How to add RewriteCond %{REQUEST_URI} in the specific loop - .htaccess

I have the following htaccess code:
RewriteCond %{QUERY_STRING} .
RewriteCond %{HTTP_USER_AGENT} 11A465|Ahrefs|ArchiveBot|AspiegelBot|Baiduspider|bingbot|BLEXBot|Bytespider|CCBot|Curebot|Daum|Detectify|DotBot|Grapeshot|heritrix|Kinza|LieBaoFast|Linguee|LMY47V|MauiBot|Mb2345Browser|MegaIndex|MicroMessenger|MJ12bot|MQQBrowser|PageFreezer|PiplBot|Riddler|Screaming.Frog|Search365bot|SearchBlox|Seekport|SemanticScholarBot|SemrushBot|SEOkicks|serpstatbot|Siteimprove.com|Sogou.web.spider|trendictionbot|TurnitinBot|UCBrowser|weborama-fetcher|Vagabondo|VelenPublicWebCrawler|YandexBot|YisouSpider [NC]
RewriteRule ^.* - [F,L]
This code blocks direct access to " /?s= " based on the listed user agents detection.
Now, I would like to implement RewriteCond %{REQUEST_URI} ^/search to this code, so that it works for " /search/ " as well.
In other words - if bots try to access " /?s= " OR " /search/ " - they get blocked.
I was able to make this work for /search/ or for /?s= but never for both at the same time, which is my goal.
Original code snippet link: https://support.acquia.com/hc/en-us/articles/360042181273-Block-Access-to-Bad-Bots-coming-from-the-Huawei-Cloud
Here is the screenshot that might give you more insight in what is happening and what I'm trying to accomplish.

Related

htaccess extract path only from THE_REQUEST into a variable

This should be rather simple, but after trying for several hours and also searching everywhere, all the related answers do not suffice.
I have this so far:
RewriteCond %{THE_REQUEST} /([a-zA-Z0-9-_/\.]+)?
RewriteRule ^(.*)$ - [E=TRGT:%1]
What the match should be doing is the following:
for URL example.com it should contain nothing, or not be defined
for URL example.com/ it should contain nothing, or not be defined
for URL example.com/some-word it should contain some-word
for URL example.com/some-word/ it should contain some-word
for URL example.com/some-word/?foo=bar it should contain some-word
for URL example.com/another_word/ it should contain another_word
for URL example.com/folder/file.ext it should contain folder/file.ext
for URL example.com/some/other.dot?bar=foo it should contain some/other.dot
for URL example.com/thing.ext/?foo=bar&bar=foo it should contain thing.ext
What I have so far seems to be working, except for when the request ends in some/ -or some/?thing=wat .. then it contains some/
I think I'm missing something really simple; any help will be appreciated, thank you.
UPDATE
I've managed to achieve these exact requirements with the following code, but, after trying many ways to do it in a 1-liner it fails horribly, so I did it in several lines:
RewriteCond %{THE_REQUEST} (?<=\s)(.*?)(?=\s)
RewriteRule ^(.*)$ - [E=HREFPATH:%1]
RewriteCond %{ENV:HREFPATH} (^.*)?\?
RewriteRule ^(.*)$ - [E=HREFPATH:%1]
RewriteCond %{ENV:HREFPATH} /(.*)
RewriteRule ^(.*)$ - [E=HREFPATH:%1]
RewriteCond %{ENV:HREFPATH} (.*)/$
RewriteRule ^(.*)$ - [E=HREFPATH:%1]
If anybody can reduce that to a s3xy 1(or 2)-liner I will choose your answer; thanks in advance.
Based on the accepted answer here the following fulfills the requirements of the question:
RewriteCond %{THE_REQUEST} \s/+([^?]*?)/*[\s?]
RewriteRule ^ - [E=HREFPATH:%1]

Replacing - sign with + Query string in htaccess

I have tried my best but ony little of it works, I want to replace - to + sing withing a specific query string in a category.
i want to replace
https://www.example.com/stack/?s=over-flow
to
https://www.example.com/stack/?s=over+flow
I have tried my best with the below code
RewriteCond %{QUERY_STRING} ^(.+)-(.+)$
RewriteRule ^(.*)$ /$1?%1+%2 [L,R=301,NE]
It works but it breaks other part of my site containing - sign in the url
I want it specifically on /stack/?s= only
Please help me out, thanks in advance
You may use this rule to target a specific query string with only single parameter:
RewriteCond %{QUERY_STRING} ^(s=[^&-]+)-([^&]+)$ [NC]
RewriteRule ^(stack/?)$ /$1?%1+%2 [L,R=301,NE]

Rewritecond Query String search on subpages

I'm writing to you because of an obstacle i found with .htaccess and Query String. What I need now to achieve is search inside subpages from FORM by GET method. I tried everything found on the internet on yesterday's evening. What partly works is:
RewriteCond %{QUERY_STRING} ^checkin=(.*)&checkout=(.*)$
RewriteRule ^index\.php$ /%1/%2/%3/? [L,R=301]
This script gives a result from:
/chocolate/?checkin=15-08-2016&checkout=18-08-2016
to:
/15-08-2016/18-08-2016/
Probably it's very easy for you to make this subpage be the first before checkin & checkout. I wasted many hours but couldn't make it work properly. I mean:
/chocolate/15-08-2016/18-08-2016/
I even tried to send this subpage's name through INPUT type HIDDEN but it also wasn't working.
You are not mentioning chocolate anywhere:
RewriteCond %{QUERY_STRING} ^checkin=([^\s&]+)&checkout=([^\s&]+)$
RewriteRule ^index\.php$ /chocolate/%1/%2/? [L,R=301]

htaccess codes did not work

This is EXACTLY the same case as: (htaccess) How to prevent a file from DIRECT URL ACCESS?
But, no one of codes provided by answers work for me. I tried 1 by 1, then tried to combine, but still not works. Here is my code:
# prevent direct image url access
# ----------
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http(s)://(www\.)?example\.com [NC]
RewriteCond %{HTTP_REFERER} !^http(s)://(www\.)?example\.com.*$ [NC]
# this not works
RewriteRule \.(png|gif|jpe?g)$ - [F]
# and this
RewriteRule \.(png|gif|jpe?g)$ - [F,NC]
# and this
RewriteRule \.(png|gif|jpe?g)$ https://example.com/wp-login.php [NC,R,L]
# even by combining them
# ----------
# /prevent direct image url access
The case simulation:
index.php has <img src="test.png" alt=""> and should be normally accessible. The requirement is: http://example.com/test.png shouldn't be accessible.
I use WordPress in wp-engine, and i think WordPres's default rewrite doesn't cause the problem since the code from answers are placed above WordPress rewrite.
UPDATE
I use PHP Version 5.5.9-1ubuntu4.14 on Apache 2 on wp engine
Your rules basically work for me, except for one thing:
The (s) is not doing what you think it does.
RewriteCond %{HTTP_REFERER} !^http(s)://(www\.)?example\.com [NC]
With parentheses you define a group, which doesn't make any sense at this point. If you remove the (s), it works for http.
If you want to use https too you have to write it like this:
RewriteCond %{HTTP_REFERER} !^https?://(www\.)?example\.com [NC]
The ? will make the preceding character (or group, if in parentheses) optional.

.htacces RewriteRule not working

Hi people#stackoverflow,
Maybe I have a fundamental misconception about the working of RewriteRule. Or maybe not. Nevertheless, I'm trying to figure this out now for two days, without any progress.
This is the currrent situation:
I have a Joomla website with SEF and mod_rewrite turned on.
This results in the URL:
mysite.com/index.php?option=com_remository&Itemid=7
being rewritten to:
mysite.com/sub-directory/sub-directory/0000-Business-files/
These are the lines that are currently used in my .htaccess (all standard Joomla)
Options +FollowSymLinks
RewriteEngine On
RewriteRule ^([^\-]*)\-(.*)$ $1 $2 [N]
RewriteCond %{QUERY_STRING} mosConfig_[a-zA-Z_]{1,21}(=|\%3D) [OR]
RewriteCond %{QUERY_STRING} base64_encode.*\(.*\) [OR]
RewriteCond %{QUERY_STRING} (\<|%3C).*script.*(\>|%3E) [NC,OR]
RewriteCond %{QUERY_STRING} GLOBALS(=|\[|\%[0-9A-Z]{0,2}) [OR]
RewriteCond %{QUERY_STRING} _REQUEST(=|\[|\%[0-9A-Z]{0,2})
RewriteRule ^(.*)$ index.php [F,L]
# RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !^/index.php
RewriteCond %{REQUEST_URI} (/|\.php|\.html|\.htm|\.feed|\.pdf|\.raw|/[^.]*)$ [NC]
RewriteRule (.*) index.php
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L]
This is what I want to achieve:
When a visitor uses this URL
mysite.com/sub directory/sub directory/0000 Business files/
it should lead him to the right page.
Although I know it's not the best idea to use spaces in a URL, I'm confronted with the fact that these 'spacious' URL's are used in a PDF, that's already been issued.
I thought I could use mod_rewrite to rewrite these URL's. But all I get is 'page not found'
I've added this rule on top of the .htaccess file:
RewriteRule ^([^\-]*)\-(.*)$ $1 $2 [N]
But this is not working. What am I doing wrong? Or, also possible, am I missing the point on when and how to use mod_rewrite?
rgds, Eric
First off, the default behavior of apache is usually to allow direct URLs that map to the underlying file system (relative to the document root), and you should use RewriteRule when you want to work around that. Looking at your question, it seems like you want to browse the filesystem and so you should not use a RewriteRule.
If mysite.com/sub+diretory/sub+directory/0000+Business+files/ doesn't work (without your rule), I'm wondering: do you have that directory structure on your server? I.e. does it look like this?
[document root]/index.php
[document root]/sub directory/sub directory/0000 Business files/
If not, I'm not sure I understand what you're trying to achieve, and what you mean by the visitor being "lead to the right page". Could you provide an example URL that the user provides, and the corresponding URL (or file system path) that you want the user to be served.
Regarding your rewrite rule, I'm not even sure that it is allowed, and I'm surprised you don't get a 500 Internal Server Error. RewriteRule takes two arguments (matching pattern and substitution) and optionally some flags, but because of the space between $1 and $2 you're supplying three arguments (+ flags).
EDIT: I got the pattern wrong, but it still doesn't make much sense. It matches against any URL that has at least one dash in it, and then picks out the parts before and after the first dash. So, for a URL like "this-is-a-url-path/to-a-file/on-the-server", $1 would be "this" and $2 would be "is-a-url-path/to-a-file/on-the-server". Again, if I had some example URLs and their corresponding rewrites, I could help you find the right pattern.
On a side note, spaces aren't allowed in URLs, but the browser and server probably does some work behind the scenes, allowing your PDFs to be picked up correctly.

Resources