multiple slashes on url: how to remove? - .htaccess

Based on code found here: remove multiple trailing slashes mod_rewrite
I have the following htaccess
Options +FollowSymLinks
DirectorySlash Off
RewriteEngine on
RewriteOptions inherit
RewriteBase /
#
# remove multiple slashes from url
#
RewriteCond %{HTTP_HOST} !=""
RewriteCond %{THE_REQUEST} ^[A-Z]+\s//+(.*)\sHTTP/[0-9.]+$ [OR]
RewriteCond %{THE_REQUEST} ^[A-Z]+\s(.*/)/+\sHTTP/[0-9.]+$
RewriteRule .* http://%{HTTP_HOST}/%1 [R=301,L]
#
# Remove multiple slashes anywhere in URL
#
RewriteCond %{THE_REQUEST} ^(.*)//(.*)$
RewriteRule . %1/%2 [R=301,L]
Yet i found out the G-Bot has crawled this url: http://www.example.com/aaa/bbb/////////bbb-ccc/bbb-ddd.htm. (aaa, bbb, ccc, ddd, are keywords in url, not to be taken litraly - i jut show the pattern of the url)
Testing the above url in by live server i found out that the slash removal does not work.
Anyone can offer any tips or improvement to the the existing code? Thank you
EDIT 1
#Sylwester provided the following code
# if match set environment variable and start over
RewriteRule ^(.*?)//+(.*)$ $1/$2 [E=REDIR:1,N]
# if done at least one. redirect with 301
RewriteCond %{ENV:REDIR} 1
RewriteRule ^/(.*) /$1 [R=301,L]
It is not working either. I still see the ////// inside the url.I have put this set of rules at the very top of my htaccess file, right below the " RewriteBase /", so as not to be affected by other rules, yet... nothing.
Any other suggestion?

Per directory and .htaccess is tricky since apache actually have removed redundant slashed for us. Eg. there is no match for //+ anymore so we check the %{REQUEST_URI} since it has the original URI while the rewrite rule need to match anything:
# NB: Only works for per directory and .htaccess
# Needs "AllowOverride All" in global config for .htaccess
RewriteEngine On
RewriteBase "/"
Options +FollowSymlinks
# Check if the REQUEST_URI has redundant slashes
# and redirect to self if it has (which apache has cleaned up already)
RewriteCond %{REQUEST_URI} //+
RewriteRule ^(.*) $1 [R=301,L]
If you can add global config I would have prefered this in the virtual host instead:
RewriteEngine On
# if match set environment variable and start over
RewriteRule ^(.*?)//+(.*)$ $1/$2 [E=REDIR:1,N]
# if done at least one. redirect with 301
RewriteCond %{ENV:REDIR} 1
RewriteRule ^/(.*) /$1 [R=301,L]

Related

.htaccess rewrite to same alias without infinite redirects

I have...
| .htaccess : (v1)
RewriteEngine on
RewriteRule ^in?$ login.php
So, /in --is-really--> /login.php
This much works great. We all can learn how to do this from: .htaccess redirect with alias url
But, I want it to also work in reverse...
If someone should enter /login.php into the address bar, I want it to change to /in.
So also, /login.php --rewrites-to--> /in
From this Answer to a different Question, I want to be ready for anything, using REQUEST_URI. So, my .htaccess file starts with this...
| .htaccess : (v2)
RewriteEngine on
# Remove index.php, if a user adds it to the address
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{REQUEST_URI} ^/(.+/)?index\.php
RewriteRule (^|/)index\.php(/|$) /%1 [R=301,L]
# "in" --> login.php
RewriteRule ^in?$ login.php
That also works great.
But now, I want to add this rule (my Question here) for /in <--> /login.php both ways, just how / <--> /index.php already works with .htaccess (v2). So, I adopted the settings and added a second rule...
| .htaccess : (v3) —not working!
RewriteEngine on
# Remove index.php, if a user adds it to the address
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{REQUEST_URI} ^/(.+/)?index\.php
RewriteRule (^|/)index\.php(/|$) /%1 [R=301,L]
# "in" --> login.php, and also redirect back to it
RewriteCond %{REQUEST_URI} ^/(.+/)?login\.php
RewriteRule (^|/)login\.php(/|$) /%1in [R=302,L]
RewriteRule ^in?$ login.php
...but then /in and /login.php both cause an infinite redirect loop.
What's the right way to do this, still using REQUEST_URI, and still having both rewrite rules (for index.php and for login.php)?
These Questions did not help:
Rewrite rule to hide folder, doesn't work right without trailing slash
This is not about a trailing slash
Allow multiple IPs to access Wordpress Site Admin via .htaccess
This is not about IP-based access
Htaccess URLs redirects are working for http not all https
This is not about https vs http
Rewrite-rules issues : .htaccess
This is not about cleaning up the GET array in the URL
apache htaccess rewrite with alias
This is not about rewriting the host/domain, thereby preserving the path
rewrite htaccess causes infinite loop?
This is not about www subdomain rewrites
.htaccess rewrite page with alias
This is not about rewriting "pretty" URLs nor about how to use slug settings in WordPress
Htaccess alias or rewrite confusion
This is not about simply having multiple rules with the same destination
htaccess rewrite to include #!
I'm not trying to rewrite #!
Reason of redirect loop is a missing RewriteCond %{ENV:REDIRECT_STATUS} ^$ before first redirect rule that removes index.php. Remember that RewriteCond is applicable to immediate next RewriteRule only.
Suggested .htaccess:
RewriteEngine on
# Remove index.php, if a user adds it to the address
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{REQUEST_URI} ^/(.+/)?index\.php$ [NC]
RewriteRule ^ /%1 [R=301,L]
# "in" --> login.php, and also redirect back to it
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{REQUEST_URI} ^/(.+/)?login\.php$ [NC]
RewriteRule ^ /%1in [R=302,L]
RewriteRule ^in?$ login.php [L,NC]
It won't cause redirect loop because after first rewrite to /login.php, variable REDIRECT_STATUS will become 200 and then the RewriteCond %{ENV:REDIRECT_STATUS} ^$ will stop redirect looping.
Thanks to the help from the user with the correct answer, I found that...
RewriteCond %{ENV:REDIRECT_STATUS} ^$
...doesn't go in .htaccess only once, but every time on the line before...
RewriteCond %{REQUEST_URI} ...

Rewrite only one specific url

I want to rewrite one specific url.
http://example1.com should be http://example2.de .
But http://example1.com/subdir or http://sub.example1.com should remain the same.
I found the following, which successfully rewrites example1.com, but also every url which starts with example1.com
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^example.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]
Background: I want to redirect the main page of an WP-Multisite but want to make sure that I can work with the backend of wordpress and run other multisites which are subdomains.
For matching only http://example.com domain (without possibility to add anything before or after the example.com) use the following code:
RewriteCond %{HTTP_HOST} ^(example.com(\/{0,1})){1}$
RewriteRule http://example2.de(\/{0,1}) [R=301,L]
That (\/{0,1}) part is for matching both example.com and example.com/ (but nothing esle) - if you do not wish to match example.com/ remove that part from both rows.
You're pretty close but you don't need to capture URI in $1:
Options +FollowSymLinks -MultiViews
# Turn mod_rewrite on
RewriteEngine On
RewriteCond %{HTTP_HOST} ^(www\.)?example1\.com$ [NC]
RewriteRule ^$ http://example2.de/ [L,R=301]

SEO Url for Profiles

Preface
I'm trying to re-write a URL for a profile page. All of my application pages have a .html extension, so I'm trying to match just letters, numbers, -, and ..
So these would be valid
site.com/steve
site.com/steve-robbins
site.com/steve.robbins
But these wouldn't be
site.com/steve.html
site.com/steve-robbins.php
Assume I have a check in place so that custom URLs don't have .html or .php on the end.
Problem
I'm currently using this but it's not working
RewriteRule ^([a-zA-Z0-9\.-]+)$ profile.php?url=$1 [L]
It should set url to steve, but it's setting it to profile.php
What am I doing wrong?
My complete .htaccess
Options +FollowSymLinks
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} ^[^.]+\.[^.]+$
RewriteRule ^(.*) http://www.%{HTTP_HOST}/$1 [R=301]
#
# LOGIN
#
RewriteRule ^([a-z0-9]{255})/activate\.html$ login.php?activate=$1 [L]
RewriteRule ^logout\.html$ login.php?logout [L]
#
# SETTINGS
#
RewriteRule ^change-([a-z]+)\.html$ account-settings.php?$1 [L]
RewriteRule ^([a-zA-Z0-9\.-]+)$ profile.php?url=$1 [L]
# SEO friendly URLs
RewriteRule ^([a-zA-Z0-9-_.]+)\.html$ $1.php [L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([a-zA-Z0-9-_.]+)\.php
RewriteRule ^([a-zA-Z0-9-_.]+)\.php$ $1.html [R=301]
Add this to the top of your rules (under the RewriteBase / directive):
RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^ - [L]
That should stop it from looping. The rewrite engine will keep re-applying all the rules until the URI going in (sans query string) is the same as the URI that comes out of the rules. That's why the value of url is profile.php.
I'm kind of a beginner in interpreting mod_rewrite rules but if I understand it correctly your rule is matched and than matched again, either add something to the url matching scheme like /profile/user or add a condition to not redirect if already redirected
Try adding a leading slash to the redirect like this:
RewriteRule ^([a-zA-Z0-9.-]+)$ /profile.php?url=$1 [L]
The reason you're getting a url value of profile.php is because the [L] flag is kinda misleading when it comes to the .htaccess file. In the server config files it does exactly what you'd think, but in the .htaccess file it stops reading rules at that rule, but then goes through the rules again until path is unchanged by any of the rules. By adding the leading /, your rule will not match the second time around as you exclude / from the regex. I spent a while struggling with this feature myself.

.htaccess URL rewriting challenge

I'm having trouble with some URL rewriting.
All of the stuff below works fine, but I need to add a rule which removes querystrings from URLS.
site.com/page?a=b
will become
site.com/page
Can someone help out? I have done some reading on .htaccess but I find it terribly complex. Also, will need to know where in the file my new directives should appear.
Thanks.
# EE 404 page for missing pages
ErrorDocument 404 /index.php/404/index
# Simple 404 for missing files
ErrorDocument 404 "File Not Found"
# Rewriting will likely already be on, uncomment if it isnt
RewriteEngine On
RewriteBase /
# Block access to "hidden" directories whose names begin with a period. This
# includes directories used by version control systems such as Subversion or Git.
RewriteRule "(^|/)\." - [F]
# remove the www - Uncomment to activate
#
# RewriteCond %{HTTPS} !=on
# RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
# RewriteRule ^(.*)$ http://%1/$1 [R=301,L]
#
# Remove the trailing slash to paths without an extension
# Uncomment to activate
#
# RewriteRule ^(.*)/$ /$1 [R=301,L]
#
# Remove index.php
# Uses the "include method"
# http://expressionengine.com/wiki/Remove_index.php_From_URLs/#Include_List_Method
#
RewriteCond %{QUERY_STRING} !^(ACT=.*)$ [NC]
RewriteCond %{REQUEST_URI} !(\.[a-zA-Z0-9]{1,5})$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} ^/(home|inc|publishers|sidebars|about|include-template|testing|advertisers|products|sitemap|style|ad-choices|social-bar|search|404||members|P[0-9]{2,8}) [NC]
RewriteRule (.*) /index.php/$1 [L]
This would remove query string from url
RewriteRule ^(.*) /index.php/$1? [L] #remove query string
Hope it helps
I'm a little late for an answer, but since i was searching for a similar behaviour, i thought i should share it: you can also add a flag to a rewrite rule to remove the query string; the flag is [QSD], and it helps avoiding workarounds like the ? at the end ;-)
Here you can find more about this flag. I feel like pointing out that "This flag is available in [Apache] version 2.4.0 and later"

How can I make a stack-overflow style user URL?

I'd like to map:
mywebsite.com/users/ -> mywebsite.com/users/users.php
mywebsite.com/users -> mywebsite.com/users/users.php
mywebsite.com/users/username -> mywebsite.com/users/user.php?name=username
At present, I'm using this .htaccess in the users directory:
RewriteEngine On
RewriteRule ^([a-zA-Z0-9_])*$ user.php?name=$1 [L]
RewriteRule ^.*$ users.php [L]
However, it never generates the user.php?name=$1 URL.
Why doesn't it work?
Here are the rules (place in .htaccess in website root folder. If placed elsewhere some tweaking is required):
Options +FollowSymLinks -MultiViews
RewriteEngine On
RewriteBase /
# 1
RewriteRule ^users/?$ /users/users.php [L]
# 2
RewriteCond %{REQUEST_URI} !^/users/users?\.php
RewriteRule ^users/([^/]+)$ /users/user.php?name=$1 [QSA,L]
Rule #1 will match fist 2 URLs of yours.
Rule #2 will work with specific user mapping. It will ensure that it does not rewrite already rewritten URLs.
UPDATE:
If you want to place it into .htaccess file in /users/ folder, then this URL mywebsite.com/users (without trailing slash) most likely will not work.
But in any case -- here are the rules:
Options +FollowSymLinks -MultiViews
RewriteEngine On
# 1
RewriteRule ^$ users.php [L]
# 2
RewriteCond %{REQUEST_URI} !^/users/users?\.php
RewriteRule ^([^/]+)$ user.php?name=$1 [QSA,L]
It looks to me like it is applying both rules in sequence: the url matches the first rule, so it adds user.php?name=$1, and then it matches the second rule, so it replaces the string with users.php. If that is the problem, then for your particular case you could fix it by replacing the second regular expression with ^/?$.

Resources