How do I 410 a pattern of toxic Prestashop URLS? - .htaccess

I've been trying to 410 a pattern of a few thousands toxics URLS that got indexed by Google for one of my clients. I want every urls of the website that contain ?% to send a 410 response code so they got de-indexed.
Example of an URL :
https://www.example.com/fr/promotions?%25252525253Bid_lang=6&p=6
I've tried to put this in the .htaccess, above the If module section, as I found on another thread here, but it didn't work, any ideas ?
RewriteRule ^[a-z]shop($|/) - [G]

I want every urls of the website that contain ?% to send a 410 response code
Using mod_rewrite at the top of the .htaccess file:
# Send 410 for any URL where the query string starts with "%"
RewriteCond %{QUERY_STRING} ^%
RewriteRule ^ - [G]

Related

Using htaccess how do I RewriteRule/RewriteCond with no filename?

Hoping this isn't a duplicate, done a lot of looking and I just get more confused as I don't use .htaccess often.
I would like to have some pretty URLs and see lots of help regarding getting information where for example index.php is passed a parameter such as page. So I can currently convert www.example.com/index.php?page=help to www.example.com/help.
Obviously I'm not clued up on this but I would like to parse a URL such as www.example.com/?page=help.
Can't seem to find much info and adapting the original I am obviously going wrong somewhere.
Any help or pointers in the right direction would be greatly appreciated. I'm sure its probably stupidly simple.
My alterations so far which do not seem to work are:
RewriteCond %{THE_REQUEST} ^.*/?page=$1
RewriteRule ^(.*)/+page$ /$1[QSA,L]
Also recently tried QUERY_STRING but just getting server error.
RewriteCond %{QUERY_STRING} ^page=([a-zA-Z]*)
RewriteRule ^(.*) /$1 [QSA,L]
Given up as dead to the world so thought I would ask. Hoping to ensure the request/url etc starts ?page and wanting to make a clean URL from the page parameter.
This is the whole/basic process...
1. HTML Source
Make sure you are linking to the "pretty/canonical" URL in your HTML source. This should be a root-relative URL starting with a slash (or absolute), in case you rewrite from different URL path depths later. For example:
Help Page
2. Rewrite the "pretty" URL
In .htaccess (using mod_rewrite), internally rewrite the "pretty" URL back to the file that actually handles the request, ie. the "front-controller" (eg. index.php, passing the page URL parameter if you wish). For example:
DirectoryIndex index.php
RewriteEngine On
# Rewrite URL of the form "/help" to "index.php?page=help"
RewriteRule ^[^.]+$ index.php?page=$0 [L]
The RewriteRule pattern ^[^.]+$ matches any URL-path that does not include a dot. By excluding a dot we can easily omit any request that would map to a physical file (that includes a file extension delimited by a dot).
The $0 backreference contains the entire URL-path that is matched by the RewriteRule pattern.
The DirectoryIndex is required when the "homepage" (root-directory) is requested, when the URL-path is otherwise empty. In this case the page URL parameter is not passed to our script.
3. Implement the front-controller / router (ie. index.php)
In index.php (your "front-controller" / router) we read the page URL parameter and serve the appropriate content. For example:
<?php
$pages = [
'home' => '/content/homepage.php',
'help' => '/content/help-page.php',
'about' => '/content/about-page.php',
'404' => '/content/404.php',
];
// Default to "home" if "page" URL param is omitted or is empty
$page = empty($_GET['page']) ? 'home' : $_GET['page'];
// Default to 404 "page" if not found in the array/DB of pages
$handler = $pages[$page] ?? $pages['404'];
include($_SERVER['DOCUMENT_ROOT'].$handler);
As seen in the above script, the actual "content" is stored in the /content subdirectory. (This could also be a location outside of the document root.) By storing these files in a separate directory they can be easily protected from direct access.
4. Redirect the "old/ugly" URL to the "new/pretty" URL [OPTIONAL]
This is only strictly necessary (in order to preserve SEO) if you are changing an existing URL structure and the "old/ugly" (original) URLs have been exposed (indexed by search engines, linked to by third parties, etc.), otherwise the "old" URL (ie. /index.php?page=abc) is accessible. This is the same whenever you change an existing URL structure.
If the site is new and you are implementing the "new/pretty" URLs from the start then this is not so important, but it does prevent users from accessing the old URLs if they were ever exposed/guessed.
The following would go before the internal rewrite and after the RewriteEngine directive. For example:
# Redirect "old" URL of the form "/index.php?page=help" to "/help"
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{REQUEST_URI} ^/index\.php$ [OR]
RewriteCond %{QUERY_STRING} ^page=([^.&]*)
RewriteRule ^(index\.php)?$ /%1 [R=301,L]
The check against the REDIRECT_STATUS environment variable prevents a redirect-loop by not redirecting requests that have already been rewritten by the later rewrite.
The %1 backreference contains the value of the page URL parameter, as captured from the preceding CondPattern (RewriteCond directive). (Note how this is different to the $n backreference as used in the rewrite above.)
The above redirects all URL variants both with/without index.php and with/without the page URL parameter. For example:
/index.php?page=help -> /help
/?page=help -> /help
/index.php -> / (homepage)
/?page= -> / (homepage)
TIP: Test first with 302 (temporary) redirects to prevent potential caching issues.
Comments / improvements / Exercises for the reader
The above does not handle additional URL parameters. You can use the QSA (Query String Append) flag on the initial rewrite to append additional URL parameters on the initially requested URL. However, implementing the reverse redirect is not so trivial.
You don't need to pass the page URL parameter in the rewrite. The entire (original) URL is available in the PHP superglobal $_SERVER['REQUEST_URI'] (which also includes the query string - if any). You can then parse this variable to extract the required part of the URL instead of relying on the page URL parameter. This generally allows greatest flexibility, without having to modify .htaccess later.
However, being able to pass a page URL parameter can be "useful" if you ever want to manually rewrite (override) a URL route using .htaccess.
Incorporate regex (wildcard pattern matching) in the "router" script so you can generate URLs with "parameters". eg. /<page>/<param1>/<param2> like /photo/cat/large.
Reference:
https://httpd.apache.org/docs/2.4/rewrite/
https://httpd.apache.org/docs/2.4/rewrite/intro.html
https://httpd.apache.org/docs/2.4/mod/mod_rewrite.html
RewriteCond %{QUERY_STRING} ^page=([^&]+)
RewriteRule ^$ /%1? [R=302,L]
Can't delete and didn't want to waste anyones time responding.

How to rewrite a url with multiple parameters to new url using .htaccess

I have rewritten my site from asp to php. I need to redirect a few pages with multiple parameters.
These are a few of the old url's:
mysite.co.uk/productlist_paged.asp?cid=1&offset=10
mysite.co.uk/productlist_paged.asp?cid=1&offset=20
mysite.co.uk/productlist_paged.asp?cid=1&offset=30
mysite.co.uk/productlist_paged.asp?cid=1&offset=40
to the following new pages:
mysite.co.uk/Compare/Roland-Digital-Pianos/43/1
mysite.co.uk/Compare/Roland-Digital-Pianos/43/2
mysite.co.uk/Compare/Roland-Digital-Pianos/43/3
mysite.co.uk/Compare/Roland-Digital-Pianos/43/4
I was hoping to keep the number 43 out of the redirect as this a number that will change when products are added/removed.
cid=1 equals Roland-Digital-Pianos and e.g offset=10 is number 1 at the end of the url
Any help welcome
You could go with something like :
RewriteEngine On
RewriteCond %{QUERY_STRING} cid=1&offset=([0-9]+)0
RewriteRule ^productlist_paged.asp$ /Compare/Roland-Digital-Pianos/43/%1? [R=301,L]
There is an extra 0 at the end of the RewriteCond, otherwise, %1 would be 10 or 20 instead of 1 or 2.
The extra ? at the end delete the QUERY_STRING in the redirect url.
Note : I didn't add the ^ and $ in the RewriteCond, so that your url doesn't necessarly start/end with this QUERY_STRING, ie productlist_paged.asp?test=1&cid=1&offset=10&test2=1 will also fit the RewriteCond and get redirected.
Hope it helps !

How to get only first part of string before slash in URL (HTACCESS)

To achieve this URL pattern www.example.com/abc-xyz/mno-pqr/123.html
I am using following in htaccess:
rewriterule ^(.*)/(.*)/(.*).html$ index.php?lyrics_id=$3&singer=$1&song=$2 [L]
Below are example URL which are causing duplicate title error for my site.
www.example.com/abc-xyz/WHATEVER/ANOTHER_WHATEVER/mno-pqr/283.html
www.example.com/abc-xyz/I_DONT_WANT_THIS_PART/mno-pqr/283.html
www.example.com/abc-xyz/HELP_ME/REMOVE/mno-pqr/283.html
www.example.com/abc-xyz/HELP/REMOVE/THIS/PART/mno-pqr/283.html
I want to get only first part before slash in singer part.
I want exactly this,
www.example.com/abc-xyz/mno-pqr/123.html
but not other letters or between abc-xyz and mno-pqr.
Help me writing htaccess.
It seems that you only want to match the url if it consists of exactly 3 parts. You can do this, by matching everything but the delimiter in each part. The delimiter here is /.
Also please note that . matches EVERY CHARACTER. If you want to match the period character instead, you have to escape it (\.).
RewriteRule ^([^/]*)/([^/]*)/([^/]*)\.html$ /index.php?lyrics_id=$3&singer=$1&song=$2 [L]
That works for my 3 parameters rule but how to add 301 redirect to all
those URL which have more than 3 parts?
If the extra parts are always between the first part and the 2nd part of the url, like you showed above, you have to redirect the user with a 301 header:
RewriteRule ^([^/]*)/.*/([^/]*)/([^/]*)\.html$ /$1/$2/$3.html [R=301,L]

mod_rewrite for a 301 redirect. Not redirecting to proper location

I'm trying to use a RewriteRule (using ISAPI, NOT on an Apache server) to 301 redirect a url such as:
http://www.mydomain.com/news/story-title/
to
http://www.mydomain.com/news/detail/story-title/
What I've gotten so far is:
RewriteRule ^news/(?!detail)/?$ news/detail/$1/ [L,R=301]
which successfully ignores urls that already have the "detail" in them (in some of my first attempts I ended up with a loop and a url like "/news/detail/detail/detail..."), but visiting /news/story-title/ gives me a 404 so it's not redirecting to the proper location.
Change your rewrite rule to
RewriteRule ^news/(?!detail)([^/]+)/?$ news/detail/$1/ [L,R=301]
EDIT : (How it works?)
/(?!detail) is a negative lookahead but it's also non-capturing i.e. it matches / but not what comes after it; just makes sure that it isn't "detail". So, I added a capturing group ([^/]+) to capure those characters (one or more + of anything that's not a/) optionally ending with a /.
Hence, the $1 now gets replaced with the matched directory name.

301 Redirect in htaccess. Matching Page IDs

I would like some help creating an htaccess 301 redirect for the below type of url.
In total there's around 500 or so products but rather than write a redirect for every url, which would be very bulky and time consuming, I'm hoping there's an easier way that I haven't yet found to create a kind of regular expression match?
OLD: http://www.example.co.uk/test-product-name-slug/prod_233.html
NEW: http://www.example.co.uk/test-product-name-slug-233.html
The new URL can be accessed by browsing to ..... example.co.uk/-223.html ...... which then rewrites to ..... example.co.uk/test-product-name-slug-233.html
So it would appear I need a way of detecting if the incoming visitor is coming to a url that cotains prod_id and redirecting to -id
I hope that all makes sense.
Hopefully this is what you're looking for. It only matches 1 directory in
/some-product_name-random/prod_9393.html => /some-product_name-random-9393.html
.htaccess
RewriteEngine on
RewriteBase /
RewriteRule ^([a-zA-A-_]+)/prod_([0-9]+)\.html$ /$1-$2.html [R=301,L,QSA]
Regex and Parameters Explained
([a-zA-A-_]+) matches product name $1
prod_([0-9]+) matches product id $2
[R=301] 301 permanent redirect
[L] stop .htaccess script (may be removed, but usually good practice for specific rules when using multiple rules for different scenarios)
[QSA] keep query string domain.com/somepath/page.html?querystring=value&otherstuff (?...)

Resources