Rewriterule after 3 or 4 characters - .htaccess

My URLs are like this:
www.example.com/1234-title-of-an-event/whatever/
I need to take everything after the "1234-" (but it could be a 3 digit number as well, like "123-") and before the slash "/whatever" in order to redirect as:
www.example.com/title-of-an-event/
I am trying with the following rule (as the very last rule), but it doesn't seem to work and I only get a 500 Internal Server Error .
RewriteRule ^\/([0-9]{3,4})-(.*)\/(.*) https://www.example.com/$2 [R=301, L]

RewriteRule ^\/([0-9]{3,4})-(.*)\/(.*) https://www.example.com/$2 [R=301, L]
The 500 error is most probably caused by the space in the flags argument. It should be [R=301,L] (no space). However, this directive won't do anything in .htaccess, because of the slash prefix on the RewriteRule pattern. In a directory context (ie. .htaccess), the URL-path that is matched by the RewriteRule pattern excludes the directory-prefix, which notably ends in a slash, so the URL-path that is matched never starts with a slash. (You would need to match the slash prefix in a server/vhost context.)
This also should probably not be the "very last rule" if you have other mod_rewrite directives - there could be a conflict (without seeing your entire file).
There is no need to escape slashes in the the RewriteRule pattern. Slashes carry no special meaning here, since spaces are effectively used as regex delimiters in Apache config files.
I would also modify your regex to match everything except a slash (ie. [^/]+), instead of everything, since your regex (capturing subpattern) would match title-of-an-event/whatever in your example URL, not title-of-an-event, as intended. Since regex is greedy by default.
So, try the following instead, near the top of your .htaccess file:
RewriteRule ^\d{3,4}-([^/]+) /$1 [R=302,L]
This matches /1234-title-of-an-event and discards everything else, returning title-of-an-event in the $1 backreference. (Is the trailing /<whatever>/ required in order to make a successful match?)
\d is simply shorthand for [0-9].
There is no need to have capturing sub-patterns in the regex if the backreferences are not being used.
There is no need to include the absolute URL in the substitution unless you have multiple domains or are canonicalising the scheme/hostname in the redirected response.
Note that this is a 302 (temporary) redirect - only change to a 301 (permanent) redirect - if that is the intention - once you have confirmed that this works OK, to avoid caching issues. You will need to ensure that your browser cache is cleared before testing.

Related

Redirection from after slash word to query

I am very new in coding and redirection in .htaccess.
I would need a redirection for a lot of URLs with the slug /brand/.
For example:
https://example.com/brand/AAA to /shop/?filter_marke=AAA
https://example.com/brand/BBB to /shop/?filter_marke=BBB
https://example.com/brand/CCC to /shop/?filter_marke=CCC
and so on.
You could perform the "redirect" like the following using mod_rewrite:
RewriteEngine On
RewriteRule ^brand/(\w+)$ /shop/?filter_marke=$1 [R=302,L]
The order of rules can be important. This would need to go near the top of the .htaccess file, before any existing rewrites. If this is a WordPress site, then it would need to go before the # BEGIN WordPress comment marker.
The $1 backreference in the substitution string (2nd argument) contains the value of the word after /brand/ in the URL-path.
UPDATE:
I only forgot to mention that there is a slash after the variable, means the incoming link looks like ..../AAA/
In that case you can simply append a trailing slash to the end of the pattern, ie. ^brand/(\w+)/$. Or make the trailing slash optional so it matches both. ie. ^brand/(\w+)/?$.

Replace part of a long URL and redirect

Is there a way to redirect the URL as follows:
URL is generated based on a filtering system so it is like this
https://example.com/product-category-no-slash-generated-part-is-autoadded-here
Due to the massive product number, it is impossible for me to change all generated URL-s but I need to change, for example, only no-slash part to something-else, so redirect does this:
Old URL:
https://example.com/product-category-no-slash-generated-part-is-autoadded-here
New URL:
https://example.com/product-category-something-else-generated-part-is-autoadded-here
I hope I managed to explain the problem.
I tried to use stuff like RewriteRule ^/no-slash/(.*)$ /something-else/$1 [L] but I think this does not work for what I need.
To replace no-slash with something-else in the URL-path that only consists of a single path-segment then you can do something like the following using mod-rewrite, near the top of the root .htaccess file.
RewriteEngine On
# Replace "no-slash" in URL-path with "something-else"
RewriteRule ^([\w-]+)no-slash([\w-]+)$ /$1something-else$2 [R=302,L]
This assumes the URL-path can only consist of the characters 0-9, a-z, A-Z, _ (underscore) and - (hyphen).
The $1 and $2 backreferences contain the matched URl-path before and after the string to replace repsectively.
I tried to use stuff like RewriteRule ^/no-slash/(.*)$ /something-else/$1 [L]
In this you are matching slashes in the URL-path - which do not occur in your example. You are also not allowing for anything before the string you want to replace (eg. product-catgeory-).
In a .htaccess context, the URL-path matched by the RewriteRule pattern does not start with a slash. S, a pattern like ^/no-slash will never match.
UPDATE:
another example. example.com/demo-tools-for-construction-work So word TOOLS in URL must be replaced with EQUIPMENT-AND-TOOLS.
(I'm assuming this should all be lowercase.)
A problem with your second example (in comments) is that tools also exists in the target URL, so this would naturally result in an endless redirect loop.
To prevent this "loop" you would need to exclude the URL you are redirecting to. eg. You could exclude URLs that already contain equipment-and-tools.
For example:
# Replace "tools" in URL-path with "equipment-and-tools"
# - except if it already contains "equipment-and-tools"
RewriteCond %{REQUEST_URI} !equipment-and-tools
RewriteRule ^([\w-]+)tools([\w-]+)$ /$1equipment-and-tools$2 [R=302,L]
The ! prefix on the CondPattern (2nd argument to the RewriteCond directive) negates the expression. So, in this case it is successful when equipment-and-tools is not contained in the requested URL.

How to redirect double URL to single URL with htaccess

Google Search Console is showing 404 Page Not Found error for
https://example.com/page/https://example.com/page/
and the link is coming from an external website.
I want to redirect with .htaccess:
https://example.com/page/https://example.com/page/
to
https://example.com/page/
Can anyone can help me in this regard?
Try the following mod_rewrite directives at the top of your .htaccess file:
RewriteEngine On
RewriteRule ^(.*?)https?:/ /$1 [R=301,L]
This just removes any trailing part on the URL-path that starts http:/ (or https:/).
UPDATE: The ? in the capturing subpattern (.*?) makes it non-greedy, so it only captures up to the first occurrence of https:/ and discards the rest, rather than up to the last occurrence (greedy) and looping (redirect loop) until all occurrences of https:/ were removed.
Additional notes:
First test with 302 (temporary) redirect to make sure it works. Only change to 301 when confirmed, to avoid caching issues.
The URL-path that is matched by the RewriteRule pattern has already had sequences of slashes reduced to single slashes, so you can't match // (double slash) here (but I don't think you need to).
If there are query strings involved then you may need a slightly different approach and another directive, since the query string itself (as opposed to the URL-path) might contain the "repeated URL" that needs to be removed (we would need to see an example first). The RewriteRule pattern matches against the URL-path only, not the query string.
On Windows: If the (scheme and) colon (:) appears in the first path segment (ie. the malformed link is for the document root) then Apache will generate a 403 Forbidden before .htaccess is able to redirect. There is nothing you can do to avoid this since it is a limitation of the OS (colons are not allowed in filesystem paths - the 403 occurs when Apache tries to map the URL to a filesystem path). This does not happen on Linux. For example: https://example.com/https://example.com/.
UPDATE: If you are not seeing a redirect, just a 404 then you may need to enable additional pathname information (PATH_INFO) on your URLs. For example, at the top of your .htaccess file:
AcceptPathInfo On

htaccess RewriteRule with literal question marks (not query string)

I need to be able to match question marks because there was a translated text encoding mistake, and part of the URL ended up hardcoded with question marks in them. Here's a URL example that I need to rewrite:
https://example.com/Documentation/Product????/index.html
Here is my current rewrite rule. It works when the characters following "Product" are not question marks, but when they are, the rule doesn't apply.
RewriteRule "^Documentation/Product[^/]+/(.*)$" "https://s3.amazonaws.com/company-documentation/Help/Product/$1" [L,NC]
How would I make sure that question marks are considered to be characters too in this rule? I can't expect that only question marks and not the original non-English characters will be in the URL, so I want the rule above to match both question marks and any other character.
I found this topic which seems relevant, but the flags don't help, and the answer doesn't explain how to overcome the problem mentioned in the "Aside".
https://webmasters.stackexchange.com/questions/107259/url-path-with-encoded-question-mark-results-in-incorrect-redirect-when-copied-to
https://example.com/Documentation/Product????/index.html
You say it's "not a query string", but actually that is exactly what it is. And that is why you can't match it with the RewriteRule pattern. The above URL is split as follows:
URL-path: /Documentation/Product (matched by the RewriteRule pattern)
Query string: ???/index.html (note 3 ? - the first one starts the query string)
To match the query string you'll need an additional RewriteCond directive that checks against the QUERY_STRING server variable.
For example, to match the above URL, you would need to do something like:
RewriteCond %{QUERY_STRING} ^\?*/index\.html
RewriteRule ^Documentation/Product$ https://s3.amazonaws.com/company-documentation/Help/Product/index.html [NC,R,L]
This matches any number of erroneous ? at the start of the query string.
I've added the R (redirect) flag. Your directive (without the R flag) would trigger an external redirect anyway (because you specifying an absolute URL in the substitution), but it is far better to be explicit here. This is also a temporary (302) redirect. If this should be permanent (301) then change it to R=301, but only once you have confirmed that it's working OK (301s are cached hard by the browser so can make testing problematic).
UPDATE:
...so I want the rule above to match both question marks and any other character.
Only if there are question marks in the URL will there be a query string, so I think it is advisable to keep these two rules separate.
If there could be any erroneous characters at the start of the query string and if you want to capture the end part of the URL (like you are doing in your original directive, eg. index.html) then you can modify the above to read:
RewriteCond %{QUERY_STRING} /(.*)$
RewriteRule ^Documentation/Product$ https://s3.amazonaws.com/company-documentation/Help/Product/%1 [NC,R,L]
Note the %1 (as opposed to $1) backreference in the substitution string. This is a backreference to the captured group in the last matched CondPattern (ie. /(.*)$).
You can follow this with your existing directive (but remember to include the R flag) for more "normal" URLs that don't contain a ? (ie. query string).
NB: Surrounding the arguments in double quotes are entirely optional in this example. They are only required if you have unescaped spaces in the pattern or substitution arguments.
In summary
# Redirect URLs of the form:
# "/Documentation/Product?<anything#1>/<anything#2>"
RewriteCond %{QUERY_STRING} /(.*)$
RewriteRule ^Documentation/Product$ https://s3.amazonaws.com/company-documentation/Help/Product/%1 [NC,R,L]
# Redirect URL-paths of the form (no query string):
# "/Documentation/Product<something>/<anything>"
RewriteRule ^Documentation/Product[^/]+/(.*) https://s3.amazonaws.com/company-documentation/Help/Product/$1 [NC,R,L]

.htaccess rewrite /files/users/1/file.pdf to /view/?file=file.pdf

I am terrible with mod_rewrite however I need to rewrite any request to the folder /files/users/*/ (* is a wildcard) to /view/ and insert the filename into a query paramater like so:
/files/users/9/test.pdf becomes /view/?file=test.pdf
How would I go about this assuming that the .htaccess file will be located inside /files/users/?
I would really appreciate if you explained how your solution works as I am slowly trying to become familiar with mod_rewrite.
So, you wanna have all my trade secrets on a silver plate?
Well, I try my best. ;-)
First of all, you must know where the documentation is. Look here for the reference: mod_rewrite. Or mod_rewrite, if your Apache version is 2.2.
You will find an overview with lots of links at Apache mod_rewrite. There, you will find a nice introduction to rewriting URLs. Also look here for lots of standard examples.
Since mod_rewrite supports PCRE regular expressions, you might need perlre and/or regular-expression.info from time to time.
Now to your question
RewriteEngine On
RewriteRule ^(?:.+?)/(.*) /view/?file=$1
This might already be sufficient. It looks for a subdirectory (?:.+?) in /files/users and captures the name of a file (.*) in this subdirectory. If this pattern matches, it rewrites the URL to /view/?file= and appends the captured file with $1, which gives /view/?file=$1.
All untested, of course, have fun.
P.S. Additional info is here at SO at .htaccess info and .htaccess faq.
Put the directive below in your .htaccess file to rewrite /files/users/9/test.pdf to /view/?file=test.pdf. In practical terms this means that if you visit http://yourdomain.com/files/users/9/test.pdf then the visitor will be served the rewritten url which is http://yourdomain.com/view?file=test.pdf
RewriteRule ^[^/]+/(.*)$ /view/?file=$1 [L]
A RewriteRule directive is part of the Apache mod_rewrite module. It takes two arguments:
Pattern - a regular expression to match against the current URL path (note that the URL path is not the entire URL but eg. /my/path, but in a .htaccess context the leading slash / is stripped giving us my/path).
Substitution - the destination URL or path where the user will rewritten OR redirected to.
Explaining the rule
The pattern ^[^/]+/(.*)$:
^ - the regex must match from the start of the string
[^/] - match everything but forward slash
+ - repetition operator which means: match 1 or more characters
/ - matches a forward slash
(.*) - mathes any characters. The dot means match any character. The star operator means match ANY characters (0 or more). The parantheses means the match is grouped and can be used in backreferences.
$ - the regex must match until the end of the string
The substitution /view/?file=$1:
...means that we rewrite the URL path to the /view/ folder with the query parameter file. The query parameter file will contain our first grouped match from the pattern as we pass it the $1 value (which means the first match from our RewriteRule pattern).
The [L] flag:
...means that mod_rewrite will stop processing rewrite rules. This is handy to avoid unwanted behaviour and/or infinite loops.

Resources