.htaccess rewrite /files/users/1/file.pdf to /view/?file=file.pdf - .htaccess

I am terrible with mod_rewrite however I need to rewrite any request to the folder /files/users/*/ (* is a wildcard) to /view/ and insert the filename into a query paramater like so:
/files/users/9/test.pdf becomes /view/?file=test.pdf
How would I go about this assuming that the .htaccess file will be located inside /files/users/?
I would really appreciate if you explained how your solution works as I am slowly trying to become familiar with mod_rewrite.

So, you wanna have all my trade secrets on a silver plate?
Well, I try my best. ;-)
First of all, you must know where the documentation is. Look here for the reference: mod_rewrite. Or mod_rewrite, if your Apache version is 2.2.
You will find an overview with lots of links at Apache mod_rewrite. There, you will find a nice introduction to rewriting URLs. Also look here for lots of standard examples.
Since mod_rewrite supports PCRE regular expressions, you might need perlre and/or regular-expression.info from time to time.
Now to your question
RewriteEngine On
RewriteRule ^(?:.+?)/(.*) /view/?file=$1
This might already be sufficient. It looks for a subdirectory (?:.+?) in /files/users and captures the name of a file (.*) in this subdirectory. If this pattern matches, it rewrites the URL to /view/?file= and appends the captured file with $1, which gives /view/?file=$1.
All untested, of course, have fun.
P.S. Additional info is here at SO at .htaccess info and .htaccess faq.

Put the directive below in your .htaccess file to rewrite /files/users/9/test.pdf to /view/?file=test.pdf. In practical terms this means that if you visit http://yourdomain.com/files/users/9/test.pdf then the visitor will be served the rewritten url which is http://yourdomain.com/view?file=test.pdf
RewriteRule ^[^/]+/(.*)$ /view/?file=$1 [L]
A RewriteRule directive is part of the Apache mod_rewrite module. It takes two arguments:
Pattern - a regular expression to match against the current URL path (note that the URL path is not the entire URL but eg. /my/path, but in a .htaccess context the leading slash / is stripped giving us my/path).
Substitution - the destination URL or path where the user will rewritten OR redirected to.
Explaining the rule
The pattern ^[^/]+/(.*)$:
^ - the regex must match from the start of the string
[^/] - match everything but forward slash
+ - repetition operator which means: match 1 or more characters
/ - matches a forward slash
(.*) - mathes any characters. The dot means match any character. The star operator means match ANY characters (0 or more). The parantheses means the match is grouped and can be used in backreferences.
$ - the regex must match until the end of the string
The substitution /view/?file=$1:
...means that we rewrite the URL path to the /view/ folder with the query parameter file. The query parameter file will contain our first grouped match from the pattern as we pass it the $1 value (which means the first match from our RewriteRule pattern).
The [L] flag:
...means that mod_rewrite will stop processing rewrite rules. This is handy to avoid unwanted behaviour and/or infinite loops.

Related

Rewriterule after 3 or 4 characters

My URLs are like this:
www.example.com/1234-title-of-an-event/whatever/
I need to take everything after the "1234-" (but it could be a 3 digit number as well, like "123-") and before the slash "/whatever" in order to redirect as:
www.example.com/title-of-an-event/
I am trying with the following rule (as the very last rule), but it doesn't seem to work and I only get a 500 Internal Server Error .
RewriteRule ^\/([0-9]{3,4})-(.*)\/(.*) https://www.example.com/$2 [R=301, L]
RewriteRule ^\/([0-9]{3,4})-(.*)\/(.*) https://www.example.com/$2 [R=301, L]
The 500 error is most probably caused by the space in the flags argument. It should be [R=301,L] (no space). However, this directive won't do anything in .htaccess, because of the slash prefix on the RewriteRule pattern. In a directory context (ie. .htaccess), the URL-path that is matched by the RewriteRule pattern excludes the directory-prefix, which notably ends in a slash, so the URL-path that is matched never starts with a slash. (You would need to match the slash prefix in a server/vhost context.)
This also should probably not be the "very last rule" if you have other mod_rewrite directives - there could be a conflict (without seeing your entire file).
There is no need to escape slashes in the the RewriteRule pattern. Slashes carry no special meaning here, since spaces are effectively used as regex delimiters in Apache config files.
I would also modify your regex to match everything except a slash (ie. [^/]+), instead of everything, since your regex (capturing subpattern) would match title-of-an-event/whatever in your example URL, not title-of-an-event, as intended. Since regex is greedy by default.
So, try the following instead, near the top of your .htaccess file:
RewriteRule ^\d{3,4}-([^/]+) /$1 [R=302,L]
This matches /1234-title-of-an-event and discards everything else, returning title-of-an-event in the $1 backreference. (Is the trailing /<whatever>/ required in order to make a successful match?)
\d is simply shorthand for [0-9].
There is no need to have capturing sub-patterns in the regex if the backreferences are not being used.
There is no need to include the absolute URL in the substitution unless you have multiple domains or are canonicalising the scheme/hostname in the redirected response.
Note that this is a 302 (temporary) redirect - only change to a 301 (permanent) redirect - if that is the intention - once you have confirmed that this works OK, to avoid caching issues. You will need to ensure that your browser cache is cleared before testing.

How to redirect double URL to single URL with htaccess

Google Search Console is showing 404 Page Not Found error for
https://example.com/page/https://example.com/page/
and the link is coming from an external website.
I want to redirect with .htaccess:
https://example.com/page/https://example.com/page/
to
https://example.com/page/
Can anyone can help me in this regard?
Try the following mod_rewrite directives at the top of your .htaccess file:
RewriteEngine On
RewriteRule ^(.*?)https?:/ /$1 [R=301,L]
This just removes any trailing part on the URL-path that starts http:/ (or https:/).
UPDATE: The ? in the capturing subpattern (.*?) makes it non-greedy, so it only captures up to the first occurrence of https:/ and discards the rest, rather than up to the last occurrence (greedy) and looping (redirect loop) until all occurrences of https:/ were removed.
Additional notes:
First test with 302 (temporary) redirect to make sure it works. Only change to 301 when confirmed, to avoid caching issues.
The URL-path that is matched by the RewriteRule pattern has already had sequences of slashes reduced to single slashes, so you can't match // (double slash) here (but I don't think you need to).
If there are query strings involved then you may need a slightly different approach and another directive, since the query string itself (as opposed to the URL-path) might contain the "repeated URL" that needs to be removed (we would need to see an example first). The RewriteRule pattern matches against the URL-path only, not the query string.
On Windows: If the (scheme and) colon (:) appears in the first path segment (ie. the malformed link is for the document root) then Apache will generate a 403 Forbidden before .htaccess is able to redirect. There is nothing you can do to avoid this since it is a limitation of the OS (colons are not allowed in filesystem paths - the 403 occurs when Apache tries to map the URL to a filesystem path). This does not happen on Linux. For example: https://example.com/https://example.com/.
UPDATE: If you are not seeing a redirect, just a 404 then you may need to enable additional pathname information (PATH_INFO) on your URLs. For example, at the top of your .htaccess file:
AcceptPathInfo On

.htaccess basic mod rewrite

I am only just starting to learn how to rewrite urls with the .htaccess file.
How would I change:
http://www.url.net/games/game_one.php
into this:
http://www.url.net/games/game-one/
This is what I have been trying
RewriteRule ^/games/game-one.php ^/games/game-one/ [NC,L]
If you want people to use /games/game-one/ explicitly, you have to rewrite so that it requests /game/game-one.php. So the opposite way around than you have it in your question.
RewriteEngine On
RewriteRule ^games/game-one/$ /games/game-one.php
If you want to rewrite other URL's too, then you'd need to use a technique similar to the prior answer.
Try this:
RewriteRule ^(/games/game-one)\.php $1/
What that says is match anything starting with /games/game-one and remember the first part of that match, then replace it with the first part (capturing group in regex speak), and a slash character. Note that to match a period character you must precede it with a \ since . is a special character that means "any character" (at least if you care to avoid matching any character).

Using .htaccess to style URL directory style

I have searched this question and looked around but can't seem to get this working in practice. This is my .htaccess file:
Options +FollowSymLinks
RewriteEngine on
RewriteRule /poker/(.*)/(.*)/$ /poker/?$1=$2
I am trying to get my page to work like this:
mysite.com/poker/page/home
But this just isn't working, I have used 3 different generators and tried typing it manually from tutorials but it is just returning a 404. Any idea's a greatly appreciated, it could be really obvious..
Thanks
You do not have a trailing slash in your example, yet your rule requires one. You can make the trailing slash optional:
RewriteEngine on
RewriteRule /poker/(.*)/(.*)/?$ /poker/?$1=$2
Note however, that a uri /poker/a/b/c/d/e/f/g/ is also a match here - a/b/c/d/e/f will match the first subpattern and g will match the second one, because (.*) is greedy. Be more specific if you wish to match only content between slashes - e.g. ([^/]*)
Well, there's really nothing wrong with the rules that you have if http://mysite.com/poker/?page=home resolves correctly. The only thing is that if this is in an htaccess file, the leading slash is removed from the URI when it's matched against in a RewriteRule, so you need to remove it from your regular expression (or maky it optional):
RewriteRule ^poker/(.+)/(.+)/?$ /poker/?$1=$2
And maybe make the groupings (.+) instead so that there is at least one character there.

RewriteRule in htaccess

Could anyone explain the following line please?
RewriteRule ^(.*)$ /index.php/$1 [L]
The parts of the rewrite rule break down as follows:
RewriteRule
Indicates this line will be a rewrite rule, as opposed to a rewrite condition or one of the other rewrite engine directives
^(.*)$
Matches all characters (.*) from the beggining ^ to the end $ of the request
/index.php/$1
The request will be re-written with the data matched by (.*) in the previous example being substituted for $1.
[L]
This tells mod_rewrite that if the pattern in step 2 matches, apply this rule as the "Last" rule, and don't apply anymore.
The mod_rewrite documentation is really comprehensive, but admittedly a lot to wade through to decode such a simple example.
The net effect is that all requests will be routed through index.php, a pattern seen in many model-view-controller implementations for PHP. index.php can examine the requested URL segments (and potentially whether the request was made via GET or POST) and use this information to dynamically invoke a certain script, without the location of that script having to match the directory structure implied by the request URI.
For example, /users/john/files/index might invoke the function index('john') in a file called user_files.php stored in a scripts directory. Without mod_rewrite, the more traditional URL would probably use an arguably less readable query string and invoke the file directly: /user_files.php?action=index&user=john.
That will cause every request to be handled by index.php, which can extract the actual request from $_SERVER['REQUEST_URI']
So, a request for /foo/bar will be rewritten as /index.php/foo/bar
(I'm commenting here because I don't yet have the rep's to comment the answers)
Point #2 in meagar's answer doesn't seem exactly right to me. I might be out on a limb here (I've been searching all over for help with my .htaccess rewrites...), and I'd be glad for any clarification, but this is from the Apache 2.2 documentation on RewriteRule:
What is matched?
The Pattern will initially be matched against the part of the URL after the hostname and port, and before the query string. If you wish to match against the hostname, port, or query string, use a RewriteCond with the %{HTTP_HOST}, %{SERVER_PORT}, or %{QUERY_STRING} variables respectively.
To me that seems to say that for a URL of
http: // some.host.com/~user/folder/index.php?param=value
the part that will actually be matched is
~user/folder/index.php
So that is not matching "all characters (.*) from the beggining ^ to the end $ of the request", unless "the request" doesn't mean what I thought it does.

Resources