htaccess rewrite block - target specific directory (for PreRender.io caching) - .htaccess

I would like to modify the below ModRewrite block to only be triggered when the URL starts like this:
http(s)://my-website.com/n/ ....
The rewrite is used to send the (angular) page to PreRender for caching so search engines can index it. The whole angular portion of the site lives under /n/ so that is all I need to cache
<IfModule mod_proxy_http.c>
RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator [NC,OR]
RewriteCond %{QUERY_STRING} _escaped_fragment_
# Only proxy the request to Prerender if it's a request for HTML
RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent))(index\.php)?(.*) http://service.prerender.io/%{REQUEST_SCHEME}://%{HTTP_HOST}/$3 [P,L]

I think you'd want to add:
RewriteCond %{REQUEST_URI} /n/
like:
RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator [NC,OR]
RewriteCond %{QUERY_STRING} _escaped_fragment_
RewriteCond %{REQUEST_URI} /n/
# Only proxy the request to Prerender if it's a request for HTML
RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent))(index\.php)?(.*) http://service.prerender.io/%{REQUEST_SCHEME}://%{HTTP_HOST}/$3 [P,L]

Related

.htaccess Angular app crawler redirect not working on specific URLs

I am making an inventory management site using Angular and Firebase. Because this is angular, there are problems with web crawlers, specifically Slack/Twitter/Facebook/.etc crawlers that grab meta information to display a card/tile. Angular does not do well with this.
I have a site at https://domain.io (just the example) and, because of the angular issue, I have a firebase function that created a new site that I can redirect traffic to. When it gets the request (onRequest), I can grab whatever query parameters I've sent it and call the DB to render the page, server-side.
So, The three examples that I need to redirect are:
From: https://domain.io/item/ABC123
To: https://us-central1-domain.cloudfunctions.net/metaTags?item=ABC123
--
From: https://domain.io/bench/USERNAME
To: https://us-central1-domain.cloudfunctions.net/metaTags?bench=USERNAME
--
From: https://domain.io/bench/USERNAME/ITEMTYPE
To: https://us-central1-domain.cloudfunctions.net/metaTags?bench=USERNAME&type=ITEMTYPE
I can't seem to get the right combination. This is what I have right now. Note: The item redirect is working, but the other two are not...
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule ^index\.html$ - [L]
# If a bot goes to an Item
RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|Slackbot|Slack-ImgProxy|Slackbot-LinkExpanding|Site\ Analyzer|SiteAnalyzerBot|Viber|Whatsapp|Telegram [NC,OR]
RewriteCond %{REQUEST_URI} ^/item/
RewriteRule ^item/(.+)$ https://us-central1-domain.cloudfunctions.net/metaTags?item=$1 [NC,L]
# If a bot goes to a simple bench
RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|Slackbot|Slack-ImgProxy|Slackbot-LinkExpanding|Site\ Analyzer|SiteAnalyzerBot|Viber|Whatsapp|Telegram [NC,OR]
RewriteCond %{REQUEST_URI} ^/bench/
RewriteRule ^bench/(.+)$ https://us-central1-domain.cloudfunctions.net/metaTags?user=$1 [NC]
# If a bot goes to a bench of a type
RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|Slackbot|Slack-ImgProxy|Slackbot-LinkExpanding|Site\ Analyzer|SiteAnalyzerBot|Viber|Whatsapp|Telegram [NC,OR]
RewriteCond %{REQUEST_URI} ^/bench/
RewriteRule ^bench/(.+)/(.+)$ https://us-central1-domain.cloudfunctions.net/metaTags?user=$1&type=$2 [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.html [L]
</IfModule>
It's also important to note that I would like this to work on any sub-domain (I have my dev and staging environments set up with subdomains) as well as make sure it's always directed to https
Any thoughts?
Use [NC,L] flags also for both bench RewriteRules
Use ([^/]+) instead of (.+) in regex patterns
Change [NC,OR] to [NC] in user-agent RewriteCond
You can try this below code. (Not tested)
# If a bot goes to a simple bench
RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|Slackbot|Slack-ImgProxy|Slackbot-LinkExpanding|Site\ Analyzer|SiteAnalyzerBot|Viber|Whatsapp|Telegram [NC,OR]
RewriteCond %{REQUEST_URI} ^/bench/(.*)
RewriteRule ^bench/(.*)$ https://us-central1-domain.cloudfunctions.net/metaTags?user=$1 [NC]
# If a bot goes to a bench of a type
RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|Slackbot|Slack-ImgProxy|Slackbot-LinkExpanding|Site\ Analyzer|SiteAnalyzerBot|Viber|Whatsapp|Telegram [NC,OR]
RewriteCond %{REQUEST_URI} ^/bench/(.*)/(.*)
RewriteRule ^bench/(.*)/(.*)$ https://us-central1-domain.cloudfunctions.net/metaTags?user=$1&type=$2 [NC]
If it doesn't work, try [NC] with [L].

.htaccess: Allow specific URL

I have an angularJS site and the .htaccess file have been configured to redirect all traffic from mysite.se/* to mysite.se/index.html.
It also configured with prerender.io to let search engines to access the site.
Question: I want to add a blog to blog.mysite.se, and the problem is of course that it will be redirected to mysite.se/index.html. What do I need to add to the .htaccess file to let through the blog.mysite.se?
Thanks
RewriteEngine On
RewriteCond %{HTTP_HOST} ^MYSITE\.se$ [NC]
RewriteRule ^.*$ http://www.MYSITE.se%{REQUEST_URI} [R=301,L]
# If requested resource exists as a file or directory
# (REQUEST_FILENAME is only relative in virtualhost context, so not usable)
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
# Go to it as is
RewriteRule ^ - [L]
# If non existent
# If path ends with / and is not just a single /, redirect to without the trailing /
RewriteCond %{REQUEST_URI} ^.*/$
RewriteCond %{REQUEST_URI} !^/$
RewriteRule ^(.*)/$ $1 [R,QSA,L]
# Handle Prerender.io
RequestHeader set X-Prerender-Token "kEJ0CC1gMnj6V0J4u8xu"
RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest [NC,OR]
RewriteCond %{QUERY_STRING} _escaped_fragment_
# Proxy the request
RewriteRule ^(.*)$ http://service.prerender.io/http://%{HTTP_HOST}$1 [P,L]
# If non existent
# Accept everything on index.html
RewriteRule ^ /index.html
Insert this rule just below RewriteEngine On line to skip blog subdomain from current rule:
RewriteEngine On
RewriteCond %{HTTP_HOST ^blog\. [NC]
RewriteRule ^ - [L]
# rest of rules here

Rewrite Cond to exclude seperate url

I'm using .htaccess to redirect mobiles using code found here RewriteCond to exclude a path not working
I've tried rewriting the part that excludes path to exclude a URL but can't get it to work.
Basically, i have a second url used on my hosting account which is an addon domain and is stored in a subfolder of my public html folder. My other website is stored on the public html folder with the .htaccess to redirect to a subdomain for mobile for that website.
Any suggestions on how i can stop the addon domain redirecting to the mobile version of the main website?
# Turn mod_rewrite on
RewriteEngine On
RewriteBase /
# Check if mobile=1 is set and set cookie 'mobile' equal to 1
RewriteCond %{QUERY_STRING} (^|&)mobile=1(&|$)
RewriteRule ^ - [CO=mobile:1:%{HTTP_HOST}]
# Check if mobile=0 is set and set cookie 'mobile' equal to 0
RewriteCond %{QUERY_STRING} (^|&)mobile=0(&|$)
RewriteRule ^ - [CO=mobile:0:%{HTTP_HOST}]
# Skip next rule if mobile=0 [OR] if it's a file [OR] if /path/
RewriteCond %{QUERY_STRING} (^|&)mobile=0(&|$) [OR]
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_URI} ^.*/path/.*$
RewriteRule ^ - [S=1]
# Check if this looks like a mobile device
RewriteCond %{HTTP_PROFILE} !^$ [OR]
RewriteCond %{HTTP_X_WAP_PROFILE} !^$ [OR]
RewriteCond %{HTTP_USER_AGENT} "android|blackberry|ipad|iphone|ipod|iemobile|opera mobile|palmos|webos|googlebot-mobile" [NC]
# Check if we're not already on the mobile site
RewriteCond %{HTTP_HOST} !^m\.
# Check to make sure we haven't set the cookie before
RewriteCond %{HTTP_COOKIE} !mobile=0(;|$)
# Don't redirect "path" pages
RewriteCond %{REQUEST_URI} !^.*www.addondomain.com/.*$ [NC]
# Now redirect to the mobile site
RewriteRule ^ http://m.example.com/ [R,L,NC]
Add that below RewriteBase /:
RewriteCond %{HTTP_HOST} ^(www\.)?addondomain\.com$ [NC]
RewriteRule ^ - [L]
Indicating to do nothing for addondomain.com

Redirect a specifc but wild card * URL (with folder(s)) to new domain (same structure)

I need to redirect a specific URL (with structure) to a the same URL(s) using a new domain, but not other URLS.
domainA.com/company/careers*
domainB.com/company/careers*
The reason for this is a 3rd party vendor supplying a jquery based iframe app that perfoms a referrer check before loading.
I realize there is a bigger seo/duplicate content issue that needs to be addressed, but there is a lot of additional work that needs to happen before domainA.com is fully redirected to domainB.com so for now, Its only the "career" section.
The site is using IIS6 with HeliconTech's ISAP ReWrite3
http://www.helicontech.com/isapi_rewrite/doc/introduct.htm
Current Rules:
# Helicon ISAPI_Rewrite configuration file
# Version 3.1.0.59
<VirtualHost www.domainA.com www.domainB.com>
RewriteEngine On
#RewriteBase /
#RewriteRule ^pubs/(.+)\.pdf$ /404/?pub=$1.pdf [NC,R=301,L]
# Send away some bots
RewriteCond %{HTTP:User-Agent} (?:YodaoBot|Yeti|ZmEu|Morfeus\Scanner) [NC]
RewriteRule .? - [F]
# Ignore dirctories from FarCry Friendly URL processing
RewriteCond %{REQUEST_URI} !(^/measureone|^/blog|^/demo|^/_dev)($|/)
RewriteRule ^([a-zA-Z0-9\/\-\%:\[\]\{\}\|\;\<\>\?\,\*\!\#\#\$\ \(\)\^_`~]*)$ /index.cfm?furl=$1 [L,PT,QSA]
RewriteCond %{REQUEST_URI} ^/company/careers [NC]
RewriteRule ^company/careers/?(.*)$ http://www.domainname.com/company/careers/$1 [R=301,L]
# Allow CFFileServlet requests
RewriteCond %{REQUEST_URI} !(?i)^[\\/]CFFileServlet
RewriteBase /blog/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .* /blog/index.php [L]
</VirtualHost>
<VirtualHost blog.domainA.com>
RewriteEngine On
#redirect old blog.domainA.com/* posts to www.domainB.com/blog/*
RewriteCond %{HTTP_HOST} ^blog.domainA\.com [nc]
RewriteRule (.*) http://www.domainB.com/blog$1 [R=301,L]
</VirtualHost>
It seems that "RewriteBase /blog/" line corrupts your "careers" rule as it implies that the request should be domainA.com/blog/company/careers*
Please consider having it like this:
<VirtualHost www.domainA.com www.domainB.com>
RewriteEngine On
RewriteBase /
#RewriteRule ^pubs/(.+)\.pdf$ /404/?pub=$1.pdf [NC,R=301,L]
# Send away some bots
RewriteCond %{HTTP:User-Agent} (?:YodaoBot|Yeti|ZmEu|Morfeus\Fucking\Scanner) [NC]
RewriteRule .? - [F]
# Ignore dirctories from FarCry Friendly URL processing
RewriteCond %{REQUEST_URI} !(^/measureone|^/blog|^/demo|^/_dev)($|/)
RewriteRule ^([a-zA-Z0-9\/\-\%:\[\]\{\}\|\;\<\>\?\,\*\!\#\#\$\ \(\)\^_`~]*)$ /index.cfm?furl=$1 [L,PT,QSA]
RewriteCond %{REQUEST_URI} ^/company/careers [NC]
RewriteRule ^company/careers/?(.*)$ http://www.domainname.com/company/careers/$1 [R=301,L]
# Allow CFFileServlet requests
RewriteCond %{REQUEST_URI} !(?i)^[\\/]CFFileServlet
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^blog/.* /blog/index.php [L]
</VirtualHost>
If you still have issues, enable logging in httpd.conf by putting
RewriteLogLevel 9
and check how your request is processed in rewrite.log.
Just check to see if the request starts with /company/careers
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/company/careers [NC]
RwriteRule ^company/careers/?(.*)$ http://domainB.com/company/careers/$1 [R=301,L]
See if that works for you.

Subdomain is taking over requests to domain

I've created a subdomain called mobile for my website throuch cPanel. I redirect mobile devices to that subdomain, but there is javascript that lives there that makes AJAX calls to the actual domain. I have structured these calls to go to website.com/mobile/.... However, these aren't going through, and I suspect that it's because it is looking for ... in my /mobile, but the request is supposed to be rewritten in .htaccess to website.com/index.php?params=mobile/....
Here's the .htaccess:
# redirect phones/tablets to mobile site
RewriteCond %{HTTP_USER_AGENT} "android|blackberry|ipad|iphone|ipod|iemobile|opera mobile|palmos|webos|googlebot-mobile" [NC]
RewriteCond %{HTTP_HOST} !mobile\.website\.com [NC]
RewriteCond %{REQUEST_URI} !^/mobile [NC]
RewriteRule ^(.*)$ http://www.mobile.website.com/$1 [L,R=302]
# not a file or directory
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# website.com/home => website.com/index.php?params=home
RewriteRule ^(.+)(\?.+)?$ index.php?params=$1 [L,QSA]
This works on my local machine but not on the live server. I have created a sudomain locally via
<VirtualHost *:80>
DocumentRoot "C:/Program Files (x86)/Apache Software Foundation/Apache2.2/htdocs/website/mobile"
ServerName mobile.website.local
</VirtualHost>
and it works perfectly: when I go to mobile.website.local or website.local/mobile, I get the mobile site, and when I go to website.local/mobile/users/login I get the correct JSON output for the AJAX request.
How can I keep my mobile subdomain alive in /mobile/ but have requests to website.com/mobile/... be forwarded with the last rewrite rule?
Thanks!
Just add the specific redirect for your /mobile, forcing to ignore the file or directory statement:
RewriteCond %{REQUEST_URI} ^/mobile [NC]
RewriteRule ^(.+)(\?.+)?$ index.php?params=$1 [L,QSA]
# redirect phones/tablets to mobile site
RewriteCond %{HTTP_HOST} !mobile\.website\.com [NC]
RewriteCond %{REQUEST_URI} !^/mobile [NC]
RewriteCond %{QUERY_STRING} !^params=mobile(.*)$
RewriteRule ^(.*)$ http://www.mobile.website.com/$1 [L]
# not a file or directory
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# website.com/home => website.com/index.php?params=home
RewriteRule ^(.+)(\?.+)?$ index.php?params=$1 [L,QSA]
Anything let me know and I'll see if I can help :)

Resources