htaccess 301 redirect while replacing characters in URL - Helicon ASAPI ReWrite module - .htaccess

I have an old site that is being rebuilt. Instead of using a folders structure, it is using sub-domains. The segments are different, but the redirect itself is pretty simple. I can handle it like so:
RewriteRule ^segment/blog/view$ http://blogs.site.com/segment/article [R=301,NE,NC,L]
RewriteRule ^segment/blog$ http://blogs.site.com/segment [R=301,NE,NC,L]
So if I had www.site.com/segment/blog, it will now go to blogs.site.com/segment.
If I had www.site.com/segment/blog/view/catchy_name_goes_here, currently it redirects it to blogs.site.com/segment/article/catchy_name_goes_here and I NEED it to go here: blogs.site.com/segment/article/catchy-name-goes-here.
My issue comes from a decision to change the separator in the URI. The old articles were built with underscores '_' and the new articles are built with hyphens '-'.
How can I replace the underscores in the article titles with hyphens?

Try these rules:
RewriteRule ^/segment/blog$ http://blogs.site.com/segment [R,I,L]
# replace _ by - repeatedly
RewriteRule ^(/segment/blog/view)/([^_]*)_+(.*)$ /$1/$2-$3 [I,N,U]
# all _s gone, now do a redirect
RewriteRule ^/segment/blog/view/([^_]+)$ http://blogs.site.com/segment/article/$1 [R,I,L,U]

I ended up having to use the following. I don't know how many people this might effect due to the unique settings for this site in particular, but thought I would post the answer so it could help anyone that might need it.
The full settings on this are a server running IIS with Server 2k. The site consists of several static content pages, vb script, classic ASP, dot Net, and this is all intertwined with ExpressionEngine pages. It's a mess to say the least. To top it off, Helicon Tech's ASAPI Rewrite Module version 3 is running on the server for .htaccess usage. No sub-expressions, groupings, etc. were taking or being followed/processed. The index.php rule was getting bypassed as well.
This all said, I ended up with the following which parsed everything I needed.
RewriteRule ^index.php/segment/blog/view/([^_]*)_+(.*)$ http://www.site.com/index.php/segment/blog/view/$1-$2 [R,NC]
RewriteRule ^index.php/segment/blog/view/([^_]*)$ http://blogs.site.com/segment/article/$1 [R=301,I,L,U]
RewriteRule ^segment/blog$ http://blogs.site.com/segment [R=301,NE,NC,L]
RewriteRule ^/segment/blog$ http://blogs.site.com/segment [R,I,L]

Related

Subdomain in .htaccess file only works with index

I have a subdomain setup in my .htaccess, which only seems to work with the default index.html page. I'd LIKE it to work for ANY page in the folder corresponding to the subdomain. Edited for privacy, assume my domain is example.org. The pertinent parts of the file look like this...
#subdomain
RewriteCond %{HTTP_HOST} ^subname\.example\.org$ [OR]
RewriteCond %{HTTP_HOST} ^www\.subname\.example\.org$
# (a few lines added by my hosting company deleted -- see below)
RewriteRule ^/?$ "http\:\/\/example\.org\/subname\/" [R=301,L]
So the result of the above is that if I have an index.html page in my 'public-html' (root?), http://example.org and a different index.html stored in a sub-folder (having the same name as the subdomain), I will get this expected result, which works...
browse to: http://example.org results in viewing http:// example.org/index.html
browse to: http://subname.example.org results in viewing http:// example.org/subname/index.html
Great so far. This is what I expected when I created the domain name. However, given a specific file myfile.html stored in the subname folder, I would expect this to work also, and it doesn't...
browse to: http://subname.example.org/myfile.html results in a 404 error.
This despite the fact that browsing to http://example.org/subname/myfile.html works fine. In that case myfile.html is displayed. So is there anything I can do to modify the subdomain code to get the result I'm looking for? Namely, browsing to http://subname.example.org/ANYFILE should work as well as browsing to http://example.org/subname/ANYFILE, regardless of what 'ANYFILE' is. This, after all, is one of the main reasons I set up the subdomain to begin with!
Note: I confess that I relied on my hosting company's cPanel utility to create the subdomain code, so I asked for their tech support for help first. Long story short they didn't. Maybe what I hoped for is not actually possible?
Also, the lines I deleted' from the code had to do with something called "well-known/acme-challenge", added by my hosting company at some point. Since removing them had no effect on the behavior I've described, I left it out to avoid clouding the issue.
RewriteRule ^/?$ "http\:\/\/example\.org\/subname\/" [R=301,L]
This only "redirects" the document root. To redirect all URLs you need to change the above to read something like:
RewriteRule (.*) http://example.org/subname/$1 [R=301,L]
The $1 backreference refers to the URL-path captured in the RewriteRule pattern, ie. (.*).
No need to backslash-escape the colons, slashes and dots in the substitution string (that's typical of cPanel).
Also, the lines I deleted' from the code had to do with something called "well-known/acme-challenge", added by my hosting company at some point.
Those lines will likely be required when the (Let's Encrypt?) SSL cert auto-renews. (Although the above redirects to "http" - are you not using HTTPS?)
UPDATE:
RewriteCond %{HTTP_HOST} ^subname\.example\.org$ [OR]
RewriteCond %{HTTP_HOST} ^www\.subname\.example\.org$
Just as an aside, these two conditions could be reduced to a single condition if you wanted. For example, the above is equivalent to:
RewriteCond %{HTTP_HOST} ^(www\.)?subname\.example\.org$

Eliminate duplicate directory name in url using .htaccess file

For some reason, I'm getting duplicate directory names in some urls within a subfolder on our website. This seems to affect only crawlers as the files within this directory work fine when navigated.
I'd like to simply remove the duplicate directory name and make mydomain.com/sub/sub redirect to mydomain.com/sub.
I've tried many versions but my .htaccess skills are lacking apparently. I currently have (not working of course):
RewriteRule ^mydomain.com/sub/sub/(.*) mydomain.com/sub/$1 [L,R=301]
RewriteRule ^mydomain.com/sub/sub/(.*) mydomain.com/sub/$1 [L,R=301]
The RewriteRule pattern matches against the URL-path only - you appear to have included (part of) the domain name. Also, mydomain.com in the substitution string is going to be seen as a relative subdirectory.
Assuming you have a limited number of subdirectories where this occurs then to reduce /sub/sub/<something> to just /sub/<something> you would do something like this:
RewriteEngine On
RewriteRule ^sub/sub/(.*) /sub/$1 [R=301,L]
If you have other directives in you .htaccess file, then this needs to go near the top.
First test with 302 (temporary) redirects to avoid potential caching issues. Clear your browser cache before testing.
But to echo #arkascha's comment... the reason why crawlers are finding these URLs in the first place would seem to be a fault in your URL structure/internal links - so this is what ultimately needs to be fixed.

Htaccess - Detecting the URL

For my family members I was giving each person their own subdomain
(sister1.mydomain.com, sister2.mydomain.com, etc...)
I was using PHP to detect the domain, and then I'd load information related to the subdomain dynamically.
I'd like to get rid of the subdomains and use the power of .htaccess
My goal is to give the same URL:
www.mydomain.com/sister1
www.mydomain.com/sister2
www.mydomain.com/mommy
www.mydomain.com/daddyo
Obviously, I don't plan to have literal working directories for each person.
I'd pass the "sister1" portion to a process.php script that takes care of the rest.
I've figure out how to do it by manually typing each RewriteRule in my htaccess file:
Options +FollowSymLinks
AddDefaultCharset UTF-8
RewriteEngine on
RewriteBase /
RewriteRule ^/?sister1$ process.php?entity=sister1 [L]
RewriteRule ^/?sister2$ process.php?entity=sister2[L]
RewriteRule ^/?mommy$ process.php?entity=mommy[L]
RewriteRule ^/?daddyo$ process.php?entity=daddyo[L]
I feel this is the long way of doing it.
Is there a more universal way of extracting the text after the first "/" forwardslash, and passing it to process.php?entity=$1 ?
I tried it this way:
RewriteRule ^/([A-Za-z0-9-]+)/?$ process.php?entity=$1 [NC,L]
I'm getting the apache 404 error: "Not Found".
It is because you have a mandatory / in the beginning of your rule, i.e., you are always looking for something like /sibling in the URL. Your first examples have that first forward slash as optional due to the question mark after it.
You do not need a beginning forward slash - normally the rewrite rule picks up stuff after the domain name
www.example.com/string/mod/rewrite/gets/is.here
So just remove the starting slash and it should work.

redirect rule for multiple pages

I have page not found erros in webmaster because the page/2/0 part of a url is due to smart paging module clean url feature (Drupal) now i uninstalled the smart paging module but these page not found errors are still there.
www.mysite.com/a/b/c/page/2/0,
www.mysite.com/a/d/e/page/3/0,
www.mysite.com/a/f/g/page/4/0,
www.mysite.com/a/h/i/page/5/0
and so on.
I want to redirect
www.mysite.com/a/b/c/page/2/0 to www.mysite.com/a/b/c
www.mysite.com/a/d/e/page/3/0, to www.mysite.com/a/d/e
www.mysite.com/a/f/g/page/4/0, to www.mysite.com/a/f/g
www.mysite.com/a/h/i/page/5/0 to www.mysite.com/a/h/i
with one redirect rule. How to do this
in short i want to remove the page/x/0 part from the url and redirect it to the remaining part of that url.
You want something like this:
RewriteEngine On
RewriteRule ^/?([^/]+)/([^/]+)/([^/]+)/page/[0-9]+/0$ /$1/$2/$3 [L,R=301]
The literals here are after the first 3 path nodes: /anything/anything/anything/, then it must be followed by a page, then some numbers, then a zero. You can make it even more general, say after the "page", by changing the pattern to:
RewriteEngine On
RewriteRule ^/?([^/]+)/([^/]+)/([^/]+)/page/ /$1/$2/$3 [L,R=301]

htaccess rewriterule for directory is changing the url in a undesireable way

EDIT
After a comment from Seth below, and heading to a helpful apache page here, I have found that VirtualHosts are the way to go for the following issue.
/edit
--ORIGINAL POST--
First, a little background on file setup. I am running a LAMP server that hosts multiple domains. I have staging and live sites on this server, under different directories under the web root.
examples
/webroot/live/site1/[public files]
/webroot/live/site2/[public files]
/webroot/stage/site1/[public files]
/webroot/stage/site2/[public files]
The domains for each of these go to the IP of the server, which points at the webroot directory. I have an .htaccess file there to load the appropriate content based on the http_host.
examples
RewriteCond %{HTTP_HOST} ^www.site1-live.com [NC]
RewriteRule ^(.*)$ /live/site1/$1 [PT,L,QSA]
RewriteCond %{HTTP_HOST} ^www.site1-stage.com [NC]
RewriteRule ^(.*)$ /stage/site1/$1 [PT,L,QSA]
These work great for hitting the home page and any of the internal pages, even with the specific pages being like site1-live.com/view/123. Each site's htaccess handles those.
My issue (sorry it took so long to get here):
When I head to any subdirectory within a site, like www.site1-live.com/rss, the content loads just fine, but the URL changes to something like the following
http://www.site1-live.com/live/site1/rss/
Essentially showing the path from the webroot to the files.
How can I avoid this? I obviously want the url to remain www.site1-live.com/rss. Do I need an htaccess file inside the rss directory to block this somehow?
Thanks in advance!
replace ^www with ^(.*)
then have the whole url in the second line www.yourdomain.com/live/...
Doug,
why do you need the QSA flag?
Anyway, what is happening to you is that mod_index (or whatever is serving you directories) is redirecting you www.site1-live.com/rss (without the ending /) to the equivalent URL with the ending /.
If you don't use mod_alias or something list that on the rewritten URLs, removing the PT should work as you expect.

Resources