Help with 301 redirects on outgoing links from my site - .htaccess

I work for company that links out to partners through a third party website that tracks them. So for example on our site there will be an outgoing link something like this (names changed to protect my work):
check it out kids!
if you go into link.php, you see I define the link there:
$outlink['chuckecheese'] = "http://partners.linktrackingisprettycool.com/x/212/CD1/$STAMP";
$STAMP is a timestamp and is replaced with, say, "12-25-09-1200" for noon on christmas.
When a user clicks on this link, he goes to www.chuckecheese.com
This all works fine, but it isn't as good for SEO purposes as it could be. I want to make it so that search engines will see it as a link to chuckecheese.com, which which helps our partners' pageranks and is more honest.
I'm in .htaccess trying to make up rewrite rules but I'm confused and don't know exactly how it's done. I tried:
RewriteRule http://www.chuckecheese.com$ link.php?link=chuckecheese$ [QSA]
But this doesn't seem to work. What should I try next?
Thanks in advance for any help. You guys on here are always awesome and I appreciate the part that the good people at stack overflow play in me remaining employed.

You can't use a rewrite rule to redirect the user for this. The request has to be one processed by your webserver.
You might try doing some javascript to achieve this. so the href is to chuckecheese, but onclick, you change the document.location to what you really want to do.
Edited question for bounty
What you could do is pre-process your links based on the user agent of the browser. So when the user-agent is googlebot (one of the below strings), You display the real url of http://www.chuckecheese.com.
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Googlebot/2.1 (+http://www.googlebot.com/bot.html)
Googlebot/2.1 (+http://www.google.com/bot.html)
When the URL is not googlebot, you display the link that does traffic analytics.
You can find a list of user agents at the following URLs:
http://www.useragentstring.com/Googlebot2.1_id_71.php
http://www.user-agents.org/
http://www.pgts.com.au/pgtsj/pgtsj0208c.html
If googlebot isn't showing the correct user-agent (or it changes in the future) google recommends you do a reverse look up against the IP address. This will be a small performance hit.
You can verify that a bot accessing your server really is Googlebot by using a reverse DNS look up, verifying that the name is in the googlebot.com domain, and then doing a forward DNS look up using that googlebot name. This is useful if you're concerned that spammers or other troublemakers are accessing your site while claiming to be Googlebot. -Google
Edited for further explanation
Assuming you are using php, you generate the link at runtime. Here is some code I whipped up.
function getRealURL($url)
{
// adjust this regex to match the pattern of your traffic analysis urls
ereg("link=(.+)$",$url,$matches);
if ($matches[1])
{
// adjust this so the urls come out correctly
return "http://www.".$matches[1].".com";
}
else
{
return $url;
}
}
function isGoogle()
{
switch ($_SERVER['HTTP_USER_AGENT'])
{
case 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)':
case 'Googlebot/2.1 (+http://www.googlebot.com/bot.html)':
case 'Googlebot/2.1 (+http://www.google.com/bot.html)':
return true;
default:
return false;
}
}
function showlink($url)
{
$trafficAnalysisUrl = getRealURL($url);
if (isGoogle())
{
return $url;
}
else
{
return $trafficAnalysisUrl;
}
}
<html>
...
Come eat pizza at <a href='<?=showLink("link.php?link=chuckecheese")?>'>chuck e cheese!</a>
...
</html>
I doubt google would care about something like this since both links go to the same place.
But check the TOS to be sure.
http://www.google.com/accounts/TOS

An assumption of yours is not good. You say:
I want to make it so that search
engines will see it as a link to
chuckecheese.com, which helps our
score when people search for chuck e
cheese because we'll be seen as
linking right to them.
If this really helped SEO wise, every body would spam link all great sites just to get SEO pagerank and the game would just be too easy. The beneficiary of a link is the recipient page/site, not the sender.

Hey PG... linking out to other websites will not give you any further PageRank just as having your ads in Adwords appearing on a thousand other sites will not give you PageRank. And yes, your partners are being benefited by having you linked to them. And what about those benefits to be gain that you speak of from being open? From my understanding of what you have wrote, it is just another fancy redirect. Google knows that.

Related

remove utm_* from URL and keep GA tracking

I had to add some ugly links in my website that looks like: "/page/?utm_source=value&utm_source=value2&utm_source=value3...." and I would like to keep the URLs clean. I found this directives that I tried, it does redirect but the tracker doesn't work coz it's not passed to GA.
RewriteCond %{QUERY_STRING} ^((.*?)&|)utm_
RewriteRule ^(.*)$ /$1?%2 [R=301,NE,L]
Is there another solution for this?
Thanks
It seems there is no clear way to do this. I checked couple of big websites using utm_campain and it stays after clicking the link, so even big websites consider this as fine.
But if one really wants to remove it, it can be done manually. Assuming that you have canonical address (which you should have if you care about SEO), see some PHP-pseudocode:
function cleanupUtm($currentCanonicalUrl)
{
$currentUrl = $_SERVER['REQUEST_URI'];
// Check are there "utm_" parameters in current URL
if (strpos($currentUrl, 'utm_') === false) {
return;
}
redirect($currentCanonicalUrl);
}
function redirect($location)
{
// Some helper redirect function, see https://stackoverflow.com/a/768472/1657819
header("Location: " . $location);
die();
}
But obviously it will trigger extra redirect and page reload.
Alternative
As an alternative, you may consider using Google Tag Manager to replace long boring URL with campaign to some neat URL like https://example.com/#instagram. See for example this article
While it looks neat (and much better than long utm... string), I feel that it gives you less flexibility (or you will end up with hundreds of tags for all the cases you need)
Google analytics by default removes the UTM parameters from pages report but if you want to remove any other parameter
you can go to admin > view setting > Exclude URL Query Parameters
You can learn a bit more about it here
https://support.google.com/analytics/answer/1010249?hl=en
Hope this help

how to prevent a country from accessing my website

I have a french website and i have a lot of russian visitor (bot ?) and i don't want them in my GA stats.
So first I tried to prevent Russian visitor from accessing my website by htaccess. But still russian visitor ...
Then I tried to allow only French visitor, still with htaccess solution :
<Limit GET HEAD POST>
order deny,allow
allow from 2.0.0.0/12
allow from 2.16.2.0/23
allow from 2.16.9.0/24
...
deny from all
</Limit>
Unfortunately I still have russian visitor in my GA. Is that possible ?
I don't know what to do anymore, please bring me light !
Thx
Your .htaccess solution may be failing to filter out these spam results because the Russian hits in question may not be "real".
Check out this article on referral spam: http://www.analyticsedge.com/2014/12/removing-referral-spam-google-analytics/
There is a nice description of "ghost referrals" that directly manipulate GA to post fake page views by using random GA numbers (you may even find that your GA number appears in the referral link for these hits when you view acquisition > all traffic > referrals).
Unfortunately, if this is the case then .htaccess will not do anything for you and you will need to use GA filters to filter out these results going forward.

Trying to create seo friendly url

I'm trying to create friendly url for my site but with no success :(( and i have two questions
The first is:
How to change the url from domain.com/page/something.php to domain.com/something
And the second is:
Will the changes make duplicate content and if how to fix the problem.
Thank you for your time,
Have a nice day
Check out the official URL Rewriting guide: http://httpd.apache.org/docs/2.0/misc/rewriteguide.html
You'll be using the simplest use case, Canonical URLs. Assuming you have no other pages that you need to worry about, you can use a rule like this: (note: untested, your usage may vary)
RewriteRule ^/(.*)([^/]+)$ /$1$2.php
While that example might not exactly work for your use case, hopefully reading the guide and my example will help you get you on your way.

Add .html when rewriting URL in htaccess?

I'm in the process of rewriting all the URLs on my site that end with .php and/or have dynamic URLs so that they're static and more search engine friendly.
I'm trying to decide if I should rewrite file names as simple strings of words, or if I should add .html to the end of everything. For example, is it better to have a URL like
www.example.com/view-profiles
or
www.example.com/view-profiles.html
???
Does anyone know if the search engines favor doing it one way or another? I've looked all over Stack Overflow (and several other resources) but can't find an answer to this specific question.
Thanks!
SEO optimized URLs should be according to this logic (listed in priority)
unique (1 URL == 1 ressource)
permanent (they do not change)
manageable (1 logic per site section, no complicated exceptions)
easily scaleable logic
short
with a targeted keyword phrase
based on this
www.example.com/view-profiles
would be the better choice.
said that:
google has something i call "dust crawling prevention" (see paper: "do not crawl in dust" from this google http://research.google.com/pubs/author6593.html) so if google discovers a URL it must decide if it is worth crawling that specific page.
as google gives URLs with an .html a "bonus" credit of trust "this is an HTML page i probably want to crawl it".
said that: if your site mostly consists out of HTML pages that have actual textual content , this "bonus" is not needed.
i personally only add the .html to HTML sitemap pages that consists only out of long lists and only if i have a few millions of it, as i have seen a slightly better crawlrate above these pages. for all other pages i strictly keep the Franzsche URL logic mentioned above.
br
franz, austria, vienna
p.s.: please see https://webmasters.stackexchange.com/ for not programming related SEO questions

Getting "mywebsite.org/" to resolve to "mywebsite.org/index.php"

At my work we have various web pages that, my boss feels, are being ranked lower than they should be because "mywebsite.org/category/" looks like a different URL to search engines than "mywebsite.org/category/index.php" does, even though they show the same file. I don't think it works this way but he's convinced. Maybe I'm wrong though. I have two questions:
How do i make it so that it will say "index.php" in the address bar of all subcategories?
Is this really how pagerank works?
Besides changing all the links everywhere, a simpler solution is to use a rewrite rule. Make sure it is a permanent redirect, or Google will keep using the old link (without index.php). How you do this exactly depends on your web server, but for Apache HTTPd it looks something like the example given below.
Yes. Or so I've heard. Very few people know for sure. But Google mentions this guideline (as "Be consistent"). Make sure to check out all of Google's Webmaster guidelines.
Apache config for rewrite rule:
# in the generic config
LoadModule rewrite_module modules/mod_rewrite.so
# in your virutal host
RewriteEngine On
# redirect everything that ends in a slash to the same, but with index.php added
RewriteRule ^(.*)/$ $1/index.php [R=301,L]
# or the other way around, as suggested
# RewriteRule ^(.*)/index.php$ $1/ [R=301,L]
Adding this code to the top of every page should also work:
<?php
if (substr($_SERVER['REQUEST_URI'], -1) == '/') {
$new_request_uri = $_SERVER['REQUEST_URI'].'index.php';
header('HTTP/1.1 301 Moved Permanently');
header('Location: '.$new_request_uri);
exit;
}
?>
You don't tell us if you're using straight PHP or some other framework, but for PHP, probably you just need to change all the links on your site to "mywebsite.org/category/index.php".
I think it's possible that this does affect your search engine rank. However, you would be better off using only "mywebsite.org/category" rather than adding "index.php" to each one.
Bottom line is that you need to make sure all your links in your website use one or the other. What actually gets shown in the address bar is unimportant.
A simple solution is to put in the <head> tag:
<link rel="canonical" href="http://mywebsite.org/category/" />
Then, no matter which page the search engine ends up on, it will know it is simply a different view of /category/
And for your second question--yes, it can affect your results, if Google thinks you are spamming. If it wasn't, they wouldn't have added support for rel="canonical". Although I wouldn't be surprised if they treat somedir/index.* the same as somedir/
I'm not sure if /category/ and /category/index.php are considered two urls for seo, but there is a good chance that it will effect them, one way or another. There is nothing wrong with making a quick change just to be sure.
A few thoughts:
URLs
Rather than adding /index.php, you will be better off making it so there is no index.php on any of them, since the keyword 'index' is probably not what you want.
You can make a script that will check if the URL of the current page ends in index.php and remove it, then forward to the resulting URL.
For example, on one of my sites, I require the 'www.' for my domain (www.domain.com and domain.com are considered two URLs for search purposes, though not always), so I have a script that checks each page and if there is no www., it ads it, and forwards.
if (APPLICATION_LIVE) {
if ( (strtolower($_SERVER["HTTP_HOST"]) != "www.domain.com") ) {
header("HTTP/1.1 301 Moved Permanently"); // Recognized by search engines and may count the link toward the correct URL...
header("Location: " . 'www.domain.com/'.$_SERVER["REQUEST_URI"] );
exit();
}
}
You could mode that to do what you need.
That way, if a crawler visits the wrong URL, it will be notified that it was replaced with the correct URL. If a person visits the wrong URL, they will be forwarded to the correct URL (most won't notice), and then if they copy the url from the browser to send someone or link to that page, they will end up linking to the correct url for that page.
LINKING URLS
They way other pages link to your pages is more important for seo. Make sure all your in-site links use the proper URL (without /index.php), and that if you have a 'link to this page' feature, it doesn't include the /index.php part. You can't control how everyone links to you, but you can take some control over it, like with the script in item 1.
URL ROUTING
You may also want to consider using some sort of framework or stand-alone URL rerouting scheme. It could make it so there were more keywords, etc.
See here for an example: http://docs.kohanaphp.com/general/routing
I agree with everyone who's saying to ditch the index.php. Please don't force your visitor to type index.php if not typing it could get them the same result.
You didn't say if you're on an IIS or Apache server.
IIS can be set to assume index.php is the default page so that http:// mywebsite.org/ will resolve correctly without including index.php.
I would say that if you want to include the default page and force your users to type the page name in the url, make the page name meaningful to a search engine and to your visitors.
Example:
http://mywebsite.org/teaching-web-scripting.php
is far more descriptive and beneficial for SEO rankings than just
http://mywebsite.org/index.php
Might want to take a look at robots.txt files? Not quite the best solution, but you should be able to implement something workable with them...

Resources