Stop alexa to gain the visitor stats - statistics

Is there a way I stop alexa to gain the visitors stats, I don't want that every one search my domain in alexa and get information about the overal users stats or get the information that on which country are users viewing my site the most or other stats about them.
For example when we do not want the crawlers to get to some of our pages and limit the indexing on them, we use nofollow as meta tag or on rel attribute of links,
But what can we do for alexa robots? is there a way to limit alexa too?

Alexa gets its information from browser toolbars that you install on purpose or as part of a bundle with some software. For some browsers it modifies User Agent so you can check if user is reporting to Alexa using
<?php if (preg_match('/(Alexa)/i',$_SERVER['HTTP_USER_AGENT'])) {echo 'This is Alexa Reporter :)';) ?>
But you cannot prevent getting information by Alexa. Even if you redirect all these people somewhere Alexa will still know referring domains, search keywords, etc.
Here is an article that provides more information about Alexa Rank - http://netberry.co.uk/alexa-rank-explained.htm

Related

How to check if a site user is real (without every time showing him a captcha)?

I want to count page views and/or users on my site.
How to exclude bot (or otherwise fraudulent) views from the count?
I want to make it highly secure so that it would be very difficult to write a bot that significally tampers the statistics.
My ideas of solutions:
Use Google Analytics API (does it have such an API?)
Show captcha before showing the page (very disturning for user experience)
You can use ReCaptcha serverside and get user 'botness' scored. Simo Ahava has a great guide on implementing this.

Get Google Search Results Content

I want to get or buy google search results content (structured) from Google itself or any other source that can sell google data legally. I want all results about a specific keyword for the recent 6 months for example.
It will be a good turnaround if I can only get the page content as a raw text for this stage.
Automatic reading out / scraping of Google SERP is against Google ToS. From this point of view there is no one who sells such data legally - any seller violates Googles ToS.
Tere are many offers on markt, where you can get SERP data as JSON or full HTML through API access - just google for it.
The way every seller does SERP scraping is always the same - you can do it by your own. Run many proxies with IP addresses of countries, from where you need SERPs, and query Google with a kind of headless browser. Use captcha solving services to get data even if IP should be banned. Multithread your queries to get more data at once. Thats the whole magic.

How does Mixpanel's Search Keyword work?

I'm curious on how Mixpanel tracks which Search Keywords an event is affiliated with. Is this from the organic search (vs. paid search ads)?
If yes, how did they do it? From a glance, I guess organic search works this way:
That link goes to a proxy link with some query parameters which contain info about the (encrypted) search term & the real destination link.
Redirect to the real destination link.
Google Analytics know the organic search keyword used on a session because they intercept it in the middle point. I'm not sure if there's any way for someone outside of Google to intercept that info (including Mixpanel). Right? (correct me if I'm wrong)
If there is a way for the destination website to know the organic search keyword, can I be enlightened on the method?
I don't think this is coming from organic search or paid ads due to a couple reasons:
Most of the organic traffic is now in HTTPS which makes it hard to get the search parameters. Google Analytics shows this data through the Webmaster Tools console which is able to grab keyword data in a different way (I assume through the Google backend and not the URL itself). Otherwise, you are stuck with the "Not Provided" issue in Google Analytics.
Mixpanel only captures the default UTM parameters: utm_campaign, utm_source, utm_keyword, utm_medium and utm_content. Mixpanel also calls this properties as expected: UTM Medium, UTM Source, etc.
I can't tell from your screenshot but it seems this might be a custom property that your Mixpanel setup is setting it, perhaps from an internal search engine? Or perhaps you're grabbing a custom URL query?
Can you provide more information as to how this event is being captured?

How tracking of the web traffic source works?

May be a stupid question, but I can't find any answer to this question on the web.
In Google analytics it is possible to check the origin a connection to our website. My question, how Google can track the origin of those connections?
If there is info in document.referer (for the javascript tracker, with the measurement protocol you'd have to pass a referer as parameter) Google identifies the source as referrer, unless it is configured (in the defaults or per custom settings) as a search engine (which is really just a referrer with a known search parameter). Also via the settings you can exclude urls from the referrer reports so they will appear as direct traffic.
If there are campaign parameters Google uses those (or else a Google click id (gclid) from autotagging in adwords, which serves a similar purpose). If campaign parameters or gclid are stripped out (e.g. by redirects) adwords ad clicks will be reported as organic search.
If there is no referrer and no campaign parameters/gclid (i.e. a direct type in or a bookmark) Google will identify the source as a direct hit, unless you have clicked an adwords ad before. In that case the aquisition report will report the source as CPC (click per cost) in the acquisition report (as Google puts it, they will use the last known marketing channel as source. Direct is not a marketing channel according to Google). However the multichannel reports will identify those more correctly as direct visits (which is why multichannel and acquisition reports usually do not quite match).

Why does Google Analytics show less visits than One&One stats?

Comparing google analytics results to one&one hosting monthly statics shows a huge discrepancy.
For last month:
Google shows 1046 visits.
One&one stats show 15304 unique visits.
The google code is in the footer which appears on every page.
I'm aware ga only works with js enabled but to assume that many non js users???
Google Analytics is a good indicator of how many humans are visiting your website.
Here are some things to check:
how many bots are in your monthly stats? You can usually find something that says User-Agent in your stats page. GoogleBot, Slurp, msnbot & others will be visiting every page on your site.
that you've read Google Analytics' definition of a visit.
that you have read what your statistics provider means by unique visit. Does that mean unique visitor, page view or something else?
Raw hits on servers can be misleading for a number of reasons..
If you have external style sheets & JavaScript etc, they could be counted as a hit in the webserver log
RSS feed readers will periodically update without being asked to by a human
Check the page views in Google Analytics - it's possible that 1&1 is tracking unique page views instead of the actual visits.
Google Analytics works for almost all users (I believe less than 5% have JS disabled). I have had the same discrepancy, in my case the difference was zeroed out when I took into account the bots (which server-side statistics often take into account, as they produce http-requests). You probably have the same "problem".
Neither stats are wrong, they just count different things. Google Analytics is the more "accurate", i.e. the numbers you want to take a look at. The hosting stats, which look only at http requests, often without filtering, are less interesting.
Blogger, and probably other sites, serve a different page template or skin to mobile visitors. In my case, that template didn't contain the google analytics snippet of code and so those hits were uncounted, until I noticed and fixed it.

Resources