Does anyone know if Google adds domains to its crawl list if they are known only through only known through Gmail email content?
A domain to which no one has ever linked and which was never submitted to Google or DMOZ has turned up in Google results.
Anyone know if they index emailed URLs?
It could have been that it was publicly listed after your purchase of the domain. I am not sure if google crawled mail for SERP but it would make sense as gmail is googles biggest social network and it would indicate trends but in a much more private conversation.
They are a registrar. I don't see why they couldn't use it to find new sites to index.
Related
We had an account on SendGrid, and we used link branding with custom subdomain (emails.example.com). We stopped using SendGrid and switched to other email service, but we still want to support our old emails. Right now, old email links are working fine, as we still have DNS set to redirect from emails.example.com to sendgrid.net, and SendGrid still redirects to correct urls, but we are completely dependent on SendGrid right now, and we don't know if at some point in future SendGrid will simply stop redirecting this links.
So, the question is how we can redirect old email links to point to our website?
I think it's impossible, as link branding feature replaced not just domain, but also the path from the links.
Maybe someone from SendGrid will have some answer?
Twilio SendGrid developer evangelist here.
I've checked internally and as long as you keep the DNS in place, the redirects will continue to work.
I have a newsletter subscription form on a website using Kentico 9. There is a simple captcha : "prompts users to retype a sequence of numbers from an image". Since a few month, I have noticed a lot of new subscriptions with email address that can look real because of the domain. But names attached to the email address are only a series of letters (example : vPkGNFtUjyxcEQ). I verified some email address on CleanTalk and they were reported as spam.
Is it possible for bots to subscribe to the newsletter even with this kind of captcha? How can I prevent that?
Thanks!
Yes, it is possible for bots to submit those old captcha forms. You're better off to introduce the reCaptcha v2 or v3 on your site. There is some code on the old Kentico Marketplace which allows you to import and use the new reCaptcha functionality.
I'm using node.js with express.
I want to present a user arrived from google adwords campaign a specific banner. I can use req.get('referer') and see that it's google.com but how can I know it's from a compaign? and how can I test it locally?
You should add tracking params to your Google AdWords campaigns, e.g. ?source=google-adwords. You can then check for the existence of this param. This is what tracking systems like Piwik do.
Google AdWords allows to pass dynamic param values named value track params to gain deeper insights about what keywords and ads your traffic came from.
You could link Google AdWords with Google Analytics and automate tagging of your ads, which will automatically add a param named gclid to all of your Google ads links. You could then check for the existence of this param.
Referrer should be the last way to go, as it is not very reliable.
Is it possible to crawl check-in data from foursquare in a greedy way? (even if I don't have friendship with all the users) Just like crawling publicly available twitter messages. If you have any experience or suggestions, please share. Thanks.
If you have publicly available tweets containing links to foursquare, you can resolve the foursquare short links (4sq.com/XXXXXX) by making a HEAD request. The head request will return a URL with a check-in ID and a signature. You can use those two values to retrieve a check-in object via the foursquare API /checkins/ endpoint. You're only allowed to access 500 of these per hour.
You must abide by both the Twitter and foursquare terms of service -- in fourquare's case, you may not display this information to anyone, nor may you retain any user information for more than 3 hours (since the user has not authorized your application).
You can only get the check-in data for a location if the manager of the location gives you OAuth access to your application. If you have that, you can use the real-time API defined here: https://developer.foursquare.com/docs/realtime.html
No, it is not possible to crawl check-in data similar to Twitter. This information is considered personal data and is not public.
You can crawl twitter data for foursquare data =)
Our company uses Google Apps, and I want to find a way to search the All Mail folders of all employees simultaneously: the goal is to return a complete list of emails our company has had to/from a given email address. I am new to the Gmail APIs - is there a way to do what I'm hoping to do? Any advice would be appreciated. Thanks!
I am a little worried about the ethics of doing this, I would imagine it would be a concern for your domain users also but it is technically possible.
I believe this is the kind of service that Postini could provide for you.
Alternatively you can use 2 legged OAuth in conjunction with Gmail IMAP. This could allow you to programmatically iterate through your domain users, login to IMAP and search for the e-mail address. See Gmail IMAP and SMTP using OAuth
This may sound insecure but in order to enable this behaviour in the first place you would need to have Google Apps adminstrator access to your domain (to enable OAuth access and acquire the domain's Consumer Secret). See also OAuth: Managing the OAuth key and secret