Suspicious behavior by Google when verifying users via nodejs

Suspicious behavior by Google when verifying users via nodejs - node.js

I'm building a user authentication system in Nodejs and use a confirmation email to verify a new account is real.
The user creates an account, which prompts him/her to check the email for a URL that he/she clicks to verify the account.
It works great, no issues.
What's unusual is that in testing, when I email myself (to simulate the new user process), and after I click the verify-URL, immediately afterward there are two subsequent connections to the endpoint. Upon inspection, it appears the source IPs belong to Google. What's even more interesting is that the user agent strings are random versions of Chrome.
Here's an example of the last sequence. The first one is the HTTP 200 request and the next two -- the HTTP 400s are Google. (I remove upon user verification the user's verification code from the database so that subsequence requests are HTTP 400s.)
162.158.78.180 - - [03/Jul/2020:20:35:40 +0000] "GET /v1/user/verify/95a546cf7ad448a18e7512ced322d96f HTTP/1.1" 200 70 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36" "hidden.com" "72.191.192.163" "US" "en-US,en;q=0.9"
162.158.187.117 - - [03/Jul/2020:20:35:43 +0000] "GET /v1/user/verify/95a546cf7ad448a18e7512ced322d96f HTTP/1.1" 400 28 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36" "hidden.com" "74.125.210.22" "US" "en-US,en;q=0.9"
162.158.187.117 - - [03/Jul/2020:20:35:43 +0000] "GET /v1/user/verify/95a546cf7ad448a18e7512ced322d96f HTTP/1.1" 400 28 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.122 Safari/537.36" "hidden.com" "74.125.210.8" "US" "en-US,en;q=0.9"
Now I'm using Cloudflare so the first IP address is a Cloudflare address but the second one you see is the real one [as reported by Cloudflare] ... I modified my "combined" log format in Nginx.
Anyhow, any idea what this is? Or why Google would be doing this?
It's just incredibly suspicious given the use of randomized user agent strings.
And one last note, if I inspect my console w/Chrome and go into the network tab before I click a verification link from my email, the 2 subsequent connections never come. It's like Google knows I'm monitoring ... this is just so incredibly odd that I had to ask the community. I'm thinking maybe this is an extension that's infected w/some kind of tracking, but how then do the IPs come back as Google?
New thread: https://security.stackexchange.com/questions/234241/suspicious-behavior-by-google-when-verifying-users-via-nodejs?noredirect=1#comment479497_234241

this is going to sound weird but i started having the same problem this week and i’ve been going legitimately crazy trying to figure out what’s causing it. is your desktop a windows machine? if so is it running the adobe update service? (adobeupdateservice.exe)
i have been killing processes and shutting off chrome extensions one by one and it finally stopped doing this after i killed the adobe services. i have two computers with mostly identical software installs and only one will trigger this in chrome - one difference is that the computer triggering these “cache.google.com” requests has a much newer adobeupdateservice from like two weeks ago
i have been throwing requests at it all day now and finally it’s stopped sending phantom followup requests from weird IPs

Related

What are these POST-requests containing random URLs to my node server

For personal learning purposes i set up a small express webserver in node js and opened it up to the outside world. It only contains a single static page and no other additional end-point.
Occasionally some bots make random requests, apparently attempting to exploit some common vulnerabilities. These requests are all logged.
There is a type of request that interests me. I was yet unable to learn what it is trying to achieve.
here are a few examples I have received over time:
[10/Dec/2020:23:55:37 +0000] "POST http://kiedys.fun/2d17b63552b6cc403d7066358f302265b36b5a21669505db3cf95cce34e15a5a2532aa55a638229487ce0e37b4422bd55181b877e45517a893f1e74b819b43e105bd36b73aa1c2ae8508607a1aee868858476c5044269cc94ae93de48b1ac16a HTTP/1.1"
[11/Dec/2020:10:51:04 +0000] "POST http://pomidorowa.xyz/7f22fd5911841fb9cea80c0145b9371d29da1d1b69795298e1b5891ffc9847b848f357a9f46a5ff87e9c85da481d37a322c7edd30aa640679521a12e07d18d1a7438b0fc26638363136141661a4ff98e873c46a7b86982d6038dca5a6adc1c2c HTTP/1.1"
[16/Dec/2020:09:09:26 +0000] "POST http://seduced.xyz/80143c6a4e331dd4422b3b75cc961a86df0eeb0b5428b8133e6d81783dc2fb2269b72803d001a200f51583d8217541795d258baa087d18fc3d30cdd1bb19a6f27933e8085f1a85c83f2181586bf4a8b397b8c651ffc126cb8cdb0aef42097a75 HTTP/1.1"
[17/Dec/2020:11:07:21 +0000] "POST http://kaymcclurg.best/35a28a78179508d919df81fb6e000bf346b0df58c84abcccea5367fbd430b32a429551c8710650314b9aa78c9dfee6723e53a2057dd92911d5431bab101a04b504d11d24476930d9d1ff8544f1a8abe9562392901bd3e86d059d5d236cebc52d HTTP/1.1"
[18/Dec/2020:00:03:55 +0000] "POST http://verdlet.website/f006d2c96296e7ab0462b6927f98ec36800db9b8b05cdf5ef75be509830f46edb90c2b9c48d10b66763d32560761359261cc20b6684de0dba79f99e19657a5b85a5037b8f4818552e93f757fdb1a449149f81e4b39e6eccc6effbb59b7ae2231 HTTP/1.1"
[01/Jan/2021:22:33:47 +0000] "POST http://zwykle.xyz/bde81af2ba9fde1c1c50fb38316a9e5f74ecdac9ca614ff5bf9d2b11c08482e19ff2d074576d0d25f8ad25028830e8e1b82611935b9d88e5e611e0ed7670174a9f1240b08f13599f039d7e96ff5edfaa058dc8d867e11be95e16d076b7270991 HTTP/1.1"
They all appear to contain an arbitrary domain name, followed by a seemingly random string.
I was unable to find a description of these in a web search. Possibly because the requests, while following a pattern, are each unique
I would be interested to learn what function these POST-requests serve.

Your site is being probed, just like every other site on the web. They might also be trying to execute some express CVEs in hopes to exploit your server.
You could install fail2ban that will block some of this traffic.

How to define routes with a custom protocol/scheme with node.js?

I scrapped the internet but did not find any good resource on how to create routes with a custom scheme (my-app://) with node.js.
Strictly speaking, it would not really be custom protocol, it would be http but served with another scheme.
How can I do that?
I can install any npm packages.

If it is HTTP then even though some other client application is using another scheme to connect, you will still get it as HTTP on the server side.
In fact, in the HTTP protocol you don't get the protocol scheme in the request. You get the host (hostname and port) in the Host heared and yu get the path (with query string but no fragment part) in the GET lite of the request (or POST etc.). At no point the client sends any indication of what protocol does it use, unless it's a request to a forward proxy server (but not if it's a reverse proxy).
It is your server that assumes which protocol scheme is used because it knows what protocol it speaks with on a given port. In the case that you describe of a client that uses some other protocol name in the URL but connects to your server using HTTP, your server will only need to know HTTP and the routes doesn't usually include the protocol anyway, maybe unless it's Diet.js but even then it's used in the listen argument, not in the routes.
This is an example HTTP request:
GET / HTTP/1.1
Host: localhost:3344
Connection: keep-alive
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/53.0.2785.143 Chrome/53.0.2785.143 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8,pl;q=0.6
The only place where it has "HTTP" is the first line defining the version of the protocol so that the client could understand the headers properly and this you would need to keep anyway so that your server could work if you want to use the built in http module or any framework in Node. If you changed that then you will have to write your own parser of the protocol.

Does Instagram return HTTP 500 response due to rate limiting and/or some form of request filtering?

I'm developing a tool for checking the integrity of links in a web page.
I've noticed that various Instagram URLs will return a HTTP 500 response in cases where if one were to visit the given URL in a browser one would get a HTTP 200 response accompanied by the expected resource.
This is when requesting regular Instagram URLs as one would do as a browser user and not when using the REST API.
A typical request/response using cURL:
curl -v http://instagram.com/p/YKTkxHBA-P/
* About to connect() to instagram.com port 80 (#0)
* Trying 54.225.159.246... connected
> GET /p/YKTkxHBA-P/ HTTP/1.1
> User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3
> Host: instagram.com
> Accept: */*
>
< HTTP/1.1 500 Server Error
< Cache-Control: no-cache
< Content-Type: text/html
< Date: Tue, 15 Oct 2013 08:31:09 GMT
< Server: nginx
< Content-Length: 87
< Connection: keep-alive
<
<html><body><h1>500 Server Error</h1>
An internal server error occured.
</body></html>
* Connection #0 to host instagram.com left intact
* Closing connection #0
I did for some time get HTTP 200 responses in such cases but am now consistently getting HTTP 500 responses.
This is all specific to a given host; such URLs, even when sending requests with cURL, will return HTTP 200 responses from other machines.
Due to this being specific to the host sending the requests I suspect a form of rate limiting or request filtering by IP is going on, however I can find no documentation to the effect.
Will Instagram officially return a HTTP 500 response as above due to a given IP being denied access?

This is an IP rate limit. If you want to skip the part where you contact Instagram and wait for the monkey crew they have working over there to fix the problem, simply assign 1000 IPs to your server and loop through them randomly for your requests. You won't see anymore 500's.
Cheers.

I've received a mail from Instagram support just yesterday.
...
Hi,
We made some changes to our server configurations, can you please check if you are still seeing the 500 errors?
Thanks,
...
..well I was 100% certain that those 500ers didn't come from IG's IP rate limit because they weren't returned beforehand either.
I've been checking the logfiles and found a couple 502 (Bad Gateway) and "host unreachables", though no more 500 per se, after 2014-04-14 18:28:56 (PST).
Looks like they've finally sorted it out after nearly a month... ^_^

I had the exact same problem - unfortunately there is no way around it. This happens because of two many requests. At first I thought it was my IP or possibly my UDID until I signed into my app from my own phone and from my home IP but using a different Instagram id and it finally worked as expected again. Eventually, I could use my own ID again as time went by but with limited requests. I am not exactly sure how the algorithm works but the more time that went on the more requests I could use again.
Also, this is in real-time on an actual iPhone in an actual app - not on the iOS sim or Instagram API console, FYI.
Main Point: The request limit is based off the user (5000 requests per hour per user)...there is no IP rate limit.
Hope this helps :)
Clayton

I have the same problem. As i've rad, you have to get an extra access to API, I mean the Sandbox mode in your application does not allow you to use all the API. To get extra premissions go to client preferences, tab "premissions".

It seems to be related to curl version, I also experience the same problem with v 7.22.0 on 4 different machines 10 different ip's, while v7.30.0 and v7.19.7 are working like a charm.
Currently investigating further more.

I'm almost 100% sure this happens because the domain/IP address is blocked from the Instagram API.
Reason:
Works to get the JSON in the browser. Doesn't work to get with cURL from webservice.
I copied my exact application from my primary domain (were the application doesn't work) to another domain. Same application worked.
The strange thing is that you get a "500 Internal Error" back and not a message "Your IP is blocked".

DCE RPC bink_nak reason protocol version not supported

I have an application hosted on Linux 5.5 that uses SMB and RPC calls to a Windows server to gain some data from the registry.
The problem is that when I have a look at the wireshark traces I see a response coming from the windows server stating that bind_nak reason protocol version not supported. I see that the Linux server is using Major version 5 and minor version 0. Tried with Windows 2008 server. Same problem seen. Due to this I am not able to get the data I want.
Any idea how I could decode the problem? What do I look for on the windows/linux server.
Note: - The initial to and fro using SMB protocol is successful. That is I can see in the wireshark traces that there was a negotiation of protocol, then I can see commands such as session setup andx request, session setup andx response, NT create andx request, NT create andx response etc etc.

What is the maximum practical length of a mailto URL?

In my web page I make an Ajax request to a WCF service. If the service throws an error then that is passed back in the JSON. The JavaScript error handler then reveals a hidden div with a mailto URL prepopulated with my details so that team members (this is a small internal app) can send me the error including the stack trace. Here's an example resulting URL from a test run:
mailto:tttttttt#mmmmmmmmm.com?subject=potential%20seed%20save%20failed&body=Potential%20seed%20URL%20=%20unknown%0DResponse%20%3A%20%7B%22ExceptionDetail%22%3A%7B%22HelpLink%22%3Anull%2C%22InnerException%22%3Anull%2C%22Message%22%3A%22testing%22%2C%22StackTrace%22%3A%22%20%20%20at%20SavePotentialSeedSearches.WCFService.StorePotentialSeed(String%20url%2C%20String%20name)%20in%20C%3A%5C%5CTFS%5C%5CProjects%5C%5CSeeds%5C%5CPreliminaries%5C%5CSavePotentialSeedSearches%5C%5CWCFService.svc.cs%3Aline%2021%5Cu000d%5Cu000a%20%20%20at%20SyncInvokeStorePotentialSeed(Object%20%2C%20Object%5B%5D%20%2C%20Object%5B%5D%20)%5Cu000d%5Cu000a%20%20%20at%20System.ServiceModel.Dispatcher.SyncMethodInvoker.Invoke(Object%20instance%2C%20Object%5B%5D%20inputs%2C%20Object%5B%5D%26%20outputs)%5Cu000d%5Cu000a%20%20%20at%20System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin(MessageRpc%26%20rpc)%5Cu000d%5Cu000a%20%20%20at%20System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage5(MessageRpc%26%20rpc)%5Cu000d%5Cu000a%20%20%20at%20System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage31(MessageRpc%26%20rpc)%5Cu000d%5Cu000a%20%20%20at%20System.ServiceModel.Dispatcher.MessageRpc.Process(Boolean%20isOperationContextSet)%22%2C%22Type%22%3A%22System.ArgumentException%22%7D%2C%22ExceptionType%22%3A%22System.ArgumentException%22%2C%22Message%22%3A%22testing%22%2C%22StackTrace%22%3A%22%20%20%20at%20SavePotentialSeedSearches.WCFService.StorePotentialSeed(String%20url%2C%20String%20name)%20in%20C%3A%5C%5CTFS%5C%5CProjects%5C%5CSeeds%5C%5CPreliminaries%5C%5CSavePotentialSeedSearches%5C%5CWCFService.svc.cs%3Aline%2021%5Cu000d%5Cu000a%20%20%20at%20SyncInvokeStorePotentialSeed(Object%20%2C%20Object%5B%5D%20%2C%20Object%5B%5D%20)%5Cu000d%5Cu000a%20%20%20at%20System.ServiceModel.Dispatcher.SyncMethodInvoker.Invoke(Object%20instance%2C%20Object%5B%5D%20inputs%2C%20Object%5B%5D%26%20outputs)%5Cu000d%5Cu000a%20%20%20at%20System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin(MessageRpc%26%20rpc)%5Cu000d%5Cu000a%20%20%20at%20System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage5(MessageRpc%26%20rpc)%5Cu000d%5Cu000a%20%20%20at%20System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage31(MessageRpc%26%20rpc)%5Cu000d%5Cu000a%20%20%20at%20System.ServiceModel.Dispatcher.MessageRpc.Process(Boolean%20isOperationContextSet)%22%7D
That's 2354 characters long.
Other answers suggest that URLs above 2000 characters are a bad idea as some browsers may struggle with them. But are mailto URLs parsed in any way by the browser or are they handed immediately on to the default mail tool? If they are handed on, does anyone have data on the length of mailto URLs that various mail tool (and in particular Outlook) can handle?

As noted here
- yes the browser will parse the URL before sending it
- Safari and most email clients have no hard limit (depends on available CPU and RAM)
2015 Web Browser Testing:
Safari
705000000
2 minutes
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/601.1.56 (KHTML, like Gecko) Version/9.0 Safari/601.1.56
limited by 16GB RAM
Firefox
268435455
20 seconds
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:41.0) Gecko/20100101 Firefox/41.0
limited by maximum string length
Chrome
2097132
1 second
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36
limited without explanation
IE
2029
5 seconds
Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; .NET4.0C; .NET4.0E; rv:11.0) like Gecko
limited without explanation
2015 Email Client Testing:
Mozilla Thunderbird
2097132 works in 1 second
268435455 uses 100% CPU for 2 minutes but fails to render the body and is not usable
version 38.3.0
SeaMonkey
2097132 works in 5 seconds
268435455 uses 100% CPU for a long time (more than 5 minutes)
version 2.38
Apple Mail
500000 works in 14 seconds
2097132 uses 100% CPU for a long time (more than 5 minutes)
version 8.2
Microsoft Outlook
trims to 2070 in 1 second
version 2013

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Suspicious behavior by Google when verifying users via nodejs - node.js

Related

What are these POST-requests containing random URLs to my node server

How to define routes with a custom protocol/scheme with node.js?

Does Instagram return HTTP 500 response due to rate limiting and/or some form of request filtering?

DCE RPC bink_nak reason protocol version not supported

What is the maximum practical length of a mailto URL?

Categories

Resources