Unable to bulk upload log data in ArangoDB? - arangodb

I need to bulk upload the log data in ArangoDB. It is not in JSON and CSV format, its a log data in gz format.
203.109.94.55 - - [19/Jun/2015:16:02:45 +0000] "GET /origin-cdn.firstcry.com/brainbees/images/products/thumb/506739a.jpg HTTP/1.1" 200 21514 "-" "Mozilla/5.0 (Linux; Android 4.4.4; XT1022 Build/KXC21.5-40) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.114 Mobile Safari/537.36" "-"
183.87.73.202 - - [19/Jun/2015:16:02:45 +0000] "GET /origin-cdn.firstcry.com/brainbees/images/products/bigthumb/555258a.jpg HTTP/1.1" 200 34903 "-" "Dalvik/2.1.0 (Linux; U; Android 5.0.2; AO5510 Build/LRX22G)" "-"
183.87.73.202 - - [19/Jun/2015:16:02:45 +0000] "GET /origin-cdn.firstcry.com/brainbees/images/products/bigthumb/555401a.jpg HTTP/1.1" 200 32334 "-" "Dalvik/2.1.0 (Linux; U; Android 5.0.2; AO5510 Build/LRX22G)" "-"
Even when we are trying for uploading single file through Arangosh then too it is not uploading and generating error that format is invalid.
I have daily data of 2 GB to upload and process, how can I do through ArangoDB?
As I have gone through documents of ArangoDB and found the bulk upload for only JSON. Any help with how to upload and process the same will be grateful?

ArangoDB only supports bulk upload of JSON, CSV, or TSV. Therefore you need to convert the log file. I good starting point is google. For example, there is a project called "log2json":
https://github.com/kadnan/logs2json
with some minor tweaks you should be able to generate lines of JSON (the above project creates one large JSON, which is not what you want).

Related

Seemingly random '403 Server failed to authenticate the request error' error when interacting with Azure Blob Storage via SAS Key

I have a React application that fetches a SAS Key from an API, then uses it to upload a file to Azure Blob Storage or make other changes there. However, an issue I sometimes run into at very intermittent intervals is a "403 Server failed to authenticate the request" error. Sometimes I can go days of testing without running into this error. When it arises, I can often refresh the page and successfully complete the action next time. However, other times like tonight I can't complete a single action to the blob storage server.
The full error message text is below:
403 (Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.)
Uncaught (in promise) RestError: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
RequestId:48abb2df-a01e-0034-2ca8-03daae000000
Time:2022-11-29T04:13:40.0000267Z
My request headers:
Accept: application/xml
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Cache-Control: no-cache
Connection: keep-alive
Content-Length: 0
DNT: 1
Host: <servername, omitted for privacy>
Origin: http://localhost:3000
Pragma: no-cache
Referer: http://localhost:3000/
sec-ch-ua: "Microsoft Edge";v="107", "Chromium";v="107", "Not=A?Brand";v="24"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: cross-site
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.56
x-ms-blob-public-access: container
x-ms-client-request-id: 53bc0407-af75-4468-afc8-15cca65a5914
x-ms-version: 2021-10-04
An example of the URL that is sent with an old SAS Key:
https://<servername omitted for privacy>/<container name>?sv=2021-10-04&ss=bf&srt=sco&spr=https&st=2022-11-29T04%3A26%3A29Z&se=2022-11-29T04%3A36%3A29Z&sp=rwdlacupi&sig=wqPy8ZyXVPDj5pC49MVamyZWx9ROav6SyTw8aktWEpY%3D&restype=container
I have viewed many other posts on this site asking the same question but can't seem to fix my issue with their posted solutions. My SAS Key often fails even when I have verified it has no plus signs present. I have a x-ms-version header item being sent as can be seen above. And my machine time is correct.
I wondered if maybe duplicate SAS keys were being generated, but I checked my console on the API which logs every time a key is generated and am seeing only one entry when the operation fails.

Python Request Waits not Responding

import requests
res = requests.get("https://api.nasdaq.com/api/quote/list-type/nasdaq100")
When I enter url to browser it gives data, but when I use python it does not get data just waits. Thanks.
Seems like the user agent is the culprit after all:
>>> import requests
>>> requests.get("https://api.nasdaq.com/api/quote/list-type/nasdaq100", headers={"User-Agent":'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36'})
<Response [200]>
Websites tend to block bots' and scrapers' user-agents. It is a common internet etiquette to respect their wishes.
Seems like Nasdaq's API has free and premium tiers. I guess they wish programmatic Python users to obtain an API key.
Keep in mind they also have their own Python API.
Looks like Nasdaq's server leaves the connection open and gets stuck when a different user agent is sent.
Try switching between the commented and uncommented ssl.send():
import socket, ssl
context = ssl.create_default_context()
hostname = "api.nasdaq.com"
sock = socket.create_connection((hostname, 443))
ssl = context.wrap_socket(sock, server_hostname=hostname)
#ssl.send(b'GET /api/quote/list-type/nasdaq100 HTTP/1.1\r\nHost: api.nasdaq.com\r\nUser-Agent: python-requests/2.27.1\r\nAccept-Encoding: identity\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n')
ssl.send(b'GET /api/quote/list-type/nasdaq100 HTTP/1.1\r\nHost: api.nasdaq.com\r\nUser-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 10_3 like Mac OS X);AppleWebKit/602.1.50 (KHTML, like Gecko) CriOS/56.0.2924.75;Mobile/14E5239e Safari/602.1\r\nAccept-Encoding: identity\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n')
print(ssl.recv(1024))
Faking request's user agent in chrome works though and returns a 403 unauthorized.

Cloudfront always miss HEAD request. Why?

I am sending an ajax HEAD request to a file served over CloudFront. The max-age set for this file in S3 is 1800. Always it miss the cache in CloudFront.
Is this the expected behavior? Or is there anything has to be configured so that the HEAD request hits cloudfront until the specified time in max-age?
EDIT
Two consecutive head request/response are as below:
Request 1
Request URL:https://360-dev.web-dev.mydomain.com/resources/data/master.json
Request Method:HEAD
Status Code:304 Not Modified
Remote Address:52.84.105.65:443
Response Headers
view source
Cache-Control:max-age=1800, private
Connection:keep-alive
Date:Tue, 27 Sep 2016 10:20:56 GMT
ETag:"213cd6a833efde3409a8dc3808e01c46"
Last-Modified:Thu, 22 Sep 2016 11:35:17 GMT
Server:AmazonS3
Via:1.1 f2eee4ce6eb32d1b7578af7dc2c917de.cloudfront.net (CloudFront)
X-Amz-Cf-Id:QgmjSCu2uIam9Jmo63a8g-qytd6OsyalTEpNUGOaMp0EtJkheENkIA==
x-amz-storage-class:REDUCED_REDUNDANCY
X-Cache:Miss from cloudfront
Request Headers
view source
Accept:*/*
Accept-Encoding:gzip, deflate, sdch, br
Accept-Language:en-US,en;q=0.8
Cache-Control:max-age=0
Connection:keep-alive
Content-Type:application/x-www-form-urlencoded
Cookie:UnicaID=gU6xZpJesOr-Z6LqaYt; __utma=227427714.2013234852.1473314245.1473314245.1473314245.1; __utmz=227427714.1473314245.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); SMIDENTITY=qDrbSpOZGmtMUtkclvRVlc+KUiEI64G3S7hLBFdqQCoWNLGHde2Ra7dVVWFmIjMXbUR5y3gyxjPBFA8Lcugrv3hY87qavk7fpL2XSLfBSDo4s2hMJeJXD69/iMJwe09pf7ZRxguLJc/o+lDEcIG/rLxtBBNbXnjavsLs5sipgR9A0Wf+XHLEUtBPztis4ydwMZbOoxb3kxmyuUceJgKsCA6un4FhMR3OZrbWyh6S9lEQ4/1KgHyf3P5CZwmit0ZUawjOnFMTyH/TWml3EB/spjeB69N64FDf4DsigqqFq/06Bp6nmXeq2dn9TWTWtJ3DNeSu62JyjE2KJ/59wkJ4NHzpPjiHRtbhh441bisCqjoHQ1KKrkKvnIlbNs7Brql04DRlEvIBuycumQD4DYbESvto3gw0rGpKDiD13k6AUJ/pyI7974aQcR8i9eCXWBPD5Jnx+J+DWGh1XWXCRZgu6jBGQ6sx/e6yfuo45eLqXpa7D+qBBFDSoBjtgog30vIyKcpHwcLa603X22K9wdspX/DO8QuV2vBMtYcaYC85Y3NC+0jznqfIUOqqvPvHk24dEnyS9iB6lyd9KqDR6HPcjwMzBtXdWnv0EZKwssrqgEAJL7eOfxptOpG3u5mf3YL8; SMSESSION=yOYBm7PNsDUiLFC/accSxCQgb8Ps8ZOJwWABqge/q3ktPHwef0AAtP31vm3mSkmB6Xny0NyevVx4NgkvoREs3K8lHNrNPabQAW6TJYQ4X3DMWK0HvrsaJYDRWA+lHqQCsZeOaYOwH6WxecH8jIcBC6MZmoG3eBuyegeFi2yJG/jqRi2FkcV4c4ffvg3FTUmF3GcMRvGI4G+YC5WubDhwKs7p1M/e3XyUCM6FwCTnSRVLhDt1q6M+4HJAw7j3B73mdt3axe9wzZ5lSsNGAzyI8v/2i9avLdEHtbIJSpgkWjEIWlNDgPj/jhtttY0zugLAttAblbPZr+w9Mvafh8fRmYHlBLr8sFjJFEk1fs8sqs9I+GRa8KFfk9UPImSu5iiIML9HH/ga/KaSvfL7BvZ/vUvqeudXIy3zR10j1uy9dLKAlduuSqoYwJrLpa5+u4hRRl8450JQLsNry9slNL4zTBYrE6aFsKKsu/+rTXq4tZ/fFWBMz28rC2JRroBhtmAbV3MMqw/WonCUpEyHxsRzDYHu+sAQhP585Pf3l2zxN63aqtzYqK5lE5pKyF8ivb1zFgdE5aZbGSYsjIw2p1l3MCfZDOkIqUiorNxVgz9vXCaafOblEARizV5nwMC/k+VqNuhBZgcTfNt3izOXfZfTxw+VG2eO97jgO/0XpDTix2Ok9VcS5r1jYJ6Afbo12fWWOk4oC835jOvtINjI8GYQmI3qS/Hy1gHIynU31o5X7cQyOOM4OC0JbnSXWwbvP+c03j8fh1jVpv03pW/0HlQvhbtWIgcB3YAUSnKChu8Ae29UhAZFPvzvg0pmM5zk31J5TLJq0ng9glu21hcD7kLU69ytzMKHwZTMIf9HdYFKJLfaLVSqo328E+yZ7kuaGvLX0Xt5qQHPaoYEbuU2HnLkh8DxIuv7hp7t8aNUaqsEVuhr1cDzV52Wzt0WMUP6KP149MmXTzn8s1FrHtzUxkGdAjj9HBF5AHpoUa79XPCpI7etRX2Beo2IZHg5BvDtYBv8ntczsQNNYyCqAUzD6ZE4u7nHssmLnDT+; CloudFront-Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly8zNjAtZGV2Ki53ZWItZGV2LmJtcy5jb20vKiIsIkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTQ3NTAwNjQ2OX19fV19; CloudFront-Signature=SvN~2tPgK~N~GzoY2pVOFZN1nic4t2Kgq3AucGD8gvuGS4iqjnlBIceFM~k5ZHZRlSbWa8V8QZzoYuMZvY2GvAjGJrDigJD93Vxq0qCm6alexx5~yxtX1FebaFAp68fgqo1tbjVYm7nCYrvGl2RebFcucbN6RC-Lo6aBvPnIgTrXqa6OrJKgxQQxii~LE7l9XnnKHWoYnrjBZEFWuqJ5fHrWK1MennKKAh67nOO9OGznX9slQRXBGCpNV4SCICzQEMaMxHBANjVE7nTfP9YussBV-AXYaQdkvdNt6LWcDotZu~wDDqlrBpNcru6EqJackyUAOvS982t4BPGAiL1jjQ__; CloudFront-Key-Pair-Id=APKAIOWOUVDQ5VOOZ5IA; Custom-Insite-Cookie=eyJMREFQR3JvdXAiOlsiIl0sInVzZXJOYW1lIjoic3VicmFtbTIiLCJpbnRlcm5ldCI6dHJ1ZSwiZW52aXJvbm1lbnQiOiIzNjAtZGV2IiwibG9naW5EYXRlIjoiMjAxNi0wOS0yN1QxMDowMTowOS4zODhaIn0%3D; rtFa=fa1/uY2RI0bciZ9ZZSoGEQfpjl1ezr5wmPN7/U+ySlNuu2iSn1blhq9qeQBW2Iq3gSllMAAlV5troHg6UfB2KKR7TkNU5Q3IS3TnYS+XJJxSaLXv5ghig7fDjU1KKCs9IbrJaCW9XIMzgtfDlxXE/EHRhD3+u5xX4KegxZwGjWMMNM0QOOZrtMvk98h08BhpKfAChj8CPmeaghOehRhgxbOlLwQ+1AHIgrZ4Y8n7sbW4zw4NnAPhTgdtfJ43midH1pfqIH5ijy5x4a+61nczQFkI9+WxMfqsBVJDDteeBLVc+NPuWw84JRlar01jB4Qpm0VFzt8sXMOApfNosrsgR1iQmlcdZqSpMcrKIEwKw11GZiyNGyVzMd3R1/vzQI9gIAAAAA==
Host:360-dev.web-dev.mydomain.com
If-Modified-Since:Thu, 22 Sep 2016 11:35:17 GMT
If-None-Match:"213cd6a833efde3409a8dc3808e01c46"
Referer:https://360-dev.web-dev.mydomain.com/home.html
User-Agent:Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36
X-Requested-With:XMLHttpRequest
Request 2
Request URL:https://360-dev.web-dev.mydomain.com/resources/data/master.json
Request Method:HEAD
Status Code:304 Not Modified
Remote Address:52.84.105.65:443
Response Headers
view source
Cache-Control:max-age=1800, private
Connection:keep-alive
Date:Tue, 27 Sep 2016 10:22:09 GMT
ETag:"213cd6a833efde3409a8dc3808e01c46"
Last-Modified:Thu, 22 Sep 2016 11:35:17 GMT
Server:AmazonS3
Via:1.1 0f99540d655ae57ac39033aac52161f5.cloudfront.net (CloudFront)
X-Amz-Cf-Id:3ci8nzrDmmcJMq7_ElxxU6HPbedPqp0P1fDXiDUFIO7b-qm_R2bxMg==
x-amz-storage-class:REDUCED_REDUNDANCY
X-Cache:Miss from cloudfront
Request Headers
view source
Accept:*/*
Accept-Encoding:gzip, deflate, sdch, br
Accept-Language:en-US,en;q=0.8
Cache-Control:max-age=0
Connection:keep-alive
Content-Type:application/x-www-form-urlencoded
Cookie:UnicaID=gU6xZpJesOr-Z6LqaYt; __utma=227427714.2013234852.1473314245.1473314245.1473314245.1; __utmz=227427714.1473314245.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); SMIDENTITY=qDrbSpOZGmtMUtkclvRVlc+KUiEI64G3S7hLBFdqQCoWNLGHde2Ra7dVVWFmIjMXbUR5y3gyxjPBFA8Lcugrv3hY87qavk7fpL2XSLfBSDo4s2hMJeJXD69/iMJwe09pf7ZRxguLJc/o+lDEcIG/rLxtBBNbXnjavsLs5sipgR9A0Wf+XHLEUtBPztis4ydwMZbOoxb3kxmyuUceJgKsCA6un4FhMR3OZrbWyh6S9lEQ4/1KgHyf3P5CZwmit0ZUawjOnFMTyH/TWml3EB/spjeB69N64FDf4DsigqqFq/06Bp6nmXeq2dn9TWTWtJ3DNeSu62JyjE2KJ/59wkJ4NHzpPjiHRtbhh441bisCqjoHQ1KKrkKvnIlbNs7Brql04DRlEvIBuycumQD4DYbESvto3gw0rGpKDiD13k6AUJ/pyI7974aQcR8i9eCXWBPD5Jnx+J+DWGh1XWXCRZgu6jBGQ6sx/e6yfuo45eLqXpa7D+qBBFDSoBjtgog30vIyKcpHwcLa603X22K9wdspX/DO8QuV2vBMtYcaYC85Y3NC+0jznqfIUOqqvPvHk24dEnyS9iB6lyd9KqDR6HPcjwMzBtXdWnv0EZKwssrqgEAJL7eOfxptOpG3u5mf3YL8; SMSESSION=yOYBm7PNsDUiLFC/accSxCQgb8Ps8ZOJwWABqge/q3ktPHwef0AAtP31vm3mSkmB6Xny0NyevVx4NgkvoREs3K8lHNrNPabQAW6TJYQ4X3DMWK0HvrsaJYDRWA+lHqQCsZeOaYOwH6WxecH8jIcBC6MZmoG3eBuyegeFi2yJG/jqRi2FkcV4c4ffvg3FTUmF3GcMRvGI4G+YC5WubDhwKs7p1M/e3XyUCM6FwCTnSRVLhDt1q6M+4HJAw7j3B73mdt3axe9wzZ5lSsNGAzyI8v/2i9avLdEHtbIJSpgkWjEIWlNDgPj/jhtttY0zugLAttAblbPZr+w9Mvafh8fRmYHlBLr8sFjJFEk1fs8sqs9I+GRa8KFfk9UPImSu5iiIML9HH/ga/KaSvfL7BvZ/vUvqeudXIy3zR10j1uy9dLKAlduuSqoYwJrLpa5+u4hRRl8450JQLsNry9slNL4zTBYrE6aFsKKsu/+rTXq4tZ/fFWBMz28rC2JRroBhtmAbV3MMqw/WonCUpEyHxsRzDYHu+sAQhP585Pf3l2zxN63aqtzYqK5lE5pKyF8ivb1zFgdE5aZbGSYsjIw2p1l3MCfZDOkIqUiorNxVgz9vXCaafOblEARizV5nwMC/k+VqNuhBZgcTfNt3izOXfZfTxw+VG2eO97jgO/0XpDTix2Ok9VcS5r1jYJ6Afbo12fWWOk4oC835jOvtINjI8GYQmI3qS/Hy1gHIynU31o5X7cQyOOM4OC0JbnSXWwbvP+c03j8fh1jVpv03pW/0HlQvhbtWIgcB3YAUSnKChu8Ae29UhAZFPvzvg0pmM5zk31J5TLJq0ng9glu21hcD7kLU69ytzMKHwZTMIf9HdYFKJLfaLVSqo328E+yZ7kuaGvLX0Xt5qQHPaoYEbuU2HnLkh8DxIuv7hp7t8aNUaqsEVuhr1cDzV52Wzt0WMUP6KP149MmXTzn8s1FrHtzUxkGdAjj9HBF5AHpoUa79XPCpI7etRX2Beo2IZHg5BvDtYBv8ntczsQNNYyCqAUzD6ZE4u7nHssmLnDT+; CloudFront-Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly8zNjAtZGV2Ki53ZWItZGV2LmJtcy5jb20vKiIsIkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTQ3NTAwNjQ2OX19fV19; CloudFront-Signature=SvN~2tPgK~N~GzoY2pVOFZN1nic4t2Kgq3AucGD8gvuGS4iqjnlBIceFM~k5ZHZRlSbWa8V8QZzoYuMZvY2GvAjGJrDigJD93Vxq0qCm6alexx5~yxtX1FebaFAp68fgqo1tbjVYm7nCYrvGl2RebFcucbN6RC-Lo6aBvPnIgTrXqa6OrJKgxQQxii~LE7l9XnnKHWoYnrjBZEFWuqJ5fHrWK1MennKKAh67nOO9OGznX9slQRXBGCpNV4SCICzQEMaMxHBANjVE7nTfP9YussBV-AXYaQdkvdNt6LWcDotZu~wDDqlrBpNcru6EqJackyUAOvS982t4BPGAiL1jjQ__; CloudFront-Key-Pair-Id=APKAIOWOUVDQ5VOOZ5IA; Custom-Insite-Cookie=eyJMREFQR3JvdXAiOlsiIl0sInVzZXJOYW1lIjoic3VicmFtbTIiLCJpbnRlcm5ldCI6dHJ1ZSwiZW52aXJvbm1lbnQiOiIzNjAtZGV2IiwibG9naW5EYXRlIjoiMjAxNi0wOS0yN1QxMDowMTowOS4zODhaIn0%3D; rtFa=fa1/uY2RI0bciZ9ZZSoGEQfpjl1ezr5wmPN7/U+ySlNuu2iSn1blhq9qeQBW2Iq3gSllMAAlV5troHg6UfB2KKR7TkNU5Q3IS3TnYS+XJJxSaLXv5ghig7fDjU1KKCs9IbrJaCW9XIMzgtfDlxXE/EHRhD3+u5xX4KegxZwGjWMMNM0QOOZrtMvk98h08BhpKfAChj8CPmeaghOehRhgxbOlLwQ+1AHIgrZ4Y8n7sbW4zw4NnAPhTgdtfJ43midH1pfqIH5ijy5x4a+61nczQFkI9+WxMfqsBVJDDteeBLVc+NPuWw84JRlar01jB4Qpm0VFzt8sXMOApfNosrsgR1iQmlcdZqSpMcrKIEwKw11GZiyNGyVzMd3R1/vzQI9gIAAAAA==
Host:360-dev.web-dev.mydomain.com
If-Modified-Since:Thu, 22 Sep 2016 11:35:17 GMT
If-None-Match:"213cd6a833efde3409a8dc3808e01c46"
Referer:https://360-dev.web-dev.mydomain.com/home.html
User-Agent:Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36
X-Requested-With:XMLHttpRequest
See the below image of a 'behaviour' for reference:
Things to note:
Behaviours work off 'path patterns' in the priority order that
they are defined, make sure you open the correct behaviour to see
how its configured.
Default TTL will come into play if your
server does NOT send cache headers such as "Cache-Control max-age,
Cache-Control s-maxage, and Expires to objects".
Min & max
interact with the HTTP headers sent and do what their names suggest.
If in head requests you have different headers, all the headers
will become part of the key which cloud front will use to cache it,
and compare that key with the new requests to determine whether its
the same or not as the previous name.
If cookies are being different, most notably authentication & session keys (and any other) and if forward cookies option is not set to "None" it will
use the cookie values as part of the key as well (this will most
likely cause your cache to be seldom used). 6) Query strings
forwarding if used, will also cause it to become part of the key.
Now you can either determine it from this info or if you can't then paste a screenshot of your behaviour screen, after making sure its the correct behaviour, and I can help further.

Socket.io multiple connections from one client

I have a Node JS + Socket.io setup handling a chat + voting website. Since day one I've noticed, that from time to time there were multiple connections made from one client.
Client's browser (user agent) and network (ip) are different every time, and there is around 2-3k connections made at a time. This behavior is quite hard to diagnose/reproduce, since it happens once every couple days.
From what i've observed, it's like client's browser connects and immediately disconnects (times out), so it tries to reconnect. Socket.io doesn't know that user "timed out", so io.engine.clientsCount shows increased connections count for few minutes. This wouldn't be a problem, there is not much impact on performance, but it shows false data in stats, which is not acceptable for me.
As it shows on a screenshot below, this behavior causes "spikes" in connection counters:
I couldn't find any information about this behavior, but I had few ideas:
Client is behind a firewall - Node.JS listens on ports above 3000, so some public networks won't allow connections. This would be it, but then no connection would be made at all
Client uses some kind of proxy - But then I would get proxies IP address, instead of client's
Possible DOS attack - Nope. To simple, doesn't cause any damage.
Installed plugin / disabled feature - this could be possible, but I have no idea what would cause such a behavior.
Funny thing is, this behavior was observed with clients using relatively up-to-date browsers, so it's not like someone is trying his luck with IE4 :)
This is a real head-scratcher for me.. If anyone have observed something like this, i would really appreciate the help
Edit 2016-02-07
I caught 2 users with this issue and got some info about them:
HTTP headers:
host: ***.***.**.**:3000
connection: keep-alive
referer: http://***************.pl/
origin: http://***************.pl
x-wap-profile: http://wap.samsungmobile.com/uaprof/GT-P7510.xml
accept: text/xml, text/html, application/xhtml+xml, image/png, text/plain, */*;q=0.8
accept-charset: utf-8, iso-8859-1, utf-16, *;q=0.7
user-agent: Mozilla/5.0 (Linux; U; Android 3.2; pl-pl; GT-P7510 Build/HTJ85B) AppleWebKit/534.13 (KHTML, like Gecko) Version/4.0 Safari/534.13
accept-encoding: gzip,deflate
accept-language: pl-PL, en-US
cookie: io=ejdnQT_TOXNpRc-GAJU8
Socket info:
time: Sun Feb 07 2016 17:13:05 GMT+0100 (CET)
address: ***.**.**.**
xdomain: true
secure: false
issued: 1454861585297
url: /socket.io/?EIO=3&transport=polling&t=1454865217050-32520
-----
HTTP headers:
host: ***.***.**.**:3002
connection: keep-alive
referer: http://***************.pl/
origin: http://***************.pl
accept: text/xml, text/html, application/xhtml+xml, image/png, text/plain, */*;q=0.8
accept-charset: utf-8, iso-8859-1, utf-16, *;q=0.7
user-agent: Mozilla/5.0 (Linux; U; Android 3.2; pl-pl; GT-P7300 Build/HTJ85B) AppleWebKit/534.13 (KHTML, like Gecko) Version/4.0 Safari/534.13
accept-encoding: gzip,deflate
accept-language: pl-PL, en-US
cookie: io=lLFL2chztf7gZzh3AFtG
Socket info:
time: Sun Feb 07 2016 17:40:55 GMT+0100 (CET)
address: **.***.*.**
xdomain: true
secure: false
issued: 1454863255162
url: /socket.io/?EIO=3&transport=polling&t=1454863408365-11598
They both appear to be using Samsung tablets with Android 3.2. I've tested on Android AVD emulator with this version, but got no results.

Cloudant "case_clause" error with pouchdb when replicating

I am working with Pouchdb and Cloudant, and when my web app starts up it does a replication from Cloudant down to my pouchdb in the browser. I have an idea of how pouchdb works internally, and this is how I believe the process works (high level):
Replication starts
Gets a checkpoint doc from cloudant db (contains latest sequence number retrived from server, if not exists, assumes sequence # is 0, which is my case)
Grabs the changes from the changes freed starting at that sequence number (it grabs up to 25 changes)
Writes(or updates) the checkpoint doc back to cloudant server with the new sequence number (this way if a network error occurs, it can continue where it left off or for the next replication)
Repeats until no changes left
Replication complete
The problem is at step 4, that when pouch tries to write that doc to the cloudant server (for the first time), the server returns a 'case_clause' error. I am thinking the issue might be an invalid id sent to cloudant (cloudant doesn't accept ids of this format), because the id of the doc written to the server is _local/799c37dfaefb3774a04f55c7f8cee947 (or other random numbers and characters at the end). I don't know if that is a valid doc id or not (for cloudant that is, because this is accurate for pouchdb), so I guess I am asking, is that the issue (unacceptable id for cloudant), or is there some other issue based on the error the cloudant server returns.
Here is the doc being written:
{
_id: "_local/799c37dfaefb3774a04f55c7f8cee947",
last_seq: "63"
}
Here is the full error output from Chrome debugger:
{
error: "case_clause"
reason: "{{case_clause,{ok,{error,[{{doc,>,
{338,
[>]},
{[{>,>}]},
[],false,[]},
{error,internal_server_error}}]}}},
[{fabric,update_doc,3},{chttpd_db,'-update_doc/6-fun-0-',3}]}"
stack: Array[4]
0: "chttpd_db:update_doc/6"
1: "chttpd:handle_request/1"
2: "mochiweb_http:headers/5"
3: "proc_lib:init_p_do_apply/3"
length: 4
__proto__: Array[0]
status: 500
}
Note: When I go into cloudant's Futon and manually enter the url for the checkpoint doc using its id, it does not exist.
Thanks
EDIT:
Header Info from the above request using Chrome debugger:
Request URL:http://lessontrek.toddbluhm.c9.io/db/ilintindingreseseldropec/_local%2F799c37dfaefb3774a04f55c7f8cee947
Request Method:PUT
Status Code:500 Internal Server Error
Request Headersview parsed
PUT /db/ilintindingreseseldropec/_local%2F799c37dfaefb3774a04f55c7f8cee947 HTTP/1.1
Host: lessontrek.toddbluhm.c9.io
Connection: keep-alive
Content-Length: 111
Accept: application/json
Origin: http://lessontrek.toddbluhm.c9.io
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.71 Safari/537.36
Content-Type: application/json
Referer: http://lessontrek.toddbluhm.c9.io/app
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Cookie: connect.sid=s%3A8MVBFmbizTX4VNOqZNtIuxQI.TZ9yKRqNv0ePbTB%2FmSpJsncYszJ8qBSD5EWHzxQYIbg; AuthSession=(removed for security purposes, but valid); db_name=ilintindingreseseldropec; __utma=200306492.386329876.1368934655.1375164160.1375252679.55; __utmc=200306492; __utmz=200306492.1372711539.22.2.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); c9.live.proxy=(removed for security purposes, but valid)
Request Payloadview parsed
{"_id":"_local/799c37dfaefb3774a04f55c7f8cee947","last_seq":"63","_rev":"338-7db9750558e43e2076a3aa720a6de47b"}
Response Headersview parsed
HTTP/1.1 500 Internal Server Error
x-powered-by: Express
vary: Accept-Encoding
x-couch-request-id: 7d2ca9fc
server: CouchDB/1.0.2 (Erlang OTP/R14B)
date: Wed, 31 Jul 2013 07:29:23 GMT
content-type: application/json
cache-control: must-revalidate
content-encoding: gzip
transfer-encoding: chunked
via: 1.1 project-livec993c2dc8b8c.rhcloud.com (node-web-proxy/0.4)
X-C9-Server: proxy_subdomain_collab-bus2_01
Cloudant, like CouchDB, expects all _local revs to begin "0-". pouchdb should not be generating rev values of this form. If you try this PUT against CouchDB you get the same stack trace.

Resources