Understand ookla speedtest json values - speed-test

I'm having some difficulties matching the json results with the values shown on the website:
I run the test with speedtest -f json-pretty -u bps
{
"type": "result",
"timestamp": "2022-03-16T01:40:00Z",
"ping": {
"jitter": 68.655000000000001,
"latency": 11.285
},
"download": {
"bandwidth": 804925,
"bytes": 5394240,
"elapsed": 6706
},
"upload": {
"bandwidth": 97467,
"bytes": 1321920,
"elapsed": 15005
},
"result": {
"id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"url": "https://www.speedtest.net/result/c/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"persisted": true
}
}
But when I go to the url, I see this:
How do those 3 download values become 6.44Mbps ?

Those 3 values mean:
Bandwidth - Actual internet download speed in Bps (Bytes per second), from man speedtest:
The human-readable format defaults to Mbps
and any machine-readable formats (csv, tsv, json, jsonl, json-pretty) use bytes as the unit of measure with max precision.
To get value in Mbps, as in following image you have to divide it by 125000:
The bytes per second measurements can be transformed into the human-readable output format default unit of megabits (Mbps) by dividing the bytes per second value by 125,000.
Bytes - Volume of data used during the test (also in bytes)
Elapsed - Duration of testing download speed, in ms - how much time the testing took.

Related

Facebook's Duckling Cannot Identify Time Dimension Correctly

I'm using Facebook's Duckling to parse text. When I pass the text: 13h 47m it correctly classifies the entire text as DURATION (= 13 hours 47 minutes).
However, when I pass the text: 13h 47m 13s it cannot identify the 13s part of the String as being part of the DURATION. I was expecting it to parse it as 13 hours, 47 minutes and 13 seconds but it essentially ignores the 13s part as not being part of the DURATION.
Command: curl -XPOST http://127.0.0.1:0000/parse --data locale=en_US&text="13h 47m 13s"
JSON Array:
[
{
"latent": false,
"start": 0,
"dim": "duration",
"end": 7,
"body": "13h 47m",
"value": {
"unit": "minute",
"normalized": {
"unit": "second",
"value": 49620
},
"type": "value",
"value": 827,
"minute": 827
}
},
{
"latent": false,
"start": 8,
"dim": "number",
"end": 10,
"body": "13",
"value": {
"type": "value",
"value": 13
}
}
]
Is this a bug? How can I update Duckling so that it parses the text as described above?
The documentation seems pretty clear about this:
To extend Duckling's support for a dimension in a given language, typically 4 files need to be updated:
Duckling/<Dimension>/<Lang>/Rules.hs
Duckling/<Dimension>/<Lang>/Corpus.hs
Duckling/Dimensions/<Lang>.hs (if not already present in Duckling/Dimensions/Common.hs)
Duckling/Rules/<Lang>.hs
Taking a look in Duckling/Duration/Rules.hs, I see:
ruleIntegerUnitofduration = Rule
{ name = "<integer> <unit-of-duration>"
, pattern =
[ Predicate isNatural
, dimension TimeGrain
]
-- ...
So next I peeked in Duckling/TimeGrain/EN/Rules.hs (because Duckling/TimeGrain/Rules.hs did not exist), and see:
grains :: [(Text, String, TG.Grain)]
grains = [ ("second (grain) ", "sec(ond)?s?", TG.Second)
-- ...
Presumably this means 13h 47m 13sec would parse the way you want. To make 13h 47m 13s parse in the same way, I guess the first thing I would try would be to make the regex above a bit more permissive, maybe something like s(ec(ond)?s?)?, and see if that does the trick without breaking anything else you care about.

Elastsearch query causing NodeJS heap out of memory

What's happen now?
Recenly I build a Elasticsearch query. The main function is to get data count per hours until 12 weeks ago.
When the query get call over and over again. NodeJS memory will start from 20mb growing to 1024mb. And surprisingly the memory aren’t immediately get to the top. Its more like stably under 25mb ( maintain about several minutes ) and suddenly start to growing like (25mb,46mb,125mb,350mb...until 1024mb) and finally causing NodeJS memory leak. Whatever I call this query or not, The memory will still growing up and won’t release at all. And this scenario only happen at remote server (running in docker), At local docker env is totally fine (the memory is identical).
enter image description here
How am I query?
like below.
const query = {
"size": 0,
"query": {
"bool": {
"must": [
{ terms: { '_id.keyword': array_id } },
{
"range": {
"date_created": {
"gte": start_timestamp - timestamp_twelve_weeks,
"lt": start_timestamp
}
}
}
]
}
},
"aggs": {
"shortcode_log": {
"date_histogram": {
"field": "date_created",
"interval": "3600ms"
}
}
}
}
What's the return value?
like below ( total query time is around 2 sec ) .
{
"aggs_res": {
"shortcode_log": {
"buckets": [
{
"key": 1594710000,
"doc_count": 2268
},
{
"key": 1594713600,
"doc_count": 3602
},
{//.....total item count 2016
]
}
}
}
If your histogram interval is really of 3600ms(it should not be 3600s ?), it's a really short period of time to do the aggregation on 12 weeks.
It means 0.06 minutes.
24000 periods per day
168000 per week
2016000 for 12 weeks.
It can explain
Why your script wait for a long before doing anything
Why your memory explode when you try to loop on the buckets
In your example, you have 2016 buckets back only.
I think that their is a small difference between your 2 tests.
New update. The issue is solved. The problem in project has a layer between server and DB. So the code of this layer causing the query memory can't release.

CosmosDB size too big

can someone help me to analyse the data size for CosmosDB
I upload my data from JSON file
Single Region
I use CosmosDB SQL API for this DB
there are 95969 rows/documents
this is what I have as document 704b
the only vary as size "CityName": "Carleton Place"
however the JSON data file is 26.7MB
this gives 64MB
How come they inflate with 32MB???
index OK 15.45MB I have spatial points
{
"agegroup": 2,
"locationgeometry": {
"type": "Point",
"coordinates": [ 45.14478, -76.14443 ]
},
"ProvinceAbbr": "ON",
"age": 34,
"LHIN_LocationID": 11,
"Latitude": 45.14478,
"Longitude": -76.14443,
"PostalCode": "K7C 1X2",
"CityName": "Carleton Place",
"CityType": "D",
"ProvinceName": "Ontario",
"id": "3e496a96-db77-4535-b73b-5ab317b44231",
"_rid": "sGVsAMC4X4ICAAAAAAAAAA==",
"_self": "dbs/sGVsAA==/colls/sGVsAMC4X4I=/docs/sGVsAMC4X4ICAAAAAAAAAA==/",
"_etag": "\"0000cd97-0000-0200-0000-5d586f650000\"",
"_attachments": "attachments/",
"_ts": 1566076773
}
The JSON file 26.7MB created from MS SQL
the original MS SQL store 18.94MB with 2.5MB index
I have a SQL api, cosmosdb container with 6 logical partitions and each partition is about 15 MiB ( that is Million bytes not Mega - 1024*1024 bytes ).
That is 85 Mega Byte.
Using the DTUI tool to export the container to jsom dump, the text dump file size is 48.74 MB
Part of the overhead is CosmosdB's internal fields that are stored in CosmosDB but not part of user data and thus not part of export. Example fields ( you can see them in data explorer )
_rid
_self
_etag
_attachments
_ts
There are other overheads not seen in data explorer.
Anywai you should not be too concerned about the size - as most of your cost, typically will be RU ( usage, provisioned ).
Hope this helps !

Request telemetry - "durationMetric"?

When parsing exported Application Insights telemetry from Blob storage, the request data looks something like this:
{
"request": [
{
"id": "3Pc0MZMBJgQ=",
"name": "POST Blah",
"count": 6,
"responseCode": 201,
"success": true,
"url": "https://example.com/api/blah",
"durationMetric": {
"value": 66359508.0,
"count": 6.0,
"min": 11059918.0,
"max": 11059918.0,
"stdDev": 0.0,
"sampledValue": 11059918.0
},
...
}
],
...
}
I am looking for the duration of the request, but I see that I am presented with a durationMetric object.
According to the documentation the request[0].durationMetric.value field is described as
Time from request arriving to response. 1e7 == 1s
But if I query this using Analytics, the value don't match up to this field:
They do, however, match up to the min, max and sampledValue fields.
Which field should I use? And what does that "value": 66359508.0 value represent in the above example?
It doesn't match because you're seeing sampled data (meaning this event represents sampled data from multiple requests). I'd recommend starting with https://azure.microsoft.com/en-us/documentation/articles/app-insights-sampling/ to understand how sampling works.
In this case, the "matching" value would come from duration.sampledValue (notice that value == count * sampledValue)
It's hard to compare exactly what you're seeing because you don't show the Kusto query you're using, but you do need to be aware of sampling when writing AI Analytics queries. See https://azure.microsoft.com/en-us/documentation/articles/app-insights-analytics-tour/#counting-sampled-data for more details on the latter.

Instagram API: .caption.created_time vs .created_time

What use case would create a difference between .caption.created_time and .created_time in the metadata objects from the JSON response? My app has been monitoring media recent data from the tags endpoint for about a week, collecting 50 data points, and those two properties have always been the exact same Epoch time. However, these properties are different in the example response in Instagram's docs, albeit the difference is only four seconds. Copied below:
"caption": {
"created_time": "1296703540",
"text": "#Snow",
"from": {
"username": "emohatch",
"id": "1242695"
},
"id": "26589964"
},
"created_time": "1296703536",
The user may have created the post with the original caption but then edited the caption and saved 4 seconds after they posted the original. Fix a typo, etc.

Resources