Using reduce with a composite key in couchdb view returns no result on GET - couchdb

I have a couchdb view with the following map function:
function(doc) {
if (doc.date_of_operation) {
date_triple = doc.date_of_operation.split("/");
d = new Date(date_triple[2], date_triple[1]-1, date_triple[0], 0, 0, 0, 0)
emit([d, doc.name], 1);
}
}
When I issue a GET request for this, I get the whole view's data (2.8MB):
$ curl -X GET http://somehost:5984/ops-db/_design/ops-views/_view/counts
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2751k 0 2751k 0 0 67456 0 --:--:-- 0:00:41 --:--:-- 739k
However, when I add a reduce function:
function (key, values, rereduce) {
return sum(values);
}
I no longer get any data when using curl:
$ curl -X GET http://somehost:5984/ops-db/_design/ops-views/_view/counts
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 42 0 42 0 0 7069 0 --:--:-- --:--:-- --:--:-- 8400
The result looks like this:
{"rows":[
{"key":null,"value":27065}
]}
This view & map & reduce functions were added using the Futon interface and when the Reduce checkbox is checked there, I do get one row for every 'date, name' pairs with values accumulated for that pair. What changes when queried through a GET?

When you calling the view through curl you can try passing in the necessary parameters for triggering the reduce and grouping
e.g.
Explicitly tell CouchDB to run the reduce function
$ curl -X GET http://somehost:5984/ops-db/_design/ops-views/_view/counts?reduce=true
Or the group and group_level params
You can read more on the available options Here (under Querying Options section)

The reduce should look like this
_sum
So a simple "view" would look like this:
{
"_id": "_design/foo",
"_rev": "2-6145338c3e47cf0f311367a29787757c",
"language": "javascript",
"views": {
"test1": {
"map": "function(doc) {\n emit(null, 1);\n}",
"reduce": "_sum"
}
}
}

Related

Lost Clients PBI

I am trying to get the number of lost clients per month. The code I'm using for the measure is set forth next:
`LostClientsRunningTotal =
VAR currdate = MAX('Date'[Date])
VAR turnoverinperiod=[Turnover]
VAR clients=
ADDCOLUMNS(
Client,
"Turnover Until Now",
CALCULATE([Turnover],
DATESINPERIOD(
'Date'[Date],
currdate,
-1,
MONTH)),
"Running Total Turnover",
[RunningTotalTurnover]
)
VAR lostclients=
FILTER(clients,
[Running Total Turnover]>0 &&
[Turnover Until Now]=0)
RETURN
IF(turnoverinperiod>0,
COUNTROWS(lostclients))`
The problem is that I'm getting the running total and the result it returns is the following:
enter image description here
What I need is the lost clients per month so I tried to use the dateadd function to get the lost clients of the previous month and then subtract the current.
The desired result would be, for Nov-22 for instance, 629 (December running total) - 544 (November running total) = 85.
For some reason the **dateadd **function is not returning the desired result and I can't make head or tails of it.
Can you tell me how should I approach this issue please? Thank you in advance.

Python comparing values from two dictionaries where keys match and one set of values is greater

I have the following datasets:
kpi = {
"latency": 3,
"cpu_utilisation": 0.98,
"memory_utilisation": 0.95,
"MIR": 200,
}
ns_metrics = {
"timestamp": "2022-10-04T15:24:10.765000",
"ns_id": "cache",
"ns_data": {
"cpu_utilisation": 0.012666666666700622,
"memory_utilisation": 8.68265852766783,
},
}
What I'm looking for is an elegant way to compare the cpu_utilisation and memory_utilisation values from each dictionary and if the two utilisation figures from ns_metrics is greater than kpi, for now, print a message as to which utilisation value was greater,i.e. was it either cpu or memory or both. Naturally, I can do something simple like this:
if ns_metrics["ns_data"]["cpu_utilisation"] > kpi["cpu_utilisation"]:
print("true: over cpu threshold")
if ns_metrics["ns_data"]["memory_utilisation"] > kpi["memory_utilisation"]:
print("true: over memory threshold")
But this seems a bit longer winded to have many if conditions, and I was hoping there is a more elegant way of doing it. Any help would be greatly appreciated.
maybe you can use a loop to do this:
check_list = ["cpu_utilisation", "memory_utilisation"]
for i in check_list:
if ns_metrics["ns_data"][i] > kpi[i]:
print("true: over {} threshold".format(i.split('_')[0]))
if the key is different,you can use a mapping dict to do it,like this:
check_mapping = {"cpu_utilisation": "cpu_utilisation_1"}
for kpi_key, ns_key in check_mapping.items():
....

Azure FaceAPI limits iteration to 20 items

I have a list of image urls from which I use MS Azure faceAPI to extract some features from the photos. The problem is that whenever I iterate more than 20 urls, it seems not to work on any url after the 20th one. There is no error shown. However, when I manually changed the range to iterate the next 20 urls, it worked.
Side note, on free version, MS Azure Face allows only 20 requests/minute; however, even when I let time sleep up to 60s per 10 requests, the problem still persists.
FYI, I have 360,000 urls in total, and sofar I have made only about 1000 requests.
Can anyone help tell me why this happens and how to solve this? Thank you so much!
# My codes
i = 0
for post in list_post[800:1000]:
i += 1
try:
image_url = post['photo_url']
headers = {'Ocp-Apim-Subscription-Key': KEY}
params = {
'returnFaceId': 'true',
'returnFaceLandmarks': 'false',
'returnFaceAttributes': 'age,gender,headPose,smile,facialHair,glasses,emotion,hair,makeup,occlusion,accessories,blur,exposure,noise',
}
response = requests.post(face_api_url, params=params, headers=headers, json={"url": image_url})
post['face_feature'] = response.json()[0]
except (KeyError, IndexError):
continue
if i % 10 == 0:
time.sleep(60)
The free version has a max of 30 000 request per month, your 356 000 faces will therefore take a year to run.
The standard version costs USD 1 per 1000 requests, giving a total cost of USD 360. This option supports 10 transactions per second.
https://azure.microsoft.com/en-au/pricing/details/cognitive-services/face-api/

How to assign each item with the keys in map

I'm storing (non personal) data in a list of Strings which is given values by the user once an action is carried out.
Since the List is only given values once the user performs the action I don't know how many items there will be.
I'm trying to sort this List into two types of data:
Strings and DateTime
I've been trying to convert the list into a Map but I'm not sure how to assign each item with the keys "appname" and "durations". How would do you this?
Please suggest a better way of solving this issue, if there is one?
***Updated
Data example:
YouTube
1h 20m on screen - 2m background
1h 22m
Chrome
1h 26m on screen - 10m background
1h 36m
Google Maps
3h 4m on screen - 2h 54m background
5h 58m
The data is stored using a final List<String> _usage.
** Update: added example of what data should be parsed and how to access it.
Im looking to parse this data as json:
{"app": "YouTube", "duration": "1h 22m"},
//Removes the'1h 20m on screen - 2m background'
{"app": "Chrome", "duration": "1h 36m"},
//Removes the '1h 26m on screen - 10m background'
{"app": "Google Maps", "duration": "5h 58m")"}
//Removes the '3h 4m on screen - 2h 54m background'
I'd preferably like to access it through a Map or HashMap type.
I guess you could make something like this but it is difficult to make a stable solution without knowing all pitfalls in the data. E.g. I assume we always can guarantee that the name of the app comes first and then at some point some duration:
List<String> _usage = [
'YouTube',
'1h 20m on screen - 2m background',
'1h 22m',
'Chrome',
'1h 26m on screen - 10m background',
'1h 36m',
'Google Maps',
'3h 4m on screen - 2h 54m background',
'5h 58m'
];
void main(List arguments) {
final map = <String, Duration>{};
for (var i = 0; i < _usage.length; i += 3) {
final app = _usage[i];
final duration = _parseDuration(_usage[i + 2]);
if (map.containsKey(app)) {
map[app] += duration;
} else {
map[app] = duration;
}
}
// {YouTube: 1:22:00.000000, Chrome: 1:36:00.000000, Google Maps: 5:58:00.000000}
print(map);
}
final _regExp = RegExp(r'(?<hours>\d+)h (?<minutes>\d+)m');
Duration _parseDuration(String line) {
final match = _regExp.firstMatch(line);
if (match == null) {
throw Exception('Could not get duration from: $line');
}
return Duration(
hours: int.parse(match.namedGroup('hours')),
minutes: int.parse(match.namedGroup('minutes')));
}
I also added to the solution so if the same app name comes up, we add the duration instead of overwriting the previous value.

Redis Node - Querying a list of 250k items of ~15 bytes takes at least 10 seconds

I'd like to query a whole list of 250k items of ~15 bytes each.
Each item (some coordinates) is a 15 bytes string like that xxxxxx_xxxxxx_xxxxxx.
I'm storing them using this function :
function setLocation({id, lat, lng}) {
const str = `${id}_${lat}_${lng}`
client.lpush('locations', str, (err, status) => {
console.log('pushed:', status)
})
}
Using nodejs, doing a lrange('locations', 0, -1) takes between 10 seconds and 15 seconds.
Slowlog redis lab:
I tried to use sets, same results.
According to this post
This shouldn't take more than a few milliseconds.
What am I doing wrong here ?
Update:
I'm using an instance on Redis lab

Resources