Select top/latest 10 in couchdb? - couchdb

How would I execute a query equivalent to "select top 10" in couch db?
For example I have a "schema" like so:
title body modified
and I want to select the last 10 modified documents.
As an added bonus if anyone can come up with a way to do the same only per category. So for:
title category body modified
return a list of latest 10 documents in each category.
I am just wondering if such a query is possible in couchdb.

To get the first 10 documents from your db you can use the limit query option.
E.g. calling
http://localhost:5984/yourdb/_design/design_doc/_view/view_name?limit=10
You get the first 10 documents.
View rows are sorted by the key; adding descending=true in the querystring will reverse their order. You can also emit only the documents you are interested using again the querystring to select the keys you are interested.
So in your view you write your map function like:
function(doc) {
emit([doc.category, doc.modified], doc);
}
And you query it like this:
http://localhost:5984/yourdb/_design/design_doc/_view/view_name?startkey=["youcategory"]&endkey=["youcategory", date_in_the_future]&limit=10&descending=true

here is what you need to do.
Map function
function(doc)
{
if (doc.category)
{
emit(['category', doc.category], doc.modified);
}
}
then you need a list function that groups them, you might be temped to abuse a reduce and do this, but it will probably throw errors because of not reducing fast enough with large sets of data.
function(head, req)
{
% this sort function assumes that modifed is a number
% and it sorts in descending order
function sortCategory(a,b) { b.value - a.value; }
var categories = {};
var category;
var id;
var row;
while (row = getRow())
{
if (!categories[row.key[0]])
{
categories[row.key[0]] = [];
}
categories[row.key[0]].push(row);
}
for (var cat in categories)
{
categories[cat].sort(sortCategory);
categories[cat] = categories[cat].slice(0,10);
}
send(toJSON(categories));
}
you can get all categories top 10 now with
http://localhost:5984/database/_design/doc/_list/top_ten/by_categories
and get the docs with
http://localhost:5984/database/_design/doc/_list/top_ten/by_categories?include_docs=true
now you can query this with a multiple range POST and limit which categories
curl -X POST http://localhost:5984/database/_design/doc/_list/top_ten/by_categories -d '{"keys":[["category1"],["category2",["category3"]]}'
you could also not hard code the 10 and pass the number in through the req variable.
Here is some more View/List trickery.

slight correction. it was not sorting untill I added the "return" keyword in your sortCategory function. It should be like this:
function sortCategory(a,b) { return b.value - a.value; }

Related

How to access 'Abbreviation' field of a custom list in NetSuite custom lists

I have a custom list that is used as a matrix option of Inventory item. Its 'Color'. This custom list has an abbreviation column. I am creating a saved search on item and using Color field(join) and trying to access 'Abbreviation' field of color.
Abbreviation on custom list is available on when 'Matrix Option List' is checked.
Can someone please help me achieve this? I tried to do this through script and it seems like we cannot access 'Abbreviation' column through script. I also tried to use script to write a search directly on 'Color' - custom list and get the 'abbreviation' through search columns. It did not work. Is there a way to access 'Abbreviation' from custom lists?
Thanks in Advance
You can access it via suitescript by using the record type "customlist" and the internal id of the list like so:
var rec = nlapiLoadRecord('customlist', 5);
var abbreviation = rec.getLineItemValue('customvalue', 'abbreviation', 1);
nlapiLogExecution('DEBUG', 'abbreviation', abbreviation);
Keep in mind that the third argument of getLineItemValue is the line number, not the internal ID of the item in the list. If you want to find a specifc line item, you may want to use rec.findLineItemValue(group, fldnam, value).
Unfortunately, it doesn't look like this translates to saved searches. The suiteanswer at https://netsuite.custhelp.com/app/answers/detail/a_id/10653 has the following code:
var col = new Array();
col[0] = new nlobjSearchColumn('name');
col[1] = new nlobjSearchColumn('internalid');
var results = nlapiSearchRecord('customlist25', null, null, col);
for ( var i = 0; results != null && i < results.length; i++ )
{
var res = results[i];
var listValue = (res.getValue('name'));
var listID = (res.getValue('internalid'));
nlapiLogExecution('DEBUG', (listValue + ", " + listID));
}
However, whatever part of the application layer translates this into a query doesn't handle the abbreviation field. One thing to keep in mind is that the 'custom list' record is basically a header record, and each individual entry is it's own record that ties to it. You can see some of the underlying structure here, but the takeaway is that you'd need some way to drill-down into the list entries, and the saved search interface doesn't really support it.
I could be wrong, but I don't think there's any way to get it to execute in a saved search as-is. I thought the first part of my answer might help you find a workaround though.
Here is a NetSuite SuiteScript 2.0 search I use to find the internalId for a given abbreviation in a custom list.
/**
* look up the internal id value for an abbreviation in a custom list
* #param string custom_list_name
* #param string abbreviation
* #return int
* */
function lookupNetsuiteCustomListInternalId( custom_list_name, abbreviation ){
var internal_id = -1;
var custom_list_search = search.create({
type: custom_list_name,
columns: [ { name:'internalId' }, { name:'abbreviation' } ]
});
var filters = [];
filters.push(
search.createFilter({
name: 'formulatext',
formula: "{abbreviation}",
operator: search.Operator.IS,
values: abbreviation
})
);
custom_list_search.filters = filters;
var result_set = custom_list_search.run();
var results = result_set.getRange( { start:0, end:1 } );
for( var i in results ){
log.debug( 'found custom list record', results[i] );
internal_id = results[i].getValue( { name:'internalId' } );
}
return internal_id;
}
Currently NetSuite does not allows using join on matrix option field. But as you mentioned, you can use an extra search to get the result, you could first fetch color id from item and then use search.lookupFields as follows
search.lookupFields({ type: MATRIX_COLOR_LIST_ID, id: COLOR_ID, columns: ['abbreviation'] });
Note: Once you have internalid of the color its better to use search.lookupFields rather than creating a new search with search.create.

synchronous execution of database function in node.js

There are two collections products and stock in my database.Each product have multipple stock with different supplier and attributes of products.
I have selected the products from products collection and run a loop for each product. I need to append price and offer price from the stock collection to products i have selected already.
I think the for loop compleated its execution before executiong the find method on stock collection.I need to execute everything in the loop in serial manner(not asynchronous). Please check the code below and help me to solve this. I'm new in node.js and mongodb
collection = db.collection('products');
collection.find().toArray(function(err, abc) {
var finalout = [];
for( var listProducts in abc){
var subfinal = {
'_id' :abc[listProducts]['_id'],
'min_price' :abc[listProducts]['value']['price'],
'stock' :abc[listProducts]['value']['stock'],
'name' :abc[listProducts]['value']['name'],
'price' :'',
'offer_price' :'',
};
collection = db.collection('stock');
collection.find({"product":abc[listProducts]['_id'] ,"supplier":abc[listProducts]['value']['def_supplier']}).toArray(function(err, newprod) {
for( var row in newprod){
subfinal['price'] = newprod[row]['price'];
subfinal.offer_price = newprod[row]['offer_price'];
}
finalout.push(subfinal);
});
}
console.log(finalout);
});
Yes, the loop is running and starting the get requests on the database all at the same time. It is possible, like you said, to run them all sequentially, however it's probably not what you're looking for. Doing it this way will take more time, since each request to Mongo would need to wait for the previous one to finish.
Considering that each iteration of the loop doesn't depend on the previous ones, all you're really looking for is a way to know once ALL of the operations are finished. Keep in mind that these can end in any order, not necessarily the order in which they were initiated in the loop.
There's was also an issue in the creation of the subfinal variable. Since the same variable name is being used for all iterations, when they all come back they'll be using the final assignment of the variable (all results will be written on the same subfinal, which will have been pushed multiple times into the result). To fix that, I've wrapped the entire item processing iteration into another function to give each subfinal variable it's own scope.
There are plenty of modules to ease this kind of management, however here is one way to run some code only after all the Mongo calls are finished using a counter and callback that doesn't require any additional installs:
var finished = 0; // incremented every time an iteration is done
var total = Object.keys(abc).length; // number of keys in the hash table
collection = db.collection('products');
collection.find().toArray(function(err, abc) {
var finalout = [];
for( var listProducts in abc){
processItem(listProducts);
}
});
function processItem(listProducts) {
var subfinal = {
'_id' :abc[listProducts]['_id'],
'min_price' :abc[listProducts]['value']['price'],
'stock' :abc[listProducts]['value']['stock'],
'name' :abc[listProducts]['value']['name'],
'price' :'',
'offer_price' :'',
};
collection = db.collection('stock');
collection.find({"product":abc[listProducts]['_id'] ,"supplier":abc[listProducts]['value']['def_supplier']}).toArray(function(err, newprod) {
for( var row in newprod){
subfinal['price'] = newprod[row]['price'];
subfinal.offer_price = newprod[row]['offer_price'];
}
finalout.push(subfinal);
finished++;
if (finished === total) { // if this is the last one to finish.
allFinished(finalout);
}
});
}
function allFinished(finalout) {
console.log(finalout);
}
It could be written more concisely with less variables, but it should be easier to understand this way.

Distinct values in Azure Search Suggestions?

I am offloading my search feature on a relational database to Azure Search. My Products tables contains columns like serialNumber, PartNumber etc.. (there can be multiple serialNumbers with the same partNumber).
I want to create a suggestor that can autocomplete partNumbers. But in my scenario I am getting a lot of duplicates in the suggestions because the partNumber match was found in multiple entries.
How can I solve this problem ?
The Suggest API suggests documents, not queries. If you repeat the partNumber information for each serialNumber in your index and then suggest based on partNumber, you will get a result for each matching document. You can see this more clearly by including the key field in the $select parameter. Azure Search will eliminate duplicates within the same document, but not across documents. You will have to do that on the client side, or build a secondary index of partNumbers just for suggestions.
See this forum thread for a more in-depth discussion.
Also, feel free to vote on this UserVoice item to help us prioritize improvements to Suggestions.
I'm facing this problem myself. My solution does not involve a new index (this will only get messy and cost us money).
My take on this is a while-loop adding 'UserIdentity' (in your case, 'partNumber') to a filter, and re-search until my take/top-limit is met or no more suggestions exists:
public async Task<List<MachineSuggestionDTO>> SuggestMachineUser(string searchText, int take, string[] searchFields)
{
var indexClientMachine = _searchServiceClient.Indexes.GetClient(INDEX_MACHINE);
var suggestions = new List<MachineSuggestionDTO>();
var sp = new SuggestParameters
{
UseFuzzyMatching = true,
Top = 100 // Get maximum result for a chance to reduce search calls.
};
// Add searchfields if set
if (searchFields != null && searchFields.Count() != 0)
{
sp.SearchFields = searchFields;
}
// Loop until you get the desired ammount of suggestions, or if under desired ammount, the maximum.
while (suggestions.Count < take)
{
if (!await DistinctSuggestMachineUser(searchText, take, searchFields, suggestions, indexClientMachine, sp))
{
// If no more suggestions is found, we break the while-loop
break;
}
}
// Since the list might me bigger then the take, we return a narrowed list
return suggestions.Take(take).ToList();
}
private async Task<bool> DistinctSuggestMachineUser(string searchText, int take, string[] searchFields, List<MachineSuggestionDTO> suggestions, ISearchIndexClient indexClientMachine, SuggestParameters sp)
{
var response = await indexClientMachine.Documents.SuggestAsync<MachineSearchDocument>(searchText, SUGGESTION_MACHINE, sp);
if(response.Results.Count > 0){
// Fix filter if search is triggered once more
if (!string.IsNullOrEmpty(sp.Filter))
{
sp.Filter += " and ";
}
foreach (var result in response.Results.DistinctBy(r => new { r.Document.UserIdentity, r.Document.UserName, r.Document.UserCode}).Take(take))
{
var d = result.Document;
suggestions.Add(new MachineSuggestionDTO { Id = d.UserIdentity, Namn = d.UserNamn, Hkod = d.UserHkod, Intnr = d.UserIntnr });
// Add found UserIdentity to filter
sp.Filter += $"UserIdentity ne '{d.UserIdentity}' and ";
}
// Remove end of filter if it is run once more
if (sp.Filter.EndsWith(" and "))
{
sp.Filter = sp.Filter.Substring(0, sp.Filter.LastIndexOf(" and ", StringComparison.Ordinal));
}
}
// Returns false if no more suggestions is found
return response.Results.Count > 0;
}
public async Task<List<string>> SuggestionsAsync(bool highlights, bool fuzzy, string term)
{
SuggestParameters sp = new SuggestParameters()
{
UseFuzzyMatching = fuzzy,
Top = 100
};
if (highlights)
{
sp.HighlightPreTag = "<em>";
sp.HighlightPostTag = "</em>";
}
var suggestResult = await searchConfig.IndexClient.Documents.SuggestAsync(term, "mysuggestion", sp);
// Convert the suggest query results to a list that can be displayed in the client.
return suggestResult.Results.Select(x => x.Text).Distinct().Take(10).ToList();
}
After getting top 100 and using distinct it works for me.
You can use the Autocomplete API for that where does the grouping by default. However, if you need more fields together with the result, like, the partNo plus description it doesn't support it. The partNo will be distinct though.

Map/Reduce differences between Couchbase & CloudAnt

I've been playing around with Couchbase Server and now just tried replicating my local db to Cloudant, but am getting conflicting results for my map/reduce function pair to build a set of unique tags with their associated projects...
// map.js
function(doc) {
if (doc.tags) {
for(var t in doc.tags) {
emit(doc.tags[t], doc._id);
}
}
}
// reduce.js
function(key,values,rereduce) {
if (!rereduce) {
var res=[];
for(var v in values) {
res.push(values[v]);
}
return res;
} else {
return values.length;
}
}
In Cloudbase server this returns JSON like:
{"rows":[
{"key":"3d","value":["project1","project3","project8","project10"]},
{"key":"agents","value":["project2"]},
{"key":"fabrication","value":["project3","project5"]}
]}
That's exactly what I wanted & expected. However, the same query on the Cloudant replica, returns this:
{"rows":[
{"key":"3d","value":4},
{"key":"agents","value":1},
{"key":"fabrication","value":2}
]}
So it somehow only returns the length of the value array... Highly confusing & am grateful for any insights by some M&R ninjas... ;)
It looks like this is exactly the behavior you would expect given your reduce function. The key part is this:
else {
return values.length;
}
In Cloudant, rereduce is always called (since the reduce needs to span over multiple shards.) In this case, rereduce calls values.length, which will only return the length of the array.
I prefer to reduce/re-reduce implicitly rather than depending on the rereduce parameter.
function(doc) { // map
if (doc.tags) {
for(var t in doc.tags) {
emit(doc.tags[t], {id:doc._id, tag:doc.tags[t]});
}
}
}
Then reduce checks whether it is accumulating document ids from the identical tag, or whether it is just counting different tags.
function(keys, vals, rereduce) {
var initial_tag = vals[0].tag;
return vals.reduce(function(state, val) {
if(initial_tag && val.tag === initial_tag) {
// Accumulate ids which produced this tag.
var ids = state.ids;
if(!ids)
ids = [ state.id ]; // Build initial list from the state's id.
return { tag: val.tag,
, ids: ids.concat([val.id])
};
} else {
var state_count = state.ids ? state.ids.length : state;
var val_count = val.ids ? val.ids.length : val;
return state_count + val_count;
}
})
}
(I didn't test this code, but you get the idea. As long as the tag value is the same, it doesn't matter whether it's a reduce or rereduce. Once different tags start reducing together, it detects that because the tag value will change. So at that point just start accumulating.
I have used this trick before, although IMO it's rarely worth it.
Also in your specific case, this is a dangerous reduce function. You are building a wide list to see all the docs that have a tag. CouchDB likes tall lists, not fat lists. If you want to see all the docs that have a tag, you could map them.
for(var a = 0; a < doc.tags.length; a++) {
emit(doc.tags[a], doc._id);
}
Now you can query /db/_design/app/_view/docs_by_tag?key="3d" and you should get
{"total_rows":287,"offset":30,"rows":[
{"id":"project1","key":"3d","value":"project1"}
{"id":"project3","key":"3d","value":"project3"}
{"id":"project8","key":"3d","value":"project8"}
{"id":"project10","key":"3d","value":"project10"}
]}

Getting a list of documents with a maximum field value from CouchDB view

Let's say I have blog entries like these in my CouchDB database:
{"name":"Mary", "postdate":"20110412", "subject":"this", "message":"blah"}
{"name":"Joe", "postdate":"20110411", "subject":"that", "message":"yadda"}
{"name":"Mary", "postdate":"20110411", "subject":"and this", "message":"blah-blah"}
{"name":"Joe", "postdate":"20110410", "subject":"And other thing", "message":"yada-yada"}
{"name":"Jane", "postdate":"20110409", "subject":"Serious stuff", "message":"Not really"}
It's pretty easy to get a list of all posts. But how do I get a list of latest posts from all the users?
Like that:
{"name":"Mary", "postdate":"20110412", "subject":"this", "message":"blah"}
{"name":"Joe", "postdate":"20110411", "subject":"that", "message":"yadda"}
{"name":"Jane", "postdate":"20110409", "subject":"Serious stuff", "message":"Not really"}
Try with this map function:
function(doc) {
if (doc.postdate && doc.name) {
emit([doc.name, doc.postdate], 1);
}
}
and the following reduce function:
function(keys, values, rereduce) {
var max = 0,
ks = rereduce ? values : keys;
for (var i = 1, l = ks.length; i < l; ++i) {
if (ks[max][0][1] < ks[i][0][1]) max = i;
}
return ks[max];
}
and query it with group_level=1. It gives you the _id of the posts, then you can retrieve them all with a single query with the keys parameter or using a POST.
I am not sure if this is the best approach, but it seems to work.
UPDATE: fixed map to handle rereduce correctly.
You're going to emit the postdate as the key because keys are sorted. For example, this is what your map function will look like...
function(doc) {
if(doc.postdate) {
emit(doc.postdate, doc);
}
}
That will give you all the docs sorted ascending by postdate. If you want descending then query with ?descending=true
Cheers.

Resources