API search filtering bug

API search filtering bug - search

There is a problem with the API track filtering:
This returns a JSON:
https://api.soundcloud.com/tracks.json?offset=0&filter=public,streamable&tags=electonica&duration[from]=0&order_by=hotness&consumer_key=ZbOZAiYTx8wbuIaqRhfubg
And removing the zero from the duration it only returns an [] empty array
https://api.soundcloud.com/tracks.json?offset=0&filter=public,streamable&tags=electonica&duration[from]=&order_by=hotness&consumer_key=ZbOZAiYTx8wbuIaqRhfubg
The bigger issue is that these parameters also doesn't work and they should return results. I expect tracks (mixes in fact) longer than 25 minutes.
http://api.soundcloud.com/tracks.json?offset=0&filter=public,streamable&tags=lounge&created_at[from]=2012-04-01%2023:29:12&duration[from]=1500000&order_by=hotness&consumer_key=ZbOZAiYTx8wbuIaqRhfubg

Related

Logic App Until action, how to set count dynamically?

I have a logic app that is fetching data from an API endpoint. The API is using pagination and has an limit of 50 objects per request and then provides an link for the next 50 objects until it gets all the objects, however I have no idea on how many objects there will be for each request. My flow is briefly described down below:
First make an initial HTTP request against the endpoint
Parsing the response HTTP Body to be able to use the nextLink url provided.
Until loop with the conditon to run until nextLink is equal to null.
In the until loop I have an action for Set Variable that get Set to a new URL for each request made with a new pagination in the end of the url: "&_offset=100"
The issue with the until loop is that you can set limits for count and timeout as you can see here. As I have no clue on how many pagination there will be I am expecting this loop to run until the condition specified is met. However, I have tried specify some different values listed below:
Count = 1 - Resulted in just 1 run
Count = empty - Resulted in it running for an hour (approx 3300 loops), as specified by the Timeout value.
Count = 60 - Resulted in it running for 60 times
I have researched on how many pagination this specific request has and it turns out it has 290 paginations. My expectations is that this until loop will run until nextLink is equal to null which will be after 290 loops. But I wonder if there is any possibiliy to specify a dynamic value for Count in the until action?
I am expecting the UNTIL action to run as many time as needed based on how many pagination there is, that is atleast what I suppose it should do because if I need to specify a value for how many times it needs to run then this action is pretty useless. Hopefully there is someone in here that maybe have faced the same issue.
Best regards

As far as I know, "Until" action requires us to define at least one limit to prevent endless loops.
For your problem, you can just define a count which is large enough to allow your endpoints show all of the pages. If you want to specify a dynamic value for the count, you need to meet two conditions:
You have to be able to access total number of pages (if your endpoint provides a url to get it).
The count set in "Until" action can only reference trigger inputs, trigger outputs and parameters.
According to the statement in your question, I guess you can't meet these two conditions. So I think we can just set a count which is large enough.

Mongodb empty find({}) query taking long time to return docs

I have ~19,000 docs in an mlab sandbox, total size ~160mb. I will have queries that may want a heavily filtered subset of those docs, or completely unfiltered. In all cases, the information I need returned is a 100-length array that is an attribute of each document. So in the case of no filter, I expect to get back 19k arrays which I will then do further processing on. The query I'm doing to test this is db.find({}, { vector: 1 });.
Using .explain(), I can see that this query doesn't take long at all. However, it takes upwards of 50 seconds for my code to see the returned data. At first I thought it was simply downloading a bunch of data taking a long time, but the total vector data size is only around 20mb, which shouldn't take that long. (For context, my nodejs server is running locally on my pc while the db is hosted, so there will be some transfer).
What is taking this so long? An empty query shouldn't take any time to execute, and projecting the results into only vectors means I should only have to download ~20mb of data, which should be a matter of seconds.

TableStorage queryEntities sometimes returning 0 entries but no error

TableStorage & Nodejs
Using the function "queryEntities" sometimes result.entries.length is 0, even when I am pretty sure there are a lot of entries in the database. The "where" parameters are ok, but sometimes (maybe one every 100) it returns 0 entries. Not error returned. Just 0 entries.
And in my function that's causing troubles.
My theory is that the database sometimes is saturated because this function executes every 10 seconds and maybe sometimes before one finish another one starts and both operate over the same table, and instead of error it returns a length 0 , what is something awful.
There is any way to resolve this? Shouldn't it return error?

This is expected behavior. In this particular scenario, please check for the presence of continuation tokens in the response. Presence of these tokens in the response indicate that there may be entities available matching the query and you should execute the same query again with the continuation token you received.
Please read this document for explanation: https://learn.microsoft.com/en-us/rest/api/storageservices/query-timeout-and-pagination.
From this link:
A query against the Table service may return a maximum of 1,000 items
at one time and may execute for a maximum of five seconds. If the
result set contains more than 1,000 items, if the query did not
complete within five seconds, or if the query crosses the partition
boundary, the response includes headers which provide the developer
with continuation tokens to use in order to resume the query at the
next item in the result set.

Mongoose limiting query to 1000 results when I want more/all (migrating from 2.6.5 to 3.1.2)

I'm migrating my app from Mongoose 2.6.5 to 3.1.2, and I'm running into some unexpected behavior. Namely I notice that query results are automatically being limited to 1000 records, while pretty much everything else works the same. In my code (below) I set a value maxIvDataPoints that limits the number of data points returned (and ultimately sent to the client browser), and that value was set elsewhere to 1500. I use a count query to determine the total number of potential results, and then a subsequent mod to limit the actual query results using the count and the value of maxIvDataPoints to determine the value of the mod. I'm running node 0.8.4 and mongo 2.0.4, writing server-side code in coffeescript.
Prior to installing mongoose 3.1.x the code was working as I had wanted, returning just under 1500 data points each time. After installing 3.1.2 I'm getting exactly 1000 data points returned each time (assuming there are more than 1000 data points in the specified range). The results are truncated, so that data points 1001 to ~1500 are the ones no longer being returned.
It seems there may be some setting somewhere that governs this behavior, but I can't find anything in the docs, on here, or in the Google group. I'm still a relative n00b so I may have missed something obvious.
DataManager::ivDataQueryStream = (testId, minTime, maxTime, callback) ->
# If minTime and maxTime have been provided, set a flag to limit time extents of query
unless isNaN(minTime)
timeLimits = true
# Load the max number of IV data points to be displayed from CONFIG
maxIvDataPoints = CONFIG.maxIvDataPoints
# Construct a count query to determine the number if IV data points in range
ivCountQuery = TestDataPoint.count({})
ivCountQuery.where "testId", testId
if timeLimits
ivCountQuery.gt "testTime", minTime
ivCountQuery.lt "testTime", maxTime
ivCountQuery.exec (err, count) ->
ivDisplayQuery = TestDataPoint.find({})
ivDisplayQuery.where "testId", testId
if timeLimits
ivDisplayQuery.gt "testTime", minTime
ivDisplayQuery.lt "testTime", maxTime
# If the data set is too large, use modulo to sample, keeping the total data series
# for display below maxIvDataPoints
if count > maxIvDataPoints
dataMod = Math.ceil count/maxIvDataPoints
ivDisplayQuery.mod "dataPoint", dataMod, 1
ivDisplayQuery.sort "dataPoint" #, 1 <-- new sort syntax for Mongoose 3.x
callback ivDisplayQuery.stream()

You're getting tripped up by a pair of related factors:
Mongoose's default query batchSize changed to 1000 in 3.1.2.
MongoDB has a known issue where a query that requires an in-memory sort puts a hard limit of the query's batch size on the number of documents returned.
So your options are to put a combo index on TestDataPoint that would allow mongo to use it for sorting by dataPoint in this type of query or increase the batch size to at least the total count of documents you're expecting.

Wow that's awful. I'll publish a fix to mongoose soon removing the batchSize default (was helpful when streaming large result sets). Thanks for the pointer.
UPDATE: 3.2.1 and 2.9.1 have been released with the fix (removed batchSize default).

Flickr API returning duplicate photos

I've come across a confusing issue with the flickr API.
When I do a photo search (flickr.photos.search) and request high page numbers, I
often get duplicate photos returned for different page numbers.
Here's three URLs, they should each return three sets of different images,
however, they - bizarrely - return the same images:
http://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=ca3035f67faa0fcc72b74cf6e396e6a7&tags=gizmo&tag_mode=all&per_page=3&page=6820
http://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=ca3035f67faa0fcc72b74cf6e396e6a7&tags=gizmo&tag_mode=all&per_page=3&page=6821
http://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=ca3035f67faa0fcc72b74cf6e396e6a7&tags=gizmo&tag_mode=all&per_page=3&page=6822
Has anyone else come across this?
I seem to be able to recreate this on any tag search.
Cheers.

After further investigation it seems there's an undocumented "feature" build into the API which never allows you to get more than 4000 photos returned from flickr.photos.search.
So whilst 7444 pages is available, it will only let you load the first 1333.

It is possible to retrieve more than 4000 images from flickr; your query has to be paginated by (for example) temporal range such that the total number of images from that query is not more than 4000. You can also use other parameters such as bounding box to limit the total number of images in the response.
For example, if you are searching with the tag 'dogs', this is what you can do ( binary search over time range):
Specify a minimum date and a maximum date in the request url, such as Jan 1st, 1990 and Jan 1st 2015.
Inspect the total number of images in the response. If it is more than 4000, then divide the temporal range into two and work on the first half until you get less than 4000 images from the query. Once you get that, request all the pages from that time range, and move on to the next interval and do the same until (a) Number of required images is met (b) searched all over the initial time interval.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string