Spotify Metadata API: Search By Artist - search

The original plan was to write this as a blog post, entitled "Inefficiencies in the Spotify Metadata API : Or, How the Jackson 5 killed my Browser", but changed my mind at the last minute as I have a habit of missing the obvious in documentation,perhaps an undocumented feature might exist which I have missed, or someone else has solved the issue - so hence this question has a certain blog-post tone about it!
I am developing a small web-app, mostly for a small group of people, which will allow anyone to update a Spotify playlist. As not everyone has Spotify (though I don't know why!), the page will update a database with songs, as App running in Spotify on my laptop polls the database for updates, then using the Spotify Apps API, the playlist is updated, and anyone subscribing to the playlist gets the update. This is ok, though I would like to use push rather than poll, but that's a topic for another day.
I searched around for a Javascript library to use with the Spotify Metadata API, and found one (https://github.com/palmerj3/SpotifyJS) though it was basically a wrapper and still required you to parse the JSON yourself. Thinking I could go one better and put some basic parsing in for the most common fields (title, artist, album, Spotify URI) I started working on my own library/JQuery plugin.
Search by track is not a problem, it's a single call to the spotify metadata API, the results are easily parsed, matching the returned artist with the requested artist (if present) makes for an easy search by title/artist.
Search by Artist (obtain a list of all songs by a particular artist) though, appears to be a pain-in-the-**! As best as I can tell from the docs, this is the process.
Search for the artist: this will return a list of artists which match the query
For each artist, lookup their albums: this will return a list albums
Lookup each album and retrieve a track list
Compare the artist for each track with the search artist, if it matches output
The first step will return a small list of artist matches, Foo Fighters has 2, Silverchair 1, and The Jackson 5 has 4. This small list turns into a larger number of album matches - from memory Foo Fighters returned 112, which then turns into even larger number of track lists. From a Javascript/JQuery perspective, this leads to daisy-chained AJAX request, for each step, and at each step massive numbers of, nearly concurrent GET requests against the Spotify servers.
The initial version I wrote cheated and used synchronous AJAX, and worked ok, as each request must complete before the next would start, though, this would lock the browser up for some time, and removed the possibility of using feedback to the user that the system was running. I then switched to asynchronous requests and all hell broke loose! You immediately hit issues with rate limiting on the Spotify end, which returns resoponses with 502 bad gateway (not listed in the spotify docs as a status by the way), or 503 - both of which JQuery interpreted as status code 0 - which was interesting, requiring debugging in Firebug. I throttled the requests down on the client side, I found that 1 every second was about right, to avoid rate limiting and ensure I got a response containing data each time, however, this then causes massive lock ups in the browser as it had upwards of 30 or 40 GET requests in parallel, all returning pretty much at the same time (though some requests responded after 15+ seconds!) and then parsing all the JSON responses.
I looked into alleviating the load by using a server-side approach, though this has downsides as well:
1. you don't avoid the basic issue in that the API can not handle the task in an efficient manner
2. for a busy site, the bandwidth usage will be against the server, which will also present a single IP, for multiple users you will soon hit the rate limit due to parallel users
The server side does offer caching though which could be beneficial, to this end I found a PHP Library - metatune (https://github.com/mikaelbr/metatune) advertised as the "The ultimate PHP Wrapper for the Spotify Metadata API", but unfortunately only offers the same basic lookup/search as the Spotify Metadata API - i.e.: no listing of all songs by an artist.
Thus, I have now disabled searching by artist, until I find a suitable solution.
Assuming I have not missed anything, it seems, to me at least, to not be an efficient API design, as it encourages you to make large numbers of requests to the Spotify servers, which is not good for me as a client, and not good for Spotify as a server. I can't help but think that if there was a request such as:
ws.spotify.com/search/1/artist.json?q=foo+fighters&extras=tracks
then the issues discussed here would be alleviated, a single request would cover what requires 3 sets of multiple requests currently; rate limiting wouldn't be as big an issue; the overheads to process the data on the client are greatly reduced; the overheads for Spotify to handle would be reduced and the entire service would be more efficient. The fact that the request would return a very large data set is not an issue, as the API already splits data into "pages".
So, my questions to the crowd:
1. Have I missed something obvious in the documentation, or is there a secret request?
2. In the absence of an API request, does anyone have a suggestion on how to make my system more efficient?
3. Has anyone solved this issue before?
Thanks for reading! Took a long time to get to the questions, but I felt it necessary to provide as much reasoning to find the best solution, and also, it illustrates the deficiency in the API, which I hope someone from Spotify will notice!
Finally as an aside, projects like this make me feel like we've swapped Flash for Javascript but the performance is still as bad! Anyone else feel the same?
Cheers!
sockThief

Unless I'm missing something, does this do what you want?
http://ws.spotify.com/search/1/track.json?q=artist:foo+fighters
The artist: prefix tells the search service to only match on artist. You can read more about the advanced search syntax (which also works in the client) here.

Related

Shaping API endpoints and response

Hello Stackoverflow,
I'm writing API's for quite a bit of time right now and now it came to work with one of these bigger api's. Started wondering how to shape this API, as many times I've seen on a bigger platforms that one big entity (for example product page in shop) is loaded separately (We can see that item body loaded, but comments are still fetching etc.).
Usually what I've done was attaching comments as a relation in SQL query, so my frontend queried single API Endpoint like:
http://api.example.com/items/:id
And it returned all necessary data like seller info, photos etc.
Logically seller info and photos are small pieces of data (Item can only have 1 seller and no more than 10 photos for example), but number of comments might be way larger collection with relationship (comment author).
Does it make sense to separate one endpoint into 2 independent endpoints like:
http://api.example.com/items/:id
http://api.example.com/items/:id/comments
What are downsides of this approach? Is it common practice? Or maybe I misunderstood some concept?
One downside might be 2 request performed, but on the other hand, first endpoint should return data faster (as it's lighter than fetching n of comments), so page might be displayed faster and display spinner for comments section. This way I'll be able to paginate comments too.
Are there any improvements that might be included in this separation of endpoints? Or maybe I'm totally wrong and it should be done totally different way?
I think it is a good approach if:
The number of comments of one item can be large, because with this approach you could paginate it easier.
If you are going to need to access to the comments of one item without needing rest of item information
I think any of the previous conditions justify this decition, and yes, it is common approach.

How to Look Up Spotify IDs (Song / Track IDs) in Bulk?

I have a list of songs - is there a way (using the Spotify / Echo Nest API) to look up the Spotify ID for each track in bulk?
If it helps, I am planning on running these IDs through the "Get Audio Features" part of their API.
Thanks in advance!
You can use the Spotify Web API to retrieve song IDS. First, you'll need to register to use the API. Then, you will need to perform searches, like in the example linked here.
The Spotify API search will be most useful for you if you can provide specifics on albums and artists. The search API allows you to insert multiple query strings. Here is an example (Despacito by Justin Bieber:
https://api.spotify.com/v1/search?q=track:"' + despacito + '"%20artist:"' + bieber + '"&type=track
You can paste that into your browser and scan the response if you'd like. Ultimately you are interested in the song id, which you can find in the uri:
spotify:track:6rPO02ozF3bM7NnOV4h6s2
Whichever programming language you choose should allow you to loop through these calls to get the song IDs you want. Good luck!
It has been a few years, and I am curious how far you got with this project. I was doing the same thing around 2016 as well. I am just picking up the project again, and noticing you still cannot do large bulk ID queries by Artist,Title.
For now I am just handling HttpStatusCode 429 and sleeping the thread as I loop through a library. It's kind of slow but, I mean it gets the job done. After I get them I do the AudioFeatures query for 100 tracks at a time so it goes pretty quickly that way.
So far, this is the slowest part and I really wish there was a better way to do it, or even a way to make your own 'Audio Features' based on your library It just takes a lot of computing cycles. However ... one possible outcome might be to only do it for tracks that you cannot find on Spotify ;s

Instagram API media/popular

What are the queries we can use with media/popular. Can we localize it according to country or geolocation?
Also is there a way to get the discovery feature's results with the api?
This API is no longer supported.
Ref : https://www.instagram.com/developer/endpoints/media/
I was recently struggling with same problem and came to conclusion there is no other way except the hard one.
If you want location based popular images you must go with location endpoint.
https://api.instagram.com/v1/locations/214413140/media/recent
This link brings up recent media from custom location, key being the location-id. Your job is now to follow simple pagination api and merge responded arrays into one big bunch of JSON. $response['pagination']['next_max_id'] parameter is responsible for pagination, so you simply send every next request with max_id of previous request.
https://api.instagram.com/v1/locations/214413140/media/recent?max_id=1093665959941411696
End result will depend on the amount of information you gathered. In the end you will just gonna need to sort the array with like count and you're up to go whatever you were going to do.
Of course important part is to save images locally rather than generating every time user opens the webpage. Reason being not only generation time but limited amount of requests per hour.
Hope someone will come up better solution or Instagram API will finally support media/popular by location.

Spotify API, same music with differents IDs in App get the same IDs from API

Title says almost everything.
I found that the music "Boom - 2006 Remastered Version" has two different IDs that can be found in the App:
3EKjTDAEIdyQqsA9qtb5P2
0zlAqnRv07p9ezzFf3k2ky
But when using the API to get information about each one, it returns the same ID:
3EKjTDAEIdyQqsA9qtb5P2
Is this a bug?
It is unfortunately not a bug, but it is indeed very annoying, and your code needs to be able to handle it.
"Give me info for track A! Ok, here is info for track B, just like you asked".
It is a legacy thing still left in the Spotify metadata model called track redirects (the some concept exist on albums and artist too, but are less of a problem there). It was made so that we could quickly merge duplicate albums. It means that once upon time, there were two "different" tracks on different albums that were identical. We had lots of them on artists pages for popular artists. Labels would very often upload one album for one country and another identical one for another country instead of just saying that one album was available in two countries. Sometimes by mistake, most often because of cross licensing issues between labels and countries.
Track redirects are quite rare though if you look at the entire catalog. Most of these redirect tracks are only surfaced in old playlists and are for instance never returned in search results or artist pages. These days we never merge duplicates like this, but instead make sure only one is shown on artist pages, etc. and link to the other in case one is unavailable in your country. That is the concept called Track relinking in the docs. https://developer.spotify.com/web-api/track-relinking-guide/
I work at Spotify and bump into this problem every now and then. I want to change this so the tracks and album become just regular duplicates, because it is much easier to reason about, but it will take a while to fix. I guess I can update my answer here in a few years when it is done.

Instagram synced many images from a tag with Real time, what to do with deleted images

Using the Instagram API, I subscribed to a tag with the Real time feature. I sync media that match my project's criteria, then save those to DB. When users visit my website, I display these images from my DB (and not from instagram API).
From time to time, I see broken links showing up in the images. I identified that the source of the problem is that those images have now been deleted.
What's a good way to handle this?
Probably not attempting to duplicate the Instagram DB (or part thereof) would be the best option. Depending on the usage of your project and what sort of tags you're subscribing to, that could get pretty large pretty quickly.
Short of that, doing a quick HTTPRequest to the image URL (and checking the response code) before deciding whether to display it would do the job.
#Steve Crawford is on the right track.
The problem with your solution is that you are duplicating volatile data that you:
a) can't control
b) don't receive notifications on.
I would think the better method would be track the meta data of the images you are interested (like the author, url,date,etc) and then display them if they are still available.
If you are going to cache data you also need to a way to invalidate your cache. So another option would be to duplicate the data as you already are, but also have a background job to ensure that the data is still valid and remove the ones that aren't.

Resources