Is there a way to get Mel-frequency cepstrum coefficients of a track from the Spotify API? - spotify

I am looking to get the MFCC(Mel-frequency cepstrum coefficients) of a spotify track. My main aim is to identify genre of a track, and the algorithm which I'm studying right now uses MFCC to extract features of a track.
I think there might be 2 ways to do this:
Spotify's API has an endpoint called https://api.spotify.com/v1/audio-analysis/{id}. This is what the output looks like for a track. Maybe there is a way to get MFCC from this output?
Get raw audio features of the track from an API endpoint and then use a (different) library to apply MFCC on the features.
Or, is there any other method I can try?
Thanks :)
Edit :
The output of audio-analysis API for a track given here contains a key called "tmfccrack". Is this related to the MFCC?
I found out that you can get the genre of a Spotify track by getting the genre of the corresponding artist through the Spotify API. That gets me what I want for now, but I think I should keep the question open because it asks for the MFCC of a track and not just the genre.

Related

How is a track's 30sec defined that's obtained from preview_url (Spotify Web API)?

I am interested to use an audio raw dataset provided by Spotify Web API in Python. I wonder if the audio sample follows any rules to define the 30sec provided by the preview_url.
preview_url | string | A link to a 30 second preview (MP3 format) of the track. Can be null
Is the 30sec of the track extracted from:
The first 30 sec?
The track after 1 minute?
The track between 1-3mins?
A random part of the track?
Spotify analyses every track and then is able to tell where different parts of the song begin and end.
I suppose that what you hear in the 30s preview is a guess that Spotify makes of what the refrain/main part of the songs is.
Therefore you can't generally say which part is chosen because that is determined by an AI for each song respectively.

Getting total number of streams and track release date through Spotify API

I'm trying to get a large list of songs released in year X, together with their number of plays/streams.
I've been using Spotify API, and I have a number of highly popular songs. Now, for my purposes, I also need a list of non-popular songs (low play counts). I am wondering if there is any strategy to get a list of songs (maybe last played ones?), and extract their release year and number of total plays?
I've been going through the API documentation and I can only find 'popularity', which seems different from total number of plays. Secondly, I haven't found a way to get a list of last played songs yet. Should I be considering another type of strategy?
I know that you can get a list of recently played songs of all users in certain user groups in last.fm. Perhaps there is something similar in Spotify API?
Unfortunately, there is no way to get play counts through the Spotify API, only the Popularity metric.

how to get the comments on social media and make it as your data?

I've proposed a title for our thesis, Movie Success Prediction through Social Media comments using Sentiment Analysis, is there a way you can get the comments on social media (twitter, Instagram, Facebook etc.) and use it for your software? like an API or any other way. is that even possible to use your software on different social media to get the comments for prediction or should i change my title and stick to one social media like Facebook or twitter only?
what's the good algorithm for this?
what programming language and framework/IDE should i use?
I've done lots of research on google and still hoping for more info here. Thank you.
Edit: I'll only use YouTube and YouTube API.
From the title of your question, it seems that the method you need to use is distant supervision. You need to retrieve data with labels you think it is proper for your task. For instance, a tweet containing #perfect hashtag would probably be a positive tweet. So, you can define set of hashtags for your task, negative, positive or even for neutral; then you can retrieve tweets by those via Twitter API. For your task, those should be for movies, therefore your data should contain movie related information in first place.
Given that you will deal with text data and you'd like to create your own dataset, it is better to start with Twitter. Its API works for your needs and it is very well-documented. The language and frameworks are upto your choice, since APIs supports many known languages as well. Personally, I'd start with python or java to quickly solve future problems easier with community support.
For a general survey of this area, you may dive into papers and resources from here:
https://scholar.google.com.tr/scholar?hl=en&q=distant+supervision+sentiment+analysis
Distant supervision could be used to create a sentiment lexicon out of millions English tweets by using sets of negative and positive hashtags as well. You may take a look at Chapter 5 of this thesis ( https://spectrum.library.concordia.ca/980377/1/Ozdemir_MCompSc_F2015.pdf ), this may also give a good insight for your thesis, too.
Hope this helps.
Cheers

Bing Speech to Text API returning very wrong text

I am trying the "Bing Speech To Text API" in audio files that contains a real conversations between a person that answer customers in a call-center, and a customer that calls the call center to solve his doubts. Thus, these audios have two persons talking, and sometimes have long silence period when the customer is waiting an answer from support. These audios have 5 to 10 minutes long.
My doubt is:
What is the best aproach to translate audios like that to text, using Microsoft Cognitive Services?
What APIs do I have to use, besides Bing Speech To Text?
Do I have to cut or convert the audios before sending them to Bing Speech To Text?
I am asking that because the Bing Speech to text API is returning an text very very very very very different from the audio content. It is impossible to use or undertand. But, of course, I think I am doing some mistake.
Please, could you explain to me the best strategy to work with audio files like this?
I would be very glad for any help.
Best Regads,
I had run into this problem with conversations as well. Make sure that the transcription mode is set to "conversation" instead of "interactive."

What information do I get from Spotify's Audio Analysis?

When using Spotify's API to analyse a track (https://developer.spotify.com/web-api/console/get-audio-analysis-track/) it returns a bunch of numbers and strings..
Does anybody know what these numbers are all about and how to interpret them?
You should take a look at Spotify's documentation for the audio analysis:
https://developer.spotify.com/web-api/get-audio-analysis/
If you look at the "track" element, you can see it returns a number of useful stats, such as the tempo, key, mode (minor/major) and loudness of the song. In the "segments" elements you can also get more a detailed pitch and timbral (tonal) analysis for parts of the song.

Resources