Does Google Speech API uses Mel Frequency Cepstral Coefficient (MFCC) extraction features? - speech-to-text

Greetings StackOverflow experts,
I would like to clarify that does Google Speech API uses Mel Frequency Cepstral Coefficient(MFCC) feature extraction?
If so, is there any articles/ journals that says about it?
Please enlighten me.
Thank you and have a wonderful day.

Most of the modern systems use log-mel filterbank, not really MFCC. Google also adds noise subtraction.
You can check this paper on Google technology:
Acoustic Modeling for Google Home

Related

Sound detection of cutting woods

Im really new to machine Learning.I have a project to identify a given sound.(Ex: cutting wood)In the audio clip there will be several sound. What i need to do is recognise that particular sound from it. I red some articles about machine learning. But i still have lack of knowledge where to start this project and also I'm running out of time.
Any help will be really appreciated. Can anyone please tell me how to do this?
Can i directly perform template(algorithms) matching for a sound?
It's a long journey ahead of you and Stack Overflow isn't a good place for asking such a generic question. Consult help section for more.
To get you started, here are some web sites:
Awesome Bioacoustic
Comparative Audio Analysis With Wavenet, MFCCs, UMAP, t-SNE and PCA
Here are two small repos of mine related to audio classification:
Gender classification from audio
Kiwi / not-a-kiwi bird calls detector
They might give you an idea where to start your project. Check the libraries I am using - likely they will be of help to you.

Tutorial tensorflow audio pitch analysis

I'm a beginner with tensorflow and Python and I'm trying to build an app that automatically detects, in a football (soccer) match some key moments (yellow/red cards, goals, etc).
I'm starting to understand how to do a video analysis training the program on a dataset built by me, downloading images from the web and tagging them. In order to obtain some better results for the analysis, I was wondering if someone had some suggestions on tutorials to follow in order to understand how to train my app also on audio files, to make the program able to understand when there is a pitch variation in the audio of the video and combine both video and audio analysis in order to get better results.
Thank you in advance
Since you are new to Python and to tensorflow, I recommend you focus on just audio for now, especially since its a strong indicator of events of importance in a football match (red/yellow cards, nasty fouls, goals, strong chances, good plays, etc).
Very simply, without using much ML at all, you can use the average volume of a time period to infer significance. If you want to get a little more sophisticated, you can consider speech-to-text libraries to look for keywords in commentator speech.
Using video to try to determine when something important is happening is much, much more challenging.
This page can help you get started with audio signal processing in Python.
https://bastibe.de/2012-11-02-real-time-signal-processing-in-python.html

Measurement of Spotify Audio Features

I am currently conducting research on American pop songs using Spotify's audio features (e.g., danceability, tempo, and valence...). But, I couldn't find any documentation that contains details about how they measured the features. I know there's a brief description of the features. But, it doesn't tell about any the exact measurement. Could you let me know where I can find it?
Thanks.
The Echonest was a music data analysis platform acquired by Spotify, and its expertise is being currently used to power up Spotify recommendation tools.
Audio Features API endpoint extracts a more "High Level" analysis from audio and songs, whereas Audio Analysis endpoint extracts more "Low Level" and technical data.
Essentially, "High-level" features are more explicit and make use of clearer semantics -plain english, in order to be easily understood by the layman ("danceability", for instance), but it all comes from Low Level analysis, really.
Here you have some documentation, if you wish to dive deeper into the matter:
http://docs.echonest.com.s3-website-us-east-1.amazonaws.com/_static/AnalyzeDocumentation.pdf

Anyone knows about text-based emotion detection systems that offer a demo?

I recently finished work on my text-based emotion detection engine and I am looking for other existing working systems to compare with mine in order to know what should be improved and also report comparisons in an upcoming paper.
I have come across many companies claiming to do emotion detection from text but only this one offers a demo that I can use to compare with my system: http://www.o2mc.io/portfolio-posts/text-analysis-restful-api-language-polarity-and-emotion/ (scroll all the way down to see the "try it yourself" section).
Please notice that I am not looking for polarity classification, which is the simpler task of saying if a text is positive or negative. What I am looking for is for emotions (sadness, anger, joy, etc...). Does anyone here know about any company/university/person offering a demo to such system?
As a reference, here is the link to my own system's demo:
http://demo.soulhackerslabs.com/emotion/
Your help is very much appreciated.

How to implement Knowledge graph

I'm looking forward to implement something like google direct answers which uses knowledge graph, is there any useful resource can I read ? also Where can I find data for that?
Thanks in advance
From the semantic web aspect of your question i guess an interesting starting point for you is the DBpedia (example queries) dataset, maybe in combination with the strings to Wikipedia concepts dataset.
I am excited for this free class: Probabilistic Graphical Models - Daphne Koller, Professor. I would look her up and watch her youtube videos in the mean time.

Resources