Sound detection of cutting woods - audio

Im really new to machine Learning.I have a project to identify a given sound.(Ex: cutting wood)In the audio clip there will be several sound. What i need to do is recognise that particular sound from it. I red some articles about machine learning. But i still have lack of knowledge where to start this project and also I'm running out of time.
Any help will be really appreciated. Can anyone please tell me how to do this?
Can i directly perform template(algorithms) matching for a sound?

It's a long journey ahead of you and Stack Overflow isn't a good place for asking such a generic question. Consult help section for more.
To get you started, here are some web sites:
Awesome Bioacoustic
Comparative Audio Analysis With Wavenet, MFCCs, UMAP, t-SNE and PCA
Here are two small repos of mine related to audio classification:
Gender classification from audio
Kiwi / not-a-kiwi bird calls detector
They might give you an idea where to start your project. Check the libraries I am using - likely they will be of help to you.

Related

Tutorial tensorflow audio pitch analysis

I'm a beginner with tensorflow and Python and I'm trying to build an app that automatically detects, in a football (soccer) match some key moments (yellow/red cards, goals, etc).
I'm starting to understand how to do a video analysis training the program on a dataset built by me, downloading images from the web and tagging them. In order to obtain some better results for the analysis, I was wondering if someone had some suggestions on tutorials to follow in order to understand how to train my app also on audio files, to make the program able to understand when there is a pitch variation in the audio of the video and combine both video and audio analysis in order to get better results.
Thank you in advance
Since you are new to Python and to tensorflow, I recommend you focus on just audio for now, especially since its a strong indicator of events of importance in a football match (red/yellow cards, nasty fouls, goals, strong chances, good plays, etc).
Very simply, without using much ML at all, you can use the average volume of a time period to infer significance. If you want to get a little more sophisticated, you can consider speech-to-text libraries to look for keywords in commentator speech.
Using video to try to determine when something important is happening is much, much more challenging.
This page can help you get started with audio signal processing in Python.
https://bastibe.de/2012-11-02-real-time-signal-processing-in-python.html

Passive and automatic face recognition

Hy guys. At school we use badge for mark who is present, for my exam i want to upgrade that system.
I would like to create a face recognition system, basically i would like to set a raspberry with camera over the doors, like that, when students pass the door will be automatically marked as present.
I know OpenBR but i didn't understand if i can use it for my project, and i have some issues with it, i can't install it, it return me an error when i test it.
I ask you if you know if OpenBR can do the trick for me (you have to know that we are a lot at school), or if there are some other technologies that i can use.
You could look at using opencv to train an object detector to look for the badge:
http://docs.opencv.org/2.4/doc/user_guide/ug_traincascade.html
https://www.youtube.com/watch?v=WEzm7L5zoZE
If each of the badges have some unique identifier for the students, you could then analyse the identifier to take attendance.
Identifying the badge / face would be the "easy" part. Identifying the student would be the hard part!
Identifying people from photos is tricky, and I would estimate that Facebook has spent millions on this problem.
Here are a couple of links that may be useful
http://scikit-learn.sourceforge.net/0.6/auto_examples/applications/plot_face_recognition.html
OpenCV identify person with face detection
You use Raspberry Pi for your project, so
Software:
1.OpenCV-Python is a very good choice.
2. SimpleCV is more simple to use but less power than OpenCV. It's still ok for your purpose.
Hardware:
You also need to be aware of hardware, using USB Webcam is not a good choice because of slow speed.
Module camera is better because it uses serial interface to transfer data.

How does OCR work? and how to add OCR to an alphabet

I have an alphabet which has not been tackled before, so when scanned, there's no way to detect the letters for recognition with OCR. I'm trying to program OCR for it, but don't have much experience in this. I'd appreciate some hints as to where to get started, and how such a system is normally implemented.
Take a look at this page--it describes the training process for an open source OCR engine.
The free Stanford Online Machine Learning class has a great set of lessons on Photo OCR in Part XVIII.
This blog post has a brief description of the example taught in the class.
There are some excellent resources at google books. Likewise, if you search for Optical Character Recognition on Amazon, there are some pretty up-to-date books that look to be fairly thick and intellectually challenging :D heh
btw - I'm well aware this post has some age, but you never know when some other person might stumble across this and find just what they need. And if this even has the chance of helping out, then so be it. OCR is such a strange subject, that there's not too much out there that can really really answer the deep-machine ended questions. Especially if you're going to attempt to write your own library. :P

Onset to Beat Detection?

How do you determine which onsets are beats? I am using Spectral Flux for Note Onset Detection and a Running Mean for peak-picking/thresholding.
I am just working with the guitar instrument so the presence of percussions may not help with this. Any ideas?
Thanks!
EDIT: Wow...just realized this question is 3 years old...sorry to resurrect an old post.
My Master's thesis was in beat detection and the main advantage of my method over all other published methods of beat detection was in resolution, both in the time domain and frequency (beat) domain. You can find my thesis here. What it basically boils down to (after alot of filtering) is a comb-filter convolution. My code is an adaptation of this project, which contains Matlab files for you to see how it works.
My code (both in C++ and the Matlab port) is not publicly available due to possible copywrite issues with my university, but if you email me at dberm22[at]gmail[dot]com, I'd be more than willing to ahem::discuss my work with you.
Try using a beat tracking algorithm. Beat tracking is a distinct problem from onset detection.
I think there's a good algorithm in the Queen Mary plugin set for Sonic Visualizer. The plugins are open source, so you can have a look at the code to figure out how they work.
Or do a search on google scholar for "beat tracking". There are a number of effective approaches. Dan Ellis' is a good one to start with. It's intuitive, and there's code available in Matlab and Java.

Where to begin learning about audio processing?

I've tried looking up how I might go about this for a while now, and maybe I am using the wrong terminology in my searches or it's way too advanced for me. I basically want to be able to analyze audio files in real-time. I know hardly anything about audio processing so I should probably start small and work my way up. Eventually I'd like to be able to display a power (or frequency?) spectrum correlating to audio playing in real time. Basically like the WinAmp spectogram (terminology?)
Any online tutorials with perhaps an API suggestion or two would be greatly appreciated. I've found some vague explanations (mostly dealing with calculating FFT's then converting them to something...) Like I said, I know little of audio processing, so knowing where to start would be great.
Language of choice: C++
You could look into VST plugins as a starting point for the theory behind audio processing. There's a blog with some tutorials in c++ here.
You can also check out other SO questions on VST plugins for more info.
I believe audacity can run VST plugins, I'll look at that.
EDIT: Audacity doesn't support them out of the box, but you can enable it. You could download a trial of something like ableton live too.
I'd recommend using a graphical tool to begin with to prototype some ideas. Try Puredata or something similar.
http://puredata.info/
Juce is a fantastic way to get to grips with C++ with an Audio slant.
http://www.rawmaterialsoftware.com/juce.php
I've also stumbled across UGen which might help you get up and running without having to understand too much of the sample-by-sample processing theory. I've not looked at this much yet but it looks interesting at the outset.
http://code.google.com/p/ugen/
The KVR forums are full of knowledgable people who will help and direct newcomers to audio and plugin development.
http://www.kvraudio.com/
If you're feeling brave the dive in to a good book. I've heard a lot of good things about the following:
http://www.amazon.com/DAFX-Digital-Udo-246-lzer/dp/0471490784
Good luck! This is not an easy area to get going in!
(PS, the blog linked in the above answer is mine -> it's out of date and wont help you actually do any signal processing)

Resources