Linux Audio record and quality comparison - linux

I am starting a project to test the audio performance on linux.
What I need to do is to play the audio on our websystem and check the audio quality (or just check it has audio output) on linux.
I am going to record the audio on linux with ffmpeg. Is there any other better choice?
I don't know how to (automation) check I recorded is what I played, as well as the quality of recorded audio.

I think what you need is PESQ (Perceptual Evaluation of Sound Quality). However I have not found anything which is open source/free and out of the box.
You can download the recommendation from here:
http://www.itu.int/rec/T-REC-P.862-200511-I!Amd2/en
Basically this is the reference implementation of PESQ.
Sevana has an audio quality analyser which is not an ITU standard, it is AQuA:
http://www.sevana.fi/aqua_wiki.php
It is available for linux but I think you have to pay for it.
You can also check the similarities for two audio files with cross-correlation, please refer to here:
https://dsp.stackexchange.com/questions/736/how-do-i-implement-cross-correlation-to-prove-two-audio-files-are-similar
I just learned that lot of people are using Matlab or Octave to generate the necessary data, for example:
http://bagustris.blogspot.ie/2011/11/calculate-time-lag-from-cross.html

Related

Getting multiple audio clips to same level

I am working on a project that involves using a lot of found audio clips (some new, some very old archival and poor quality etc).
I am trying to figure out a way to have all audio clips to be of a similar quality (if this is possible) and play at a similar volume?
I have use of both audacity and ableton...any suggestions would be great.
What you are asking for is commonly called normalization. There are several tools that can do it, including commandline tools and also audacity.
You'll find the tool in audacity under Effect > Normalize...
You can select multiple audio tracks.
You could also consider using a limiter and/or a compressor on your track. Have a look in the Live effect reference for more info on these: https://www.ableton.com/en/manual/live-audio-effect-reference/
The results will not be as good as applying normalization by hand, but it will be a lot quicker.

Open field usage of Google Resonance Audio SDK

is there a scenario where we can use the Google Resonance Audio SDK not with headphones, but with real speakers (e.g. mounted in a 360° cyrcle setting)?
Or are all algorithms not working for real speaker outputs?
Thank you!
Currently, Resonance Audio is optimized for headphone playback. For example, HRTF processing is done in the Ambisonics domain, without generating (virtual) speaker signals - this is because it is a much more efficient way of generating binaural output.
However, in the Resonance Audio open source release, the Ambisonic Codec class can readily be used to decode Ambisonics to any arbitrary loudspeaker array. To use that with the rest of the Resonance Audio system, however, it would be necessary to modify/extend the audio processing graph by adding a new decoder node.
Please, feel free to add a feature request and, depending on popularity, we might consider adding that in the future!

Determining the quality of mp3 audio streams

I have built a source client using Portaudio and LAME which streams the microphone input to an Icecast server to be listened to online via the HTML5 tag. I have managed to (supposedly) get the quality of the stream to MP3 320kbps at 44.1kHz and am looking for a way to confirm this using tests and or benchmarks.
I have an indication that these stats are somewhat correct from looking at stream inspectors in software such as iTunes and VLC, but I am looking to get a more in-depth data set.
What I basically want is to be able to test how much of the original file is being lost over the stream and if or how much the quality changes depending on environmental conditions of the broadcaster or streamer.
Does anyone know of any tools, frameworks to get some hard numbers or representations of this data?
If VLC tells you the stream is 320kbit CBR, then it is.
It sounds like what you're looking for is a comparison of the actual audio content. This is highly subjective. MP3 is built to use features of how our hearing works to save bandwidth. For example, quiet sounds are masked by loud sounds. High frequencies are harder to hear and are simply rolled off.
You can compare the spectral analysis between the original PCM-sampled waveform and the MP3 decoded waveform, but this doesn't tell you how humans interpret that sound. For that, you would have to survey humans.

Is there open source audio feature extraction software avaliable?

I undertaking a personal project which involves the development of a system which will automatically generate audio thumbnail clips (about 30 seconds in length) from a full length track.
In order to do this I want to look at the energy and pitch of the audio to try and correctly identify its major structural features.
Is there any open source software available that can do energy/pitch extraction? If not I will start looking into alternative methods using MATLAB.
Thanks!
YAAFE (Yet Another Audio Feature Extractor) http://yaafe.sourceforge.net/ does audio feature extraction in MATLAB, Python and C.
You might want to look into the Echo Nest API. It has a lot of audio analysis capabilities, and I know there's a script bundled in the Remix package that can automagically turn songs into shorter or longer versions (I believe the script is called earworm).
Audacity may do it.
Try JAudio which can extract features from an audio.
MARSYAS contains bextract for analysis, can find MFCCs and various other timbral and spectral features. http://marsyas.info/

Converting audio to code and vice-versa

Having just witnessed Sound Load technology on the Nintendo DS game Bangai-O Spritis. I was curious as to how this technology works? Does anyone have any links, documentation or sample code on implementing such a feature, that would allow the state of an application to be saved and loaded via audio?
Its the same old thing used in ZX Spectrum era. You load programs/games from tape.Only the sound quality and the filters are probably better.
In my opinion something like Bluetooth or WiFi is better. You can also send files that can be put on some storage and then load them. I find these methods much easier than sound because if there is a lot of noise around you cannot do much.
It is just a conversion of data to audio and then back from audio to data.
Search for Zotyocopy and Copy86M on google - these are the utilities used for saving a game to tape after loading it into memory on zx spectrum.
If you want to pass data as audio through the air there are a few things you need to be aware of though, such as how the speaker and microphone interact for example. It is important that they don't distort or alter the sound too much as what you are sending are in fact the raw bytes.
Some audio software will let you open any file as audio so that you may listen to it. If you record audio as data do not use lossy compression such as mp3 on the audio file!

Resources