audio volume normalization [closed] - audio

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I am writing a music player and I want to normalize the audio volume across different songs.
I could think of some different ways to do this, e.g.:
Go through all PCM samples (assume floating point from -1 to 1) and select the m = max(abs(sample)). Then apply the factor 1/m to all the PCM samples. This would make the peak be at 1.
Go through the PCM stream and for each position, take the Hanning window of some width around it, calculate the average of absolute samples and from those data, pick the maximum and normalize everything.
The same as 2 but some other way to get some sort of averaged value.
2 and 3 have the disadvantage that I might need some clipping and thus loose some quality. By not normalizing to 1 but to 0.95 or so, I maybe could avoid this to some degree, though. But I think 2 and 3 have the advantage that this might be the more natural normalization for the user. Wikipedia also has some information about this and mentions RMS, ReplayGain or EBU R128 to measure the loudness of a song.
How are other popular music players (like iTunes or so) doing this?

iTunes uses the Sound Check technology. "Sound Check is a proprietary Apple Inc. technology similar in function to ReplayGain. It is available in iTunes and on the iPod." (from Wikipedia) So, this is no solution for me.
It seems that ReplayGain is the most common technic. The algorithm is explained here. A sample implementation is mp3gain (GPL) or ffmpeg-replaygain (GPL, derived from mp3gain). I have my own implementation now in my MusicPlayer project (BSD-licence).
See also these projects with implementations:
http://sox.sourceforge.net/
http://r128gain.sourceforge.net/
official ReplayGain homepage
official ReplayGain 1.0 specification
Wikipedia: ReplayGain

Related

Baby Cry Sound detection [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 7 years ago.
Improve this question
I wanted to write a code to detect baby cry sound. I am using Windows as platform. Presently, I am able to get audio samples and its frequency plot(using FFT) but not sure how to proceed forward.
I wanted to ask what steps I should follow to detect the baby cry sound given its time-frequency plot.
I saw some methods such as median filter followed by HMM in speech recognition. But for simple sound detection do I need to go for such sophiticated method?
I will be very grateful if you could help me.
Hidden markov models are widely used in speach recognition, but since you don't really need to know what your baby is saying (next project: baby translator), i don't think it is what you need.
What you should probably do is look at a lot of spectorgrams of babies crying, and look for patterns. Or, even better, let your algorithm do this. What you do is calculate certain metrics about your sound called MFCCs.
You do this on, say, 1000 samples of crying sound, and then you have a 1000 vectors of metrics.
Now, for each metric you calculate the standard deviation. This gives you a way to tell of a sample of random babysound how much different it is from avarage crying sound.
This sounds very hard, but i know there are tools out there. Have a look at sphinx. You can probably train to work.
But either way, start by collecting baby-crying sounds ;) (but don't steal candy)

How or where can I get separate notes of an instrument for playback in my application? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I am looking to create a music creation application, and would like to allow the user to play the individual notes of an instrument. Is there a place online where I can find individual sound files that I may playback for each note, or is there a way of programmatically "generating" each pitch? I am not concerned with sound quality at this point in my development.
EDIT: I am still in the early stages of development. I want the app to be browser based, using Javascript or something similar. A Linux development environment, if that is of relevance at all. The notes will be played via an on-screen interface.
The University of Iowa's Electronic Music Studios has a very nice and complete archive of sampled instruments, with one musical note per file. You should also check out freesound, though that is a much more general-purpose sample sharing site.
There are plenty of places online to find sampled instruments. If you're not concerned about sound quality, some free soundfonts will most likely do the job.
For example, this site http://soundfonts.homemusician.net/ has pianos, basses, guitars, horn etc. (Google "free sf2" for more)
There are plenty of ways to generate (aka synthesise) tones as well.
If you don't mind MIDI files, you can get a free MIDI software piano and create your own files: C.mid, C#.mid, D.mid, etc.
Here's one with a quirky interface but there are many more:
http://download.cnet.com/MidiPiano/3000-2133_4-10542342.html
The easiest way to do this is to simply output MIDI messages to the synth built-in to every computer. No need to create MIDI files or use extra sound fonts.
You didn't mention what language you are using, so it is hard to suggest ways to get started. In all cases though, you'll want to read up a bit on what MIDI actually is.
Basically, MIDI is nothing but control data, commonly used with synthesizers. At a basic level, there are note-on, and note-off messages. There are many other kinds of messages too, such as pitch bend, control change, etc. MIDI supports 16 "channels", which are sent all down the same line, just with a different identifier.
A good utility (on Windows) for debugging MIDI messages (and getting a better idea of the protocol in general!) can be found here: http://www.midiox.com/

sensor programming [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 11 months ago.
Improve this question
I´ve a question according sensor programming. I´m searching a sensor that tells me, for example, if a glass of water is more than half full. I´ve already googled that, but I can´t find anything.
So my questions are:
Where can I buy such a sensor?
What programming language do I need to control such a sensor?
Thanks for answers..
Update from comments below one of the answers
What I really need it for is a big container, in which is some corn. I
want to use the sensor to tell me, just as the corn is under a defined
point of the container. So that I can calculate, at which time I have
to refill the container.
Your sensor could be a level sensor. There are several principles on which level sensors work (see here). Some of them will work with granular solid material. (For example, an ultrasonic range sensor could shoot a pulse at the surface of corn mass, detect the reflection, measure round trip time of flight.)
... or it could be a proximity sensor, as somebody had suggested above.
... or it could be a weight sensor. Here's an application note on weighing vessels.
If you google "level sensor for grains", you may find something useful.
What language to use would depend on what you will connect connect the sensor to. If it will be connected to a microcontroller, the language would be C. If it will be connected to a PC, then it would depend a lot on the particular model of the sensor.
By the way, here's a web group dedicated to sensors.
I would imagine you could use a similar mechanism to a car's fuel tank. Have a mechanism that stays afloat in the container with an attached arm and a magnet on it, then using a Hall sensor you can observe the change in hall reading as the floating part rises or falls within the container.
"What I really need it for is a big container, in which is some corn."
Perhaps one of those sensors that are used to ensure garage entry ways are clear before an automatic garage door is allowed to close. It uses an optical beam of light.
Do you know the size of the glass in question? You could just get a scale and work out how heavy the glass would be when it is half full of water. My guess is that you could probably find a sensor that could do this and it would most likely need to be written in C.
This guy seems to be having the same problems:
http://forums.makezine.com/comments.php?DiscussionID=6052
Good luck.
Also check out Arduino for micro controller electronics.

WAV-MIDI matching [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
let's consider a variation of the "WAV to MIDI" conversion problem. I'm aware of the complexity of such a problem and I know that a vast literature about the more general Music Information Retrieval (MIR) subject exists.
But let's suppose here that we already have both the WAV and the MIDI representation of a music piece, so we actually don't have to discover pitches inside the WAV signal from scatch... we "just" have to match the pitches detected (using a suitable algorithm) with the NoteOn events contained in the MIDI representation. I definitely suppose we should use the information contained in the MIDI file to give some hints to the pitch detection algorithm.
Such a matching tool could be very useful, for example for MIDI "humanization": we could make the MIDI representation more expressive using the information retrieved from the WAV signal to "fine tune" note onsets, durations, dynamics, etc...
Does anybody know if such a problem has already been addressed in literature?
Any form of contribution or assistance will be greatly appreciated.
Thanks in advance.
At the 2010 Music Hackday in London some people used the MATCH Vamp plugin to align score to Youtube videos. It was very impressive! Maybe their source code could be of use. I don't know how well MATCH works on audio generated from MIDI files, but that could be worth a try. Here's a link: http://wiki.musichackday.org/index.php?title=Auto_Score_Tubing
This guy appears to have done something similar: http://www.musanim.com/wavalign/ His results are definitely interesting.
This seems like an interesting idea. What are you trying to do, is it just match the notes pitch? Or do you have something else in mind?
One possible thing that you could look into is if you know the note (as an integer value I think its been a while) that will be used to pass into the noteOn method, you may be able to do something with that to map it with a wav signal. IT depends on what you are trying to do.
Also, there are some things that you could also play around with in (I think it is called) the midi controller. Such as: modulation, pitch, volume, pan, or play a couple of notes simultaneously. What you could do with this though, is have a background thread that can change some of those effects as the note is being played. For example, you could have a note get quieter the longer it is played, or have a note that with pan between the left and right speakers, etc
I havnt really played with this code in a long time, but there are some examples of using a midi controller.

Looking for programs on audio tape/cassette containing programs for Sinclair ZX80 PC? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 years ago.
Improve this question
OK, so back before ice age, I recall having a Sinclair ZX80 PC (with TV as a display, and a cassette tape player as storage device).
Obviously, the programs on cassette tapes made a very distinct sound (er... noise) when playing the tape... I was wondering if someone still had those tapes?
The reason (and the reason this Q is programming related) is that IIRC different languages made somewhat different pitched noises, but I would like to run the tape and listen myself to confirm if that was really the case...
I have the tapes but they've been stored in the garage at my parents' house and the last thirty years hasn't been kind to them.
You can get images here though: http://www.zx81.nl/dload if that's any use. Perhaps there is a tool out there for converting from the bytes back to the audio ;)
Edit: Perhaps here: http://ldesoras.free.fr/prod.html#src_ay3hacking
On the ZX80, ZX81 and ZX Spectrum, tape output is achieved by the CPU toggling the output line level between a high state and a low state. Input is achieved by having the CPU watch an input line level. The very low level of operation was one of Sir Clive's cost-saving measures; rival machines like the BBC Micro had dedicated hardware for serialisation and deserialisation of data, so the CPU would just say "output 0xfe" and then the hardware would make the relevant noises and raise an interrupt when it was ready for the next byte. The BBC Micro specifically implements the Kansas City Standard, whereas the Sinclair machines in every instance use whatever adhoc format best fitted the constraints of the machine.
The effect of that is that while almost every other machine that uses tape has tape output that sounds much the same from one program to the next by necessity, programs on a Sinclair machine could choose to use whatever encoding they wanted, which is the principle around which a thousand speed loaders were written. It's therefore not impossible that different programs would output distinctively different sounds. Some even used the symmetry between the tape input and output to do crude digital sampling, editing and playback, though they were never more than novelties for obvious reasons.
That being said, the base units of the ZX80 and ZX81 contained just 1kb RAM so it's quite likely that programmers would just use the ROM routines for reading and writing data, due to space constraints if nothing else. Then the sound differences would just be on account of characteristic data, as suggested by slugster.
I know these come up on auction sites like Ebay quite frequently - if you want to buy them yourself. If you get someone else who owns one to listen then you are going to get their subjective opinion :)
In any case, the language used to save it would be the secondary cause of the pitch changes - it will be related to the data. IOW you could probably create a straight binary data file that sounded very similar to a BASIC program (the BASIC would have been saved as text, as it is interpreted).
I know the threads old but... I was playing about with something similar last night and I've got a wav of an old zx81 game if you're still interested? pm me and I'll post it somewhere.
You can use something like http://www.wintzx.fr/ or pick something from http://www.worldofspectrum.org/utilities.html#tzxtools to convert an emulator file to an audio file and then you can just play it on your PC. Some tools also allow you to play the file directly. Emulator files can be found at http://www.zx81.nl/files.html and many other places.

Resources