Audio File: Playing data through one Speaker Only? - audio

I am working on a simple application which does speaker test. It is intended to first play on the Left Speaker then on Right one (or based on selection). As there is no way of achieving it directly, I am trying to overwrite alternate bytes. I checked out in Hex Editor, and bytes are do repeating in alternate pairs (2 bytes). When worked out with overwriting, it is still playing sound in both speakers.
I am currently using
16 bit signed Little Endian
Am I doing something wrong?
These are the available formats I can use while recording the sound:
Unsigned 8-bit samples
Signed 8-bit samples
Unsigned 16-bit little-endian samples
Unsigned 16-bit big-endian samples
Signed 16-bit little-endian samples
Signed 16-bit big-endian samples

I think that you want to look into using AUPannerUnit, or aupn, which controls the left and right stereo channels. There should be an AUPannerUnits.txt tutorial file that is helpful.
Source: Apple Audio Unit Documentation

Related

Addition of PCM Audio Files - Mixing Audio

I am posed with the task of mixing raw data from audio files. I am currently struggling to get a clean sound from mixing the data, I keep getting distortion or white noise.
Lets say that I have a two byte array of data from two AudioInputStream's. The AIS is used to stream a byte array from a given audio file. Here I can playback single audio files using SourceDataLine's write method. I want to play two audio files simultaneously, therefore I am aware that I need to perform some sort of PCM addition.
Can anyone recommend whether this addition should be done with float values or byte values? Also, when it comes to adding 3,4 or more audio files, I am guessing my problem will be even harder! Do I need to divide by a certain amount to avoid this overflow? Lets say I am adding two 16-bit audio files (min -32,768, max 32,767).
I admit, I have had some advice on this before but can't seem to get it working! I have code of what I have tried but not with me!
Any advice would be great.
Thanks
First off, I question whether you are actually working with fully decoded PCM data values. If you are directly adding bytes, that would only make sense if the sound was recorded at 8-bit resolution, which is done less and less. These days, audio is recorded more commonly as 16-bit values, or more. I think there are some situations that don't require as much frequency content, but with current systems, the cpu savings aren't as critical so people opt to keep at least "CD Quality" (16-bit resolution, stereo, 41000 fps).
So step one, you have to make sure that you are properly converting the byte streams to valid PCM. For example, if 16-bit encoding, the two bytes have to be appended in the correct order (may be either big-endian or little-endian), and the resulting value used.
Once that is properly handled, it is usually sufficient to simply add the values and maybe impose a min and max filter to ensure the signal doesn't go beyond the defined range. I can think of two reasons why this works: (a) audio is usually recorded at a low enough volume that summing will not cause overflow, (b) the signals are random enough, with both positive and negative values, that moments where all the contributors line up in either the positive or negative direction are rare and short-lived.
Using a min and max will "clip" the signals, and can introduce some audible distortion, but it is a much less horrible sound than overflow! If your sources are routinely hitting the min and max, you can simply multiply a volume factor (within the range 0 to 1) to one or more of the contributing signals as a whole, to bring the audio values down.
For 16-bit data, it works to perform operations directly on the signed integers that result from appending the two bytes together (-32768 to 32767). But it is a more common practice to "normalize" the values, i.e., convert the 16-bit integers to floats ranging from -1 to 1, perform operations at that level, and then convert back to integers in the range -32768 to 32767 and break those integers into byte pairs.
There is a free book on digital signal processing that is well worth reading: Steven Smith's "The Scientists and Engineers Guide to Digital Signal Processing." It will give much more detail and background.

File information of .raw audio files using terminal in linux

How to get file information like sampling rate, bit rate etc of .raw audio files using terminal in linux? Soxi works for .wav files but it isn't working for .raw.
If your life depended on discovering an answer you could make some assumption to tease apart the unknowns ... however there is no automated way since the missing header would give you the easy answers ...
The audio analysis tool called audacity allows you to open up a RAW file, make some guesses and play the track
http://www.audacityteam.org
In audacity goto File -> Import -> Raw Data...
Above settings are typical for audio ripped from a CD ... toy with trying stereo vs mono for starters.
Those picklist widgets give you wiggle room to discover the format of your PCM audio given that the source audio is something when properly rendered is recognizable ... would be harder if the actual audio was noise
However if you need a programmatic method then rolling your own solution to ask those same questions which appear in above window is possible ... is that what you need or will audacity work for you ? We can go down the road of writing code to play off the unknowns mentioned in #Frank Lauterwald's comment
To kick start discovering this information programmatically, if the binary raw audio is 16 bit then each audio sample (point on the audio curve) will consume two bytes of your PCM file. For mono audio then the following two bytes would be your next sample, however if its stereo then these two following bytes would be the sample from the other channel. If more than two channels then just repeat. Typical audio is little endian. Sampling rate is important when rendering the audio, not when programmatically parsing raw bytes. One approach would be to create an output file with a WAV header followed by your source PCM data. Populate the header with answers from your guesswork. This way you could listen to this output file to help confirm your guesses.
Here is a sample 500k mono PCM audio file signed 16 bit which can be imported into audacity or used as input to rolling your own identification code
The_Constructus_Corporation_Long_Street-ycexQvMy03k_excerpt_mono.pcm

24-Bit Sound Capture in Matlab on Linux

I have a rather different question. So I'm using Matlab on a Linux Gentoo machine. I got a few Asus Xonar STX soundcards, and I'm trying to use them as sensitive audio frequency analyzer using the PlayRec non blocking audio IO package.
Now I know that Matlab will say if you try to use the audiorecorder function, and specify 24 bits in linux, it will tell you that 24bit is only supported in Windows. However the ALSA literature does not imply that this is a limitation of the operating system or of ALSA itself, and as a matter of fact Alsa seems to allow you to specify a 24 bit PCM device. And PlayRec uses PortAudio, which then uses Alsa on Linux systems.
Now this is all well and good, and Playrec doesn't seem to have a means of specifying the bit depth, just the sample rate. I have run many tests and know what the transfer function of my soundcard is (floating point return value to input Voltage conversion ratio), and I know my peak voltage is 3V, and my noise is around 100uV. This gives me 20*log10(3/100e-6) = 91dB. Which is closer to what I expect to see from 16 bits and not from 24.
My real question is this: Is there some way of verifying that I am in fact getting 24 bits in my captured signal?
And if I am not, is there some inherent limitation of ALSA or Matlab which is restricting me to only 16-bit data from sound capture devices, even when using 3rd party program to gather that data.
If you observe the data that playrec is putting out through playrec('getRec', ...), you'll see that it always is single-precision floating point (tested on Windows, MATLAB R2013b, most current Playrec). (you can verify it yourself after recording a single page with Playrec and looking in the workspace window of the IDE or by running whos('<variable_name_of_page>') at the command line.
If you look at Line 50 of pa_dll_playrec.h, you'll see that single-precision is chosen by definition:
/* Format to be used for samples with PortAudio = 32bit */
typedef float SAMPLE;
Unfortunately, this does not completely answer the question of exact sample precision, because the PortAudio lib converts samples from the APIs varying in format into the defined one. So if you want to know, what precision you're actually getting, I'd suggest a very pragmatic solution: looking at the mantissa of the 32-bit floating sample values. A simple fprintf('%+.32f\n', data) should suffice to find out how many decimal places are actually used.
Edit: I just realized I got it wrong. But here's the trick: Record audio off of an empty channel of your audio device. Plot the recorded data and zoom into the noise floor. If you're just getting plain zeros, the device is probably not activated properly (or has a too good signal/noise ratio). Try an external interface and/or turn up the gain a little bit). Depending on the actual bit resolution of the recorded data, you'll see quantization steps in the samples. Depending on the bit-depth originally used by the quantizer, those steps are bigger or smaller. Below you'll see the comparison between 16-bit (left) and 24-bit (right) of to seperately recorded blocks, from the same audio device, only that I used PortAudio's WASAPI API (on Windows, obviously) on the left and ASIO on the right:
The difference is quite obvious: at these very low levels, 16-bit only allows three values, while 24-bit has much finer stepping. So this should be a sufficient answer to your question on how to determine the real bitdepth and if your signal is recorded at 24-bit. If there are sample steps smaller than 2^-15, the odds are pretty good.
Looking in to this topic made me realize that it very much depends on the API of the currently chosen recording device, which bit-depth the quantization actually happens at. ASIO seems to always use 24-bit, while for example WASAPI falls back to 16-bit.
If you can store that signal as a wav file, run a file command on the wav from the command line in linux. Something like:
file x.wav will give you sampling rate and bits that the file was encoded at. The output is usually something like: 16 bit, 16000Hz etc.

Decode G711(PCM u-law)

Please bear with me as my understanding of audio codec is limited.
I have this audio source from a IPCAM (through a htto//... CGI interface).
I am trying to write several client programs to play this audio source on Windows, MAC, as well as Android phone. The audio is encoded in G711 (PCM ulaw).
Do I need to decode the PCM audio data to a raw audio data before I could pass it to the audio engine to play? If so, is there some sample code on how to decode it?
I am confused as somehow I believe PCM is already RAW. Could I just feed it directly to the audio engine on Android for example?
thanks much in advance
It depends on what API you are using to play sound, but most require linear PCM and you have µ-law PCM, so unless your API supports µ-law playback you will need to convert the µ-law sample values to linear.
With G.711 the compressed µ-law samples are 8 bits and these will be converted to 14 bit linear values which you will store in a buffer as 2 bytes per sample. There is a brief description of the µ-law encoding on the G.711 Wikipedia page.
You may find this useful:
u-Law companding algorithm in C

Does all audio format has a header for message length

Does all audio format has a header for audio length (in second)?
If not, what kind of audio format has that information embedded in the header.
Thank you.
Not necessarily. Typical wav files will have a wave format chunk (WAVEFORMATEX if you're coding on Windows) which contains the sample rate and number of bits per sample. Most of the WAV files you'll tend to come across are in PCM format where you know that there is always the same number of samples per second and bits per sample, so from the size of the file and these values you can work out the duration exactly.
There are other types of WAV file though which may be compressed (though these are much rarer) and for those you'll need to use the 'average bytes/sec' field of the WAVE header to work out the length.
If you're using AIFF (largely used on macs) then this has similar data members in the header.
Getting the length from an MP3 file is more difficult -- some suggestions are in this other question

Resources