Crazy audio PCM file - audio

I can't figure out why two non-identical raw files are exactly the same when imported into Audacity.
Let me explain, I have a 16-bit pcm file named file1.pcm, I import it into audacity with the parameter: (Signed 16-bit PCM, Little-endian, mono, 8000Hz).
Right after I export it with as parameter (raw, header-less, Signed 16 bits PCM).
Normally I should have the two identical files but not at all.
So I have two different files, but if I import them both into Audacity I have exactly the same thing
file 1
file 2
The two files imported in audaciy :
Audacity
If someone has an explanation, and especially how we go from one to the other.
Thanks
.G

The problem was dither, dither was setup in audacity, when i disable it import ed and exported become exactly the same
.G

Related

File information of .raw audio files using terminal in linux

How to get file information like sampling rate, bit rate etc of .raw audio files using terminal in linux? Soxi works for .wav files but it isn't working for .raw.
If your life depended on discovering an answer you could make some assumption to tease apart the unknowns ... however there is no automated way since the missing header would give you the easy answers ...
The audio analysis tool called audacity allows you to open up a RAW file, make some guesses and play the track
http://www.audacityteam.org
In audacity goto File -> Import -> Raw Data...
Above settings are typical for audio ripped from a CD ... toy with trying stereo vs mono for starters.
Those picklist widgets give you wiggle room to discover the format of your PCM audio given that the source audio is something when properly rendered is recognizable ... would be harder if the actual audio was noise
However if you need a programmatic method then rolling your own solution to ask those same questions which appear in above window is possible ... is that what you need or will audacity work for you ? We can go down the road of writing code to play off the unknowns mentioned in #Frank Lauterwald's comment
To kick start discovering this information programmatically, if the binary raw audio is 16 bit then each audio sample (point on the audio curve) will consume two bytes of your PCM file. For mono audio then the following two bytes would be your next sample, however if its stereo then these two following bytes would be the sample from the other channel. If more than two channels then just repeat. Typical audio is little endian. Sampling rate is important when rendering the audio, not when programmatically parsing raw bytes. One approach would be to create an output file with a WAV header followed by your source PCM data. Populate the header with answers from your guesswork. This way you could listen to this output file to help confirm your guesses.
Here is a sample 500k mono PCM audio file signed 16 bit which can be imported into audacity or used as input to rolling your own identification code
The_Constructus_Corporation_Long_Street-ycexQvMy03k_excerpt_mono.pcm

Converting From 4-bit RAW Audio to WAV (or another output format)

Okay, so I've got some .raw files from an old game (Zork Nemesis) and determined that they're audio files, however I'm having trouble converting them into something meaningful.
With a bit of trial and error in Audacity I've found that I can listen to a still noisy version of the audio using raw file input settings of 8-bit signed PCM in stereo with a sample rate of 22050hz. However, my suspicion is that the files may in fact be encoded in 4-bits with a sample rate of 44100hz, but I'm having trouble finding a tool that can handle this.
What I'm looking for is either a tool that can handle 4-bit raw formats, or even a tool that can determine (or guess at) the format of a given .raw file, so I know for sure what I'm dealing with (as I'm just going by trial and error so far).
I've tried sox, but I'm most likely doing something wrong as it complains of an unsupported size:
sox -r 44100 -e signed -b 4 -c 2 in.raw out.wav
I was also going to try ffmpeg, but I can't find the appropriate format/codec to set.
In case it gives any further clues; I've tried various combinations of settings, increasing sample size while decreasing sample rate increases the (white-)noise, and even 8-bit is still noisy, which is why I'm thinking 4-bit. I've tried signed and unsigned, which strangely doesn't seem to make much of a difference
sox expects .raw input with 8-bit or higher encoding. So if you run
sox -r 44100 -e signed -b 8 -c 2 in.raw out.wav
it should work just fine. So either the file is actually 8(+)-bit encoded or you need find a converter which accepts this form of input.

Dismantling a WAVE file

sorry for this not being a programming question directly, but more indirectly as i try to batch convert audio files, which is proving difficult.
I have an audio file which i exported from a package. This audio file is of the RIFF WAVE format. As far as i have read up on headers, normal headers are 44 bytes long. Which contains the sub parts "fmt " and "data". However, this header shows all kind of weird junk, which i cannot actually place anywhere.
If anyone is an audio guru of sorts, please help me out on how to make this audio file accessible for most audio players? i do not care to lose some of the header data as long as it plays the actual content.
Here is a screenshot of my current header data unaltered:
Thanks in advance.
44Bytes is the size of a minimal Wav File header. The format allows for other data chunks in the header in addition to the Riff, fmt and data chunks.
It looks like you have some cue information in your file. This is not a problem, most audio players should accept a wav file with these chunks.
How to write cues/markers to a WAV file in .NET discusses how to add a cue chunk to a file.
http://www.sonicspot.com/guide/wavefiles.html covers some of the additional chunks a wav file can have.
Mike
Turns out this WAVE thing is just a container, and it actually contains a .ogg. I used ww2ogg 3rd party tool to get out these .ogg files as wave. Thanks for all the help though!
According to http://en.wikipedia.org/wiki/WAV there is a table of wave files with different comperssion. You can just investigate in HEX editor a value of AudioFormat field of fmt chunk, to get a list of most common codecs used for compression.

Does all audio format has a header for message length

Does all audio format has a header for audio length (in second)?
If not, what kind of audio format has that information embedded in the header.
Thank you.
Not necessarily. Typical wav files will have a wave format chunk (WAVEFORMATEX if you're coding on Windows) which contains the sample rate and number of bits per sample. Most of the WAV files you'll tend to come across are in PCM format where you know that there is always the same number of samples per second and bits per sample, so from the size of the file and these values you can work out the duration exactly.
There are other types of WAV file though which may be compressed (though these are much rarer) and for those you'll need to use the 'average bytes/sec' field of the WAVE header to work out the length.
If you're using AIFF (largely used on macs) then this has similar data members in the header.
Getting the length from an MP3 file is more difficult -- some suggestions are in this other question

compressed and uncompressed .wav files

What is the difference between compressed and uncompressed .wav files?
The WAV format is a container format for audio files in Windows.
The WAV file consists of a header and the contents. The header contains information about the size, duration, sampling frequency, resolution, and other information about the audio contained in the WAV file. Generally, after the header is the actual audio data.
Since WAV is a container format, the data it contains can be stored in various formats. One of which is uncompressed PCM, but it can also store ADPCM, MP3 and other formats, and can be read and written if an audio codec for the format is available.
The difference between compressed and uncompressed WAV files is that the data contained within the WAV file is either uncompressed raw audio samples, or it is compressed using an audio codec, in which case, it must be decompressed before it can be played back.
Further reading:
Wikipedia: Audio compression (data)
Wikipedia: WAV
Wikipedia: Codec
There's a great explanation here. The basic difference is that an uncompressed wave file has just the raw bits in it as they "appear". There is nothing done to compress or shrink them. A compressed wave file uses some sort of codec to shrink down the data before putting it in the file.
The difference between these two things is basically in the size of object, the compressed one might have low size compared to uncompressed basically the content are the same.
You have to be very careful when using the word "uncompressed" when talking about media.
Basically ALL digital media is compressed in some way. Audio, or video. No matter what it is, it is compressed in some way. Its intrinsic to converting from analog to digital.
The problem isn't really technical, its lingual.
People think that uncompressed means "nothing done to it" when in reality there really isnt any way you can do this. There is always some kind of compression done when you convert the analog signal coming out of the mic and going into a file...Its essential.
What uncompressed means is very high quality. And different "Uncompressed" codecs do things differently.
I know more about video codecs, so i will base my example in those.
Black Magic (A company that makes video Out Cards) has an Uncompressed Codec. Its very good. Makes Beautiful images.. But its not really "uncompressed". Sure its big. But compare it to a DPX of TIFF image sequence...and it aint that big, and is quite compressed. Its only 10 bit, but something like an OpenEXR image sequence is like 32 bit...and coming from film, that is still technically compressed. It has to be.
Its just the nature of the beast.

Resources