Does all audio format has a header for message length - audio

Does all audio format has a header for audio length (in second)?
If not, what kind of audio format has that information embedded in the header.
Thank you.

Not necessarily. Typical wav files will have a wave format chunk (WAVEFORMATEX if you're coding on Windows) which contains the sample rate and number of bits per sample. Most of the WAV files you'll tend to come across are in PCM format where you know that there is always the same number of samples per second and bits per sample, so from the size of the file and these values you can work out the duration exactly.
There are other types of WAV file though which may be compressed (though these are much rarer) and for those you'll need to use the 'average bytes/sec' field of the WAVE header to work out the length.
If you're using AIFF (largely used on macs) then this has similar data members in the header.
Getting the length from an MP3 file is more difficult -- some suggestions are in this other question

Related

Is there a way to set the details of a file in Windows using python?

I want to be able to set the "Title" and "Comments" (listed in properties->details) of some mp3 files in Windows using python. Is this possible, perhaps with a library like PyWin32? Also, would these details be visible in other operating systems or are they Windows-specific? Thanks.
Simple Answer:
Yes, you can set 'Title' and 'Comments' (and many other fields) of an mp3 file in Windows using Python.
Also, the details are visible on all operating systems and are not windows specific.
First you have to understand what is mp3 file and how data is organized within an mp3 file.
Detailed Answer:
Raw audio consumes a lot of size. For example, an audio signal of 10 sec sampled 48 kHz and having a bit depth of 16 bits per sample will be of size 10*48000*16 bits, which is close to 1 MB. So, for a 5 minute song, it will almost take 30 MB. But, if you observe, most 5 min mp3 songs are of size around 5 MB (of course it depends on sampling frequency, bit depth and amount of compression used). How is it possible? It is possible because we compress the data using signal processing techniques which in itself is a big topic altogether which we will not discuss here. So, to create an mp3 file we need something called encoder which converts the raw audio data to compressed data and every time you play an mp3 song, decoder is used which converts the data from compressed format to raw audio, which is what you can only listen. So, compression is done for saving storage and also transmission bandwidth (basically saving amount of data to be transmitted over internet).
Now, coming to how data is organized inside an mp3 file. mp3 file will obviously contain the compressed data. In addition many mp3 files contain some meta data (like Title and Comments you mentioned in your question). There are several formats for storing this meta data. So, a decoder which is decoding mp3 file should also support decoding of meta-data, then only you can see the information, other wise you can't see. The meta data is operating system independent, and can be seen on any operating system provided you have a proper decoder.
Finally, yes you can edit the meta data on windows (for that matter on any OS) using python. If you want to do this, using only python without any library, you need to understand how data is organized inside an mp3 file, find the meta-data inside it, edit it and store it back. But, there are libraries and packages in python which support editing meta-data of mp3 file. You can use them directly. Also, the meta data is independent of OS, and once you edit your properties, you should be able to see the properties in any OS provided the decoder you use has the support.
Some links which will help you:
mp3 tag tool
Another stack overflow question which gives details about libraries that support viewing and editing of meta data using Python

File information of .raw audio files using terminal in linux

How to get file information like sampling rate, bit rate etc of .raw audio files using terminal in linux? Soxi works for .wav files but it isn't working for .raw.
If your life depended on discovering an answer you could make some assumption to tease apart the unknowns ... however there is no automated way since the missing header would give you the easy answers ...
The audio analysis tool called audacity allows you to open up a RAW file, make some guesses and play the track
http://www.audacityteam.org
In audacity goto File -> Import -> Raw Data...
Above settings are typical for audio ripped from a CD ... toy with trying stereo vs mono for starters.
Those picklist widgets give you wiggle room to discover the format of your PCM audio given that the source audio is something when properly rendered is recognizable ... would be harder if the actual audio was noise
However if you need a programmatic method then rolling your own solution to ask those same questions which appear in above window is possible ... is that what you need or will audacity work for you ? We can go down the road of writing code to play off the unknowns mentioned in #Frank Lauterwald's comment
To kick start discovering this information programmatically, if the binary raw audio is 16 bit then each audio sample (point on the audio curve) will consume two bytes of your PCM file. For mono audio then the following two bytes would be your next sample, however if its stereo then these two following bytes would be the sample from the other channel. If more than two channels then just repeat. Typical audio is little endian. Sampling rate is important when rendering the audio, not when programmatically parsing raw bytes. One approach would be to create an output file with a WAV header followed by your source PCM data. Populate the header with answers from your guesswork. This way you could listen to this output file to help confirm your guesses.
Here is a sample 500k mono PCM audio file signed 16 bit which can be imported into audacity or used as input to rolling your own identification code
The_Constructus_Corporation_Long_Street-ycexQvMy03k_excerpt_mono.pcm

Dismantling a WAVE file

sorry for this not being a programming question directly, but more indirectly as i try to batch convert audio files, which is proving difficult.
I have an audio file which i exported from a package. This audio file is of the RIFF WAVE format. As far as i have read up on headers, normal headers are 44 bytes long. Which contains the sub parts "fmt " and "data". However, this header shows all kind of weird junk, which i cannot actually place anywhere.
If anyone is an audio guru of sorts, please help me out on how to make this audio file accessible for most audio players? i do not care to lose some of the header data as long as it plays the actual content.
Here is a screenshot of my current header data unaltered:
Thanks in advance.
44Bytes is the size of a minimal Wav File header. The format allows for other data chunks in the header in addition to the Riff, fmt and data chunks.
It looks like you have some cue information in your file. This is not a problem, most audio players should accept a wav file with these chunks.
How to write cues/markers to a WAV file in .NET discusses how to add a cue chunk to a file.
http://www.sonicspot.com/guide/wavefiles.html covers some of the additional chunks a wav file can have.
Mike
Turns out this WAVE thing is just a container, and it actually contains a .ogg. I used ww2ogg 3rd party tool to get out these .ogg files as wave. Thanks for all the help though!
According to http://en.wikipedia.org/wiki/WAV there is a table of wave files with different comperssion. You can just investigate in HEX editor a value of AudioFormat field of fmt chunk, to get a list of most common codecs used for compression.

Estimating the time-position in an audio using data?

I am wondering on how to estimate where I am currently in an audio with regards to time, by using the data.
For example, I read data by byte[8192] blocks. How can I know how much byte[8192] is equivalent to in time?
If this is some sort of raw-ish encoding, like PCM, this is simple. The length in time is a function of the sample rate, bit depth, and number of channels. 30 seconds of 16-bit audio at 44.1kHz in mono is 2.5MB. However, you also need to factor in headers and container format crapola. WAV files for example can have a lot of other stuff in them.
Compressed formats are much more tricky. You can never be sure where you are without playing through the file to get to where you are. Of course you can always guesstimate based on the percentage of the file length, if that is good enough for your case.
I think this is not what he was asking.
First you have to tell us what kind of data you are using. WAV? MP3? Usually without knowing where that block came from - so you know if you have some kind of frame information and where to find it - you are not able to determine that block's position.
If you have the full stream and this data then you can do a search

compressed and uncompressed .wav files

What is the difference between compressed and uncompressed .wav files?
The WAV format is a container format for audio files in Windows.
The WAV file consists of a header and the contents. The header contains information about the size, duration, sampling frequency, resolution, and other information about the audio contained in the WAV file. Generally, after the header is the actual audio data.
Since WAV is a container format, the data it contains can be stored in various formats. One of which is uncompressed PCM, but it can also store ADPCM, MP3 and other formats, and can be read and written if an audio codec for the format is available.
The difference between compressed and uncompressed WAV files is that the data contained within the WAV file is either uncompressed raw audio samples, or it is compressed using an audio codec, in which case, it must be decompressed before it can be played back.
Further reading:
Wikipedia: Audio compression (data)
Wikipedia: WAV
Wikipedia: Codec
There's a great explanation here. The basic difference is that an uncompressed wave file has just the raw bits in it as they "appear". There is nothing done to compress or shrink them. A compressed wave file uses some sort of codec to shrink down the data before putting it in the file.
The difference between these two things is basically in the size of object, the compressed one might have low size compared to uncompressed basically the content are the same.
You have to be very careful when using the word "uncompressed" when talking about media.
Basically ALL digital media is compressed in some way. Audio, or video. No matter what it is, it is compressed in some way. Its intrinsic to converting from analog to digital.
The problem isn't really technical, its lingual.
People think that uncompressed means "nothing done to it" when in reality there really isnt any way you can do this. There is always some kind of compression done when you convert the analog signal coming out of the mic and going into a file...Its essential.
What uncompressed means is very high quality. And different "Uncompressed" codecs do things differently.
I know more about video codecs, so i will base my example in those.
Black Magic (A company that makes video Out Cards) has an Uncompressed Codec. Its very good. Makes Beautiful images.. But its not really "uncompressed". Sure its big. But compare it to a DPX of TIFF image sequence...and it aint that big, and is quite compressed. Its only 10 bit, but something like an OpenEXR image sequence is like 32 bit...and coming from film, that is still technically compressed. It has to be.
Its just the nature of the beast.

Resources