How to read audio and video packets from mp4 file - audio

I am trying to write a code in c/c++ (objective c) to parse the audio and video data from mp4 file.
I know that data in mp4 file contains under the mp4 atom but not sure how i can parse out the audio and video data separately.
Thanks in advance for any help.

Mp4 format is fairly complicated. I suggest you use a library. But if you can't use a library, or just wan to learn the format, Than you must parse about a dozen boxes or atoms under the root moov box. The information from there can be used to find frames within the mdat atom. The full specifications is numbered ISO/IEC 14496-12 You should be able to find a copy online.

Related

Is there any visualization tool for .flac audio file or .ts audio file?

I am pretty new with processing audio file. '
I want to build a web app that can take audio file and turn the into visualization for user like this https://github.com/CrowdCurio/audio-annotator
Right now I want to research on visualize audio datas. Original data that was stored in S3 come in two form .ts and .flac. That's why I want to ask if there's any visualization tool which can directly use .ts or .flac audio file.
Because right now the solution I think of will be first convert them into .wav or .mp3, so most visualization tool can process them, but .wav file is really storage-wasting as far as I know.
So if you know any approach or tool to do this. Please let me know!
Audio visualization requires audio data. Your compressed audio isn't audible until decoded. Therefore, you must decode them to PCM before visualizing.
This doesn't require that you store the files as WAV, but you'll at least have to decode them on-the-fly.

File information of .raw audio files using terminal in linux

How to get file information like sampling rate, bit rate etc of .raw audio files using terminal in linux? Soxi works for .wav files but it isn't working for .raw.
If your life depended on discovering an answer you could make some assumption to tease apart the unknowns ... however there is no automated way since the missing header would give you the easy answers ...
The audio analysis tool called audacity allows you to open up a RAW file, make some guesses and play the track
http://www.audacityteam.org
In audacity goto File -> Import -> Raw Data...
Above settings are typical for audio ripped from a CD ... toy with trying stereo vs mono for starters.
Those picklist widgets give you wiggle room to discover the format of your PCM audio given that the source audio is something when properly rendered is recognizable ... would be harder if the actual audio was noise
However if you need a programmatic method then rolling your own solution to ask those same questions which appear in above window is possible ... is that what you need or will audacity work for you ? We can go down the road of writing code to play off the unknowns mentioned in #Frank Lauterwald's comment
To kick start discovering this information programmatically, if the binary raw audio is 16 bit then each audio sample (point on the audio curve) will consume two bytes of your PCM file. For mono audio then the following two bytes would be your next sample, however if its stereo then these two following bytes would be the sample from the other channel. If more than two channels then just repeat. Typical audio is little endian. Sampling rate is important when rendering the audio, not when programmatically parsing raw bytes. One approach would be to create an output file with a WAV header followed by your source PCM data. Populate the header with answers from your guesswork. This way you could listen to this output file to help confirm your guesses.
Here is a sample 500k mono PCM audio file signed 16 bit which can be imported into audacity or used as input to rolling your own identification code
The_Constructus_Corporation_Long_Street-ycexQvMy03k_excerpt_mono.pcm

Dismantling a WAVE file

sorry for this not being a programming question directly, but more indirectly as i try to batch convert audio files, which is proving difficult.
I have an audio file which i exported from a package. This audio file is of the RIFF WAVE format. As far as i have read up on headers, normal headers are 44 bytes long. Which contains the sub parts "fmt " and "data". However, this header shows all kind of weird junk, which i cannot actually place anywhere.
If anyone is an audio guru of sorts, please help me out on how to make this audio file accessible for most audio players? i do not care to lose some of the header data as long as it plays the actual content.
Here is a screenshot of my current header data unaltered:
Thanks in advance.
44Bytes is the size of a minimal Wav File header. The format allows for other data chunks in the header in addition to the Riff, fmt and data chunks.
It looks like you have some cue information in your file. This is not a problem, most audio players should accept a wav file with these chunks.
How to write cues/markers to a WAV file in .NET discusses how to add a cue chunk to a file.
http://www.sonicspot.com/guide/wavefiles.html covers some of the additional chunks a wav file can have.
Mike
Turns out this WAVE thing is just a container, and it actually contains a .ogg. I used ww2ogg 3rd party tool to get out these .ogg files as wave. Thanks for all the help though!
According to http://en.wikipedia.org/wiki/WAV there is a table of wave files with different comperssion. You can just investigate in HEX editor a value of AudioFormat field of fmt chunk, to get a list of most common codecs used for compression.

Difference between audio encoding/decoding and format conversion

Recently i have been trying to convert an audio file from one format to another through ffmpeg. i was trying to do some google but results made me a little confused about the difference between encoding and decoding an audio file and converting from one format to another.
Let me describe it this way: There are several different file formats for video files (sometimes also called "wrappers"). There are also several different codecs which can be used to encode (or compress) the audio and video. Audio and video use different codecs - and the encoded formats can be sorted in different file types/formats.
So when you talk about "encoding" vs. "converting" a couple of things come into play.
"Encoding" would be the act of taking audio/video and encoding them into a given codec(s). "Converting" implies having stuff in one format, but wanting it in another. There are two ways of looking at this:
Often called "repackaging" - this is when the video (for example) has been encoded correctly (let's say h264, with a bunch of parameters), but you want it in a different file-type - maybe it's an .AVI and you wanted it in an .MP4. This doesn't involve changing the actual video - just re-wraping the h264 stream in a new "wrapper", and is thus a fast operation.
Re-encoding. Let's say your audio was in a MP3 format, and you wanted it in an AAC format. This would require decoding the entire MP3 stream, and re-encoding it into AAC.
Obviously you can also do "1" and "2" together.
Refer Formats and Codecs for detailed information.
Hope it helps!

How do I create an mp4 file from a collection of H.264 frames and audio frames?

I have a program that captures and stores H.264 encoded video as well as audio into a proprietary format file. I need to be able to export that video and audio to an mp4 file. I prefer C# but will use C++ if necessary. Any suggestions?
To produce MPEG-4 Part 14 .MP4 file you need a multiplexer. There is a choice of multiplexers out there:
FFmpeg (libavformat)
DirectShow filters (free and open source from GDCL, commercial)
Windows 7+ Media Foundation file sink
API and complexity might vary because some of multiplexers are expected to be a part of pipeline, they are not completely standalone classes. You might want to check respective samples (and license agreements, perhaps, too) to see what is best for you.
Take a look at libmp4v2. Fairly straightforward to use..
http://code.google.com/p/mp4v2/

Resources