How would you render an audio wave form like this from PCM data?
Related
how to compare the mic stream data with wave file data to get if it's mixed in the stream data
I'm using pyaudio and fft to get audio frequency data frames, are there any libraries for audio recognition? I do not need speech detection, just detections of some sound saved in files?
I am learning how to generate wave audio by using SDL2.0.
When I init the SDL audio, it asks me to provide a SDL_AudioFormat which specifies the audio format, and a callback function which is called when the audio system needs more data.
There are so many audio formats from SDL Doc, but no more information about what actual data I should write to the callback buffer.
I tested these formats:
float with Sine: (-1,1)
S8(signed byte) with square wave: [-128, 127]
U16(unsigned short): [-32768, 32767]
All of them worked.
The question is that I don't know what exactly these audio formats mean.
Can somebody give me some information about it?
I currently have the idea to code a small audio converter (e.g. FLAC to MP3 or m4a format) application in C# or Python but my problem is I do not know at all how audio conversion works.
After a research, I heard about Analog-to-digital / Digital-to-analog converter but I guess it would be a Digital-to-digital or something like that isn't it ?
If someone could precisely explain how it works, it would be greatly appreciated.
Thanks.
digital audio is called PCM which is the raw audio format fundamental to any audio processing system ... its uncompressed ... just a series of integers representing the height of the audio curve for each sample of the curve (the Y axis where time is the X axis along this curve)
... this PCM audio can be compressed using some codec then bundled inside a container often together with video or meta data channels ... so to convert audio from A to B you would first need to understand the container spec as well as the compressed audio codec so you can decompress audio A into PCM format ... then do the reverse ... compress the PCM into codec of B then bundle it into the container of B
Before venturing further into this I suggest you master the art of WAVE audio files ... beauty of WAVE is that its just a 44 byte header followed by the uncompressed integers of the audio curve ... write some code to read a WAVE file then parse the header (identify bit depth, sample rate, channel count, endianness) to enable you to iterate across each audio sample for each channel ... prove that its working by sending your bytes into an output WAVE file ... diff input WAVE against output WAVE as they should be identical ... once mastered you are ready to venture into your above stated goal ... do not skip over groking notion of interleaving stereo audio as well as spreading out a single audio sample which has a bit depth of 16 bits across two bytes of storage and the reverse namely stitching together multiple bytes into a single integer with a bit depth of 16, 24 or even 32 bits while keeping endianness squared away ... this may sound scary at first however all necessary details are on the net as its how I taught myself this level of detail
modern audio compression algorithms leverage knowledge of how people perceive sound to discard information which is indiscernible ( lossy ) as opposed to lossless algorithms which retain all the informational load of the source ... opus (http://opus-codec.org/) is a current favorite codec untainted by patents and is open source
i need the audio file for the yuv test video sequences like foreman.yuv, akiyo.yuv
,etc. none of the yuv sequences online have the audio file.
any other yuv sequences with the audio file suitable for encoder analysis also works.
I think there is no audio available for the sequences you mention.
But have a look at https://media.xiph.org/video/derf/, at the bottom of the page you have full sequences with flac audio.
Is there a case where a video file could contain both mjpeg frames and a sound layer? I know originally, people used to place a 8khz PCM uncompressed track along with their mjpeg movie since it is streamed/decoded/played frame by frame with no motion prediction needed. Can some decoder accept an Mjpeg with a more recent audio format?
[EDIT 1]
What I'll first try is to check if ffmpeg handles the conversion of Audio/Video movies to MJpeg with audio, and I'll explore the header and the layers with an hex editor.
[EDIT 2]
OK. I've studied a Mjpeg with audio:
ffmpeg -i some_movie_with_music.mp4 -f avi -acodec mp3 -vcodec mjpeg mjpegWithSound.aviĀ
And there's an MP3 file splitted into the total number of frames under each jpeg plus some changes in the header. So it's easy to implement in a context where a mobile application would offer to the user the opportunity to add an MP3 files to a serie of jpeg or to a movie. So, one more reason to use Mjpeg when a platform has no encoder yet.
It's fun to watch your application take shape. :-) I'm going to assume this is a follow-on to your last question and that you want to write C# code to accomplish this task. Are you still writing this into an AVI container? AVI stands for "audio/video interleaved" and is designed to transport both audio and video.
So, yes, you should be able to write both MJPEG and audio into an AVI file.
Guess what! You have lots of options for audio codecs too. We haven't cataloged quite as many audio codecs as video codecs (but close). Good news, though: Implementing a basic audio encoder in pure C# should be much simpler than trying to port even an MPEG-1 video encoder. Alternatively, check around to see if you can find an MP3 encoder written in pure C#. AVI accommodates MP3. If not, try IMA ADPCM. It's easy to implement and gives you 4:1 compression. Thus, if you have a monophonic, 44100 Hz, 16-bit stream, that requires 88200 bytes/sec. IMA ADPCM will give you roughly 22050 bytes/sec (plus small overhead).