Is there a way to get from a byte array to audio? - audio

I want to get the audio files of .lib files of a game. I heard converting the .lib into an byte array is a good first step, where I'm now at, but I don't know what to do know.
Is there a method for doing this?
Small part of the array:

Related

Reading and understanding a raw audio file(specifically MP3)

I am trying to understand what the raw data from an audio file looks like and how to get that data. I want to take the data and analyze it and see if I am able to make a program that can recognize patterns in a song such as in a hip hop song, finding the same beat in a chorus. In my head I think this could be a doable task if the data is in an integer form.
I've looked up many tutorials for this but all the tutorials use other libraries or don't explain it in a way I understand(more than likely the source of my issue).
I am wondering if there is someone out there that can help me understand a few things.
1). In an MP3 file, what is actually being stored in the file. Is it an integer that tells the radio/amp/audioPlayer a frequency, another integer for amplitude, etc...(over simplified because I don't know what other data is stored in an audio file).
2). If it is stored in an integer format, is there a way to read the integers and analyze it. If it is not stored in an integer format, how is it stored, and is there a way to convert it to an integer format?
3). In visual representations of an audio files like this one, it seems more clear what is what. It seems like the frequency is where on the circle the audio is represented, and the amplitude is how high it jumps. Is this right? Or does it just appear that way and I am understanding it incorrectly.
4). Is this task harder than I think it is? Considering I haven't found any good explanations or tutorials on how to do so, I am skeptical on how easy this would be.
(Sorry if this was poorly phrased, first question on stack and I am just illiterate :^)

sampling wav files in to get amplitude at a specific time

i am wondering if there is any way to cycle through a .wav file to get the amplitude/DB of a specific point in the wav file. i am reading it into a byte array now but that has no help to me from what i can see.
i am using this in conjunction with some hardware i have developed that encodes light data into binary and outputs audio. i wont get into the details but i need to be able to do this in c# or c++. i cant find any info on this anywhere. i have never programmed anything relating to audio so excuse me if this is a very easy thing.
i dont have anything started since this is the starting point so if anybody can point me to some functions, libraries, or methods to being able to collect the amplitude of the wave at a specific time in the file, i would greatly appreciate it.
i hope this is enough info, and thank you in advance if you are kind enough to help.
It is possible and it is done in a straightforward way: the file with PCM audio contains one value for every channel, for every (1/sample-rate) of second.
The values however might vary: 8-bit, 16-bit, single precision floating point values. You certainly have to take this into account and this is the reason you cannot take the bytes from byte array directly.
The .WAV file also has a header preceding the actual payload.

Efficiently generating time index of pre-transcribed speech using it's audio source and open source tools

On TED.com they have transcriptions and they go to the appropriate section of the video when clicking a part of the transcription.
I want to do this for 80 hours of audios and transcriptions I have, on Linux with OSS.
This is the approach I'm thinking:
Start small with a 30 minuite sample
Split the audio up into 2 minute WAV file formatted chunks, even if it breaks words up
Run the phrase spotter from CMU Sphinx's long-audio-aligner on each chunk, with the transcript
Take the time index for identified words/phrases found in each bit and calculate the actual estimated time of the ngrams in the original audio file.
Does this seem like an efficient approach? Has anyone actually done this?
Are there alternate approaches that are worth trying like dumb word counting that may be accurate enough?
You can just feed all your audio and text in a long audio aligner and it will give you the timestamps of the words. Using this timestamps you can jump to the specific word in a file.
I'm not sure why do you want to split your audio or do something else.

automatically partition audio files into small parts

I am looking for a way to automatically extract parts from audio files. Something like Imagemagick for audio files.
I only need to extract random parts of a fixed length from a large set of complete ogg-vorbis files. I easily know how to automatically interpret the output from a programm, so I would be able to write a small script if I had programs to do the following:
Get the length of the file
Extract parts of the given an offset in seconds and a length
Is there any program, which allows me to do this under linux? The files I am using are ogg vorbis files.
If there is a python library, which is able to do this, it would work as well.
You can use SoX (Sound eXchange) to do both.

Hashing raw audio data

I'm looking for a solution to this task: I want to open any audio file (MP3,FLAC,WAV), then proceed it to the extracted form and hash this data. The thing is: I don't know how to get this extracted audio data. DirectX could do the job, right? And also, I suppose if I have fo example two MP3 files, both 320kbps and only ID3 tags differ and there's a garbage inside on of the files mixed with audio data (MP3 format allows garbage to be inside) and I extract both files, I should get the exactly same audio data, right? I'd only differ if one file is 128 and the other 320, for example. Okay so, the question is, is there a way to use DirectX to get this extracted audio data? I imagine it'd be some function returning byte array or something. Also, it would be handy to just extract whole file without playback. I want to process hundreds of files so 3-10min/s each (if files have to be played at natural speed for decoding) is way worse that one second for each file (only extracting)
I hope my question is understandable.
Thanks a lot for answers,
Aaron
Use http://sox.sourceforge.net/ (multiplatform). It's faster than realtime as you'd like, and it's designed for batch mode much more than DirectX. For example, sox -r 48k -b 16 -L -c 1 in.mp3 out.raw. Loop that over your hundreds of files using whatever scripting language you like (bash, python, .bat, ...).

Resources