A simple way to turn raw ascii data into sound - audio

I'm looking for method to process simple data into an audio output such as an mp3 file. The data is in the form of a two-column text file with a time signature in milliseconds and a level in millivolts.
Ideally the method would be script-able (with linux or unix tools). I have tried using Audacity to read raw data, however it seems to expect binary files, and doesn't seem to be flexible with sample rates etc.

Related

Is there a way to set the details of a file in Windows using python?

I want to be able to set the "Title" and "Comments" (listed in properties->details) of some mp3 files in Windows using python. Is this possible, perhaps with a library like PyWin32? Also, would these details be visible in other operating systems or are they Windows-specific? Thanks.
Simple Answer:
Yes, you can set 'Title' and 'Comments' (and many other fields) of an mp3 file in Windows using Python.
Also, the details are visible on all operating systems and are not windows specific.
First you have to understand what is mp3 file and how data is organized within an mp3 file.
Detailed Answer:
Raw audio consumes a lot of size. For example, an audio signal of 10 sec sampled 48 kHz and having a bit depth of 16 bits per sample will be of size 10*48000*16 bits, which is close to 1 MB. So, for a 5 minute song, it will almost take 30 MB. But, if you observe, most 5 min mp3 songs are of size around 5 MB (of course it depends on sampling frequency, bit depth and amount of compression used). How is it possible? It is possible because we compress the data using signal processing techniques which in itself is a big topic altogether which we will not discuss here. So, to create an mp3 file we need something called encoder which converts the raw audio data to compressed data and every time you play an mp3 song, decoder is used which converts the data from compressed format to raw audio, which is what you can only listen. So, compression is done for saving storage and also transmission bandwidth (basically saving amount of data to be transmitted over internet).
Now, coming to how data is organized inside an mp3 file. mp3 file will obviously contain the compressed data. In addition many mp3 files contain some meta data (like Title and Comments you mentioned in your question). There are several formats for storing this meta data. So, a decoder which is decoding mp3 file should also support decoding of meta-data, then only you can see the information, other wise you can't see. The meta data is operating system independent, and can be seen on any operating system provided you have a proper decoder.
Finally, yes you can edit the meta data on windows (for that matter on any OS) using python. If you want to do this, using only python without any library, you need to understand how data is organized inside an mp3 file, find the meta-data inside it, edit it and store it back. But, there are libraries and packages in python which support editing meta-data of mp3 file. You can use them directly. Also, the meta data is independent of OS, and once you edit your properties, you should be able to see the properties in any OS provided the decoder you use has the support.
Some links which will help you:
mp3 tag tool
Another stack overflow question which gives details about libraries that support viewing and editing of meta data using Python

File information of .raw audio files using terminal in linux

How to get file information like sampling rate, bit rate etc of .raw audio files using terminal in linux? Soxi works for .wav files but it isn't working for .raw.
If your life depended on discovering an answer you could make some assumption to tease apart the unknowns ... however there is no automated way since the missing header would give you the easy answers ...
The audio analysis tool called audacity allows you to open up a RAW file, make some guesses and play the track
http://www.audacityteam.org
In audacity goto File -> Import -> Raw Data...
Above settings are typical for audio ripped from a CD ... toy with trying stereo vs mono for starters.
Those picklist widgets give you wiggle room to discover the format of your PCM audio given that the source audio is something when properly rendered is recognizable ... would be harder if the actual audio was noise
However if you need a programmatic method then rolling your own solution to ask those same questions which appear in above window is possible ... is that what you need or will audacity work for you ? We can go down the road of writing code to play off the unknowns mentioned in #Frank Lauterwald's comment
To kick start discovering this information programmatically, if the binary raw audio is 16 bit then each audio sample (point on the audio curve) will consume two bytes of your PCM file. For mono audio then the following two bytes would be your next sample, however if its stereo then these two following bytes would be the sample from the other channel. If more than two channels then just repeat. Typical audio is little endian. Sampling rate is important when rendering the audio, not when programmatically parsing raw bytes. One approach would be to create an output file with a WAV header followed by your source PCM data. Populate the header with answers from your guesswork. This way you could listen to this output file to help confirm your guesses.
Here is a sample 500k mono PCM audio file signed 16 bit which can be imported into audacity or used as input to rolling your own identification code
The_Constructus_Corporation_Long_Street-ycexQvMy03k_excerpt_mono.pcm

Converting Audio From Unknown Format

I would like to create a utility in either PHP or Perl to convert an audio file created by the Nortel's Callpilot voice mail system into a wave file. The problem is that the format, which has the .vbk file extension, is unknown to virtually any audio player. To date, I have not found one that will play a .vbk file. I've looked at audio file conversion libraries in CPAN and tried many of them, they don't recognize the file. I was not successful with PHP's audio formats manipulation either. Nortel does provide a converter, however, it does not suite my needs. I would like to have this run via cron on a CentOS system. I don't know how to reverse engineer this format. There seems to be just scraps of info on this format on the web. This page indicates that it is "based on the H.232 format":
https://www.odesk.com/o/jobs/job/Reverse-Engineer-Nortel-VBK-Audio-Format_~~f501f11679f3f6bb/
I know this is a very old thread, but I've recently been looking into converting Nortel's vbk format as well. Importing the vbk files into Audacity with raw data option, Encoding: U-Law, Byte order: little-endian, Channels: 1 Channel (Mono), Sample rate: 8000 Hz. Not sure if they have multiple formats for their vbk files, but mine were from a BCM50 phone system.
Well, this is the joy of closed proprietary systems. But there is a chance they could play nice. Try to contact Callpilot and see if they'll give you the format specs. It's worth a shot.
As for reverse engineering, you need to be able to generate known content. Like a constant tone at 60Hz for exactly 1 second. Then at 50Hz. Then at 10 seconds. Compare them. Isolate the data from the metadata. There is going to be compression involved, so try a handful of common compression schemes, maybe research into Nortel's practices will probably tell you more. If you can feed that into a player and get a tone back out, you're on your way.
There's probably more informed and structured ways to go about reverse engineering, but from my experience it's a lot of trial and error.

automatically partition audio files into small parts

I am looking for a way to automatically extract parts from audio files. Something like Imagemagick for audio files.
I only need to extract random parts of a fixed length from a large set of complete ogg-vorbis files. I easily know how to automatically interpret the output from a programm, so I would be able to write a small script if I had programs to do the following:
Get the length of the file
Extract parts of the given an offset in seconds and a length
Is there any program, which allows me to do this under linux? The files I am using are ogg vorbis files.
If there is a python library, which is able to do this, it would work as well.
You can use SoX (Sound eXchange) to do both.

Hashing raw audio data

I'm looking for a solution to this task: I want to open any audio file (MP3,FLAC,WAV), then proceed it to the extracted form and hash this data. The thing is: I don't know how to get this extracted audio data. DirectX could do the job, right? And also, I suppose if I have fo example two MP3 files, both 320kbps and only ID3 tags differ and there's a garbage inside on of the files mixed with audio data (MP3 format allows garbage to be inside) and I extract both files, I should get the exactly same audio data, right? I'd only differ if one file is 128 and the other 320, for example. Okay so, the question is, is there a way to use DirectX to get this extracted audio data? I imagine it'd be some function returning byte array or something. Also, it would be handy to just extract whole file without playback. I want to process hundreds of files so 3-10min/s each (if files have to be played at natural speed for decoding) is way worse that one second for each file (only extracting)
I hope my question is understandable.
Thanks a lot for answers,
Aaron
Use http://sox.sourceforge.net/ (multiplatform). It's faster than realtime as you'd like, and it's designed for batch mode much more than DirectX. For example, sox -r 48k -b 16 -L -c 1 in.mp3 out.raw. Loop that over your hundreds of files using whatever scripting language you like (bash, python, .bat, ...).

Resources