Sound - what is raw sound data?

Sound - what is raw sound data? - audio

I have code that decodes an MP3 and populates an array with all the 'values'.
My question is: what are those values? Are they frequencies? Are they amplitudes?
This is the code:
File file = new File(song.getFilepath());
if (file.exists()) {
AudioInputStream in = AudioSystem.getAudioInputStream(file);
AudioInputStream din = null;
AudioFormat baseFormat = in.getFormat();
AudioFormat decodedFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED,
baseFormat.getSampleRate(),
16,
baseFormat.getChannels(),
baseFormat.getChannels() * 2,
baseFormat.getSampleRate(),
false);
din = AudioSystem.getAudioInputStream(decodedFormat, in);
ByteArrayOutputStream out = new ByteArrayOutputStream();
byte[] data = new byte[4096];
SourceDataLine line = getLine(decodedFormat);
int nBytesRead = 0, nBytesWritten = 0;
while (nBytesRead != -1) {
nBytesRead = din.read(data, 0, data.length);
if (nBytesRead != -1) {
nBytesWritten = line.write(data, 0, nBytesRead);
out.write(data, 0, nBytesRead);
}
}
byte[] audio = out.toByteArray();
System.err.println(audio.length);
for (byte b : audio) {
System.err.println(b);
}
}
I get about 40,000,000 numbers (length of byte array) for a 3 minute song, but I have no idea what they are?

They are amplitudes. Usually, each amplitude is 16 bits (2 bytes, range from -32768 to 32767), and there are two channels (left and right). In this case, one sound sample spans four bytes.
Example:
100, 0, 2, 1
means left amplitude is 100 (of 32767) and right is 258.

Related

how to reverse AudioTrack play()

I have buffer (array) full of audio data, filled with AudioRecord method.
I want play this reversed. When I try to reverse the buffer I just get noise.
public void startWriteAndPlayProcess (){
writeAudio();
playAudio();
}
private void writeAudio(){
audioTrack.write(reverseArray(audioBuffer), 0, bufferSize);
}
private void playAudio(){
audioTrack.play();
}
private byte [] reverseArray ( byte array []){
byte [] array1 = new byte [array.length];
for (int i=0; i<array1.length; i++){
array1[i]= array[array1.length-1-i];
}
return array1;
}
What You people can recomend?

The underlying audio samples are actually an array of shorts (16-bit) or ints (24 or 32-bit). If you just reverse the raw byte array then you are putting the least significant byte on the top and this will make your signal sound like noise. To get it to work properly you need to first convert the byte array to an array of the proper type, reverse that, and then convert it back into a byte array.
private void writeAudio()
{
short[] shortArray = toShortArray(audioBuffer);
short[] reversedShortArray = reverseArray(shortArray);
byte[] reversedByteArray = toByteArray(reversedShortArray);
audioTrack.write(reversedByteArray, 0, bufferSize);
}
private short[] toShortArray(byte[] byteArray)
{
short[] shortArray = new short[byteArray.length / 2];
for (int i = 0 ; i < shortArray.length; i)
{
shortArray[i] = (short)((short)byteArray[i*2] | (short)(byteArray[i*2 + 1] << 8));
// alternatively - depending on the endianess of the data:
// shortArray[i] = (short)((short)byteArray[i*2] << 8 | (short)(byteArray[i*2 + 1]));
}
return shortArray;
}
Of course you'll have to change the type of reverseArray. I'll leave it up to you to figure out how to go back to bytes from the short array or to write the int versions of them if that's what you need.

How do I swap stereo channels in raw PCM audio data on OS X?

I'm writing audio from an external decoding library on OS X to an AIFF file, and I am able to swap the endianness of the data with OSSwapInt32().
The resulting AIFF file (16-bit PCM stereo) does play, but the left and right channels are swapped.
Would there be any way to swap the channels as I am writing each buffer?
Here is the relevant loop:
do
{
xmp_get_frame_info(writer_context, &writer_info);
if (writer_info.loop_count > 0)
break;
writeModBuffer.mBuffers[0].mDataByteSize = writer_info.buffer_size;
writeModBuffer.mBuffers[0].mNumberChannels = inputFormat.mChannelsPerFrame;
// Set up our buffer to do the endianness swap
void *new_buffer;
new_buffer = malloc((writer_info.buffer_size) * inputFormat.mBytesPerFrame);
int *ourBuffer = writer_info.buffer;
int *ourNewBuffer = new_buffer;
memset(new_buffer, 0, writer_info.buffer_size);
int i;
for (i = 0; i <= writer_info.buffer_size; i++)
{
ourNewBuffer[i] = OSSwapInt32(ourBuffer[i]);
};
writeModBuffer.mBuffers[0].mData = ourNewBuffer;
frame_size = writer_info.buffer_size / inputFormat.mBytesPerFrame;
err = ExtAudioFileWrite(writeModRef, frame_size, &writeModBuffer);
} while (xmp_play_frame(writer_context) == 0);

This solution is very specific to 2 channel audio. I chose to do it at the same time you're looping to change the byte ordering to avoid an extra loop. I'm going through the loop 1/2 the number and processing two samples per iteration. The samples are interleaved so I copy from odd sample indexes into even sample indexes and vis-a-versa.
for (i = 0; i <= writer_info.buffer_size/2; i++)
{
ourNewBuffer[i*2] = OSSwapInt32(ourBuffer[i*2 + 1]);
ourNewBuffer[i*2 + 1] = OSSwapInt32(ourBuffer[i*2]);
};
An alternative is to use a table lookup for channel mapping.

how to find decode way to decode a USSD Command's result in c#?

I'm working on my GSM modem (Huawei E171) to send USSD commands.
to do this i use this commands at the first:
AT+CMGF=1
AT+CSCS=? ----> result is "IRA" this is my modem default
after that i sent these commands and i have got these results and everything works fine.
//*141*1# ----->to check my balance
+CUSD:
0,"457A591C96EB40B41A8D0692A6C36C17688A2E9FCB667AD87D4EEB4130103D
0C8281E4753D0B1926E7CB2018881E06C140F2BADE5583819A4250D24D2FC
BDD653A485AD787DD65504C068381A8EF76D80D2287E53A55AD5653D554
31956D04",15
//*100# ----> this command give me some options to charge my mobile
+CUSD:
1,"06280627062C06470020062706CC06310627064606330644000A0030002E062E0
63106CC062F00200634062706310698000A0031002E067E062706330627063106A
F0627062F000A0032002E0622067E000A0033002E06450644062A000A003
4002E06330627064506270646000A0035002E067E0627063106330
6CC06270646000A002300200028006E0065007800740029000A",72
i found some codes to decode these result:
to decode checking balance result i used:
string result141="457A591C96EB40B41A8D0692A6C36C17688A......."
byte[] packedBytes = ConvertHexToBytes(result141);
byte[] unpackedBytes = UnpackBytes(packedBytes);
//gahi in kar mikone gahi balkaee nafahmidam chera
string o = Encoding.Default.GetString(unpackedBytes);
my function's codes are:
public static byte[] ConvertHexToBytes(string hexString)
{
if (hexString.Length % 2 != 0)
return null;
int len = hexString.Length / 2;
byte[] array = new byte[len];
for (int i = 0; i < array.Length; i++)
{
string tmp = hexString.Substring(i * 2, 2);
array[i] =
byte.Parse(tmp, System.Globalization.NumberStyles.HexNumber);
}
return array;
}
public static byte[] UnpackBytes(byte[] packedBytes)
{
byte[] shiftedBytes = new byte[(packedBytes.Length * 8) / 7];
int shiftOffset = 0;
int shiftIndex = 0;
// Shift the packed bytes to the left according
//to the offset (position of the byte)
foreach (byte b in packedBytes)
{
if (shiftOffset == 7)
{
shiftedBytes[shiftIndex] = 0;
shiftOffset = 0;
shiftIndex++;
}
shiftedBytes[shiftIndex] = (byte)((b << shiftOffset) & 127);
shiftOffset++;
shiftIndex++;
}
int moveOffset = 0;
int moveIndex = 0;
int unpackIndex = 1;
byte[] unpackedBytes = new byte[shiftedBytes.Length];
//
if (shiftedBytes.Length > 0)
{
unpackedBytes[unpackIndex - 1] =
shiftedBytes[unpackIndex - 1];
}
// Move the bits to the appropriate byte (unpack the bits)
foreach (byte b in packedBytes)
{
if (unpackIndex != shiftedBytes.Length)
{
if (moveOffset == 7)
{
moveOffset = 0;
unpackIndex++;
unpackedBytes[unpackIndex - 1] =
shiftedBytes[unpackIndex - 1];
}
if (unpackIndex != shiftedBytes.Length)
{
// Extract the bits to be moved
int extractedBitsByte = (packedBytes[moveIndex] &
_decodeMask[moveOffset]);
// Shift the extracted bits to the proper offset
extractedBitsByte =
(extractedBitsByte >> (7 - moveOffset));
// Move the bits to the appropriate byte
//(unpack the bits)
int movedBitsByte =
(extractedBitsByte | shiftedBytes[unpackIndex]);
unpackedBytes[unpackIndex] = (byte)movedBitsByte;
moveOffset++;
unpackIndex++;
moveIndex++;
}
}
}
// Remove the padding if exists
if (unpackedBytes[unpackedBytes.Length - 1] == 0)
{
byte[] finalResultBytes = new byte[unpackedBytes.Length - 1];
Array.Copy(unpackedBytes, 0,
finalResultBytes, 0, finalResultBytes.Length);
return finalResultBytes;
}
return unpackedBytes;
}
but to decode second result i used:
string strHex= "06280627062C06470020062706CC06310......";
strHex = strHex.Replace(" ", "");
int nNumberChars = strHex.Length / 2;
byte[] aBytes = new byte[nNumberChars];
using (var sr = new StringReader(strHex))
{
for (int i = 0; i < nNumberChars; i++)
aBytes[i] = Convert.ToByte(
new String(new char[2] {
(char)sr.Read(), (char)sr.Read() }), 16);
}
string decodedmessage= Encoding.BigEndianUnicode.
GetString(aBytes, 0, aBytes.Length);
both of theme works current but why i should different decoding way to decode these results?
from where i can find, i should use which one of these two types of decoding?

USSD command responses +CUSD unsolicited responses are formatted as follows:
+CUSD: <m>[<str_urc>[<dcs>]]
Where "m" is the type of action required, "str_urc" is the response string, and "dcs" is the response string encoding.
This quote is from a Siemens Cinterion MC55i manual but applies generally to other modem manufacturers:
If dcs indicates that GSM 03.38 default alphabet is used TA converts GSM alphabet into current TE character
set according to rules of GSM 07.05 Annex A. Otherwise in case of invalid or omitted dcs conversion of
str_urc is not possible.
USSD's can be sent in 7-Bit encoded format or UC2 hence when looking at your two example responses you can see either a DCS of 15 or 72.
GSM 03.38 Cell Broadcast Data Coding Scheme in integer format (default 15). In case of an invalid or omitted
dcs from the network side (MT) will not be given out.
So if you get a DCS of 15 then it is 7-Bit encoded. And if it's 72 then it will be UC2. So from this you can easily select either your first decoding routine or second.

OpenCL image2d_t writing mostly zeros

I am trying to use OpenCL and image2d_t objects to speed up image convolution. When I noticed that the output was a blank image of all zeros, I simplified the OpenCL kernel to a basic read from the input and write to the output (shown below). With a little bit of tweaking, I got it to write a few scattered pixels of the image into the output image.
I have verified that the image is intact up until the call to read_imageui() in the OpenCL kernel. I wrote the image to GPU memory with CommandQueue::enqueueWriteImage() and immediately read it back into a brand new buffer in CPU memory with CommandQueue::enqueueReadImage(). The result of this call matched the original input image. However, when I retrieve the pixels with read_imageui() in the kernel, the vast majority of the pixels are set to 0.
C++ source:
int height = 112;
int width = 9216;
unsigned int numPixels = height * width;
unsigned int numInputBytes = numPixels * sizeof(uint16_t);
unsigned int numDuplicatedInputBytes = numInputBytes * 4;
unsigned int numOutputBytes = numPixels * sizeof(int32_t);
cl::size_t<3> origin;
origin.push_back(0);
origin.push_back(0);
origin.push_back(0);
cl::size_t<3> region;
region.push_back(width);
region.push_back(height);
region.push_back(1);
std::ifstream imageFile("hri_vis_scan.dat", std::ifstream::binary);
checkErr(imageFile.is_open() ? CL_SUCCESS : -1, "hri_vis_scan.dat");
uint16_t *image = new uint16_t[numPixels];
imageFile.read((char *) image, numInputBytes);
imageFile.close();
// duplicate our single channel image into all 4 channels for Image2D
cl_ushort4 *imageDuplicated = new cl_ushort4[numPixels];
for (int i = 0; i < numPixels; i++)
for (int j = 0; j < 4; j++)
imageDuplicated[i].s[j] = image[i];
cl::Buffer imageBufferOut(context, CL_MEM_WRITE_ONLY, numOutputBytes, NULL, &err);
checkErr(err, "Buffer::Buffer()");
cl::ImageFormat inFormat;
inFormat.image_channel_data_type = CL_UNSIGNED_INT16;
inFormat.image_channel_order = CL_RGBA;
cl::Image2D bufferIn(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, inFormat, width, height, 0, imageDuplicated, &err);
checkErr(err, "Image2D::Image2D()");
cl::ImageFormat outFormat;
outFormat.image_channel_data_type = CL_UNSIGNED_INT16;
outFormat.image_channel_order = CL_RGBA;
cl::Image2D bufferOut(context, CL_MEM_WRITE_ONLY, outFormat, width, height, 0, NULL, &err);
checkErr(err, "Image2D::Image2D()");
int32_t *imageResult = new int32_t[numPixels];
memset(imageResult, 0, numOutputBytes);
cl_int4 *imageResultDuplicated = new cl_int4[numPixels];
for (int i = 0; i < numPixels; i++)
for (int j = 0; j < 4; j++)
imageResultDuplicated[i].s[j] = 0;
std::ifstream kernelFile("convolutionKernel.cl");
checkErr(kernelFile.is_open() ? CL_SUCCESS : -1, "convolutionKernel.cl");
std::string imageProg(std::istreambuf_iterator<char>(kernelFile), (std::istreambuf_iterator<char>()));
cl::Program::Sources imageSource(1, std::make_pair(imageProg.c_str(), imageProg.length() + 1));
cl::Program imageProgram(context, imageSource);
err = imageProgram.build(devices, "");
checkErr(err, "Program::build()");
cl::Kernel basic(imageProgram, "basic", &err);
checkErr(err, "Kernel::Kernel()");
basic.setArg(0, bufferIn);
basic.setArg(1, bufferOut);
basic.setArg(2, imageBufferOut);
queue.finish();
cl_ushort4 *imageDuplicatedTest = new cl_ushort4[numPixels];
for (int i = 0; i < numPixels; i++)
{
imageDuplicatedTest[i].s[0] = 0;
imageDuplicatedTest[i].s[1] = 0;
imageDuplicatedTest[i].s[2] = 0;
imageDuplicatedTest[i].s[3] = 0;
}
double gpuTimer = clock();
err = queue.enqueueReadImage(bufferIn, CL_FALSE, origin, region, 0, 0, imageDuplicatedTest, NULL, NULL);
checkErr(err, "CommandQueue::enqueueReadImage()");
// Output from above matches input image
err = queue.enqueueNDRangeKernel(basic, cl::NullRange, cl::NDRange(height, width), cl::NDRange(1, 1), NULL, NULL);
checkErr(err, "CommandQueue::enqueueNDRangeKernel()");
queue.flush();
err = queue.enqueueReadImage(bufferOut, CL_TRUE, origin, region, 0, 0, imageResultDuplicated, NULL, NULL);
checkErr(err, "CommandQueue::enqueueReadImage()");
queue.flush();
err = queue.enqueueReadBuffer(imageBufferOut, CL_TRUE, 0, numOutputBytes, imageResult, NULL, NULL);
checkErr(err, "CommandQueue::enqueueReadBuffer()");
queue.finish();
OpenCL kernel:
__kernel void basic(__read_only image2d_t input, __write_only image2d_t output, __global int *result)
{
const sampler_t smp = CLK_NORMALIZED_COORDS_TRUE | //Natural coordinates
CLK_ADDRESS_NONE | //Clamp to zeros
CLK_FILTER_NEAREST; //Don't interpolate
int2 coord = (get_global_id(1), get_global_id(0));
uint4 pixel = read_imageui(input, smp, coord);
result[coord.s0 + coord.s1 * 9216] = pixel.s0;
write_imageui(output, coord, pixel);
}
The coordinates in the kernel are currently mapped to (x, y) = (width, height).
The input image is a single channel greyscale image with 16 bits per pixel, which is why I had to duplicate the channels to fit into OpenCL's Image2D. The output after convolution will be 32 bits per pixel, which is why numOutputBytes is set to that. Also, although the width and height appear weird, the input image's dimensions are 9216x7824, so I'm only taking a portion of it to test the code first, so it doesn't take forever.
I added in a write to global memory after reading from the image in the kernel to see if the issue was reading the image or writing the image. After the kernel executes, this section of global memory also contains mostly zeros.
Any help would be greatly appreciated!

The documentation for read_imageui states that
Furthermore, the read_imagei and read_imageui calls that take integer coordinates must use a sampler with normalized coordinates set to CLK_NORMALIZED_COORDS_FALSE and addressing mode set to CLK_ADDRESS_CLAMP_TO_EDGE, CLK_ADDRESS_CLAMP or CLK_ADDRESS_NONE; otherwise the values returned are undefined.
But you're creating a sampler with CLK_NORMALIZED_COORDS_TRUE (but seem to be passing in non-normalized coords :S ?).

unsigned char* buffer to System::Drawing::Bitmap

I'm trying to create a tool/asset converter that rasterises a font to a texture page for an XNA game using the FreeType2 engine.
Below, the first image is the direct output from the FreeType2]1 engine. The second image is the result after attempting to convert it to a System::Drawing::Bitmap.
target http://www.freeimagehosting.net/uploads/fb102ee6da.jpg currentresult http://www.freeimagehosting.net/uploads/9ea77fa307.jpg
Any hints/tips/ideas on what is going on here would be greatly appreciated. Links to articles explaining byte layout and pixel formats would also be helpful.
FT_Bitmap *bitmap = &face->glyph->bitmap;
int width = (face->bitmap->metrics.width / 64);
int height = (face->bitmap->metrics.height / 64);
// must be aligned on a 32 bit boundary or 4 bytes
int depth = 8;
int stride = ((width * depth + 31) & ~31) >> 3;
int bytes = (int)(stride * height);
// as *.bmp
array<Byte>^ values = gcnew array<Byte>(bytes);
Marshal::Copy((IntPtr)glyph->buffer, values, 0, bytes);
Bitmap^ systemBitmap = gcnew Bitmap(width, height, PixelFormat::Format24bppRgb);
// create bitmap data, lock pixels to be written.
BitmapData^ bitmapData = systemBitmap->LockBits(Rectangle(0, 0, width, height), ImageLockMode::WriteOnly, bitmap->PixelFormat);
Marshal::Copy(values, 0, bitmapData->Scan0, bytes);
systemBitmap->UnlockBits(bitmapData);
systemBitmap->Save("Test.bmp");
Update. Changed PixelFormat to 8bppIndexed.
FT_Bitmap *bitmap = &face->glyph->bitmap;
// stride must be aligned on a 32 bit boundary or 4 bytes
int depth = 8;
int stride = ((width * depth + 31) & ~31) >> 3;
int bytes = (int)(stride * height);
target = gcnew Bitmap(width, height, PixelFormat::Format8bppIndexed);
// create bitmap data, lock pixels to be written.
BitmapData^ bitmapData = target->LockBits(Rectangle(0, 0, width, height), ImageLockMode::WriteOnly, target->PixelFormat);
array<Byte>^ values = gcnew array<Byte>(bytes);
Marshal::Copy((IntPtr)bitmap->buffer, values, 0, bytes);
Marshal::Copy(values, 0, bitmapData->Scan0, bytes);
target->UnlockBits(bitmapData);

Ah ha. Worked it out.
FT_Bitmap is an 8bit image, so the correct PixelFormat was 8bppIndexed, which resulted this output.
Not aligned to 32byte boundary http://www.freeimagehosting.net/uploads/dd90fa2252.jpg
System::Drawing::Bitmap needs to be aligned on a 32 bit boundary.
I was calculating the stride but was not padding it when writing the bitmap. Copied the FT_Bitmap buffer to a byte[] and then wrote that to a MemoryStream, adding the necessary padding.
int stride = ((width * pixelDepth + 31) & ~31) >> 3;
int padding = stride - (((width * pixelDepth) + 7) / 8);
array<Byte>^ pad = gcnew array<Byte>(padding);
array<Byte>^ buffer = gcnew array<Byte>(size);
Marshal::Copy((IntPtr)source->buffer, buffer, 0, size);
MemoryStream^ ms = gcnew MemoryStream();
for (int i = 0; i < height; ++i)
{
ms->Write(buffer, i * width, width);
ms->Write(pad, 0, padding);
}
Pinned the memory so the GC would leave it alone.
// pin memory and create bitmap
GCHandle handle = GCHandle::Alloc(ms->ToArray(), GCHandleType::Pinned);
target = gcnew Bitmap(width, height, stride, PixelFormat::Format8bppIndexed, handle.AddrOfPinnedObject());
ms->Close();
As there is no Format8bppIndexed Grey the image was still not correct.
alt text http://www.freeimagehosting.net/uploads/8a883b7dce.png
Then changed the bitmap palette to grey scale 256.
// 256-level greyscale palette
ColorPalette^ palette = target->Palette;
for (int i = 0; i < palette->Entries->Length; ++i)
palette->Entries[i] = Color::FromArgb(i,i,i);
target->Palette = palette;
alt text http://www.freeimagehosting.net/uploads/59a745269e.jpg
Final solution.
error = FT_Load_Char(face, ch, FT_LOAD_RENDER);
if (error)
throw gcnew InvalidOperationException("Failed to load and render character");
FT_Bitmap *source = &face->glyph->bitmap;
int width = (face->glyph->metrics.width / 64);
int height = (face->glyph->metrics.height / 64);
int pixelDepth = 8;
int size = width * height;
// stride must be aligned on a 32 bit boundary or 4 bytes
// padding is the number of bytes to add to make each row a 32bit aligned row
int stride = ((width * pixelDepth + 31) & ~31) >> 3;
int padding = stride - (((width * pixelDepth) + 7) / 8);
array<Byte>^ pad = gcnew array<Byte>(padding);
array<Byte>^ buffer = gcnew array<Byte>(size);
Marshal::Copy((IntPtr)source->buffer, buffer, 0, size);
MemoryStream^ ms = gcnew MemoryStream();
for (int i = 0; i < height; ++i)
{
ms->Write(buffer, i * width, width);
ms->Write(pad, 0, padding);
}
// pin memory and create bitmap
GCHandle handle = GCHandle::Alloc(ms->ToArray(), GCHandleType::Pinned);
target = gcnew Bitmap(width, height, stride, PixelFormat::Format8bppIndexed, handle.AddrOfPinnedObject());
ms->Close();
// 256-level greyscale palette
ColorPalette^ palette = target->Palette;
for (int i = 0; i < palette->Entries->Length; ++i)
palette->Entries[i] = Color::FromArgb(i,i,i);
target->Palette = palette;
FT_Done_FreeType(library);

Your "depth" value doesn't match the PixelFormat of the Bitmap. It needs to be 24 to match Format24bppRgb. The PF for the bitmap needs to match the PF and stride of the FT_Bitmap as well, I don't see you take care of that.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Sound - what is raw sound data? - audio

They are amplitudes. Usually, each amplitude is 16 bits (2 bytes, range from -32768 to 32767), and there are two channels (left and right). In this case, one sound sample spans four bytes. Example: 100, 0, 2, 1 means left amplitude is 100 (of 32767) and right is 258.

Related

how to reverse AudioTrack play()

How do I swap stereo channels in raw PCM audio data on OS X?

how to find decode way to decode a USSD Command's result in c#?

OpenCL image2d_t writing mostly zeros

unsigned char* buffer to System::Drawing::Bitmap

Categories

Resources