What are the other sub-chunks of WAV files? - audio

I'm writing a program to read and process WAV files for a digital signal processing class project, and I have two test files. I can read the RIFF, fmt, and data chunks properly. Both files have fmt Chunk Size: 16, but File B has this stray chunk of hex between the fmt and data chunks.
I'm certain it's not random data. I speculated it has some metadata about the file, so I converted its song title Colors to hex and found 43 6f 6c 6f 72 73 is within that stray chunk. I feel this is not a coincidence. All the sites I've visited only mention about a 2-byte variable that tells the size of extra parameters at the end of the fmt chunk. This can't be the case for file B if both fmt chunks claim to only have 16 bytes.
I'm speculating that there are other chunks present in file B. I haven't found anything about these optional(?) chunks. I need help to know what other sub-chunks I can look for in a wav file. I simply don't know the tags of other chunks that can be present in a WAV file
File A ("i ran so far away.wav") contains this header. I downloaded this file from the Internet.
5249 4646 24c0 c900 5741 5645 666d 7420
1000 0000 0100 0100 2256 0000 44ac 0000
0200 1000 6461 7461 00c0 c900
File B ("Colors.wav") contains this header. This is a file I downloaded from a .mp3 to .wav converter.
5249 4646 7c32 4a02 5741 5645 666d 7420
1000 0000 0100 0200 44ac 0000 10b1 0200
0400 1000 4c49 5354 5000 0000 494e 464f
4941 5254 0500 0000 466c 6f77 0000 494e
414d 0700 0000 436f 6c6f 7273 0000 4950
5244 0f00 0000 436f 6465 2047 6561 7373
204f 5031 0000 4953 4654 0e00 0000 4c61
7666 3537 2e32 362e 3130 3000 6461 7461
0032 4a02
If it's helpful, below is output from the program I wrote.
File A
File Descriptor: RIFF
RIFF Chunk Size: 13221924
File Format: WAVE
fmt Chunk Descriptor: fmt
fmt Chunk Size: 16
Audio Format: 1
Number of Channels: 1
Sampling Rate: 22050
Byte Rate: 44100
Block Align: 2
Bits Per Sample: 16
Data Chunk Descriptor: data
Data Chunk Size: 13221888
File B
File Descriptor: RIFF
RIFF Chunk Size: 38417020
File Format: WAVE
fmt Chunk Descriptor: fmt
fmt Chunk Size: 16
Audio Format: 1
Number of Channels: 2
Sampling Rate: 44100
Byte Rate: 176400
Block Align: 4
Bits Per Sample: 16
Data Chunk Descriptor: data
Data Chunk Size: 38416896

The RIFF file specification allows for any chunk id a program wants with the caveat that it might conflict with another program if the same chunk id is used for a different purpose. When writing a program to deal with RIFF files it is NOT required that you be able to understand every chunk type because that would be impossible You must, however, write your reader in such a way that it is able to skip over the unrecognized chunk ids.
The file you are looking at has a predefined and optional 'INFO' chunk in it. If you dump the ascii out from the hex you posted you'll find:
INFO
IART Flow
INAM Colors
IPRD Code Geass OP1
ISFT Lavf57.26.100
This chunk id is covered in the wikipedia page for RIFF - https://en.wikipedia.org/wiki/Resource_Interchange_File_Format#Use_of_the_INFO_chunk
or here
http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/RIFF.html#Info
and it's also covered in the RIFF specification. Sorry I don't have a link.

Regarding your followup question about the LIST P:
It isn't a LISTP chunk; it is a LIST chunk with a data size of 80 bytes. (When interpreting that byte as an ASCII character, it just happens to be a 'P'.)(Chunk names, also called a FourCC are strictly four characters.) Eighty bytes further in the hex, you will see the beginning of the "data" chunk, with its size of 38416896 bytes.
After the data size field containing 80, you see the "form type ID", which is "INFO". Then comes a list of sub-chunks (that's why it's called a LIST chunk) with the actual data: the first sub-chunk is "IART" followed by its size of 5 bytes, followed by the 5 bytes, which, when interpreted as ASCII, are "Flow" plus a null terminator, plus a byte of padding so that the chunk has an even number of bytes (integral number of WORDs).
Then comes an "INAM" sub-chunk, with size of 7, data of "Colors\0" and again a byte of padding. And so on.
From #jaket's second link, http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/RIFF.html#Info, you can see that these are standard sub-chunk tags:
IART: artist
INAM: title
IPRD: Product
ISFT: Software

Related

serialport communication using Node.js - packaging and sending 8-bit binary data

I’ doing a POC with serialport communication using Node.js
I’m connected to cellular modem via serial port and the target is to transmit the data over UDP.
My scripts collects data from the modem (i.e. IMEI number, 15 digit) in a form of String. Then I transmit the data like:
var IMEI = “354345678654561”;
serialPort.write(IMEI + '\r');
On modem side, each digit is received as ASCII text, coded as single byte (8-bit binary) and transmited as 15 bytes over the air. On the server side I receive 33 35 34…
But I wish to send binary value of the String “1 0100 0010 0100 0110 1000 1000 0100 1101 1100 1100 0110 0001” which be 7 bytes in total.
Tried the Buffer.from() and played with Arrays but no real success.
Any help is appreciated
figured out that doing
BUF1[0] = 0x27;
BUF1[1] = 0x0F;
serialPort.write(BUF1, 0,);
serialPort.write(BUF1, 1,);
would actually work

Trouble displaying signed unsigned bytes with python

I have a weird problem! I made a client / server Python code with Bluetooth in series, to send and receive byte frames (for example: [0x73, 0x87, 0x02 ....] )
Everything works, the send reception works very well !
The problem is the display of my frames, I noticed that the bytes from 0 to 127 are displayed, but from 128, it displays the byte but it adds a C2 (194) behind, for example: [0x73, 0x7F, 0x87, 0x02, 0x80 ....] == [115, 127, 135, 2, 128 ....] in hex display I would have 73 7F C2 87 2 C2 80 .. , we will notice that he adds a byte C2 from nowhere!
I think that since it is from 128! that it is due to a problem of signed (-128 to 127) / unsigned (0 to 255).
Anyone have any indication of this problem?
Thank you
0xc2 and 0xc3 are byte values that appear when encoding character values between U+0080 and U+00FF as UTF-8. Something on the transmission side is trying to send text instead of bytes, and something in the middle is (properly) converting the text to UTF-8 bytes before sending. The fix is to send bytes instead of text in the first place.

Converting hex values in buffer to integer

Background: I'm using node.js to get the volume setting from a device via serial connection. I need to obtain this data as an integer value.
I have the data in a buffer ('buf'), and am using readInt16BE() to convert to an int, as follows:
console.log( buf )
console.log( buf.readInt16BE(0) )
Which gives me the following output as I adjust the external device:
<Buffer 00 7e>
126
<Buffer 00 7f>
127
<Buffer 01 00>
256
<Buffer 01 01>
257
<Buffer 01 02>
258
Problem: All looks well until we reach 127, then we take a jump to 256. Maybe it's something to do with signed and unsigned integers - I don't know!
Unfortunately I have very limited documentation about the external device, I'm having to reverse engineer it! Is it possible it only sends a 7-bit value? Hopefully there is a way around this?
Regarding a solution - I must also be able to convert back from int to this format!
Question: How can I create a sequential range of integers when 7F seems to be the largest value my device sends, which causes a big jump in my integer scale?
Thanks :)
127 is the maximum value of a signed 8-bit integer. If the integer is overflowing into the next byte at 128 it would be safe to assume you are not being sent a 16 bit value, but rather 2 signed 8-bit values, and reading the value as a 16-bit integer would be incorrect.
I would start by using the first byte as a multiplier of 128 and add the second byte, this will give the series you are seeking.
buf = Buffer([0,127]) //<Buffer 00 7f>
buf.readInt8(0) * 128 + buf.readInt8(1)
>127
buf = Buffer([1,0]) //<Buffer 01 00>
buf.readInt8(0) * 128 + buf.readInt8(1)
>128
buf = Buffer([1,1]) //<Buffer 01 01>
buf.readInt8(0) * 128 + buf.readInt8(1)
>129
The way to get back is to divide by 128, round it down to the nearest integer for the first byte, and the second byte contains the remainder.
i = 129
buf = Buffer([Math.floor(i / 128), i % 128])
<Buffer 01 01>
Needed to treat the data as two signed 8-bit values. As per #forrestj the solution is to do:
valueInt = buf.readInt8(0) * 128 + buf.readInt8(1)
We can also convert the int value into the original format by doing the following:
byte1 = Math.floor(valueInt / 128)
byte2 = valueInt % 128

Extract thumbnail from jpeg file

I'd like to extract thumbnail image from jpegs, without any external library. I mean this is not too difficult, because I need to know where the thumbnail starts, and ends in the file, and simply cut it. I study many documentation ( ie.: http://www.media.mit.edu/pia/Research/deepview/exif.html ), and try to analyze jpegs, but not everything clear. I tried to track step by step the bytes, but in the deep I confused. Is there any good documentation, or readable source code to extract the info about thumbnail start and end position within a jpeg file?
Thank you!
Exiftool is very capable of doing this quickly and easily:
exiftool -b -ThumbnailImage my_image.jpg > my_thumbnail.jpg
For most JPEG images created by phones or digital cameras, the thumbnail image (if present) is stored in the APP1 marker (FFE1). Inside this marker segment is a TIFF file containing the EXIF information for the main image and the optional thumbnail image stored as a JPEG compressed image. The TIFF file usually contains two "pages" where the first page is the EXIF info and the second page is the thumbnail stored in the "old" TIFF type 6 format. Type 6 format is when a JPEG file is just stored as-is inside of a TIFF wrapper. If you want the simplest possible code to extract the thumbnail as a JFIF, you will need to do the following steps:
Familiarize yourself with JFIF and TIFF markers/tags. JFIF markers consist of two bytes: 0xFF followed by the marker type (0xE1 for APP1). These two bytes are followed by the two-byte length stored in big-endian order. For TIFF files, consult the Adobe TIFF 6.0 reference.
Search your JPEG file for the APP1 (FFE1) EXIF marker. There may be multiple APP1 markers and there may be multiple markers before the APP1.
The APP1 marker you're looking for contains the letters "EXIF" immediately after the length field.
Look for "II" or "MM" (6 bytes away from length) to indicate the endianness used in the TIFF file. II = Intel = little endian, MM = Motorola = big endian.
Skip through the first page's tags to find the second IFD where the image is stored. In the second "page", look for the two TIFF tags which point to the JPEG data. Tag 0x201 has the offset of the JPEG data (relative to the II/MM) and tag 0x202 has the length in bytes.
There is a much simpler solution for this problem, but I don't know how reliable it is: Start reading the JPEG file from the third byte and search for FFD8 (start of JPEG image marker), then for FFD9 (end of JPEG image marker). Extract it and voila, that's your thumbnail.
A simple JavaScript implementation:
function getThumbnail(file, callback) {
if (file.type == "image/jpeg") {
var reader = new FileReader();
reader.onload = function (e) {
var array = new Uint8Array(e.target.result),
start, end;
for (var i = 2; i < array.length; i++) {
if (array[i] == 0xFF) {
if (!start) {
if (array[i + 1] == 0xD8) {
start = i;
}
} else {
if (array[i + 1] == 0xD9) {
end = i;
break;
}
}
}
}
if (start && end) {
callback(new Blob([array.subarray(start, end)], {type:"image/jpeg"}));
} else {
// TODO scale with canvas
}
}
reader.readAsArrayBuffer(file.slice(0, 50000));
} else if (file.type.indexOf("image/") === 0) {
// TODO scale with canvas
}
}
The wikipedia page on JFIF at http://en.wikipedia.org/wiki/JPEG_File_Interchange_Format gives a good description of the JPEG Header(the header contains the thumbnail as an uncompressed raster image). That should give you an idea of the layout and thus the code needed to extract the info.
Hexdump of an image header (little endian display):
sdk#AndroidDev:~$ head -c 48 stfu.jpg |hexdump
0000000 d8ff e0ff 1000 464a 4649 0100 0101 4800
0000010 4800 0000 e1ff 1600 7845 6669 0000 4d4d
0000020 2a00 0000 0800 0000 0000 0000 feff 1700
Image Magic (bytes 1,0), App0 Segment header Magic(bytes 3,2), Header Length (5,4) Header Type signature ("JFIF\0"||"JFXX\0")(bytes 6-10), Version (bytes 11,12) Density units (byte 13), X Density (bytes 15,14), Y Density (bytes 17,16), Thumbnail width (byte 19), Thumbnail height (byte 18), and finally rest up to "Header Length" is thumbnail data.
From the above example, you can see that the header length is 16 bytes (bytes 6,5) and version is 01.01 (bytes 12,13). Further, as Thumbnail Width and Thumbnail Height are both 0x00, the image doesn't contain a thumbnail.

Confused about "three successive writes: bytes 10, bytes 32, bytes 54"?

I am learning SMSC smc91cx driver code, and I learned how to program test code for smc91c111 nic by the instructions of Application Note 9-6. I cannot understand the following instructions under "Transmitting A Packet":
Write the destination address (three successive writes: bytes 10, bytes 32, bytes 54)
Write 0xFFFF, 0xFFFF, 0xFFFF
Write the source address (three successive writes: bytes 10, bytes32, bytes 54)
Write 0x0000, 0x0000, 0x0000
I cannot make sense of these instructions. Should I write 10 bytes size of 0xFF plus 32 bytes size plus 54 bytes size to the buffer, or just write 0xFF in 10th byte postion, 32th, 54th byte postion?
But if so, why would you write 0x0000 to the same position?
Rather than allocating several different registers to write to, that chip has you write to the same DATA register serially until you set all the info. The DATA register is 2 bytes wide, but a MAC address is 6 bytes, numbered 0-5. So you have to write it 2 bytes at a time: bytes number 1 and 0 first, followed by bytes number 3 and 2, then bytes number 5 and 4. Then write 0xFFFF 3 times to the DATA register, then repeat for the source address and the 0x0000s.

Resources