Smooth Streaming and AAC-Low Complexity audio codec. Data format? - audio

I am writing a Smooth Streaming client application. On the server side (IIS 7 with Media Services extensions), I have a bunch of ISMV and ISMA files encoded using Expression Encoder pro 4 with the "H.264 IIS Smooth Streaming iPhone WiFi" preset. In a nutshell, it uses the "H.264 baseline" video codec, and the AAC-LC audio codec.
On the client side however is where I'm having problems, specifically with the audio chunks. While I have been able to make sense of the H.264 video stream (it is essentially a sequence of raw NAL units prefixed by their length, without the NAL unit "start code" 0, 0, 0, 1), I still haven't been able to crack what is inside the AAC LC audio stream, i.e. what comes in the "mdat" (Media Data Box) atom. It is most definitely not an MP4 container, but what is it then?
I am pasting below the first 128 (number chosen arbitrarily) bytes of one AAC-LC fragment (MDAT portion only) obtained from the server, in case anyone can figure it out from there.
unsigned char data[128] = {
0x21, 0x09, 0x0A, 0xBF, 0xBF, 0xFF, 0xFF, 0xD5, 0xB1, 0x8D, 0xC4, 0xA1,
0x18, 0x0D, 0x25, 0xC9, 0x2E, 0x49, 0x2E, 0x10, 0x88, 0x91, 0x10, 0x01,
0x13, 0x23, 0x2C, 0x36, 0x25, 0x60, 0x6B, 0x94, 0x8C, 0x74, 0xD7, 0x4A,
0x95, 0xD3, 0x03, 0x91, 0x5B, 0x76, 0xDE, 0x27, 0xC5, 0xB2, 0x4C, 0xCF,
0xEB, 0x3E, 0xDD, 0xFF, 0x22, 0xAF, 0xC3, 0xF8, 0x60, 0x36, 0x49, 0xBC,
0xAE, 0x4D, 0x10, 0x31, 0xC6, 0x28, 0x2A, 0xEB, 0xCA, 0x94, 0x51, 0xD8,
0x61, 0x1B, 0xC6, 0x2A, 0x91, 0x71, 0xE4, 0x8C, 0xF8, 0x19, 0x2C, 0xDE,
0x71, 0xBB, 0xE3, 0xBD, 0x36, 0xB4, 0x45, 0x37, 0x02, 0x61, 0x48, 0x8E,
0x19, 0x80, 0xD5, 0x24, 0x97, 0x24, 0x92, 0x44, 0x08, 0x89, 0x12, 0x00,
0xB3, 0xF8, 0x1E, 0xE2, 0xBD, 0xCD, 0x4E, 0xF7, 0xA9, 0xE2, 0x0E, 0xD8,
0xEA, 0xFA, 0xCF, 0xDB, 0x4E, 0x69, 0x6F, 0xEE
};

After a long research and this tip I received on the IIS forums, I was able to figure it out. Basically this is a raw AAC stream, which needs to be wrapped with headers before it can be played back. The simplest and most common header format appears to be ADTS, which consists in adding a 7-byte header in front of each sample.

Related

How do I reduce the color palette of an image in Python 3?

I've seen answers for Python 2, but support for it has ended and the Image module for Python2 is incompatible with aarch64.
I'm just looking to convert some images to full CGA color, without the hassle of manually editing the image.
Thanks in advance!
Edit: I think I should mention that I'm still pretty new to Python. I think I should also mention I'm trying to use colors from a typical CGA adapter.
Here's one way to do it with PIL - you may want to check I have transcribed the list of colours accurately from here::
#!/usr/bin/env python3
from PIL import Image
# Create new 1x1 palette image
CGA = Image.new('P', (1,1))
# Make a CGA palette and push it into image
CGAcolours = [
0x00, 0x00, 0x00,
0x00, 0x00, 0xaa,
0x00, 0xaa, 0x00,
0x00, 0xaa, 0xaa,
0xaa, 0x00, 0x00,
0xaa, 0x00, 0xaa,
0xaa, 0x55, 0x00,
0xaa, 0xaa, 0xaa,
0x55, 0x55, 0x55,
0x55, 0x55, 0xff,
0x55, 0xff, 0x55,
0x55, 0xff, 0xff,
0xff, 0x55, 0x55,
0xff, 0x55, 0xff,
0xff, 0xff, 0x55,
0xff, 0xff, 0xff]
CGA.putpalette(CGAcolours)
# Open our image of interest
im = Image.open('swirl.jpg')
# Quantize to our lovely CGA palette, without dithering
res = im.quantize(colors=len(CGAcolours), palette=CGA, dither=Image.Dither.NONE)
res.save('result.png')
That transforms this:
into this:
Or, if you don't really want to write any Python, you can do it on the command-line with ImageMagick in a highly analogous way:
# Create 16-colour CGA swatch in file "cga16.gif"
magick xc:'#000000' xc:'#0000AA' xc:'#00AA00' xc:'#00AAAA' \
xc:'#AA0000' xc:'#AA00AA' xc:'#AA5500' xc:'#AAAAAA' \
xc:'#555555' xc:'#5555FF' xc:'#55FF55' xc:'#55FFFF' \
xc:'#FF5555' xc:'#FF55FF' xc:'#FFFF55' xc:'#FFFFFF' \
+append cga16.gif
That creates this 16x1 swatch (enlarged so you can see it):
# Remap colours in "swirl.jpg" to the CGA palette
magick swirl.jpg +dither -remap cga16.gif resultIM.png
Doubtless the code could be converted for EGA, VGA and the other early PC hardware palettes... or even to the Rubik's Cube palette:
Resulting in:
Or the Bootstrap palette:
Or the Google palette:

BPF filter fails

Can anyone suggest why this (classic) BPF program sometimes lets non-DHCP-response packets through:
# Load the Ethertype field
BPF_LD | BPF_H | BPF_ABS 12
# And reject the packet if it's not 0x0800 (IPv4)
BPF_JMP | BPF_JEQ | BPF_K 0x0800 0 8
# Load the IP protocol field
BPF_LD | BPF_B | BPF_ABS 23
# And reject the packet if it's not 17 (UDP)
BPF_JMP | BPF_JEQ | BPF_K 17 0 6
# Check that the packet has not been fragmented
BPF_LD | BPF_H | BPF_ABS 20
BPF_JMP | BPF_JSET | BPF_K 0x1fff 4 0
# Load the IP header length field
BPF_LDX | BPF_B | BPF_MSH 14
# And load that offset + 16 to get the UDP destination port
BPF_LD | BPF_IND | BPF_H 16
# And reject the packet if the destination port is not 68
BPF_JMP | BPF_JEQ | BPF_K 68 0 1
# Accept the frame
BPF_RET | BPF_K 1500
# Reject the frame
BPF_RET | BPF_K 0
It doesn't let every frame through, but under heavy network load it fails quite often. I'm testing it with this Python 3 program:
import ctypes
import struct
import socket
ETH_P_ALL = 0x0003
SO_ATTACH_FILTER = 26
SO_ATTACH_BPF = 50
filters = [
0x28, 0x00, 0x00, 0x00, 0x0c, 0x00, 0x00, 0x00, 0x15, 0x00, 0x00, 0x08, 0x00, 0x08, 0x00, 0x00,
0x30, 0x00, 0x00, 0x00, 0x17, 0x00, 0x00, 0x00, 0x15, 0x00, 0x00, 0x06, 0x11, 0x00, 0x00, 0x00,
0x28, 0x00, 0x00, 0x00, 0x14, 0x00, 0x00, 0x00, 0x45, 0x00, 0x04, 0x00, 0xff, 0x1f, 0x00, 0x00,
0xb1, 0x00, 0x00, 0x00, 0x0e, 0x00, 0x00, 0x00, 0x48, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00,
0x15, 0x00, 0x00, 0x01, 0x44, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0xdc, 0x05, 0x00, 0x00,
0x06, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
]
filters = bytes(filters)
b = ctypes.create_string_buffer(filters)
mem_addr_of_filters = ctypes.addressof(b)
pf = struct.pack("HL", 11, mem_addr_of_filters)
pf = bytes(pf)
def main():
sock = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.htons(ETH_P_ALL))
sock.bind(("eth0", ETH_P_ALL))
sock.setsockopt(socket.SOL_SOCKET, SO_ATTACH_FILTER, pf)
# sock.send(req)
sock.settimeout(1)
try:
data = sock.recv(1500)
if data[35] == 0x43:
return
print('Packet got through: 0x{:02x} 0x{:02x}, 0x{:02x}, 0x{:02x}'.format(data[12], data[13], data
except:
print('Timeout')
return
sock.close()
for ii in range(1000):
main()
If I do this while SCPing a big core file to the host running that script, it doesn't reach the one-second timeout in the large majority of, but not all, cases. Under lighter load failures are much rarer - eg twiddling around on an ssh link while the socket is receiving; sometimes it gets through 1000 iterations without failure.
The host in question is Linux 4.9.0. The kernel has CONFIG_BPF=y.
Edit
For a simpler version of the same question, why does this BPF program let through any packets at all:
BPF_RET | BPF_K 0
Edit 2
The tests above were on an ARM64 machine. I've retested on amd64 / Linux 5.9.0. I still see failures, though not nearly as many.
I got a response on LKML explaining this.
The problem is that the filter is applied as the frame arrives on the interface, not as it's passed to userspace with recv(). So under heavy load, frames arrive between the creation of the socket with socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.htons(ETH_P_ALL)) and the filter being applied with sock.setsockopt(socket.SOL_SOCKET, SO_ATTACH_FILTER, pf). These frames sit in the queue; once the filter is applied, subsequent arriving packets have the filter applied to them.
So once the filter is applied, it's necessary to "drain" any queued frames from the socket before you can rely on the filter.

How to extract SPS and PPS from RTMP stream (avc1 encoded)?

I'm working on an extension to Node Media Server to save incoming streams to disk as MP4. For this conversion to MP4 I'm leaning heavily on the Apple QuickTime Movie Specification, The ISO/IEC 14496-14 Specification (which I discovered in the Rust-MP4 GitHub repository for free), and The HLS.js Source Code
I'm testing with a single video at the moment. Once this works I'll start experimenting with other videos. For my use case I only need to support H.264 video and AAC audio.
Currently when an RTMP connection is established, the first 3 packets I receive are consistently:
1 AMF metadata packet (RTMP cid = 6) containing information like video width, video height, bitrate, audio sample rate, etc
1 audio packet (RMTP cid = 4) containing 7 bytes of data. I assume this is the AAC config packet
1 video packet (RTMP cid = 5) containing 46 bytes of data. I assume this is the AVC config packet
When writing the MP4 moov atom, there are two places where I need to utilize additional information not located in the AMF metadata (and presumably located in these two config packets):
In the esds atom, The HLS.js source appends "config" data. I assume I just append the entire 7-byte payload from the audio config packet here
In the avcC atom, The HLS.js source append the "sps" and "pps" data. This is the root of my issue
Regarding the parsing of these 46 bytes, I found code in Node Media Server and HLS.js that seems to parse the same data. The difference between these two pieces of code is that Node Media Server expects an additional 13 bytes of data at the start of the packet. The packet I receive seems to contain these additional 13 bytes, so I simply follow their lead in extracting width, height, profile, compat, and level information. The 46 bytes in particular are:
[0x17, 0x00, 0x00, 0x00, 0x00, 0x01, 0x42, 0xc0, 0x1f, 0xff, 0xe1, 0x00, 0x19, 0x67, 0x42, 0xc0, 0x1f, 0xa6, 0x11, 0x02, 0x80, 0xbf, 0xe5, 0x84, 0x00, 0x00, 0x03, 0x00, 0x04, 0x00, 0x00, 0x03, 0x00, 0xc2, 0x3c, 0x60, 0xc8, 0x46, 0x01, 0x00, 0x05, 0x68, 0xc8, 0x42, 0x32, 0xc8]
Breaking this down for the bytes I can easily parse (prior to the use of Exponential Golomb encoding):
[
0x17, // "frame type", specifies H.264 or HVEC
0x00, 0x00, 0x00, 0x00, 0x01, // ignored. Reserved?
0x42, // profile
0xc0, // compat
0x1f, // level
0xff, // "info.nalu" (per Node Media Server source)
0xe1, // "info.nb_sps" (per Node Media Server source)
0x00, 0x19, // "nal size"
// Above here are the bits exclusively seen by Node Media Server (specific to RTMP?)
// Below here are the bits passed to HLS.js as "unit.data" (common to all AVC1 streams?):
0x67, // "nal type"
0x42, // profile (again?)
0xc0, // compat (again?)
0x1f, // level (again?)
// Below here, data is not necessarily byte-aligned as Exponential Golomb encoding is used
// ...
]
Now the problem I'm running into is during the creation of the moov atom (and the avcC atom, specifically) I need to know both the sps and the pps bytes. From the HLS.js source it looks like the sps may just be this video config packet minus the first 13 bytes. However how do I find the pps? Is pps actually the last few bytes of this packet, and I should split it somewhere? Will this be delivered in another packet? If two video packets are to be expected, is there some way I should differentiate them so I know which one is sps and which one is pps?
If I can figure out this last little bit, then I should be completely done writing the moov packet (after which point I just need to figure out the proper format for the mdat packet and I should have working code)
Update: For the record, I just checked the fourth packet being delivered to see if it might contain pps data. After reconnecting to the stream ~20 times the fourth packet was consistently a video packet (RTMP cid = 5), but the size of the packet ranged from 16000 bytes to 21000 bytes. I suspect this is legitimate video data.
Second Update: I just checked what the offset was in the video config packet when I finished parsing the SPS, and I was on byte 23 (0x84). It's therefore likely that the PPS is in fact at the end of this byte array, but I'm not sure how many bytes are delimiters / headers (NAL type, NAL length, etc) and how many bytes are the actual PPS.

USB Audio Descriptor

I'm trying to implement a PIC32 MCU as a Audio device, using USB audio class 1.
I have implementet this project: PIC32 USB Digital Audio Accesory Board Demos.zip
and it works fine, but now i want to cut away some of the Audio Control Interface, so i'm having a more simple Audio function:
The device seems to get enumerated properly acording to the status LED's on the board, and it appears in Device Manager's list of audio devices, but it has a small yellow exclamation mark.. And when i plug in the device, windows tells me: "Device driver software was not succesfully installed"
Anybody has a clue?
USB descriptors, block of code:
ROM BYTE configDescriptor1[] ={
//CD
0x09, // Size : 9 Bytes
0x02, // Configuration Descriptor (0x02)
//0xE4, // Total length in bytes of data returned Includes the combined length of all descriptors (configuration, interface, endpoint, and class- or vendor-specific) returned for this configuration.
0xB4, // Ved simpel Audio Function
0x00, // 2. Byte af Total Length // 228
0x03, // Number of Interfaces: 3
0x01, // bConfigurationValue, Value to use as an argument to select this configuration
0x00, // iConfiguration, Index of String Descriptor describing this configuration
_DEFAULT | _SELF, // bmAttributes, 0b01100000 -> D6: Self-powered, D7: Reserved (set to one)
0xFA, // Maximum Power : 250 * 2mA = 0,5A
//ID - INTERFACE 0 Control
0x09, // Size : 9 Bytes
0x04, // Interface Descriptor (0x04)
0x00, // Number of Interface: Interface nr 0
0x00, // Value used to select alternative setting
0x00, // Number of Endpoints used for this interface, 0
0x01, // Class Code (Assigned by USB Org), AUDIO
0x01, // Subclass Code (Assigned by USB Org), AUDIOCONTROL
0x00, // Protocol Code (Assigned by USB Org)
0x00, // Index of String Descriptor Describing this interface
//ACID - HEADER
0x0A, // Size : 10 Bytes
0x24, // CS_INTERFACE Descriptor Type
0x01, // HEADER descriptor subtype
0x00,0x01, // Audio Device compliant to the USB Audio specification version 1.00
//0x64,0x00, // 100 bytes - Total number of bytes returned for the class-specific AudioControl interface descriptor. // Includes the combined length of this descriptor header and all Unit and Terminal descriptors.
0x00,0x34, // wTotalLength
0x02, // bInCollection -> Number of streaming interfaces = 2
0x01, // baInterfaceNr(1) -> 0x01 = 1 -> Interface number of the first AudioStreaming interface in the Collection
0x02, // baInterfaceNr(2) -> 0x02 = 2 -> Interface number of the second AudioStreaming interface in the Collection
//ACID - INPUT_TERMINAL ID = 1
0x0C, // size : 12 bytes
0x24, // CS_INTERFACE Descriptor Type
0x02, // INPUT_TERMINAL - Descriptor subtype = 2
0x01, // ID of this Input Terminal. // Constant uniquely identifying the Terminal within the audio function.
0x01,0x01, // wTerminalType -> 0x0101 = USB streamming
0x00, // bAssocTerminal -> 0x00 = No association.
//0x03, // bAssocTerminal -> 0x03 = Associated with OUTPUT TERMINAL 3
0x02, // bNrChannels -> 0x02 two channel.
0x03,0x00, // wChannelConfig -> 0x0003 = stereo, Right / Left // se Audio Devices Dok side 34
//0x00,0x00, // wChannelConfig -> 0x0000 = mono ?
0x00, // iChannelNames -> 0x00 = Unused.
0x00, // iTerminal -> 0x00 = Unused.
//ACID - INPUT_TERMINAL ID = 4
0x0C, // size : 12 bytes
0x24, // CS_INTERFACE Descriptor Type
0x02, // INPUT_TERMINAL - Descriptor subtype
0x04, // bTerminalID -> ID of this Input Terminal = 4
0x01,0x02, // wTerminalType -> 0x0201 = Microphone
0x00, // bAssocTerminal -> 0x00 = No association.
//0x06, // bAssocTerminal -> 0x06 = Associated with OUTPUT TERMINAL 6
0x01, // bNrChannels -> 0x01 one channel.
0x01,0x00, // wChannelConfig -> Left Front <- original fra ex.
//0x00,0x00, // wChannelConfig -> 0x0000 = mono
0x00, // iChannelNames -> 0x00 = Unused.
0x00, // iTerminal -> 0x00 = Unused.
//ACID - OUTPUT_TERMINAL ID = 3
0x09, // size : 9 Bytes
0x24, // CS_INTERFACE Descriptor Type
0x03, // OUTPUT_TERMINAL - Descriptor subtype
0x03, // bTerminalID -> ID of this Output Terminal = 3
0x01,0x03, // wTerminalType -> 0x0301 = Speaker
0x00, // bAssocTerminal -> 0x00 = Unused
//0x01, // bAssocTerminal -> 0x01 = Associated with INPUT TERMINAL 1
//0x02, // bSourceID -> 0x02 = From FEATURE_UNIT ID 2 = USB stream
0x01, // bSourceID -> 0x01 = From Input Terminal ID 1
0x00, // iTerminal -> 0x00 = Unused.
//ACID - OUTPUT_TERMINAL ID = 6
0x09, // Size : 9 Bytes
0x24, // CS_INTERFACE Descriptor Type
0x03, // OUTPUT_TERMINAL - Descriptor subtype
0x06, // bTerminalID -> ID of this Output Terminal = 6
0x01,0x01, // wTerminalType -> 0x0101 = USB streaming
0x00, // bAssocTerminal -> 0x00 = Unused
//0x04, // bAssocTerminal -> 0x04 = Associated with INPUT TERMINAL 4
//0x09, // bSourceID -> 0x09 = From Selector Unit ID 9
0x04, // bSourceID -> 0x04 = From INPUT_TERMINAL ID 4
0x00, // iTerminal -> 0x00 = Unused.
/* ##### INTERFACE 1 - OUT ##### */
//ID - INTERFACE 1/0 Stream
0x09, // Size : 9 Bytes
0x04, // Interface Descriptor (0x04)
0x01, // bInterfaceNumber -> 0x01 Interface ID = 1 Number of this interface. Zero-based value identifying the index in the array of concurrent interfaces supported by this configuration.
0x00, // bAlternateSetting -> 0x00 = index of this interface's alternate setting
0x00, // bNumEndpoints -> 0x00 = 0 Endpoints to this interface
0x01, // bInterfaceClass -> 0x01 = Audio Interface
0x02, // bInterfaceSubclass -> 0x02 = AUDIO_STREAMING
0x00, // bInterfaceProtocol -> 0x00 = Unused
0x00, // iInterface -> 0x00 = Unused
//ID - INTERFACE 1/1 Stream
0x09, // Size : 9 Bytes
0x04, // Interface Descriptor (0x04)
0x01, // bInterfaceNumber -> 0x01 Interface ID = 1
0x01, // bAlternateSetting -> 0x01 = index of this interface's alternate setting
0x01, // bNumEndpoints -> 0x01 = 1 Endpoints to this interface
0x01, // bInterfaceClass -> 0x01 = Audio Interface
0x02, // bInterfaceSubclass -> 0x02 = AUDIO_STREAMING
0x00, // bInterfaceProtocol -> 0x00 = Unused
0x00, // iInterface -> 0x00 = Unused
//ASID - GENERAL
0x07, // Size : 7 Bytes
0x24, // CS_INTERFACE Descriptor Type
0x01, // bDescriptorSubtype -> 0x01 = GENERAL subtype
0x01, // bTerminalLink -> 0x01 = The Terminal ID of the Terminal to which the endpoint of this interface is connected.
0x01, // bDelay -> 0x01 = Delay (delta) introduced by the data path (see Section 3.4, ?Inter Channel Synchronization? - in Audio Devices). Expressed in number of frames.
0x01,0x00, // wFormatTag -> 0x0001 = PCM
//ASID - FORMAT_TYPE
0x0E, // Size : 14 Bytes
0x24, // CS_INTERFACE Descriptor Type
0x02, // bDescriptorSubtype -> 0x02 = FORMAT_TYPE
0x01, // bFormatType -> 0x01 = FORMAT_TYPE_I -> ref: A.1.1 Audio Data Format Type I Codes -> Audio Data Format Dok
0x02, // bNrChannels -> 0x02 = Two channels
0x02, // bSubFrameSize -> 0x02 = 2 bytes pr audio subframe
0x10, // bBitResolution -> 0x10 = 16 bit pr sample
0x02, // bSamFreqType -> 0x02 = 2 sample frequencies supported
0x80,0xBB,0x00, // tSamFreq -> 0xBB80 = 48000 Hz
0x00,0x7D,0x00, // tSamFreq -> 0x7D00 = 32000 Hz
//0x44,0xAC,0x00,
//ED - ENDPOINT OUT
0x09, // Size : 9 Bytes
0x05, // 0x05 -> ENDPOINT Descriptor Type
0x01, // bEndpointAddress -> 0x01 = adress 1, OUT, -> ref 9.6.6 Endpoint -> usb_20 Dok
0x09, // bmAttributes -> 0b00001001 -> Bits 0-1 = 01 = Isochronous , Bits 2-3 = 10 = Adaptive
AUDIO_MAX_SAMPLES * sizeof ( AUDIO_PLAY_SAMPLE ), 0x00, // wMaxPacketSize -> 48 * 4 = 0x0030 :Maximum packet size this endpoint is capable of sending or receiving when this configuration is selected.
0x01, // bInterval -> 0x01 = 1 millisecond
0x00, // ?? not described -> 0x00 = Unused
0x00, // ?? not described -> 0x00 = Unused
//CSED - CS ENDPOINT
0x07, // Size : 7 Bytes
0x25, // CS_ENDPOINT
0x01, // bDescriptorSubtype -> 0x01 = GENERAL
0x01, // bmAttributes -> 0b00000001 = Bit 1 = 1 => Sample Freq Control is supported by this endpoint
0x00, // bLockDelayUnits -> 0x00 = Indicates the units used for the wLockDelay field: 0 = Undefined
0x00,0x00, // the time it takes this endpoint to reliably lock its internal clock recovery circuitry.
/* ##### INTERFACE 2 - IN ##### */
//ID - INTERFACE 2/0 Stream
0x09, // Size : 9 Bytes
0x04, // Interface Descriptor (0x04)
0x02, // bInterfaceNumber -> 0x02 Interface ID = 2
0x00, // bAlternateSetting -> 0x00 = Value used to select this alternate setting for the interface identified in the prior field
0x00, // bNumEndpoints -> 0x00 = 0 -> Number of endpoints used by this interface
0x01, // bInterfaceClass -> 0x01 = 1 = AUDIO
0x02, // bInterfaceSubClass -> 0x02 = AUDIO_STREAMING
0x00, // bInterfaceProtocol -> 0x00 = Unused
0x00, // iInterface -> 0x00 = Unused -> Index of string descriptor.
//ID - INTERFACE 2/1 Stream
0x09, // Size : 9 Bytes
0x04, // Interface Descriptor (0x04)
0x02, // bInterfaceNumber -> 0x02 Interface ID = 2
0x01, // bAlternateSetting -> 0x01 = Value used to select this alternate setting for the interface identified in the prior field
0x01, // bNumEndpoints -> 0x01 = 1 -> Number of endpoints used by this interface
0x01, // bInterfaceClass -> 0x01 = 1 = AUDIO
0x02, // bInterfaceSubClass -> 0x02 = AUDIO_STREAMING
0x00, // bInterfaceProtocol -> 0x00 = Unused
0x00, // iInterface -> 0x00 = Unused -> Index of string descriptor.
//ASID - GENERAL
0x07, // Size : 7 Bytes
0x24, // CS_INTERFACE Descriptor Type
0x01, // GENERAL Descriptor
0x06, // bTerminalLink -> 0x06 = The Terminal ID of the Terminal to which the endpoint of this interface is connected = 6
0x01, // bDelay -> 0x01 = Delay (delta) introduced by the data path (see Section 3.4, ?Inter Channel Synchronization? - in Audio Devices). Expressed in number of frames.
0x01,0x00, // wFormatTag -> 0x0001 = PCM
//ASID - FORMAT_TYPE
0x0E, // Size : 14 Bytes
0x24, // CS_INTERFACE Descriptor Type
0x02, // bDescriptorSubtype -> 0x02 = FORMAT_TYPE
0x01, // bFormatType -> 0x01 = FORMAT_TYPE_I -> ref: A.1.1 Audio Data Format Type I Codes -> Audio Data Format Dok
0x02, // bNrChannels -> 0x02 = Two channels
0x02, // bSubFrameSize -> 0x02 = 2 bytes pr audio subframe
//0x03, // bSubFrameSize -> 0x03 = 3 bytes pr audio subframe
0x10, // bBitResolution -> 0x10 = 16 bit pr sample
//0x18, // bBitResolution -> 0x18 = 24 bit pr sample
0x02, // bSamFreqType -> 0x02 = 2 sample frequencies supported
0x80,0xBB,0x00, // tSamFreq -> 0xBB80 = 48000 Hz
0x00,0x7D,0x00, // tSamFreq -> 0x7D00 = 32000 Hz
//0x44,0xAC,0x00, // 44100 Hz
//ED - ENDPOINT IN
0x09, // Size : 9 Bytes
0x05, // 0x05 -> ENDPOINT Descriptor Type
0x82, // bEndpointAddress -> 0x82 = adress 2, IN, -> ref 9.6.6 Endpoint -> usb_20 Dok
0x05, // bmAttributes -> 0b00000101 -> Bits 0-1 = 01 = Isochronous , Bits 2-3 = 01 = Asynchronous
AUDIO_MAX_SAMPLES * (sizeof ( AUDIO_PLAY_SAMPLE )), 0x00, // wMaxPacketSize -> 48 * 4 = 0x0030 :Maximum packet size this endpoint is capable of sending or receiving when this configuration is selected.
//0x20,0x01,
0x01, // bInterval -> 0x01 = 1 millisecond
0x00, // bRefresh -> ??
0x00, // bSynchAddress -> ??
//CSED - CS ENDPOINT
0x07, // Size : 7 Bytes
0x25, // bDescriptorType -> 0x25 = CS_ENDPOINT
0x01, // bDescriptorSubtype -> 0x01 = GENERAL
0x01, // bmAttributes -> 0b00000001 = Bit 1 = 1 => Sample Freq Control is supported by this endpoint
0x00, // bLockDelayUnits -> 0x00 = Indicates the units used for the wLockDelay field: 0 = Undefined
0x00,0x00, // the time it takes this endpoint to reliably lock its internal clock recovery circuitry
};
So i solved my question!
As chris commented, i had to implement some sort of control, into the control interface. First i added two Feature units, as seen on the picture. I implemented made then support no controls, by setting:
bmaControls(1) = 0x00 // No control support
This cause windows to install the device driver properly, but i still couldn't see the microphone in "Recording Devices" in the sound settings.
The i implemented Mute control, by changing the control parameter to:
bmaControls(1) = 0x01 // Mute support
Now i could see the microphone in the settings, but the gain was different from what i expected. Then i changed the controls to Volume control, by changing the control parameter to:
bmaControls(1) = 0x02 // Volume support
This made the microphone work as i expected...
Thanks Chris!

Setting values of bytearray

I have a BYTE data[3]. The first element, data[0] has to be a BYTE with very specific values which are as follows:
typedef enum
{
SET_ACCURACY=0x01,
SET_RETRACT_LIMIT=0x02,
SET_EXTEND_LIMT=0x03,
SET_MOVEMENT_THRESHOLD=0x04,
SET_STALL_TIME= 0x05,
SET_PWM_THRESHOLD= 0x06,
SET_DERIVATIVE_THRESHOLD= 0x07,
SET_DERIVATIVE_MAXIMUM = 0x08,
SET_DERIVATIVE_MINIMUM= 0x09,
SET_PWM_MAXIMUM= 0x0A,
SET_PWM_MINIMUM = 0x0B,
SET_PROPORTIONAL_GAIN = 0x0C,
SET_DERIVATIVE_GAIN= 0x0D,
SET_AVERAGE_RC = 0x0E,
SET_AVERAGE_ADC = 0x0F,
GET_FEEDBACK=0x10,
SET_POSITION=0x20,
SET_SPEED= 0x21,
DISABLE_MANUAL = 0x30,
RESET= 0xFF,
}TYPE_CMD;
As is, if I set data[0]=SET_ACCURACY it doesn't set it to 0x01, it sets it to 1, which is not what I want. data[0] must take the value 0x01 when set it equal to SET_ACCURACY. How do I make it so that it does this for not only SET_ACCURACY, but all the other values as well?
EDIT: Actually this works... I had a different error in my code that I attributed to this. Sorry!
Thanks!
"data[0]=SET_ACCURACY doesn't set it to 0x01, it sets it to 1"
It assigns value of SET_ACCURACY to the data[0], which means that bits 00000001 are stored into memory at address &data[0]. 0x01 is hexadecimal representation of this byte, 1 is its decimal representation.

Resources