Correct WAVE_FORMAT_1S16 PCM dual-channel format? - audio

I am trying to play 2 16-bit PCM streams on Windows 10 WinMM Audio,
each on a separate channel, using this WAVEFORMATEX :
const WAVEFORMATEX
_wex_ = // (WAVEFORMATEX)
{ .wFormatTag = WAVE_FORMAT_PCM
, .nChannels = 2
, .nSamplesPerSec = 8000
, .nAvgBytesPerSec= 16000
, .nBlockAlign = 4
, .wBitsPerSample = 16
, .cbSize = 0
};
When I play them with 2 processes, process A laying out its stream like:
Bit 31: | :Bit 0
<PCM_16_BITS> | 00 ... 00
and process B laying out its stream like:
Bit 31: | :Bit 0
00 ... 00 | <PCM_16_BITS>
, then the stream plays on left and right channels successfully (is Mixed
by the WAS Mixer device - each stream plays on only one channel so is played only
on left or right speaker ).
But if I write a single process which combines the two streams, so that
a single stream is laid out like:
Bit 31: | :Bit 0
<LEFT PCM_16_BITS> | <RIGHT PCM_16_BITS>
then the stream plays as garbled nonsense.
Please could anyone enlighten me as to the actual byte layout that Windows Audio
expects the frames to have for 2-channel 16-bit PCM as configured by my WAVEFORMATEX ?
I have written the code to invoke waveOutOpen, waveOutPrepareHeader, and waveOutWrite, etc., which works fine, it is just when I try to
play a combined stream with the 2 left|right 16-bit audio samples laid out in
high 16 bits and low 16 bits of 32-bit words that the output is garbled -
I guess I just do not know what format Windows Audio is expecting here.
Or is my WAVEFORMATEX in error somehow?
Or point to where this might be documented ? The Microsoft docs go into excruciating detail on header file contents without actually explaining things like stream layouts at all.
Thanks in advance for any helpful advice / replies.

Aha! I finally found :
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ksmedia/ns-ksmedia-waveformatextensible
and
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ksmedia/ns-ksmedia-ksaudio_channel_config
which do answer the question - I guess if I am sending 2 channels
in one frame I have to use 12 bytes per channel .

OK, the last answer wasn't really an answer - here is a better attempt -
To save anyone else the headaches I have been through the last few days,
here is the corrected 'pwfx' waveOutOpen parameter:
const WAVEFORMATEXTENSIBLE
_wex_ext_ =
{ .Format =
{ .wFormatTag = WAVE_FORMAT_EXTENSIBLE
, .nChannels = 2
, .nSamplesPerSec = 8000
, .nAvgBytesPerSec= 32000
, .nBlockAlign = 4
, .wBitsPerSample = 16
, .cbSize = sizeof(WAVEFORMATEXTENSIBLE) -
sizeof(WAVEFORMATEX)
}
, .Samples = {16}
, .dwChannelMask = (SPEAKER_FRONT_LEFT | SPEAKER_FRONT_RIGHT)
, .SubFormat = DEFINE_WAVEFORMATEX_GUID(WAVE_FORMAT_PCM)
};
So if that is used as the structure to which 'pwfx' points :
switch
( r =
waveOutOpen
( &aoh // Audio Output Handle
, WAVE_MAPPER
, ((WAVEFORMATEX*)&_wex_ext_)
, ((DWORD_PTR)waveOutProc) // Callback function on WOM_DONE
, ((DWORD_PTR)0UL) // no user paramet
, WAVE_FORMAT_DIRECT | CALLBACK_FUNCTION // do not modify audio data
)
)
{case MMSYSERR_NOERROR:
break;
case MMSYSERR_ALLOCATED:
ok = false;
emsg = "Specified resource is already allocated.";
break;
case MMSYSERR_BADDEVICEID:
ok = false;
emsg = "Specified device identifier is out of range.";
break;
case MMSYSERR_NODRIVER:
ok = false;
emsg = "No device driver is present.";
break;
case MMSYSERR_NOMEM:
ok = false;
emsg = "Unable to allocate or lock memory.";
break;
case WAVERR_BADFORMAT:
ok = false;
emsg = "Attempted to open with an unsupported waveform-audio format.";
break;
case WAVERR_SYNC:
ok = false;
emsg = "The device is synchronous but waveOutOpen was called without using the WAVE_ALLOWSYNC flag.";
break;
default:
ok = false;
if ( 0 == waveOutGetErrorText(r, ((u8_t*) &pcm[0].l[0]), 1024 ))
{ u16_t len = 1024;
s8_t *str = ((s8_t*)&pcm[0].l[0]);
utf8str( ((u16_t*)&pcm[0].l[0]), wcslen((U16_t*)&pcm[0].l[0]), &str, &len);
emsg = ((const char*) &pcm[0].l[0]);
}
break;
}
Then either of the two SINGLE CHANNEL formats played by 2 processes:
Bit:15
Bit: 31: | Bit: 0:
<16-bit LE PCM L>| 0 ... 0 # all zeros, OR:
Bit:15
Bit: 31: | Bit: 0:
0 ... 0| <16-bit LE PCM R> |
works fine, and mixes so that "PCM L" plays on LEFT channel, and PCM R
plays on RIGHT channel,
But I have had no NO success playing this with a single process :
Bit:15
Bit: 31: | Bit: 0:
<1| <16-bit LE PCM L> | <16-bit LE PCM R>
The same occurs when playing the stream with ALSA (alsa-lib) on Linux -
so it is not a windows thing.
On Linux, with ALSA's S16_LE format,
I have tried playing the same streams, in one process, like:
Bit:15
Bit: 31: | Bit: 0:
<16-bit LE PCM L>| 0 ... 0 # all zeros
0 ... 0| <16-bit LE PCM R>
I have also tried:
Bit:15
Bit: 31: | Bit: 0:
0 ... 0| <16-bit LE PCM L>|
0 ... 0| <16-bit LE PCM R>|
and:
Bit:15
Bit: 31: | Bit: 0:
<16-bit LE PCM L>| 0 ... 0|
<16-bit LE PCM R>| 0 ... 0|
But these also donot work (the stream is full of clicks & glitches
and the signals seem to alternate between Left & Right & are barely
audible). Of course, how could that work if I've specified the
S16_LE format, which expects 16-bit frames , and I'm writing 32-bit
frames ?
So, I'm really stuck as to what the exact interleaving byte format
should be.
On Windows, the code doing the interleaving looks like:
typedef struct pcm_frame_s
{ U16_t l[320], r[320], lr[640];
} PCMFrame_t;
static
PCMFrame_t pcm[16] = {0};
...
// pcmA & pcmB are pointers to pcm[i].l[0] & pcm[i].r[0] :
if ( pcmA && pcmB )
{ register U8_t
np;
register U16_t
*ppcm =&pcm[i].lr[0];
for( np = 0
, pcmA=&pcm[i].l[0]
, pcmB=&pcm[i].r[0]
; np < 160
; ++np, ++ppcm, ++pcmA, ++pcmB
)
{ *ppcm = *pcmA ? *pcmA : pcm_slnc[np];
ppcm += 1;
*ppcm = *pcmB ? *pcmB : pcm_slnc[np];
} // pcm_slnc is a pre-computed "Comfort Noise" block
}else
{ register U8_t
np;
register U16_t
*ppcm = &pcm[i].lr[0]
, *pspcm = pcmA ? pcmA : pcmB;
if(left) // user has chosen to put 1st stream on left channel
{
for(np=0; np < 160; ++np, ++ppcm, ++pspcm)
{ *ppcm = *pspcm;
ppcm += 1;
*ppcm = pcm_slnc[np];
}
} else
{
for(np=0; np < 160; ++np, ++ppcm, ++pspcm)
{ *ppcm = pcm_slnc[np];
ppcm += 1;
*ppcm=*pspcm;
}
}
}
while ( (WHDR_PREPARED | WHDR_DONE)
!= (wh[i].dwFlags & (WHDR_PREPARED | WHDR_DONE))
)
{ if (!WaitOnAddress
( &(wh[i].dwFlags)
, &playingDwFlags
, sizeof(u32_t)
, INFINITE
)
)
{ say(FMT("WaitOnAddress failed in INFINITE mode."));
ok = false;
break;
}
}
wh[i].dwFlags &= ~WHDR_DONE;
switch
( r = waveOutWrite(aoh, &wh[i], sizeof(wh[i])) )
{case MMSYSERR_NOERROR:
...
I think I MUST specify 32-bits per channel, and use a 32-bit frame size?
So there is no way I can interleave two 16-bit left | right channel
PCM values next to each other without using a 32-bit stream format ?
I am on the verge of giving up and just using 2 processes and the
mixer, which strangely DOES honor the "<left 16 bits>|0" (left process) and
"0|<right 16-bits>" (right process) (left | right) channel format
and send them to the left / right speakers as expected.
But there is no way to specify two 16-bit channels in a 32-bit word?
Strange.

Related

Audio Unit RemoteIO Setting interleaved float gives kAudioUnitErr_FormatNotSupported

I am working with Audio Unit RemoteIO's to obtain a low latency audio output. My problem is AFAIK audio unit only accepts several audio formats depending on the hardware. My problem is I have a C++ DSP Sound engine and it works with float interleaved PCM. I do not want to implement a format converter since it can slow things down in the remote IO callback. I tried obtaining a low latency Audio Unit with the following format:
AudioStreamBasicDescription const audioDescription = {
.mSampleRate = defaultSampleRate,
.mFormatID = kAudioFormatLinearPCM,
.mFormatFlags = kAudioFormatFlagIsFloat,
.mBytesPerPacket = defaultSampleRate * STEREO_CHANNEL,
.mFramesPerPacket = 1,
.mBytesPerFrame = STEREO_CHANNEL * sizeof(Float32),
.mChannelsPerFrame = STEREO_CHANNEL,
.mBitsPerChannel = 8 * sizeof(Float32),
.mReserved = 0
};
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
kOutputBus,
&audioDescription,
sizeof(audioDescription));
This fails with the error code kAudioUnitErr_FormatNotSupported -10868. If I try to obtain a float PCM NON-interleaved audio stream with the following:
AudioStreamBasicDescription const audioDescription = {
.mSampleRate = defaultSampleRate,
.mFormatID = kAudioFormatLinearPCM,
.mFormatFlags = kAudioFormatFlagIsFloat | kAudioFormatFlagIsPacked | kAudioFormatFlagIsNonInterleaved,
.mBytesPerPacket = sizeof(float),
.mFramesPerPacket = 1,
.mBytesPerFrame = sizeof(float),
.mChannelsPerFrame = STEREO_CHANNEL,
.mBitsPerChannel = 8 * sizeof(float),
.mReserved = 0
};
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
kOutputBus,
&audioDescription,
sizeof(audioDescription));
Everything works fine. However I want to obtain an interleaved audio stream for my DSP engine to work without format conversion. Is this possible at all?
PS. waiting for hotpaw2 to guide me :)
Your error is probably due to this line:
.mBytesPerPacket = defaultSampleRate * STEREO_CHANNEL,

sending audio via bluetooth a2dp source esp32

I am trying to send measured i2s analogue signal (e.g. from mic) to the sink device via Bluetooth instead of the default noise.
Currently I am trying to change the bt_app_a2d_data_cb()
static int32_t bt_app_a2d_data_cb(uint8_t *data, int32_t i2s_read_len)
{
if (i2s_read_len < 0 || data == NULL) {
return 0;
}
char* i2s_read_buff = (char*) calloc(i2s_read_len, sizeof(char));
bytes_read = 0;
i2s_adc_enable(I2S_NUM_0);
while(bytes_read == 0)
{
i2s_read(I2S_NUM_0, i2s_read_buff, i2s_read_len,&bytes_read, portMAX_DELAY);
}
i2s_adc_disable(I2S_NUM_0);
// taking care of the watchdog//
TIMERG0.wdt_wprotect=TIMG_WDT_WKEY_VALUE;
TIMERG0.wdt_feed=1;
TIMERG0.wdt_wprotect=0;
uint32_t j = 0;
uint16_t dac_value = 0;
// change 16bit input signal to 8bit
for (int i = 0; i < i2s_read_len; i += 2) {
dac_value = ((((uint16_t) (i2s_read_buff[i + 1] & 0xf) << 8) | ((i2s_read_buff[i + 0]))));
data[j] = (uint8_t) dac_value * 256 / 4096;
j++;
}
// testing for loop
//uint8_t da = 0;
//for (int i = 0; i < i2s_read_len; i++) {
// data[i] = (uint8_t) (i2s_read_buff[i] >> 8);// & 0xff;
// da++;
// if(da>254) da=0;
//}
free(i2s_read_buff);
i2s_read_buff = NULL;
return i2s_read_len;
}
I can hear the sawtooth sound from the sink device.
Any ideas what to do?
your data can be an array of some float digits representing analog signals or analog signal variations, for example, a 32khz sound signal contains 320000 float numbers to define captures sound for every second. if your data have been expected to transmit in offline mode you can prepare your outcoming data in the form of a buffer plus a terminator sign then send buffer by Bluetooth module of sender device which is connected to the proper microcontroller. for the receiving device, if you got terminator character like "\r" you can process incoming buffer e.g. for my case, I had to send a string array of numbers but I often received at most one or two unknown characters and to avoid it I reject it while fulfill receiving container.
how to trim unknown first characters of string in code vision
if you want it in online mode i.e. your data must be transmitted and played concurrently. you must consider delays and reasonable time to process for all microcontrollers and devices like Bluetooth, EEprom iCs and...
I'm also working on a project "a2dp source esp32".
I'm playing a wav-file from spiffs.
If the wav-file is 44100, 16-bit, stereo then you can directly write a stream of bytes from the file to the array data[ ].
When I tried to write less data than in the len-variable and return less (for example 88), I got an error, now I'm trying to figure out how to reduce this buffer because of big latency (len=512).
Also, the data in the array data[ ] is stored as stereo.
Example: read data from file to data[ ]-array:
size_t read;
read = fread((void*) data, 1, len, fwave);//fwave is a file
if(read<len){//If get EOF, go to begin of the file
fseek(fwave , 0x2C , SEEK_SET);//skip wav-header 44bytesт
read = fread((void*) (&(data[read])), 1, len-read, fwave);//read up
}
If file mono, I convert it to stereo like this (I read half and then double data):
int32_t lenHalf=len/2;
read = fread((void*) data, 1, lenHalf, fwave);
if(read<lenHalf){
fseek(fwave , 0x2C , SEEK_SET);//skip wav-header 44bytesт
read = fread((void*) (&(data[read])), 1, lenHalf-read, fwave);//read up
}
//copy to the second channel
uint16_t *data16=(uint16_t*)data;
for (int i = lenHalf/2-1; i >= 0; i--) {
data16[(i << 1)] = data16[i];
data16[(i << 1) + 1] = data16[i];
}
I think you have got sawtooth sound because:
your data is mono?
in your "return i2s_read_len;" i2s_read_len less than len
you // change 16bit input signal to 8bit, in the array data[ ] data as 16-bit: 2ByteLeft-2ByteRight-2ByteLeft-2ByteRight-...
I'm not sure, it's a guess.

Format settings for iOS multimixer au when using a bluetooth endpoint

Hi Core Audio/Au community,
I have hit a roadbloc during development. My current AUGraph is set up as 2 Mono streams->Mixer unit->remoteIO unit on an iOS platform. I am using the mixer to mix two mono stream into stereo interleaved. However, the need is that mono streams neednt be mixed at all times while being played out in stereo i.e the interleaved stereo output should be composed of: the 1st mono stream in the left ear and the 2nd mono stream in the right ear. I am able to accomplish this using the kAudioUnitProperty_MatrixLevels property on the Multichannel mixer.
//Left out //right out
matrixVolumes[0][0]=1; matrixVolumes[0][1]=0.001;
matrixVolumes[1][0]=0.001; matrixVolumes[1][1]=0.001;
result = AudioUnitSetProperty(mAumixer, kAudioUnitProperty_MatrixLevels, kAudioUnitScope_Input, 0,matrixVolumes , matrixPropSize);
if (result) {
printf("Error while setting kAudioUnitProperty_MatrixLevels from mixer on bus 0 %ld %08X %4.4s\n", result, (unsigned int)result, (char*)&result);
return -1;
}
printf("setting matrix levels kAudioUnitProperty_MatrixLevels on bus 1 \n");
//Left out //right out
matrixVolumes[0][0]=0.001; matrixVolumes[0][1]=1;
matrixVolumes[1][0]=0.001; matrixVolumes[1][1]=0.001;
result = AudioUnitSetProperty(mAumixer, kAudioUnitProperty_MatrixLevels, kAudioUnitScope_Input, 1,matrixVolumes , matrixPropSize);
if (result) {
printf("Error while setting kAudioUnitProperty_MatrixLevels from mixer on bus 1 %ld %08X %4.4s\n", result, (unsigned int)result, (char*)&result);
return -1;
}
As shown above I am using the volume controls to control the streams playing as unmixed stereo interleaved separately. This works fine when I am using a wired headset to play the audio; the 1st mono stream plays on the left ear and the second mono stream plays on the right ear. But when I am switching to a bluetooth headset the audio output is a mix of both the mono streams playing on both the left and right channel. So, the matrix levels do not seem to work there. The formats used for the i/p and o/p of the mixer are as follows:
printf("create Input ASBD\n");
// client format audio goes into the mixer
obj->mInputFormat.mFormatID = kAudioFormatLinearPCM;
int sampleSize = ((UInt32)sizeof(AudioSampleType));
obj->mInputFormat.mFormatFlags = kAudioFormatFlagsCanonical;
obj->mInputFormat.mBitsPerChannel = 8 * sampleSize;
obj->mInputFormat.mChannelsPerFrame = 1; //mono
obj->mInputFormat.mFramesPerPacket = 1;
obj->mInputFormat.mBytesPerPacket = obj->mInputFormat.mBytesPerFrame = sampleSize;
obj->mInputFormat.mFormatFlags |= kAudioFormatFlagIsNonInterleaved;
// obj->mInputFormat.mSampleRate = obj->mGlobalSampleRate;(// set later while initializing audioStreamer or from the client app)
printf("create output ASBD\n");
// output format for the mixer unit output bus
obj->mOutputFormat.mFormatID = kAudioFormatLinearPCM;
obj->mOutputFormat.mFormatFlags = kAudioFormatFlagsCanonical | (kAudioUnitSampleFractionBits << kLinearPCMFormatFlagsSampleFractionShift);
obj->mOutputFormat.mChannelsPerFrame = 2;//stereo
obj->mOutputFormat.mFramesPerPacket = 1;
obj->mOutputFormat.mBitsPerChannel = 8 * ((UInt32)sizeof(AudioUnitSampleType));
obj->mOutputFormat.mBytesPerPacket = obj->mOutputFormat.mBytesPerFrame = 2 * ((UInt32)sizeof(AudioUnitSampleType));
// obj->mOutputFormat.mSampleRate = obj->mGlobalSampleRate; (// set later while initializing)
N.B : I am setting the sample rates separately from the application.The i/p and o/p sample rate of the mixer is same as sample rate of the mono audio files.
Thanks in advance for taking a look at the issue...:)

Converting 24 bit USB audio stream into 32 bit stream

I'm trying to convert a 24 bit usb audio stream into a 32 bit stream so my microcontroller's peripherals can play happily with the stream (it can only handle 16 or 32 bit data like most mcus...).
The following code is what I got from the mcu's company... didn't work as expected and I ended up getting really distorted audio.
// Function takes usb stream and processes the data for our peripherals
// #data - usb stream data
// #byte_count - size of stream
void process_usb_stream(uint8_t *data, uint16_t byte_count) {
// Etc code that gets buffers ready to read the stream...
// Conversion here!
int32_t *buffer;
int sample_count = 0;
for (int i = 0; i < byte_count; i += 3) {
buffer[sample_count++] = data[i] | data[i+1] << 8 | data[i+2] << 16;
}
// Send buffer to peripherals for them to use...
}
Any help with converting the data from a 24 bit stream to 32 bit stream would be super awesome! This area of work is very hard for me :(
data[...] is a uint8_t. You need to cast that before shifting, because data[...]<<8 and data[...]<<16 are undefined. They'll either be 0 or unchanged, neither of which is what you want.
Also, you need to shift by another 8 bits to get the full range and put the sign bit in the right place.
Also, you're treating the data as if it were in little-endian format. Make sure it is. I'll assume that's correct, so something like this works:
int32_t *buffer;
int sample_count = 0;
for (int i = 0; i+3 <= byte_count; ) {
int32_t v = ((int32_t)data[i++])<<8;
v |= ((int32_t)data[i++])<<16;
v |= ((int32_t)data[i++])<<24;
buffer[sample_count++] = v;
}
Finally, note that this assumes that byte_count is divisible by 3 -- make sure that's true!
this is DSP stuff if, also post this question on http://dsp.stackexchange.com
In DSP the process of changing the bit depth is called scaling
16 bit resolution has 65536 values
24 bit resolution has 16777216
possible values
32 bit has 4294967296 values so the factor is 256
According to https://electronics.stackexchange.com/questions/229268/what-is-name-of-process-used-to-change-sample-bit-depth/229271
reduction from 24 bit to 16 bit is called scaling down and is done by dividing each value by 256.
This can be done by bitwise shifting every bit by 8
y = x >> 8. When scaling down this way the LSB is lost
Scaling up to 32 bit is more complicated and there are several approaches how to do this. It may work by multiplying each bit of the value with a value between 2⁰ and 2⁸.
Push the 24 bit value in a 32 bit register and then left-shifting each bit by a value between 2⁰ and 2⁸:
data32[31] = data32[23] << 8;
data32[22] = data32[14] << 8;
...
data32[0] = data32[0];
and interpolate the bits you do not get with this (linear interpolation)
Maybe there are much better scaling up algortihms ask on http://dsp.stackexchange.com
See also http://blog.bjornroche.com/2013/05/the-abcs-of-pcm-uncompressed-digital.html for the scaling up problem...

Reading PCI MSICAP register

I am trying to enable multiple MSI on my PCI card where in before enabling the same i read pci_config_space() MSICAP + 2h: MC – Message Signaled Interrupt Message Control.
The way i am doing is as follows
u16 val;
int pos = 0x50;
pci_read_config_word(dev, pos + PCI_MSI_FLAGS, &val);
printk("\n val = %x", val);
val |= 0x51;
pci_write_config_word(dev, pos + PCI_MSI_FLAGS, val);
pci_read_config_word(dev, pos + PCI_MSI_FLAGS, &val);
printk("\n After pci_enable val = %x", val);
Firstly is this the right way of accessing MSI CAP + 2 offset?
I am reading the value as 0x80 which says only one MSI is supported but HW spec says it supports 16MSI's and the expected value is 88.
Taking HW spec to it's truest, when i enable more than one MSI as follows
ret = pci_enable_msi_block(dev, 2);
Value of "ret" is 1 which says my request for multiple MSI failed.
When checked in kernel code, this is the path where it fails and this says MSI CAP was 0x00.
292 #define PCI_MSI_FLAGS_QMASK 0x0e /* Maximum queue size available */
670 pci_read_config_word(dev, pos + PCI_MSI_FLAGS, &msgctl);
671 maxvec = 1 << ((msgctl & PCI_MSI_FLAGS_QMASK) >> 1);
672 if (nvec > maxvec)
673 return maxvec;
Am i missing something here?

Resources