Play file audio - redhawksdr

I have a raw data file of a sound recording, with each sample stored as a 16 bit short. I want to play this file through Redhawk.
I have a file_source_s connected to AudioSink like so:
I was expecting to hear sound from my speakers when starting these components. But when I start both components, I cannot hear any sound.
Here are the file_source_s properties values:
filename: name
itemsize: 2
repeat: true
seek: true
seek_point: 0
whence: SEEK_SET
I know:
the problem is not AudioSink. I have tested the AudioSink with the signal generator (SigGen) and I could hear sound through my speakers.
file_source_s is finding the file. When I put in a non-existent file name, file_source_s gives the "No such file or directory" error. I can also see the first 1024 bytes of the file when I plot the short_out port, but the plot does not update.

The AudioSink component uses the information from the received SRI (Signal Related Information) in order to determine the audio's sample rate. This is seen here from line 156 of the AudioSink component:
int sample_rate = static_cast<int>(rint(1.0/current_sri.xdelta));
It receives the SRI from downstream components, in this case, file_source_s.
The component file_source_s is part of the gnuhawk component package. The GNUHAWK library provides software that enables a GNU Radio block to be integrated into the REDHAWK software framework. Since SRI is a REDHAWK construct and not present in GNURADIO, it does not appear as though the file_source_s block gathers enough information via properties to represent the correct xdelta / sample rate for the audio file.
I'd recommend using a pure REDHAWK component like DataReader which takes in as a property the sample rate.

Related

About DirectShow source filter

I have created (C++, Win10, VS2022) a simple source DirectShow filter. It gets audio stream from the external source (file – for testing, network – in future) and produces audio stream on output pin, which I connect to soundspeaker.
In order to do it I have implemented FillBuffer method for the output pin (CSourceStream) of the filter. Media type - MEDIATYPE_Stream/MEDIASUBTYPE_PCM.
Before being connected the pin gets info about media type via SetMediaType (WAVEFORMATEX) and remembers parameters of audio - wBitsPerSample; nSamplesPerSec; nChannels. Audio stream comes from the external source (file or net) to FillBuffer with the parameters - wBitsPerSample; nSamplesPerSec; nChannels. It works fine.
But I need to handle situation, when external source will send audio stream to the filter , with another parameter (for example, old sample had 11025 Hz, and the current = 22050).
Could you help me – which actions and calls should I make in FillBuffer() method if I will receive audio stream with changed wBitsPerSample or nSamplesPerSec or nChannels parameter ?
The fact is that these parameters have already been agreed between my output pin and the input pin of the soundspeaker and I need to change these agreement correctly.
You need to improve the implementation and handle
Dynamic Format Changes
...
QueryAccept (Downstream) is used when If an output pin proposes a format change to its downstream peer, but only if the new format does not require a larger buffer.
This might be not trivial because baseline DirectShow filters are not required to support dynamic changes. That is, ability to change format is dependent on your actual pipeline and implementation of other filters.
You should also be able to find SDK helpers, CDynamicSourceStream and CDynamicSource.

Sending a webcam input to zoom using a recorded clip

I have an idea that I have been working on, but there are some technical details that I would love to understand before I proceed.
From what I understand, Linux communicates with the underlying hardware through the /dev/. I was messing around with my video cam input to zoom and I found someone explaining that I need to create a virtual device and mount it to the output of another program called v4loop.
My questions are
1- How does Zoom detect the webcams available for input. My /dev directory has 2 "files" called video (/dev/video0 and /dev/video1), yet zoom only detects one webcam. Is the webcam communication done through this video file or not? If yes, why does simply creating one doesn't affect Zoom input choices. If not, how does zoom detect the input and read the webcam feed?
2- can I create a virtual device and write a kernel module for it that feeds the input from a local file. I have written a lot of kernel modules, and I know they have a read, write, release methods. I want to parse the video whenever a read request from zoom is issued. How should the video be encoded? Is it an mp4 or a raw format or something else? How fast should I be sending input (in terms of kilobytes). I think it is a function of my webcam recording specs. If it is 1920x1080, and each pixel is 3 bytes (RGB), and it is recording at 20 fps, I can simply calculate how many bytes are generated per second, but how does Zoom expect the input to be Fed into it. Assuming that it is sending the strean in real time, then it should be reading input every few milliseconds. How do I get access to such information?
Thank you in advance. This is a learning experiment, I am just trying to do something fun that I am motivated to do, while learning more about Linux-hardware communication. I am still a beginner, so please go easy on me.
Apparently, there are two types of /dev/video* files. One for the metadata and the other is for the actual stream from the webcam. Creating a virtual device of the same type as the stream in the /dev directory did result in Zoom recognizing it as an independent webcam, even without creating its metadata file. I did finally achieve what I wanted, but I used OBS Studio virtual camera feature that was added after update 26.0.1, and it is working perfectly so far.

IMFTransform SetInputType()/SetOutputType() fails

I'm trying to playback MP3 (and similar audio files) using WASAPI shared mode and a media foundation IMFSourceReader on Windows 7. From what I understand I have to use an IMFTransform between the IMFSourceReader decoding and the WASAPI playback. Everything seems fine apart from when I call SetInputType()/SetOutputType() on the IMFTransform?
The relevant snippets of code are:
MFCreateSourceReaderFromURL(...); // Various test mp3 files
...
sourceReader->GetCurrentMediaType(MF_SOURCE_READER_FIRST_AUDIO_STREAM, &reader.audioType);
//sourceReader->GetNativeMediaType(MF_SOURCE_READER_FIRST_AUDIO_STREAM, 0, &reader.audioType);
...
audioClient->GetMixFormat(&player.mixFormat);
...
MFCreateMediaType(&player.audioType);
MFInitMediaTypeFromWaveFormatEx(player.audioType, player.mixFormat, sizeof(WAVEFORMATEX) + player.mixFormat->cbSize);
...
hr = CoCreateInstance(CLSID_CResamplerMediaObject, NULL, CLSCTX_INPROC_SERVER, IID_IUnknown, (void**)&unknown);
ASSERT(SUCCEEDED(hr));
hr = unknown->QueryInterface(IID_PPV_ARGS(&resampler.transform));
ASSERT(SUCCEEDED(hr));
unknown->Release();
hr = resampler.transform->SetInputType(0, inType, 0);
ASSERT(hr != DMO_E_INVALIDSTREAMINDEX);
ASSERT(hr != DMO_E_TYPE_NOT_ACCEPTED);
ASSERT(SUCCEEDED(hr)); // Fails here with hr = 0xc00d36b4
hr = resampler.transform->SetOutputType(0, outType, 0);
ASSERT(hr != DMO_E_INVALIDSTREAMINDEX);
ASSERT(hr != DMO_E_TYPE_NOT_ACCEPTED);
ASSERT(SUCCEEDED(hr)); // Fails here with hr = 0xc00d6d60
I suspect I am misunderstanding how to negotiate the input/output IMFMediaType's between things, and also how to take into consideration that IMFTransform needs to operate on uncompressed data?
It seems odd to me the output type fails but maybe that is a knock on effect of the input type failing first - and if I try to set the output type first it fails also.
In recent versions of Windows you would probably prefer to take advantage of stock functionality which is already there for you.
When you configure Source Reader object, IMFSourceReader::SetCurrentMediaType lets you specify media type you want your data in. If you set media type compatible with WASAPI requirements, Source Reader would automatically add transform to convert the data for you.
However...
Audio resampling support was added to the source reader with Windows 8. In versions of Windows prior to Windows 8, the source reader does not support audio resampling. If you need to resample the audio in versions of Windows earlier than Windows 8, you can use the Audio Resampler DSP.
... which means that indeed you might need to manage the MFT yourself. The input media type for the MFT is supposed to be coming from IMFSourceReader::GetCurrentMediaType. To instruct source reader to use uncompressed audio you need to build a media type decoder for this type of stream would decode audio to. For example, if your file is MP3 then you would read number of channels, sampling rate and build a compatible PCM media type (or take system decoder and ask it separately for output media type, which is even a cleaner way). You would set this uncompressed audio media type using IMFSourceReader::SetCurrentMediaType. This media type would also be your input media type for audio resampler MFT. This would instruct source reader to add necessary decoders and IMFSourceReader::ReadSample would give you converted data.
Output media type for reasmpler MFT would be derived from audio format you obtained from WASAPI and converted using API calls you mentioned at the top of your code snippet.
To look the error codes up you can use this:
https://www.magnumdb.com/search?q=0xc00d36b4
https://www.magnumdb.com/search?q=0xc00d6d60
Also, you, generally, should be able to play audio files using Media Foundation Media Session API with smaller effort. Media Session uses the same primitives to build a playback pipeline and takes care of format fitting.
Ah so are you saying I need to create an additional object that is the decoder to fit between the IMFSourceReader and IMFTransform/Resampler?
No. By doing SetCurrentMediaType with proper media type you have Source Reader adding decoder internally so that it could give you already decompressed data. Starting with Windows 8 it is also capable to do conversion between PCM flavors, but in Windows 7 you need to take care of this yourself with Audio Resampler DSP.
You can manage decoder yourself but you don't need to since Source Reader's decoder would do the same more reliably.
You might need a separate decoder just to help you guess what PCM media type decoder would produce so that you request it from Source Reader. MFTEnumEx is proper API to look decoder up.
I am not sure how to decide on or create a suitable decoder object? Do I need to enumerate a list of suitable ones somehow rather than assume specific ones?
The mentioned MFTEnum, MFTEnumEx API calls can enumerate decoders, both all available or filtered by given criteria.
One another way is to use partial media type (see relevant explanation and code snippet here: Tutorial: Decoding Audio). Partial media type is a signal about desired format requesting that Media Foundation API supplies a primitive that matches this partial type. See comments below for related discussion links.

External source for sample rate of Redhawk system

We are using Redhawk for an FM modulator. It reads an audio modulating signal from a file, performs the modulation, then sends the modulated data from Redhawk to an external program via TCP/IP for DAC and up-conversion to RF.
The data flows through the following components: rh.FileReader, rh.DataConverter, rh.fastfilter, an FM modulator, rh.DataConverter, and rh.sinksocket. The FM modulator is a custom component.
The rh.sinksocket sends data to an external server program that sends the samples from Redhawk to an FPGA and DAC.
At present the sample rate appears to be controlled via the rh.FileReader component. However, we would like the external DAC to set the sample rate of the system, not the rh.FileReader component of Redhawk, for example via TCP/IP flow control.
Is it possible to use an external DAC as the clock source for a Redhawk waveform?
The property on FileReader dictating the sample rate is simply telling it what the sample rate of the provided file is. This is used for the Signal Related Information (SRI) passed to down stream components and then output rate if you do not block or throttle. Eg. FileReader does not do any resampling of the given file to meet the sample rate given.
If you want to resample to a given rate you can try the ArbitraryRateResampler component.
Regarding setting these properties via some external mechanism (TCP/IP) you would want to write a specific component or REDHAWK service that listens for this external event and then makes a configure call to set the property you'd like changed.
If these events are global and can apply to many applications on your domain then a service is the right pattern, if these events are specific to a single application then a component might make more sense.

Correct way to encode Kinect audio with lame.exe

I receive data from a Kinect v2, which is (I believe, information is hard to find) 16kHz mono audio in 32-bit floating point PCM. The data arrives in up to 4 "SubFrames", which contain 256 samples each.
When I send this data to lame.exe with -r -s 16 --bitwidth 32 -m m I get an output containing gaps (supposedly where the second channel should be). These command line switches should however take stereo and downmix it to mono.
I've also tried importing the raw data into Audacity, but I still can't figure out the correct way to get continuous audio out of it.
EDIT: I can get continuous audio when I only save the first SubFrame. The audio still doesn't sound right though.
In the end I went with Ogg Vorbis. A free format, so no problems there either. I use the following command line switches for oggenc2.exe:
oggenc2.exe --raw-format=3 --raw-chan=1 --raw-rate=16000 - --output=[filename]

Resources