Save an RGB24 sample to bitmap - visual-c++

I'm using Windows Media Foundation to do some messing around with my webcam. I've been able to successfully retrieve a data sample from the webcam and identify that the format is RGB24. Now I'd like to save a single frame as a bitmap. A small snippet of the code I'm using to read a sample from the webcam is below.
IMFSample *pSample = NULL;
hr = pReader->ReadSample(
MF_SOURCE_READER_ANY_STREAM, // Stream index.
0, // Flags.
&streamIndex, // Receives the actual stream index.
&flags, // Receives status flags.
&llTimeStamp, // Receives the time stamp.
&pSample // Receives the sample or NULL.
);
So once I've got pSample populated with an IMFSample how can I save it as a bitmap?

Below is the code snippet I used to save a bitmap from an IMFSample. I've taken a lot of shortcuts and I'm pretty sure I'm only able to get away with doing things this way because my webcam defaults to returning an RGB24 stream and also a 640 x 480 pixel buffer which means there's no striping to worry about in pData.
hr = pReader->ReadSample(
MF_SOURCE_READER_ANY_STREAM, // Stream index.
0, // Flags.
&streamIndex, // Receives the actual stream index.
&flags, // Receives status flags.
&llTimeStamp, // Receives the time stamp.
&pSample // Receives the sample or NULL.
);
wprintf(L"Stream %d (%I64d)\n", streamIndex, llTimeStamp);
HANDLE file;
BITMAPFILEHEADER fileHeader;
BITMAPINFOHEADER fileInfo;
DWORD write = 0;
file = CreateFile(L"sample.bmp",GENERIC_WRITE,0,NULL,CREATE_ALWAYS,FILE_ATTRIBUTE_NORMAL,NULL); //Sets up the new bmp to be written to
fileHeader.bfType = 19778; //Sets our type to BM or bmp
fileHeader.bfSize = sizeof(fileHeader.bfOffBits) + sizeof(RGBTRIPLE); //Sets the size equal to the size of the header struct
fileHeader.bfReserved1 = 0; //sets the reserves to 0
fileHeader.bfReserved2 = 0;
fileHeader.bfOffBits = sizeof(BITMAPFILEHEADER)+sizeof(BITMAPINFOHEADER); //Sets offbits equal to the size of file and info header
fileInfo.biSize = sizeof(BITMAPINFOHEADER);
fileInfo.biWidth = 640;
fileInfo.biHeight = 480;
fileInfo.biPlanes = 1;
fileInfo.biBitCount = 24;
fileInfo.biCompression = BI_RGB;
fileInfo.biSizeImage = 640 * 480 * (24/8);
fileInfo.biXPelsPerMeter = 2400;
fileInfo.biYPelsPerMeter = 2400;
fileInfo.biClrImportant = 0;
fileInfo.biClrUsed = 0;
WriteFile(file,&fileHeader,sizeof(fileHeader),&write,NULL);
WriteFile(file,&fileInfo,sizeof(fileInfo),&write,NULL);
IMFMediaBuffer *mediaBuffer = NULL;
BYTE *pData = NULL;
pSample->ConvertToContiguousBuffer(&mediaBuffer);
hr = mediaBuffer->Lock(&pData, NULL, NULL);
WriteFile(file, pData, fileInfo.biSizeImage, &write, NULL);
CloseHandle(file);
mediaBuffer->Unlock();
I've included a bit of a discussion here.

Related

TarsosDSP Pitch Detection from .wav file. And the result frequency is always less than half

I'm trying to use TarsosDSP library to detect pitch from a .wav file, and the result of frequency is always less than half.
Here is my code.
public class Main {
public static void main(String[] args){
try{
float sampleRate = 44100;
int audioBufferSize = 2048;
int bufferOverlap = 0;
//Create an AudioInputStream from my .wav file
URL soundURL = Main.class.getResource("/DetectPicthFromWav/329.wav");
AudioInputStream stream = AudioSystem.getAudioInputStream(soundURL);
//Convert into TarsosDSP API
JVMAudioInputStream audioStream = new JVMAudioInputStream(stream);
AudioDispatcher dispatcher = new AudioDispatcher(audioStream, audioBufferSize, bufferOverlap);
MyPitchDetector myPitchDetector = new MyPitchDetector();
dispatcher.addAudioProcessor(new PitchProcessor(PitchEstimationAlgorithm.YIN, sampleRate, audioBufferSize, myPitchDetector));
dispatcher.run();
}
catch(FileNotFoundException fne){fne.printStackTrace();}
catch(UnsupportedAudioFileException uafe){uafe.printStackTrace();}
catch(IOException ie){ie.printStackTrace();}
}
}
class MyPitchDetector implements PitchDetectionHandler{
//Here the result of pitch is always less than half.
#Override
public void handlePitch(PitchDetectionResult pitchDetectionResult,
AudioEvent audioEvent) {
if(pitchDetectionResult.getPitch() != -1){
double timeStamp = audioEvent.getTimeStamp();
float pitch = pitchDetectionResult.getPitch();
float probability = pitchDetectionResult.getProbability();
double rms = audioEvent.getRMS() * 100;
String message = String.format("Pitch detected at %.2fs: %.2fHz ( %.2f probability, RMS: %.5f )\n", timeStamp,pitch,probability,rms);
System.out.println(message);
}
}
}
The 329.wav file is generated from http://onlinetonegenerator.com/ website with 329Hz.
I don't know why the result pitch is always 164.5Hz. Is there any problem in my code?
Well I don't know what methods you are using, but by looking at how the frequency is exactly halved, it could be a problem of wrong sample rate being set?
Most operations assume an initial sample rate when the signal was sampled, maybe you've passed it as an argument (or its default value is) half that?
I just had the same problem with TarsosDSP on Android. For me the answer was that the file from http://onlinetonegenerator.com/ has 32-bit samples instead of 16-bit, which appears to be the default. Relevant code:
AssetFileDescriptor afd = getAssets().openFd("440.wav"); // 440Hz sine wave
InputStream is = afd.createInputStream();
TarsosDSPAudioFormat audioFormat = new TarsosDSPAudioFormat(
/* sample rate */ 44100,
/* HERE sample size in bits */ 32,
/* number of channels */ 1,
/* signed/unsigned data */ true,
/* big-endian byte order */ false
);
AudioDispatcher dispatcher = new AudioDispatcher(new UniversalAudioInputStream(is, audioFormat), 2048, 0);
PitchDetectionHandler pdh = ...
AudioProcessor p = new PitchProcessor(PitchProcessor.PitchEstimationAlgorithm.FFT_YIN, 44100, 2048, pdh);
dispatcher.addAudioProcessor(p);
new Thread(dispatcher, "Audio Dispatcher").start();

No sound output with WASAPI

I am having trouble with WASAPI. It do not output any sound and I have been checked the data that writing to the buffer.
Because of it does not output any sound, I haven't any idea to find out the problem.
It may have some problems in following code.
SoundStream::SoundStream() : writtenCursor(0), writeCursor(0), distroy(false)
{
IMMDeviceEnumerator * pEnumerator = nullptr;
HResult(CoCreateInstance(__uuidof(MMDeviceEnumerator), NULL, CLSCTX_ALL, IID_PPV_ARGS(&pEnumerator)));
IMMDevice * pDevice = nullptr;
HResult(pEnumerator->GetDefaultAudioEndpoint(eRender, eMultimedia, &pDevice));
SafeRelease(&pEnumerator);
HResult(pDevice->Activate(__uuidof(IAudioClient), CLSCTX_ALL, NULL, (void**)&pAudioClient));
SafeRelease(&pDevice);
WAVEFORMATEXTENSIBLE * pwfx = nullptr;
hEvent = CreateEvent(NULL, FALSE, FALSE, NULL);
REFERENCE_TIME hnsRequestedDuration = REFTIMES_PER_SEC * 2;
HResult(pAudioClient->GetMixFormat((WAVEFORMATEX**)&pwfx));
HResult(pAudioClient->Initialize(
AUDCLNT_SHAREMODE_SHARED,
AUDCLNT_STREAMFLAGS_EVENTCALLBACK,
hnsRequestedDuration,
0,
(WAVEFORMATEX*)pwfx,
NULL));
pAudioClient->SetEventHandle(hEvent);
channel = (size_t)pwfx->Format.nChannels;
bits = (size_t)pwfx->Format.wBitsPerSample;
validBits = (size_t)pwfx->Samples.wValidBitsPerSample;
frequency = (size_t)pwfx->Format.nSamplesPerSec;
buffer.reshape({ 0, channel, bits >> 3 });
CoTaskMemFree(pwfx);
HResult(pAudioClient->GetBufferSize(&bufferFrameCount));
HResult(pAudioClient->Start());
if (pAudioClient)
{
thread = std::thread([&]()
{
this->Sync();
});
}
}
You could look at my WASAPI.cpp code at http://jdmcox.com (which works fine).
You should also check if the expected wave format is float:
//SubFormat 00000003-0000-0010-8000-00aa00389b71 defines KSDATAFORMAT_SUBTYPE_IEEE_FLOAT
//SubFormat 00000001-0000-0010-8000-00aa00389b71 defines KSDATAFORMAT_SUBTYPE_PCM
GUID G;
WORD V;
WAVEFORMATEX *pwfx = NULL;
bool itsfloat;
pAudioClient->GetMixFormat(&pwfx);
// Do we received a WAVEFORMATEXTENSIBLE?
if(pwfx.cbSize >= 22) {
G = ((WAVEFORMATEXTENSIBLE*)pwfx)->SubFormat;
V = ((WAVEFORMATEXTENSIBLE*)pwfx)->Samples.wValidBitsPerSample;
if (G.Data1 == 3) itsfloat = true;
else if (G.Data1 == 1) itsfloat = false;
}
You know you received a WAVEFORMATEXTENSIBLE and not a simple WAVEFORMATEX because the "pwfx.cbSize >= 22".
See more at:
IAudioClient::GetMixFormat
https://learn.microsoft.com/en-us/windows/win32/api/audioclient/nf-audioclient-iaudioclient-getmixformat
WAVEFORMATEXTENSIBLE
https://learn.microsoft.com/en-us/windows/win32/api/mmreg/ns-mmreg-waveformatextensible
You could look at my WASAPI.cpp code at http://jdmcox.com AGAIN.
Now it works in shared mode as well as exclusive mode.
I should note that no conversion of wave format or wave is necessary in shared mode -- Windows takes care of both converting to and from their format used to mix waves.

Hooking IDirect3DDevice9::EndScene method to capture a gameplay video: can not get rid of a text overlay in the recorded video

In fact it is a wild mix of technologies, but the answer to my question (I think) is closest to Direct3D 9. I am hooking to an arbitrary D3D9 applications, in most cases it is a game, and injecting my own code to mofify the behavior of the EndScene function. The backbuffer is copied into a surface which is set to point to a bitmap in a push source DirectShow filter. The filter samples the bitmaps at 25 fps and streams the video into an .avi file. There is a text overlay shown across the game's screnn telling the user about a hot key combination that is supposed to stop gameplay capture, but this overlay is not supposed to show up in the recoreded video. Everything works fast and beautiful except for one annoying fact. On a random occasion, a frame with the text overaly makes its way into the recoreded video. This is not a really desired artefact, the end user only wants to see his gameplay in the video and nothing else. I'd love to hear if anyone can share ideas of why this is happening. Here is the source code for the EndScene hook:
using System;
using SlimDX;
using SlimDX.Direct3D9;
using System.Diagnostics;
using DirectShowLib;
using System.Runtime.InteropServices;
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
[System.Security.SuppressUnmanagedCodeSecurity]
[Guid("EA2829B9-F644-4341-B3CF-82FF92FD7C20")]
public interface IScene
{
unsafe int PassMemoryPtr(void* ptr, bool noheaders);
int SetBITMAPINFO([MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1)]byte[] ptr, bool noheaders);
}
public class Class1
{
object _lockRenderTarget = new object();
public string StatusMess { get; set; }
Surface _renderTarget;
//points to image bytes
unsafe void* bytesptr;
//used to store headers AND image bytes
byte[] bytes;
IFilterGraph2 ifg2;
ICaptureGraphBuilder2 icgb2;
IBaseFilter push;
IBaseFilter compressor;
IScene scene;
IBaseFilter mux;
IFileSinkFilter sink;
IMediaControl media;
bool NeedRunGraphInit = true;
bool NeedRunGraphClean = true;
DataStream s;
DataRectangle dr;
unsafe int EndSceneHook(IntPtr devicePtr)
{
int hr;
using (Device device = Device.FromPointer(devicePtr))
{
try
{
lock (_lockRenderTarget)
{
bool TimeToGrabFrame = false;
//....
//logic based on elapsed milliseconds deciding if it is time to grab another frame
if (TimeToGrabFrame)
{
//First ensure we have a Surface to render target data into
//called only once
if (_renderTarget == null)
{
//Create offscreen surface to use as copy of render target data
using (SwapChain sc = device.GetSwapChain(0))
{
//Att: created in system memory, not in video memory
_renderTarget = Surface.CreateOffscreenPlain(device, sc.PresentParameters.BackBufferWidth, sc.PresentParameters.BackBufferHeight, sc.PresentParameters.BackBufferFormat, Pool.SystemMemory);
} //end using
} // end if
using (Surface backBuffer = device.GetBackBuffer(0, 0))
{
//The following line is where main action takes place:
//Direct3D 9 back buffer gets copied to Surface _renderTarget,
//which has been connected by references to DirectShow's
//bitmap capture filter
//Inside the filter ( code not shown in this listing) the bitmap is periodically
//scanned to create a streaming video.
device.GetRenderTargetData(backBuffer, _renderTarget);
if (NeedRunGraphInit) //ran only once
{
ifg2 = (IFilterGraph2)new FilterGraph();
icgb2 = (ICaptureGraphBuilder2)new CaptureGraphBuilder2();
icgb2.SetFiltergraph(ifg2);
push = (IBaseFilter) new PushSourceFilter();
scene = (IScene)push;
//this way we get bitmapfile and bitmapinfo headers
//ToStream is slow, but run it only once to get the headers
s = Surface.ToStream(_renderTarget, ImageFileFormat.Bmp);
bytes = new byte[s.Length];
s.Read(bytes, 0, (int)s.Length);
hr = scene.SetBITMAPINFO(bytes, false);
//we just supplied the header to the PushSource
//filter. Let's pass reference to
//just image bytes from LockRectangle
dr = _renderTarget.LockRectangle(LockFlags.None);
s = dr.Data;
Result r = _renderTarget.UnlockRectangle();
bytesptr = s.DataPointer.ToPointer();
hr = scene.PassMemoryPtr(bytesptr, true);
//continue building graph
ifg2.AddFilter(push, "MyPushSource");
icgb2.SetOutputFileName(MediaSubType.Avi, "C:\foo.avi", out mux, out sink);
icgb2.RenderStream(null, null, push, null, mux);
media = (IMediaControl)ifg2;
media.Run();
NeedRunGraphInit = false;
NeedRunGraphClean = true;
StatusMess = "now capturing, press shift-F11 to stop";
} //end if
} // end using backbuffer
} // end if Time to grab frame
} //end lock
} // end try
//It is usually thrown when the user makes game window inactive
//or it is thrown deliberately when time is up, or the user pressed F11 and
//it resulted in stopping a capture.
//If it is thrown for another reason, it is still a good
//idea to stop recording and free the graph
catch (Exception ex)
{
//..
//stop the DirectShow graph and cleanup
} // end catch
//draw overlay
using (SlimDX.Direct3D9.Font font = new SlimDX.Direct3D9.Font(device, new System.Drawing.Font("Times New Roman", 26.0f, FontStyle.Bold)))
{
font.DrawString(null, StatusMess, 20, 100, System.Drawing.Color.FromArgb(255, 255, 255, 255));
}
return device.EndScene().Code;
} // end using device
} //end EndSceneHook
As it happens sometimes, I finally found an answer to this question myself, if anyone is interested. It turned out that backbuffer in some Direct3D9 apps is not necessarily refreshed each time the hooked EndScene is called. Hence, occasionally the backbuffer with the text overlay from the previous EndScene hook call was passed to the DirectShow source filter responsible for collecting input frames. I started stamping each frame with a tiny 3 pixel overlay with known RGB values and checking if this dummy overlay was still present before passing the frame to the DirectShow filter. If the overlay was there, the previously cached frame was passed instead of the current one. This approach effectively removed the text overlay from the video recorded in the DirectShow graph.

How to get a list of all Microsoft Media Foundation Transforms (MFTs) available on a system

I'm a newbie to native development on Windows, but I've been tasked with creating a small app that will list out all the transformers for various video+audio codecs.
Looking at the MSDN documentation, there doesn't seem to be much direct documentation on doing this. Docs that I've found indicate that this information is stored in the registry (not sure where) so that could be a vector.
Is this possible?
Generally how should I do it?
Thanks
EDIT:
It does seem that a call to MFTEnumEx with the parameters of type MFT_REGISTER_TYPE_INFO set to NULL returns a count of 8
MFTEnumEx(MFT_CATEGORY_VIDEO_DECODER,MFT_ENUM_FLAG_ALL,NULL, NULL, &ppActivate, &count);
assert(count > 0);
Still have to get the actual values though. But the passed ppActivate param should contain an enumeration of them.
EDIT:
It's surprising, but while the count above == 8, there are no video or audio attributes(the video/audio IMFAttributes object is NULL)
IMFAttributes* videoAttributes = NULL;
if(SUCCEEDED(hr)){
hr = pProfile->GetVideoAttributes(&videoAttributes);
//If there are no container attributes set in the transcode profile, the GetVideoAttributes method succeeds and videoAttributes receives NULL.
}
assert(videoAttributes != NULL); //FAILS!
EDIT:
This is a method that pulls all the IMFMediaTypes from the machine(modified call from the book Developing Microsoft® Media Foundation Applications); I then enumerate over them in the caller:
HRESULT CTranscoder::GetVideoOutputAvailableTypes(
DWORD flags,
CComPtr<IMFCollection>& pTypeCollection)
{
HRESULT hr = S_OK;
IMFActivate** pActivateArray = NULL;
MFT_REGISTER_TYPE_INFO outputType;
UINT32 nMftsFound = 0;
do
{
// create the collection in which we will return the types found
hr = MFCreateCollection(&pTypeCollection);
BREAK_ON_FAIL(hr);
// initialize the structure that describes the output streams that the encoders must
// be able to produce. In this case we want video encoders - so major type is video,
// and we want the specified subtype
outputType.guidMajorType = MFMediaType_Video;
outputType.guidSubtype = MFVideoFormat_WMV3;
// get a collection of MFTs that fit the requested pattern - video encoders,
// with the specified subtype, and using the specified search flags
hr = MFTEnumEx(
MFT_CATEGORY_VIDEO_ENCODER, // type of object to find - video encoders
flags, // search flags
NULL, // match all input types for an encoder
&outputType, // get encoders with specified output type
&pActivateArray,
&nMftsFound);
BREAK_ON_FAIL(hr);
// now that we have an array of activation objects for matching MFTs, loop through
// each of those MFTs, extracting all possible and available formats from each of them
for(UINT32 x = 0; x < nMftsFound; x++)
{
CComPtr<IMFTransform> pEncoder;
UINT32 typeIndex = 0;
// activate the encoder that corresponds to the activation object
hr = pActivateArray[x]->ActivateObject(IID_IMFTransform,
(void**)&pEncoder);
// while we don't have a failure, get each available output type for the MFT
// encoder we keep looping until there are no more available types. If there
// are no more types for the encoder, IMFTransform::GetOutputAvailableTypes[]
// will return MF_E_NO_MORE_TYPES
while(SUCCEEDED(hr))
{
IMFMediaType* pType;
// get the avilable type for the type index, and increment the typeIndex
// counter
hr = pEncoder->GetOutputAvailableType(0, typeIndex++, &pType);
if(SUCCEEDED(hr))
{
// store the type in the IMFCollection
hr = pTypeCollection->AddElement(pType);
}
}
}
} while(false);
// possible valid errors that may be returned after the previous for loop is done
if(hr == MF_E_NO_MORE_TYPES || hr == MF_E_TRANSFORM_TYPE_NOT_SET)
hr = S_OK;
// if we successfully used MFTEnumEx() to allocate an array of the MFT activation
// objects, then it is our responsibility to release each one and free up the memory
// used by the array
if(pActivateArray != NULL)
{
// release the individual activation objects
for(UINT32 x = 0; x < nMftsFound; x++)
{
if(pActivateArray[x] != NULL)
pActivateArray[x]->Release();
}
// free the memory used by the array
CoTaskMemFree(pActivateArray);
pActivateArray = NULL;
}
return hr;
}
Caller:
hr=transcoder.GetVideoOutputAvailableTypes( MFT_ENUM_FLAG_ALL, availableTypes);
if (FAILED(hr)){
wprintf_s(L"didn't like the printVideoProfiles method");
}
DWORD availableInputTypeCount =0;
if(SUCCEEDED(hr)){
hr= availableTypes->GetElementCount(&availableInputTypeCount);
}
for(DWORD i = 0; i< availableInputTypeCount && SUCCEEDED(hr); i++)
{
//really a IMFMediaType*
IMFAttributes* mediaInterface = NULL;
if(SUCCEEDED(hr)){
hr = availableTypes->GetElement(i, (IUnknown**)&mediaInterface) ;}
if(SUCCEEDED(hr)){
//see http://msdn.microsoft.com/en-us/library/aa376629(v=VS.85).aspx for a list of attributes to pull off the media interface.
GUID majorType;
hr = mediaInterface->GetGUID(MF_MT_MAJOR_TYPE, &majorType);
LPOLESTR majorGuidString = NULL;
hr = StringFromCLSID(majorType,&majorGuidString);
wprintf_s(L"major type: %s \n", majorGuidString);
wprintf_s(L"is a video? %i \n", IsEqualGUID(MFMediaType_Video,majorType));
GUID subType;
if(SUCCEEDED(mediaInterface->GetGUID(MF_MT_SUBTYPE, &subType))){
LPOLESTR minorGuidString = NULL;
if(SUCCEEDED(StringFromCLSID(subType,&minorGuidString)))
wprintf_s(L"subtype: %s \n", minorGuidString);
}
//Contains a DirectShow format GUID for a media type: http://msdn.microsoft.com/en-us/library/dd373477(v=VS.85).aspx
GUID formatType;
if(SUCCEEDED(mediaInterface->GetGUID(MF_MT_AM_FORMAT_TYPE, &formatType))){
LPOLESTR formatTypeString = NULL;
if(SUCCEEDED(StringFromCLSID(formatType,&formatTypeString)))
wprintf_s(L"format type: %s \n", formatTypeString);
}
UINT32 numeratorFrameRate = 0;
UINT32 denominatorFrameRate = 0;
if(SUCCEEDED(MFGetAttributeRatio(mediaInterface, MF_MT_FRAME_RATE, &numeratorFrameRate, &denominatorFrameRate)))
wprintf_s(L"framerate: %i/%i \n", numeratorFrameRate, denominatorFrameRate);
UINT32 widthOfFrame = 0;
UINT32 heightOfFrame = 0;
if(SUCCEEDED(MFGetAttributeSize(mediaInterface, MF_MT_FRAME_SIZE, &widthOfFrame, &heightOfFrame)))
wprintf_s(L"height of frame: %i width of frame: %i \n", heightOfFrame, widthOfFrame);
UINT32 isCompressedP = 0;
if(SUCCEEDED(mediaInterface->GetUINT32(MF_MT_COMPRESSED, &isCompressedP)))
wprintf_s(L"is media compressed? %iu \n", (BOOL)isCompressedP);
BOOL isCompressedP2 = 0;
if(SUCCEEDED((((IMFMediaType*)mediaInterface)->IsCompressedFormat(&isCompressedP2))))
wprintf_s(L"is media compressed2? %i \n", isCompressedP2);
UINT32 fixedSampleSizeP = 0;
if(SUCCEEDED(mediaInterface->GetUINT32(MF_MT_FIXED_SIZE_SAMPLES, &fixedSampleSizeP)))
wprintf_s(L"is fixed sample size? %iu \n", fixedSampleSizeP);
UINT32 sampleSize = 0;
if(SUCCEEDED(mediaInterface->GetUINT32(MF_MT_SAMPLE_SIZE, &sampleSize)))
wprintf_s(L"sample size: %iu \n", sampleSize);
UINT32 averateBitrate = 0;
if(SUCCEEDED(mediaInterface->GetUINT32(MF_MT_AVG_BITRATE, &averateBitrate)))
wprintf_s(L"average bitrate: %iu \n", averateBitrate);
UINT32 aspectRatio = 0;
if(SUCCEEDED(mediaInterface->GetUINT32(MF_MT_PAD_CONTROL_FLAGS, &aspectRatio)))
wprintf_s(L"4 by 3? %i 16 by 9? %i None? %i \n", aspectRatio == MFVideoPadFlag_PAD_TO_4x3, MFVideoPadFlag_PAD_TO_16x9 == aspectRatio, MFVideoPadFlag_PAD_TO_None == aspectRatio);
UINT32 drmFlag = 0;
if(SUCCEEDED(mediaInterface->GetUINT32(MF_MT_DRM_FLAGS, &drmFlag)))
wprintf_s(L"requires digital drm: %i requires analog drm: %i requires no drm: %i", drmFlag == MFVideoDRMFlag_DigitallyProtected, drmFlag == MFVideoDRMFlag_AnalogProtected, MFVideoDRMFlag_None == drmFlag);
UINT32 panScanEnabled = 0;
if(SUCCEEDED(mediaInterface->GetUINT32(MF_MT_PAN_SCAN_ENABLED, &panScanEnabled)))
wprintf_s(L"pan/scan enabled? %i", panScanEnabled);
UINT32 maxFrameRateNumerator = 0;
UINT32 maxFrameRateDenominator = 0;
if(SUCCEEDED(MFGetAttributeRatio(mediaInterface, MF_MT_FRAME_RATE_RANGE_MAX, &maxFrameRateNumerator, &maxFrameRateDenominator)))
wprintf_s(L"max framerate range: %i/%i \n", maxFrameRateNumerator, maxFrameRateDenominator);
}
}
It's getting some attributes from the IMFMediaInterface, but not many attributes are set and
the call to mediaInterface->GetUINT32(MF_MT_COMPRESSED, &isCompressedP) isn't successful but the call to (IMFMediaType*)mediaInterface)->IsCompressedFormat(&isCompressedP2) is, which makes me wonder if I'm doing it wrong.
This is an old question, but noone should go away unanswered.
As you discovered, MFTEnumEx can give you the list of MFTs, either bulk list, or filtered with a criteria. Now once you have the collection of transforms, you have IMFActivate for every transform available.
Having IMFActivate on hands, see this code snippet how you can obtain information about this transform: you list attributes or access attribute of interest using its key, you can obtain the category, input and output media types (MFT_INPUT_TYPES_Attributes, MFT_OUTPUT_TYPES_Attributes).
Here is sample code and MFT dump samples:
How to enumerate Media Foundation transforms on your system
Enumerating Media Foundation Transforms (MFTs)

Simplest way to capture raw audio from audio input for real time processing on a mac

What is the simplest way to capture audio from the built in audio input and be able to read the raw sampled values (as in a .wav) in real time as they come in when requested, like reading from a socket.
Hopefully code that uses one of Apple's frameworks (Audio Queues). Documentation is not very clear, and what I need is very basic.
Try the AudioQueue Framework for this. You mainly have to perform 3 steps:
setup an audio format how to sample the incoming analog audio
start a new recording AudioQueue with AudioQueueNewInput()
Register a callback routine which handles the incoming audio data packages
In step 3 you have a chance to analyze the incoming audio data with AudioQueueGetProperty()
It's roughly like this:
static void HandleAudioCallback (void *aqData,
AudioQueueRef inAQ,
AudioQueueBufferRef inBuffer,
const AudioTimeStamp *inStartTime,
UInt32 inNumPackets,
const AudioStreamPacketDescription *inPacketDesc) {
// Here you examine your audio data
}
static void StartRecording() {
// now let's start the recording
AudioQueueNewInput (&aqData.mDataFormat, // The sampling format how to record
HandleAudioCallback, // Your callback routine
&aqData, // e.g. AudioStreamBasicDescription
NULL,
kCFRunLoopCommonModes,
0,
&aqData.mQueue); // Your fresh created AudioQueue
AudioQueueStart(aqData.mQueue,
NULL);
}
I suggest the Apple AudioQueue Services Programming Guide for detailled information about how to start and stop the AudioQueue and how to setup correctly all ther required objects.
You may also have a closer look into Apple's demo prog SpeakHere. But this is IMHO a bit confusing to start with.
It depends how ' real-time ' you need it
if you need it very crisp, go down right at the bottom level and use audio units. that means setting up an INPUT callback. remember, when this fires you need to allocate your own buffers and then request the audio from the microphone.
ie don't get fooled by the presence of a buffer pointer in the parameters... it is only there because Apple are using the same function declaration for the input and render callbacks.
here is a paste out of one of my projects:
OSStatus dataArrivedFromMic(
void * inRefCon,
AudioUnitRenderActionFlags * ioActionFlags,
const AudioTimeStamp * inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList * dummy_notused )
{
OSStatus status;
RemoteIOAudioUnit* unitClass = (RemoteIOAudioUnit *)inRefCon;
AudioComponentInstance myUnit = unitClass.myAudioUnit;
AudioBufferList ioData;
{
int kNumChannels = 1; // one channel...
enum {
kMono = 1,
kStereo = 2
};
ioData.mNumberBuffers = kNumChannels;
for (int i = 0; i < kNumChannels; i++)
{
int bytesNeeded = inNumberFrames * sizeof( Float32 );
ioData.mBuffers[i].mNumberChannels = kMono;
ioData.mBuffers[i].mDataByteSize = bytesNeeded;
ioData.mBuffers[i].mData = malloc( bytesNeeded );
}
}
// actually GET the data that arrived
status = AudioUnitRender( (void *)myUnit,
ioActionFlags,
inTimeStamp,
inBusNumber,
inNumberFrames,
& ioData );
// take MONO from mic
const int channel = 0;
Float32 * outBuffer = (Float32 *) ioData.mBuffers[channel].mData;
// get a handle to our game object
static KPRing* kpRing = nil;
if ( ! kpRing )
{
//AppDelegate * appDelegate = [UIApplication sharedApplication].delegate;
kpRing = [Game singleton].kpRing;
assert( kpRing );
}
// ... and send it the data we just got from the mic
[ kpRing floatsArrivedFromMic: outBuffer
count: inNumberFrames ];
return status;
}

Resources