I have a video capture card with SDK for Visual C++. Color frames (640 x 480) become available to me at 30 fps in a callback from the SDK. Currently, I am writing the entire image sequence out one at a time as individual bmp files in a separate thread -- that's 108,000 files in an hour, or about 100 GB per hour, which is not manageable. I would like to push these incoming frames to one AVI file instead, with optional compression. Where do I even start? Wading through the MSDN DirectShow documentation has confused me so far. Are there better examples out there? Is OpenCV the answer? I've looked at some examples, but I'm not sure OpenCV would even recognize the card as a capture device, nor do I understand how it even recognizes capture devices in the first place. Also, I'm already getting the frames in, I just need to put them out to AVI in some consumer thread that does not back up my producer thread. Thanks for any help.
I've used CAviFile before. It works pretty well, I had to tweak it a bit to allow the user to pick the codec. I took that code from CAviGenerator. The interface for CAviFile is very simple, here's some sample code:
CAviFile *Avi = new CAviFile(fileName.c_str(), 0, 10);
HRESULT res = Avi->AppendNewFrame(Width, Height, ImageBuffer, BitsPerPixel);
if (FAILED(res))
{
std::cout << "Error recording AVI: " << Avi->GetLastErrorMessage() << std::endl;
}
delete Avi;
Obviously you have to ensure your ImageBuffer contains data in the right format etc. But once I got that kind of stuff all sorted out it worked great.
You can either use Video for Windows or DirectShow. Each comes with its own set of codecs. (and can be extended)
Though Microsoft considers VfW deprecated it is still perfectly usable, and is easier to setup than DirectShow.
Well you need to attach an AVI Mux (CLSID_AviDest) to your capture card. You then need to attach a File Writer (CLSID_FileWriter) and it will write out everything for you.
Admittedly Setting up the capture graph is not necessarily easy as DirectShow makes you jump through a million and one hoops.
Its much easier using the ICaptureGraphBuilder2 interface. Thankfully Microsoft have given a really nice rundown of how to do this ...
http://msdn.microsoft.com/en-us/library/dd318627.aspx
Adding an encoder is not easy though and, conveniently, glossed over in that link.
Here is an example of how to enumerate all the video compressors in a system that I wrote for an MFC app of mine.
BOOL LiveInputDlg::EnumerateVideoCompression()
{
CComboBox* pVideoCompression = (CComboBox*)GetDlgItem( IDC_COMBO_VIDEOCOMPRESSION );
pVideoCompression->SetExtendedUI( TRUE );
pVideoCompression->SetCurSel( pVideoCompression->AddString( _T( "<None>" ) ) );
ICreateDevEnum* pDevEnum = NULL;
IEnumMoniker* pEnum = NULL;
HRESULT hr = S_OK;
hr = CoCreateInstance( CLSID_SystemDeviceEnum, NULL, CLSCTX_INPROC_SERVER, IID_ICreateDevEnum, (void**)&pDevEnum );
if ( FAILED( hr ) )
{
return FALSE;
}
hr = pDevEnum->CreateClassEnumerator( CLSID_VideoCompressorCategory, &pEnum, 0 );
pDevEnum->Release();
if ( FAILED( hr ) )
{
return FALSE;
}
if ( pEnum )
{
IMoniker* pMoniker = NULL;
hr = pEnum->Next( 1, &pMoniker, NULL );
while( hr == S_OK )
{
IPropertyBag* pPropertyBag = NULL;
hr = pMoniker->BindToStorage( NULL, NULL, IID_IPropertyBag, (void**)&pPropertyBag );
if ( FAILED( hr ) )
{
pMoniker->Release();
pEnum->Release();
return FALSE;
}
VARIANT varName;
VariantInit( &varName );
hr = pPropertyBag->Read( L"Description", &varName, NULL );
if ( FAILED( hr ) )
{
hr = pPropertyBag->Read( L"FriendlyName", &varName, NULL );
if ( FAILED( hr ) )
{
pPropertyBag->Release();
pMoniker->Release();
pEnum->Release();
return FALSE;
}
}
IBaseFilter* pBaseFilter = NULL;
pMoniker->BindToObject( NULL, NULL, IID_IBaseFilter, (void**)&pBaseFilter );
{
USES_CONVERSION;
TCHAR* pName = OLE2T( varName.bstrVal );
int index = pVideoCompression->AddString( pName );
pVideoCompression->SetItemDataPtr( index, pMoniker );
VariantClear( &varName );
pPropertyBag->Release();
}
hr = pEnum->Next( 1, &pMoniker, NULL );
}
pEnum->Release();
}
return TRUE;
}
Good Luck! :)
Related
I have an application that needs to encode some audio files in MP3 format, 320 kbps bitrate. I'm using DirectShow to accomplish this task and lameDS-3.99.5 DirectShow filter.
The problem is that even if I set lameDS-3.99.5 DirectShow filter in GraphEdit to use Constant Birate - 320 kbps, the encoding is made all the time at 128 kbps.
What I need is an way to set the bitrate of the lameDS-3.99.5 DirectShow filter grammatically.
I have investigated all the web but I didn't found an example of doing this.
Some advice that I have found is to use IAMStreamConfig Interface on the output pin of the filter, but I didn't found any code examples in achieves this.
Thank you for helping me.
#Roman
Thank you very much for your reply.
Please find my function:
HRESULT CDShowGraph::AddLAMEMP3EncoderFilter( CComPtr<IBaseFilter>& spCodec )
{
HRESULT hr = spCodec.CoCreateInstance( IID_LAMEAudioEncoder_Filter );
if (FAILED(hr) )
{
return hr;
}
IEnumPins *pEnum = NULL;
IPin *pPin = NULL;
hr = spCodec->EnumPins(&pEnum);
if (FAILED(hr))
{
}
while (S_OK == pEnum->Next(1, &pPin, NULL))
{
PIN_DIRECTION pinDir;
PIN_DIRECTION dir = PINDIR_OUTPUT;
hr = pPin->QueryDirection(&pinDir);
if (SUCCEEDED(hr))
{
if(pinDir == dir)
{
IAMStreamConfig * pamconfig = 0;
hr = pPin->QueryInterface(IID_IAMStreamConfig, (void **)&pamconfig);
if(FAILED(hr)) {}
else
{
AM_MEDIA_TYPE *pmt={0};
hr = pamconfig->GetFormat(&pmt);
if(FAILED(hr)) {}
else
{
//audio_set_capformat(pmt);
WAVEFORMATEX *format = (WAVEFORMATEX *) pmt->pbFormat;
format->nAvgBytesPerSec = 320*1024;
hr = pamconfig->SetFormat(pmt);
DeleteMediaType(pmt);
}
}
pamconfig->Release();
}
}
}
hr = m_ptrGraph->m_pGraphBuilder->AddFilter( spCodec, NULL );
return hr;
}
It goes till here:
AM_MEDIA_TYPE *pmt={0};
hr = pamconfig->GetFormat(&pmt);
For pmt I get 0x00000000 Bad Ptr and after it execute the next code with success:
AM_MEDIA_TYPE *pmt={0};
hr = pamconfig->GetFormat(&pmt);
for the all fields in the format I get:
wFormatTag CXX0030: Error: expression cannot be evaluated
nChannels CXX0030: Error: expression cannot be evaluated
nSamplesPerSec CXX0030: Error: expression cannot be evaluated
nAvgBytesPerSec CXX0030: Error: expression cannot be evaluated
nBlockAlign CXX0030: Error: expression cannot be evaluated
wBitsPerSample CXX0030: Error: expression cannot be evaluated
cbSize CXX0030: Error: expression cannot be evaluated
Using IAMStreamConfig::SetFormat - Sample rate and bit rate of a wave file created by DirectShow
There are multiple code snippets for IAMStreamConfig on StackOverflow for setting video media type, audio format is set the same way, another example of this is here
This question (and information here for older version) says that there is no support for IAMStreamConfig with LAME encoder, but there is another way to specify output details Configure LAME MP3 encoder in DirectShow application using IAudioEncoderProperties
I am having trouble with WASAPI. It do not output any sound and I have been checked the data that writing to the buffer.
Because of it does not output any sound, I haven't any idea to find out the problem.
It may have some problems in following code.
SoundStream::SoundStream() : writtenCursor(0), writeCursor(0), distroy(false)
{
IMMDeviceEnumerator * pEnumerator = nullptr;
HResult(CoCreateInstance(__uuidof(MMDeviceEnumerator), NULL, CLSCTX_ALL, IID_PPV_ARGS(&pEnumerator)));
IMMDevice * pDevice = nullptr;
HResult(pEnumerator->GetDefaultAudioEndpoint(eRender, eMultimedia, &pDevice));
SafeRelease(&pEnumerator);
HResult(pDevice->Activate(__uuidof(IAudioClient), CLSCTX_ALL, NULL, (void**)&pAudioClient));
SafeRelease(&pDevice);
WAVEFORMATEXTENSIBLE * pwfx = nullptr;
hEvent = CreateEvent(NULL, FALSE, FALSE, NULL);
REFERENCE_TIME hnsRequestedDuration = REFTIMES_PER_SEC * 2;
HResult(pAudioClient->GetMixFormat((WAVEFORMATEX**)&pwfx));
HResult(pAudioClient->Initialize(
AUDCLNT_SHAREMODE_SHARED,
AUDCLNT_STREAMFLAGS_EVENTCALLBACK,
hnsRequestedDuration,
0,
(WAVEFORMATEX*)pwfx,
NULL));
pAudioClient->SetEventHandle(hEvent);
channel = (size_t)pwfx->Format.nChannels;
bits = (size_t)pwfx->Format.wBitsPerSample;
validBits = (size_t)pwfx->Samples.wValidBitsPerSample;
frequency = (size_t)pwfx->Format.nSamplesPerSec;
buffer.reshape({ 0, channel, bits >> 3 });
CoTaskMemFree(pwfx);
HResult(pAudioClient->GetBufferSize(&bufferFrameCount));
HResult(pAudioClient->Start());
if (pAudioClient)
{
thread = std::thread([&]()
{
this->Sync();
});
}
}
You could look at my WASAPI.cpp code at http://jdmcox.com (which works fine).
You should also check if the expected wave format is float:
//SubFormat 00000003-0000-0010-8000-00aa00389b71 defines KSDATAFORMAT_SUBTYPE_IEEE_FLOAT
//SubFormat 00000001-0000-0010-8000-00aa00389b71 defines KSDATAFORMAT_SUBTYPE_PCM
GUID G;
WORD V;
WAVEFORMATEX *pwfx = NULL;
bool itsfloat;
pAudioClient->GetMixFormat(&pwfx);
// Do we received a WAVEFORMATEXTENSIBLE?
if(pwfx.cbSize >= 22) {
G = ((WAVEFORMATEXTENSIBLE*)pwfx)->SubFormat;
V = ((WAVEFORMATEXTENSIBLE*)pwfx)->Samples.wValidBitsPerSample;
if (G.Data1 == 3) itsfloat = true;
else if (G.Data1 == 1) itsfloat = false;
}
You know you received a WAVEFORMATEXTENSIBLE and not a simple WAVEFORMATEX because the "pwfx.cbSize >= 22".
See more at:
IAudioClient::GetMixFormat
https://learn.microsoft.com/en-us/windows/win32/api/audioclient/nf-audioclient-iaudioclient-getmixformat
WAVEFORMATEXTENSIBLE
https://learn.microsoft.com/en-us/windows/win32/api/mmreg/ns-mmreg-waveformatextensible
You could look at my WASAPI.cpp code at http://jdmcox.com AGAIN.
Now it works in shared mode as well as exclusive mode.
I should note that no conversion of wave format or wave is necessary in shared mode -- Windows takes care of both converting to and from their format used to mix waves.
I want to capture images from webcam without any post processing, that is NO auto focus , exposure correction , white balance and stuff. Well basically I want to capture continuous frames from webcam and make each frame compare with the previous one and save them to disk only when there is an actual change. Because of the post processing almost every frame is being returned as different for me.
code so far
using namespace cv;
bool identical(cv::Mat m1, cv::Mat m2)
{
if ( m1.cols != m2.cols || m1.rows != m2.rows || m1.channels() != m2.channels() || m1.type() != m2.type() )
{
return false;
}
for ( int i = 0; i < m1.rows; i++ )
{
for ( int j = 0; j < m1.cols; j++ )
{
if ( m1.at<Vec3b>(i, j) != m2.at<Vec3b>(i, j) )
{
return false;
}
}
}
return true;
}
int main() {
CvCapture* capture = cvCaptureFromCAM( 1);
int i=0,firsttime=0;
char filename[40];
Mat img1,img2;
if ( !capture ) {
fprintf( stderr, "ERROR: capture is NULL \n" );
getchar();
return -1;
}
cvNamedWindow( "img1", CV_WINDOW_AUTOSIZE );
cvNamedWindow( "img2", CV_WINDOW_AUTOSIZE );
while ( 1 ) {
IplImage* frame = cvQueryFrame( capture );
img1=frame;
if ( !frame ) {
fprintf( stderr, "ERROR: frame is null...\n" );
getchar();
break;
}
if(firsttime==0){
img2=frame;
fprintf( stderr, "firtstime\n" );
}
if ( (cvWaitKey(10) & 255) == 27 ) break;
i++;
sprintf(filename, "D:\\testimg\\img%d.jpg", i);
cv::cvtColor(img1, img1, CV_BGR2GRAY);
imshow( "img1", img1);
imshow( "img2", img2);
imwrite(filename,img1);
if(identical(img1,img2))
{
//write to diff path
}
img2=imread(filename,1);
firsttime=1;
}
// Release the capture device housekeeping
cvReleaseCapture( &capture );
return 0;
}
While ur at it, I'll be great full if u can suggest a workaround for this using another frame compare solution aswell :)
I had this problem, and the only solution that found and wrote was a program based on Direct-show (in case you're using windows )so no opencv code at all
with a bit of luck, you can get the properties page of your camera, and switch things off there:
VideoCapture cap(0);
cap.set(CV_CAP_PROP_SETTINGS,1);
and please, skip the c-api in favour of c++. it'll go away soon.
forgot to mention : you an change the cam-settings from vlc as well.
#Prince, sorry I have been looking for my Directshow code I didn't found it, and I don't think it will help because I used it for the DirectLink (Black magic Design)card, since I've never did that befor it was pretty hard, my suggestion will be try to use GraphEditPlus :
http://www.infognition.com/GraphEditPlus/
it helps a lot, and it's easy to use !
good luck !
If you just wish to capture frames when there is an actual change, try background subtraction algorithms. Also, instead of just subtracting subsequent frames, use one of the many algorithms already implemented for you in OpenCV - they are much more robust to changes in lightning conditions etc than vanilla background subtraction.
In Python :
backsub = cv2.BackgroundSubtractorMOG2(history=10000,varThreshold=100)
fgmask = backsub.apply(frame, None, 0.01)
Frame is the stream of pictures read from your webcam.
Google for the corresponding function in Cpp.
I'm new to C++ and trying to add OpenCV into Microsoft's Kinect samples. I was able to do it for the ColorBasics-D2D sample by modifying this function
void CColorBasics::ProcessColor()
{
HRESULT hr;
NUI_IMAGE_FRAME imageFrame;
// Attempt to get the color frame
hr = m_pNuiSensor->NuiImageStreamGetNextFrame(m_pColorStreamHandle, 0, &imageFrame);
if (FAILED(hr))
{
return;
}
INuiFrameTexture * pTexture = imageFrame.pFrameTexture;
NUI_LOCKED_RECT LockedRect;
// Lock the frame data so the Kinect knows not to modify it while we're reading it
pTexture->LockRect(0, &LockedRect, NULL, 0);
// Make sure we've received valid data
if (LockedRect.Pitch != 0)
{
BYTE * pBuffer = (BYTE*) LockedRect.pBits;
cvSetData(img,(BYTE*) pBuffer, img->widthStep);
Mat &m = Mat(img);
Mat &hsv = Mat();
vector<Mat> mv = vector<Mat>(3,Mat(cvSize(640,480),CV_8UC1));
cvtColor(m,hsv,CV_BGR2HSV);
cvtColor(hsv,m,CV_HSV2BGR);//*/
IplImage iplimg(m);
cvNamedWindow("rgb",1);
cvShowImage("rgb",&iplimg);
// Draw the data with Direct2D
m_pDrawColor->Draw(static_cast<BYTE *>(LockedRect.pBits), LockedRect.size);
// If the user pressed the screenshot button, save a screenshot
if (m_bSaveScreenshot)
{
WCHAR statusMessage[cStatusMessageMaxLen];
// Retrieve the path to My Photos
WCHAR screenshotPath[MAX_PATH];
GetScreenshotFileName(screenshotPath, _countof(screenshotPath));
// Write out the bitmap to disk
hr = SaveBitmapToFile(static_cast<BYTE *>(LockedRect.pBits), cColorWidth, cColorHeight, 32, screenshotPath);
if (SUCCEEDED(hr))
{
// Set the status bar to show where the screenshot was saved
StringCchPrintf( statusMessage, cStatusMessageMaxLen, L"Screenshot saved to %s", screenshotPath);
}
else
{
StringCchPrintf( statusMessage, cStatusMessageMaxLen, L"Failed to write screenshot to %s", screenshotPath);
}
SetStatusMessage(statusMessage);
// toggle off so we don't save a screenshot again next frame
m_bSaveScreenshot = false;
}
}
// We're done with the texture so unlock it
pTexture->UnlockRect(0);
// Release the frame
m_pNuiSensor->NuiImageStreamReleaseFrame(m_pColorStreamHandle, &imageFrame);
}
This works fine. However, when I wanted to add something like this to the SkeletalViewer example, it is just displaying an empty window.
/// <summary>
/// Handle new color data
/// </summary>
/// <returns>true if a frame was processed, false otherwise</returns>
bool CSkeletalViewerApp::Nui_GotColorAlert( )
{
NUI_IMAGE_FRAME imageFrame;
bool processedFrame = true;
HRESULT hr = m_pNuiSensor->NuiImageStreamGetNextFrame( m_pVideoStreamHandle, 0, &imageFrame );
if ( FAILED( hr ) )
{
return false;
}
INuiFrameTexture * pTexture = imageFrame.pFrameTexture;
NUI_LOCKED_RECT LockedRect;
pTexture->LockRect( 0, &LockedRect, NULL, 0 );
if ( LockedRect.Pitch != 0 )
{
BYTE * pBuffer = (BYTE*) LockedRect.pBits;
cvSetData(img,(BYTE*) pBuffer, img->widthStep);
Mat m(img);
IplImage iplimg(m);
cvNamedWindow("rgb",1);
cvShowImage("rgb",&iplimg);
m_pDrawColor->Draw( static_cast<BYTE *>(LockedRect.pBits), LockedRect.size );
}
else
{
OutputDebugString( L"Buffer length of received texture is bogus\r\n" );
processedFrame = false;
}
pTexture->UnlockRect( 0 );
m_pNuiSensor->NuiImageStreamReleaseFrame( m_pVideoStreamHandle, &imageFrame );
return processedFrame;
}
I'm not sure why the same code doesn't work in this example. I'm using Visual Studio 2010 and OpenCV 2.4.2.
Thanks
Figured it out. Changed it to this
if ( LockedRect.Pitch != 0 )
{
BYTE * pBuffer = static_cast<BYTE *>(LockedRect.pBits);
cvSetData(img,(BYTE*) pBuffer, img->widthStep);
Mat m(img);
IplImage iplimg(m);
cvNamedWindow("rgb",1);
cvShowImage("rgb",&iplimg);
waitKey(1);
m_pDrawColor->Draw( static_cast<BYTE *>(LockedRect.pBits), LockedRect.size );
}
What is the simplest way to capture audio from the built in audio input and be able to read the raw sampled values (as in a .wav) in real time as they come in when requested, like reading from a socket.
Hopefully code that uses one of Apple's frameworks (Audio Queues). Documentation is not very clear, and what I need is very basic.
Try the AudioQueue Framework for this. You mainly have to perform 3 steps:
setup an audio format how to sample the incoming analog audio
start a new recording AudioQueue with AudioQueueNewInput()
Register a callback routine which handles the incoming audio data packages
In step 3 you have a chance to analyze the incoming audio data with AudioQueueGetProperty()
It's roughly like this:
static void HandleAudioCallback (void *aqData,
AudioQueueRef inAQ,
AudioQueueBufferRef inBuffer,
const AudioTimeStamp *inStartTime,
UInt32 inNumPackets,
const AudioStreamPacketDescription *inPacketDesc) {
// Here you examine your audio data
}
static void StartRecording() {
// now let's start the recording
AudioQueueNewInput (&aqData.mDataFormat, // The sampling format how to record
HandleAudioCallback, // Your callback routine
&aqData, // e.g. AudioStreamBasicDescription
NULL,
kCFRunLoopCommonModes,
0,
&aqData.mQueue); // Your fresh created AudioQueue
AudioQueueStart(aqData.mQueue,
NULL);
}
I suggest the Apple AudioQueue Services Programming Guide for detailled information about how to start and stop the AudioQueue and how to setup correctly all ther required objects.
You may also have a closer look into Apple's demo prog SpeakHere. But this is IMHO a bit confusing to start with.
It depends how ' real-time ' you need it
if you need it very crisp, go down right at the bottom level and use audio units. that means setting up an INPUT callback. remember, when this fires you need to allocate your own buffers and then request the audio from the microphone.
ie don't get fooled by the presence of a buffer pointer in the parameters... it is only there because Apple are using the same function declaration for the input and render callbacks.
here is a paste out of one of my projects:
OSStatus dataArrivedFromMic(
void * inRefCon,
AudioUnitRenderActionFlags * ioActionFlags,
const AudioTimeStamp * inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList * dummy_notused )
{
OSStatus status;
RemoteIOAudioUnit* unitClass = (RemoteIOAudioUnit *)inRefCon;
AudioComponentInstance myUnit = unitClass.myAudioUnit;
AudioBufferList ioData;
{
int kNumChannels = 1; // one channel...
enum {
kMono = 1,
kStereo = 2
};
ioData.mNumberBuffers = kNumChannels;
for (int i = 0; i < kNumChannels; i++)
{
int bytesNeeded = inNumberFrames * sizeof( Float32 );
ioData.mBuffers[i].mNumberChannels = kMono;
ioData.mBuffers[i].mDataByteSize = bytesNeeded;
ioData.mBuffers[i].mData = malloc( bytesNeeded );
}
}
// actually GET the data that arrived
status = AudioUnitRender( (void *)myUnit,
ioActionFlags,
inTimeStamp,
inBusNumber,
inNumberFrames,
& ioData );
// take MONO from mic
const int channel = 0;
Float32 * outBuffer = (Float32 *) ioData.mBuffers[channel].mData;
// get a handle to our game object
static KPRing* kpRing = nil;
if ( ! kpRing )
{
//AppDelegate * appDelegate = [UIApplication sharedApplication].delegate;
kpRing = [Game singleton].kpRing;
assert( kpRing );
}
// ... and send it the data we just got from the mic
[ kpRing floatsArrivedFromMic: outBuffer
count: inNumberFrames ];
return status;
}