how to convert byte* into jpeg file in VC++ - visual-c++

how to convert byte* into jpeg file in VC++
i am capturing Video samples and writing it as bmp files, but i want to write that video samples into jpeg file using MFC support in ATL COM.

Use libjpg. Download from: http://www.ijg.org/

From what it appears, you have the image data in a buffer pointed to by a byte object. Note, that the type actually is BYTE (all uppercase). If the data is in JPEG format already why don't you write that data out to a file (with a suitable '.jpg' or '.jpeg' extension) and try loading it with an image editor? Otherwise, you will need to decode that to raw format and encode in the JPEG format.
Or, you need to explain you problem in more detail, preferably with some code.

Raw image data to JPEG can be acheived by ImageMagick.

You may also try to use CxImage C++ class to save your stills to JPEG-encoded file.
There are some more Windows API oriented alternatives available on CodeProject, for instance CMiniJpegEncoder
It is even possible to render JPEG to file from Windows bitmap using libgd library if compiled with libjpeg support. Here is code of small extension function gdImageTrueColorAttachBuffer I developed for this purpose some time ago:
// libgd ext// libgd extension by Mateusz Loskot <mateusz at loskot dot net>
// Originally developed for Windows CE to enable direct drawing
// on Windows API Device Context using libgd API.
// Complete example available in libgd CVS:
// http://cvs.php.net/viewvc.cgi/gd/libgd/examples/windows.c?diff_format=u&revision=1.1&view=markup
//
gdImagePtr gdImageTrueColorAttachBuffer(int* buffer, int sx, int sy, int stride)
{
int i;
int height;
int* rowptr;
gdImagePtr im;
im = (gdImage *) malloc (sizeof (gdImage));
if (!im) {
return 0;
}
memset (im, 0, sizeof (gdImage));
#if 0
if (overflow2(sizeof (int *), sy)) {
return 0;
}
#endif
im->tpixels = (int **) malloc (sizeof (int *) * sy);
if (!im->tpixels) {
free(im);
return 0;
}
im->polyInts = 0;
im->polyAllocated = 0;
im->brush = 0;
im->tile = 0;
im->style = 0;
height = sy;
rowptr = buffer;
if (stride < 0) {
int startoff = (height - 1) * stride;
rowptr = buffer - startoff;
}
i = 0;
while (height--) {
im->tpixels[i] = rowptr;
rowptr += stride;
i++;
}
im->sx = sx;
im->sy = sy;
im->transparent = (-1);
im->interlace = 0;
im->trueColor = 1;
im->saveAlphaFlag = 0;
im->alphaBlendingFlag = 1;
im->thick = 1;
im->AA = 0;
im->cx1 = 0;
im->cy1 = 0;
im->cx2 = im->sx - 1;
im->cy2 = im->sy - 1;
return im;
}
void gdSaveJPEG(void* bits, int width, int height, const char* filename)
{
bool success = false;
int stride = ((width * 1 + 3) >> 2) << 2;
gdImage* im = gdImageTrueColorAttachBuffer((int*)bits, width, height, -stride);
if (0 != im)
{
FILE* jpegout = fopen(filename, "wb");
gdImageJpeg(im, jpegout, -1);
fclose(jpegout);
success = true;
}
gdImageDestroy(im);
return success;
}
I hope it helps.

Related

How can I write a qt application to display a dcm image?

I have found a way using vtk to display dcm image. But vtk is too much for what I want, I only want to display a dcm image. The dcmtk will process the dcm image for me.
So is there an easy way for me to display dcm image?
Thanks in advance.
The smallest learning curve and code requirement will likely be to use Grass Roots DICOM. (http://gdcm.sourceforge.net/wiki/index.php/Main_Page) This library will link to Qt and give you a quick way to load an image. The only thing to remember is that a DICOM image file does not contain an image that Qt (or anything else) can display directly. You have to load the DICOM data and convert the image to display it.
These are the lines to add to the project file. Note that the paths will have to match your machine and versions, not mine;
#INCLUDEPATH += /usr/local/include/gdcm-2.4/
#LIBS += -L"/usr/local/lib/" -lgdcmCommon -lgdcmDICT -lgdcmDSED -lgdcmIOD -lgdcmMEXD -lgdcmMSFF -lgdcmjpeg12 -lgdcmjpeg16 -lgdcmopenjpeg -lgdcmjpeg8
#LIBS += -L"/usr/local/lib/" -lgdcmcharls -lexpat -lgdcmzlib
This is an example converter, once you have the dicom image loaded, it will convert it to a Qt QImage.
bool imageConverters::convertToFormat_RGB888(gdcm::Image const & gimage, char *buffer, QImage* &imageQt)
{
unsigned int dimX;
unsigned int dimY;
int photoInterp;
const unsigned int* dimension = gimage.GetDimensions();
if (dimension == 0)
{
dimX = 800;
dimY = 600;
}
else
{
dimX = dimension[0];
dimY = dimension[1];
}
gimage.GetBuffer(buffer);
photoInterp = gimage.GetPhotometricInterpretation();
qDebug() << "photo interp = " << photoInterp;
qDebug() << "pixel format = " << gimage.GetPixelFormat();
// Let's start with the easy case:
if( photoInterp == gdcm::PhotometricInterpretation::RGB )
{
if( gimage.GetPixelFormat() != gdcm::PixelFormat::UINT8 )
{
return false;
}
unsigned char *ubuffer = (unsigned char*)buffer;
// QImage::Format_RGB888 13 The image is stored using a 24-bit RGB format (8-8-8).Format_RGB888 Format_ARGB32
imageQt = new QImage((unsigned char *)ubuffer, dimX, dimY, 3*dimX, QImage::Format_RGB888);
//imageQt = &imageQt->rgbSwapped();
}
else
if( photoInterp == gdcm::PhotometricInterpretation::MONOCHROME2 ||
photoInterp == gdcm::PhotometricInterpretation::MONOCHROME1
)
{
if( gimage.GetPixelFormat() == gdcm::PixelFormat::UINT8 || gimage.GetPixelFormat() == gdcm::PixelFormat::INT8
|| gimage.GetPixelFormat() == gdcm::PixelFormat::UINT16)
{
// We need to copy each individual 8bits into R / G and B:
unsigned char *ubuffer = new unsigned char[dimX*dimY*3];
unsigned char *pubuffer = ubuffer;
for(unsigned int i = 0; i < dimX*dimY; i++)
{
*pubuffer++ = *buffer;
*pubuffer++ = *buffer;
*pubuffer++ = *buffer++;
}
imageQt = new QImage(ubuffer, dimX, dimY, QImage::Format_RGB888);
}
else
if( gimage.GetPixelFormat() == gdcm::PixelFormat::INT16 )
{
// We need to copy each individual 16bits into R / G and B (truncate value)
short *buffer16 = (short*)buffer;
unsigned char *ubuffer = new unsigned char[dimX*dimY*3];
unsigned char *pubuffer = ubuffer;
for(unsigned int i = 0; i < dimX*dimY; i++)
{
// Scalar Range of gdcmData/012345.002.050.dcm is [0,192], we could simply do:
// *pubuffer++ = *buffer16;
// *pubuffer++ = *buffer16;
// *pubuffer++ = *buffer16;
// instead do it right:
*pubuffer++ = (unsigned char)std::min(255, (32768 + *buffer16) / 255);
*pubuffer++ = (unsigned char)std::min(255, (32768 + *buffer16) / 255);
*pubuffer++ = (unsigned char)std::min(255, (32768 + *buffer16) / 255);
buffer16++;
}
imageQt = new QImage(ubuffer, dimX, dimY, QImage::Format_RGB888);
}
else
{
std::cerr << "Pixel Format is: " << gimage.GetPixelFormat() << std::endl;
return false;
}
}
else
{
std::cerr << "Unhandled PhotometricInterpretation: " << gimage.GetPhotometricInterpretation() << std::endl;
return false;
}
return true;
}
#john elemans
If I use the code that you gave me, It seems to be worng. My image's pixel format is UINT16, so the program will execute the following sentences.
unsigned char *ubuffer = new unsigned char[dimX*dimY*3];
unsigned char *pubuffer = ubuffer;
for(unsigned int i = 0; i < dimX*dimY; i++)
{
*pubuffer++ = *buffer;
*pubuffer++ = *buffer;
*pubuffer++ = *buffer++;
}
imageQt = new QImage(ubuffer, dimX, dimY, QImage::Format_RGB888);
But after the converting, the result is not right.
The original image is like this:before
And this is the image which has been converted:after
The result proves that the code isn't right. I also have tried other ways. One of the codes that I have tried is this:
short *buffer16 = (short*)buffer;
unsigned char *ubuffer = new unsigned char[dimX*dimY*3];
unsigned char *pubuffer = ubuffer;
for (unsigned int i = 0; i < dimX*dimY; i++)
{
*pubuffer++ = *buffer16;
*pubuffer++ = *buffer16;
*pubuffer++ = *buffer16;
buffer16++;
}
imageQt = new QImage(ubuffer, dimX, dimY, QImage::Format_RGB888);
After I used this code to convert the image, I almost believed that I have succeeded. But the result is like this: sorry, I don't have the enough reputation the post more than 2 links.
I only want to convert DICOM image to bitmap, but I don't know how.
At last, thank you for your help.

C++ FFmpeg distorted sound when converting audio

I'm using the FFmpeg library to generate MP4 files containing audio from various files, such as MP3, WAV, OGG, but I'm having some troubles (I'm also putting video in there, but for simplicity's sake I'm omitting that for this question, since I've got that working). My current code opens an audio file, decodes the content and converts it into the MP4 container and finally writes it into the destination file as interleaved frames.
It works perfectly for most MP3 files, but when inputting WAV or OGG, the audio in the resulting MP4 is slightly distorted and often plays at the wrong speed (up to many times faster or slower).
I've looked at countless of examples of using the converting functions (swr_convert), but I can't seem to get rid of the noise in the exported audio.
Here's how I add an audio stream to the MP4 (outContext is the AVFormatContext for the output file):
audioCodec = avcodec_find_encoder(outContext->oformat->audio_codec);
if (!audioCodec)
die("Could not find audio encoder!");
// Start stream
audioStream = avformat_new_stream(outContext, audioCodec);
if (!audioStream)
die("Could not allocate audio stream!");
audioCodecContext = audioStream->codec;
audioStream->id = 1;
// Setup
audioCodecContext->sample_fmt = AV_SAMPLE_FMT_S16;
audioCodecContext->bit_rate = 128000;
audioCodecContext->sample_rate = 44100;
audioCodecContext->channels = 2;
audioCodecContext->channel_layout = AV_CH_LAYOUT_STEREO;
// Open the codec
if (avcodec_open2(audioCodecContext, audioCodec, NULL) < 0)
die("Could not open audio codec");
And to open a sound file from MP3/WAV/OGG (from the filename variable)...
// Create contex
formatContext = avformat_alloc_context();
if (avformat_open_input(&formatContext, filename, NULL, NULL)<0)
die("Could not open file");
// Find info
if (avformat_find_stream_info(formatContext, 0)<0)
die("Could not find file info");
av_dump_format(formatContext, 0, filename, false);
// Find audio stream
streamId = av_find_best_stream(formatContext, AVMEDIA_TYPE_AUDIO, -1, -1, NULL, 0);
if (streamId < 0)
die("Could not find Audio Stream");
codecContext = formatContext->streams[streamId]->codec;
// Find decoder
codec = avcodec_find_decoder(codecContext->codec_id);
if (codec == NULL)
die("cannot find codec!");
// Open codec
if (avcodec_open2(codecContext, codec, 0)<0)
die("Codec cannot be found");
// Set up resample context
swrContext = swr_alloc();
if (!swrContext)
die("Failed to alloc swr context");
av_opt_set_int(swrContext, "in_channel_count", codecContext->channels, 0);
av_opt_set_int(swrContext, "in_channel_layout", codecContext->channel_layout, 0);
av_opt_set_int(swrContext, "in_sample_rate", codecContext->sample_rate, 0);
av_opt_set_sample_fmt(swrContext, "in_sample_fmt", codecContext->sample_fmt, 0);
av_opt_set_int(swrContext, "out_channel_count", audioCodecContext->channels, 0);
av_opt_set_int(swrContext, "out_channel_layout", audioCodecContext->channel_layout, 0);
av_opt_set_int(swrContext, "out_sample_rate", audioCodecContext->sample_rate, 0);
av_opt_set_sample_fmt(swrContext, "out_sample_fmt", audioCodecContext->sample_fmt, 0);
if (swr_init(swrContext))
die("Failed to init swr context");
Finally, to decode+convert+encode...
// Allocate and init re-usable frames
audioFrameDecoded = av_frame_alloc();
if (!audioFrameDecoded)
die("Could not allocate audio frame");
audioFrameDecoded->format = fileCodecContext->sample_fmt;
audioFrameDecoded->channel_layout = fileCodecContext->channel_layout;
audioFrameDecoded->channels = fileCodecContext->channels;
audioFrameDecoded->sample_rate = fileCodecContext->sample_rate;
audioFrameConverted = av_frame_alloc();
if (!audioFrameConverted)
die("Could not allocate audio frame");
audioFrameConverted->nb_samples = audioCodecContext->frame_size;
audioFrameConverted->format = audioCodecContext->sample_fmt;
audioFrameConverted->channel_layout = audioCodecContext->channel_layout;
audioFrameConverted->channels = audioCodecContext->channels;
audioFrameConverted->sample_rate = audioCodecContext->sample_rate;
AVPacket inPacket;
av_init_packet(&inPacket);
inPacket.data = NULL;
inPacket.size = 0;
int frameFinished = 0;
while (av_read_frame(formatContext, &inPacket) >= 0) {
if (inPacket.stream_index == streamId) {
int len = avcodec_decode_audio4(fileCodecContext, audioFrameDecoded, &frameFinished, &inPacket);
if (frameFinished) {
// Convert
uint8_t *convertedData=NULL;
if (av_samples_alloc(&convertedData,
NULL,
audioCodecContext->channels,
audioFrameConverted->nb_samples,
audioCodecContext->sample_fmt, 0) < 0)
die("Could not allocate samples");
int outSamples = swr_convert(swrContext,
&convertedData,
audioFrameConverted->nb_samples,
(const uint8_t **)audioFrameDecoded->data,
audioFrameDecoded->nb_samples);
if (outSamples < 0)
die("Could not convert");
size_t buffer_size = av_samples_get_buffer_size(NULL,
audioCodecContext->channels,
audioFrameConverted->nb_samples,
audioCodecContext->sample_fmt,
0);
if (buffer_size < 0)
die("Invalid buffer size");
if (avcodec_fill_audio_frame(audioFrameConverted,
audioCodecContext->channels,
audioCodecContext->sample_fmt,
convertedData,
buffer_size,
0) < 0)
die("Could not fill frame");
AVPacket outPacket;
av_init_packet(&outPacket);
outPacket.data = NULL;
outPacket.size = 0;
if (avcodec_encode_audio2(audioCodecContext, &outPacket, audioFrameConverted, &frameFinished) < 0)
die("Error encoding audio frame");
if (frameFinished) {
outPacket.stream_index = audioStream->index;
if (av_interleaved_write_frame(outContext, &outPacket) != 0)
die("Error while writing audio frame");
av_free_packet(&outPacket);
}
}
}
}
av_frame_free(&audioFrameConverted);
av_frame_free(&audioFrameDecoded);
av_free_packet(&inPacket);
I have also tried setting appropriate pts values for outgoing frames, but that doesn't seem to affect the sound quality at all.
I'm also unsure how/if I should be allocating the converted data, can av_samples_alloc be used for this? What about avcodec_fill_audio_frame? Am I on the right track?
Any input is appreciated (I can also send the exported MP4s if necessary, if you want to hear the distortion).
if (avcodec_encode_audio2(audioCodecContext, &outPacket, audioFrameConverted, &frameFinished) < 0)
die("Error encoding audio frame");
You seem to be assuming that the encoder will eat all submitted samples - it doesn't. It also doesn't cache them internally. It will eat a specific number of samples (AVCodecContext.frame_size), and the rest should be resubmitted in the next call to avcodec_encode_audio2().
[edit]
ok, so your edited code is better, but not there yet. You're still assuming the decoder will output at least frame_size samples for each call to avcodec_decode_audioN() (after resampling), which may not be the case. If that happens (and it does, for ogg), your avcodec_encode_audioN() call will encode an incomplete input buffer (because you say it's got frame_size samples, but it doesn't). Likewise, your code also doesn't deal with cases where the decoder outputs a number significantly bigger than frame_size (like 10*frame_size) expected by the encoder, in which case you'll get overruns - basically your 1:1 decode/encode mapping is the main source of your problem.
As a solution, consider the swrContext a FIFO, where you input all decoder samples, and loop over it until it's got less than frame_size samples left. I'll leave it up to you to learn how to deal with end-of-stream, because you'll need to flush cached samples out of the decoder (by calling avcodec_decode_audioN() with AVPacket where .data = NULL and .size = 0), flush the swrContext (by calling swr_context() until it returns 0) as well as flush the encoder (by feeding it NULL AVFrames until it returns AVPacket with .size = 0). Right now you'll probably get an output file where the end is slightly truncated. That shouldn't be hard to figure out.
This code works for me for m4a/ogg/mp3 to m4a/aac conversion:
#include "libswresample/swresample.h"
#include "libavcodec/avcodec.h"
#include "libavformat/avformat.h"
#include "libavutil/opt.h"
#include <stdio.h>
#include <stdlib.h>
static void die(char *str) {
fprintf(stderr, "%s\n", str);
exit(1);
}
static AVStream *add_audio_stream(AVFormatContext *oc, enum AVCodecID codec_id)
{
AVCodecContext *c;
AVCodec *encoder = avcodec_find_encoder(codec_id);
AVStream *st = avformat_new_stream(oc, encoder);
if (!st) die("av_new_stream");
c = st->codec;
c->codec_id = codec_id;
c->codec_type = AVMEDIA_TYPE_AUDIO;
/* put sample parameters */
c->bit_rate = 64000;
c->sample_rate = 44100;
c->channels = 2;
c->sample_fmt = encoder->sample_fmts[0];
c->channel_layout = AV_CH_LAYOUT_STEREO;
// some formats want stream headers to be separate
if(oc->oformat->flags & AVFMT_GLOBALHEADER)
c->flags |= CODEC_FLAG_GLOBAL_HEADER;
return st;
}
static void open_audio(AVFormatContext *oc, AVStream *st)
{
AVCodecContext *c = st->codec;
AVCodec *codec;
/* find the audio encoder */
codec = avcodec_find_encoder(c->codec_id);
if (!codec) die("avcodec_find_encoder");
/* open it */
AVDictionary *dict = NULL;
av_dict_set(&dict, "strict", "+experimental", 0);
int res = avcodec_open2(c, codec, &dict);
if (res < 0) die("avcodec_open");
}
int main(int argc, char *argv[]) {
av_register_all();
if (argc != 3) {
fprintf(stderr, "%s <in> <out>\n", argv[0]);
exit(1);
}
// Allocate and init re-usable frames
AVCodecContext *fileCodecContext, *audioCodecContext;
AVFormatContext *formatContext, *outContext;
AVStream *audioStream;
SwrContext *swrContext;
int streamId;
// input file
const char *file = argv[1];
int res = avformat_open_input(&formatContext, file, NULL, NULL);
if (res != 0) die("avformat_open_input");
res = avformat_find_stream_info(formatContext, NULL);
if (res < 0) die("avformat_find_stream_info");
AVCodec *codec;
res = av_find_best_stream(formatContext, AVMEDIA_TYPE_AUDIO, -1, -1, &codec, 0);
if (res < 0) die("av_find_best_stream");
streamId = res;
fileCodecContext = avcodec_alloc_context3(codec);
avcodec_copy_context(fileCodecContext, formatContext->streams[streamId]->codec);
res = avcodec_open2(fileCodecContext, codec, NULL);
if (res < 0) die("avcodec_open2");
// output file
const char *outfile = argv[2];
AVOutputFormat *fmt = fmt = av_guess_format(NULL, outfile, NULL);
if (!fmt) die("av_guess_format");
outContext = avformat_alloc_context();
outContext->oformat = fmt;
audioStream = add_audio_stream(outContext, fmt->audio_codec);
open_audio(outContext, audioStream);
res = avio_open2(&outContext->pb, outfile, AVIO_FLAG_WRITE, NULL, NULL);
if (res < 0) die("url_fopen");
avformat_write_header(outContext, NULL);
audioCodecContext = audioStream->codec;
// resampling
swrContext = swr_alloc();
av_opt_set_channel_layout(swrContext, "in_channel_layout", fileCodecContext->channel_layout, 0);
av_opt_set_channel_layout(swrContext, "out_channel_layout", audioCodecContext->channel_layout, 0);
av_opt_set_int(swrContext, "in_sample_rate", fileCodecContext->sample_rate, 0);
av_opt_set_int(swrContext, "out_sample_rate", audioCodecContext->sample_rate, 0);
av_opt_set_sample_fmt(swrContext, "in_sample_fmt", fileCodecContext->sample_fmt, 0);
av_opt_set_sample_fmt(swrContext, "out_sample_fmt", audioCodecContext->sample_fmt, 0);
res = swr_init(swrContext);
if (res < 0) die("swr_init");
AVFrame *audioFrameDecoded = av_frame_alloc();
if (!audioFrameDecoded)
die("Could not allocate audio frame");
audioFrameDecoded->format = fileCodecContext->sample_fmt;
audioFrameDecoded->channel_layout = fileCodecContext->channel_layout;
audioFrameDecoded->channels = fileCodecContext->channels;
audioFrameDecoded->sample_rate = fileCodecContext->sample_rate;
AVFrame *audioFrameConverted = av_frame_alloc();
if (!audioFrameConverted) die("Could not allocate audio frame");
audioFrameConverted->nb_samples = audioCodecContext->frame_size;
audioFrameConverted->format = audioCodecContext->sample_fmt;
audioFrameConverted->channel_layout = audioCodecContext->channel_layout;
audioFrameConverted->channels = audioCodecContext->channels;
audioFrameConverted->sample_rate = audioCodecContext->sample_rate;
AVPacket inPacket;
av_init_packet(&inPacket);
inPacket.data = NULL;
inPacket.size = 0;
int frameFinished = 0;
while (av_read_frame(formatContext, &inPacket) >= 0) {
if (inPacket.stream_index == streamId) {
int len = avcodec_decode_audio4(fileCodecContext, audioFrameDecoded, &frameFinished, &inPacket);
if (frameFinished) {
// Convert
uint8_t *convertedData=NULL;
if (av_samples_alloc(&convertedData,
NULL,
audioCodecContext->channels,
audioFrameConverted->nb_samples,
audioCodecContext->sample_fmt, 0) < 0)
die("Could not allocate samples");
int outSamples = swr_convert(swrContext, NULL, 0,
//&convertedData,
//audioFrameConverted->nb_samples,
(const uint8_t **)audioFrameDecoded->data,
audioFrameDecoded->nb_samples);
if (outSamples < 0) die("Could not convert");
for (;;) {
outSamples = swr_get_out_samples(swrContext, 0);
if (outSamples < audioCodecContext->frame_size * audioCodecContext->channels) break; // see comments, thanks to #dajuric for fixing this
outSamples = swr_convert(swrContext,
&convertedData,
audioFrameConverted->nb_samples, NULL, 0);
size_t buffer_size = av_samples_get_buffer_size(NULL,
audioCodecContext->channels,
audioFrameConverted->nb_samples,
audioCodecContext->sample_fmt,
0);
if (buffer_size < 0) die("Invalid buffer size");
if (avcodec_fill_audio_frame(audioFrameConverted,
audioCodecContext->channels,
audioCodecContext->sample_fmt,
convertedData,
buffer_size,
0) < 0)
die("Could not fill frame");
AVPacket outPacket;
av_init_packet(&outPacket);
outPacket.data = NULL;
outPacket.size = 0;
if (avcodec_encode_audio2(audioCodecContext, &outPacket, audioFrameConverted, &frameFinished) < 0)
die("Error encoding audio frame");
if (frameFinished) {
outPacket.stream_index = audioStream->index;
if (av_interleaved_write_frame(outContext, &outPacket) != 0)
die("Error while writing audio frame");
av_free_packet(&outPacket);
}
}
}
}
}
swr_close(swrContext);
swr_free(&swrContext);
av_frame_free(&audioFrameConverted);
av_frame_free(&audioFrameDecoded);
av_free_packet(&inPacket);
av_write_trailer(outContext);
avio_close(outContext->pb);
avcodec_close(fileCodecContext);
avcodec_free_context(&fileCodecContext);
avformat_close_input(&formatContext);
return 0;
}
I wanted to include a couple things I found when I was working with the above code.
I had one file get stuck in an infinite loop. The reason is the file had a sample rate of 48000 and the code changes it to a 44100. This caused it to always have extra outSamples. swr_convert & would not grab them. So I ended up changing add_audio_stream to match the input streams sample rate.
c->sample_rate = fileCodecContext->sample_rate;
Also I had to produce wav files as my output. And it had a framesize of 0. so I just chose a number after a few tests I went with 32. I noticed if I went too big (ex 128) I would get audio glitches.
if (audioFrameConverted->nb_samples <= 0) audioFrameConverted->nb_samples = 32; //wav files have a 0
Changed the if statement that breaks out of the loop to check nb_samples if frame_size is 0.
if ((outSamples < audioCodecContext->frame_size * audioCodecContext->channels) || audioCodecContext->frame_size==0 && (outSamples < audioFrameConverted->nb_samples * audioCodecContext->channels)) break; // see comments, thanks to #dajuric for fixing this
There was also a glitch when I was testing outputting to ogg files where the timestamp data was missing so the file wouldn't play correctly in vlc. There were a few lines I added that helped with that.
out_audioStream->time_base = in_audioStream->time_base; // entered before avio_open.
outPacket.dts = audioFrameDecoded->pkt_dts;//rest after avcodec_encode_audio2
outPacket.pts = audioFrameDecoded->pkt_pts;
av_packet_rescale_ts(&outPacket, in_audioStream->time_base, out_audioStream->time_base);
Variables might be a little different I converted the code to c#. Thought this might help someone.
Actually swr_convert won't work for that, try to use swr_convert_frame instead.

Get distance from kinect depth image using ubuntu 12.04 LTS and opencv

I found out from one site that it is possible to find distance from the raw depth video output of the Kinect through the 2 bytes assigned to a particular pixel as shown in this link - tutorial. Based on this I written a code to find out the distance of the middle point form the Kinect sensor.
I compiled it and ran the code on Ubuntu and it is showing the output. The output is showing some values as distance. The values are coming around 150->1147. I hope it is showing the distance in mm.
But I am not sure, if it is right or wrong. I am providing the code below. Is my code working correctly or do I need to make some changes?
Code:
#include <opencv/cv.h>
#include <opencv/highgui.h>
#include <stdio.h>
#include "libfreenect_cv.h"
int getDist(IplImage *depth){
int x = depth->width/2;
int y = depth->height/2;
printf("width= %d and height %d \n",x,y);
int d = depth->imageData[x*2+y*640*2+1];
printf("1st value is %d \n",d);
d= d << 8;
d= d+depth->imageData[x*2+y*640*2];
return d;
}
IplImage *GlViewColor(IplImage *depth)
{
static IplImage *image = 0;
if (!image) image = cvCreateImage(cvSize(640,480), 8, 3);
unsigned char *depth_mid = (unsigned char*)(image->imageData);
int i;
for (i = 0; i < 640*480; i++) {
int lb = ((short *)depth->imageData)[i] % 256;
int ub = ((short *)depth->imageData)[i] / 256;
switch (ub) {
case 0:
depth_mid[3*i+2] = 255;
depth_mid[3*i+1] = 255-lb;
depth_mid[3*i+0] = 255-lb;
break;
case 1:
depth_mid[3*i+2] = 255;
depth_mid[3*i+1] = lb;
depth_mid[3*i+0] = 0;
break;
case 2:
depth_mid[3*i+2] = 255-lb;
depth_mid[3*i+1] = 255;
depth_mid[3*i+0] = 0;
break;
case 3:
depth_mid[3*i+2] = 0;
depth_mid[3*i+1] = 255;
depth_mid[3*i+0] = lb;
break;
case 4:
depth_mid[3*i+2] = 0;
depth_mid[3*i+1] = 255-lb;
depth_mid[3*i+0] = 255;
break;
case 5:
depth_mid[3*i+2] = 0;
depth_mid[3*i+1] = 0;
depth_mid[3*i+0] = 255-lb;
break;
default:
depth_mid[3*i+2] = 0;
depth_mid[3*i+1] = 0;
depth_mid[3*i+0] = 0;
break;
}
}
return image;
}
int main(int argc, char **argv)
{
while (cvWaitKey(100) != 27) {
IplImage *image = freenect_sync_get_rgb_cv(0);
if (!image) {
printf("Error: Kinect not connected?\n");
return -1;
}
cvCvtColor(image, image, CV_RGB2BGR);
IplImage *depth = freenect_sync_get_depth_cv(0);
if (!depth) {
printf("Error: Kinect not connected?\n");
return -1;
}
cvShowImage("RGB", image);
//int d = getDist(depth);
printf("value is %d \n",getDist(depth));
cvShowImage("Depth", GlViewColor(depth));//GlViewColor(depth)
}
cvDestroyWindow("RGB");
cvDestroyWindow("Depth");
//cvReleaseImage(image);
//cvReleaseImage(depth);
return 0;
}
The code seems to be fine. Scale the image of range (150-1147) to (0-255) and display it as gray scale. It will help you to have a better understanding of the image. Doing so will result in nearest object being dark-Colored and farthest being light-colored. It would be better than using GlViewColor function.

How do I create an AIFF file

I have some raw sound data that I want to make into an AIFF file format. I know the specifics of the audio data. I tried creating a wave from the audio, but that didn't work. OS X does have a function to create the header, but it directly addresses a file and I might not want to do that (that and the function, SetupAIFFHeader is deprecated and unavailable in 64-bit code).
Apple's Core Audio API will create and write data to an AIFF file, and other formats. It works pretty well, but in my opinion the API is difficult to use. I'll paste some example code below, but you'd probably want to change it. AudioFileWriteBytes can write more than 2 bytes at a time. There is another wrapper API in AudioToolbox/ExtendedAudioFile.h which will let you write a format like 32 bit floats, and have it translated to an underlying format, be it AIFF/PCM or a compressed format.
double sampleRate = 44100;
double duration = ...;
long nSamples = (long)(sampleRate * duration);
// Format struct for 1 channel, 16 bit PCM audio
AudioStreamBasicDescription asbd;
memset(&asbd, 0, sizeof(asbd));
asbd.mSampleRate = sampleRate;
asbd.mFormatID = kAudioFormatLinearPCM;
asbd.mFormatFlags = kAudioFormatFlagIsBigEndian | kAudioFormatFlagIsSignedInteger;
asbd.mBitsPerChannel = 16;
asbd.mChannelsPerFrame = 1;
asbd.mFramesPerPacket = 1;
asbd.mBytesPerFrame = 2;
asbd.mBytesPerPacket = 2;
CFURLRef url = makeUrl("hello.aiff");
AudioFileID audioFile;
OSStatus res;
res = AudioFileCreateWithURL(url, kAudioFileAIFFType, &asbd,
kAudioFileFlags_EraseFile, &audioFile);
checkError(res);
UInt32 numBytes = 2;
for (int i=0; i<nSamples; i++) {
SInt16 sample = ... // something between SHRT_MIN and SHRT_MAX;
sample = OSSwapHostToBigInt16(sample);
res = AudioFileWriteBytes(audioFile, false, i*2, &numBytes, &sample);
checkError(res);
}
res = AudioFileClose(audioFile);
checkError(res);
checkError is asserting that res == noErr. makeUrl looks like:
CFURLRef makeUrl(const char *cstr) {
CFStringRef path = CFStringCreateWithCString(0, cstr, kCFStringEncodingUTF8);
CFURLRef url = CFURLCreateWithFileSystemPath(NULL, path, 0, false);
CFRelease(path);
return url;
}
As much as I hate wheel-reinvention, I suspect your best bet might be to roll your own AIFF save routines.
AIFF is an extension of the old Electronic Arts EA-IFF format which was used on the Amiga; it's a series of 4-byte identifiers (similar to FOURCCs), block lengths and data payloads. The Wikipedia article is quite informative and provides links to other sites which contain detailed information about the format.
http://en.wikipedia.org/wiki/Audio_Interchange_File_Format
I was able to write a proper AIFF file. The last bit that was getting me was I was using a sizeof() for a structure's size, where the size omits the first eight bytes. I did use Apple's deprecated AIFF.h header to get the structures, and it seems that neither QuickTime X nor 7 reads the metadata I set in it.
You can see my work at PlayerPRO's PlayerPRO 6 branch. It's in a file called PPApp_AppDelegate.m in the function -createAIFFDataFromSettings:data:
Here is some C code that will create an AIFF file using the Apple CoreAudio and AudioToolbox frameworks for macOS.
#include <string.h>
#include <math.h>
#include "CoreAudio/CoreAudio.h"
#include "CoreAudio/CoreAudioTypes.h"
#include "AudioToolbox/AudioToolbox.h"
#include "AudioToolbox/AudioFile.h"
CFURLRef MakeUrl(const char *cstr);
void CheckError(OSStatus res);
AudioStreamBasicDescription asbd;
AudioFileID audioFile;
OSStatus res;
void CheckError(OSStatus result) {
if (result == noErr) return;
switch(result) {
case kAudioFileUnspecifiedError:
printf("kAudioFileUnspecifiedError");
break;
case kAudioFileUnsupportedFileTypeError:
printf("kAudioFileUnsupportedFileTypeError");
break;
case kAudioFileUnsupportedDataFormatError:
printf("kAudioFileUnsupportedDataFormatError");
break;
case kAudioFileUnsupportedPropertyError:
printf("kAudioFileUnsupportedPropertyError");
break;
case kAudioFileBadPropertySizeError:
printf("kAudioFileBadPropertySizeError");
break;
case kAudioFilePermissionsError:
printf("kAudioFilePermissionsError");
break;
case kAudioFileNotOptimizedError:
printf("kAudioFileNotOptimizedError");
break;
case kAudioFileInvalidChunkError:
printf("kAudioFileInvalidChunkError");
break;
case kAudioFileDoesNotAllow64BitDataSizeError:
printf("kAudioFileDoesNotAllow64BitDataSizeError");
break;
case kAudioFileInvalidPacketOffsetError:
printf("kAudioFileInvalidPacketOffsetError");
break;
case kAudioFileInvalidFileError:
printf("kAudioFileInvalidFileError");
break;
case kAudioFileOperationNotSupportedError:
printf("kAudioFileOperationNotSupportedError");
break;
case kAudioFileNotOpenError:
printf("kAudioFileNotOpenError");
break;
case kAudioFileEndOfFileError:
printf("kAudioFileEndOfFileError");
break;
case kAudioFilePositionError:
printf("kAudioFilePositionError");
break;
case kAudioFileFileNotFoundError:
printf("kAudioFileFileNotFoundError");
break;
default:
printf("unknown error");
break;
}
exit(result);
}
CFURLRef MakeUrl(const char *cstr) {
CFStringRef path = CFStringCreateWithCString(0, cstr, kCFStringEncodingUTF8);
CFURLRef url = CFURLCreateWithFileSystemPath(NULL, path, 0, false);
CFRelease(path);
return url;
}
int main() {
double sampleRate = 44100.0;
double duration = 10.0;
long nSamples = (long)(sampleRate * duration);
memset(&asbd, 0, sizeof(asbd));
// Format struct for 1 channel, 16 bit PCM audio
asbd.mSampleRate = sampleRate;
asbd.mFormatID = kAudioFormatLinearPCM;
asbd.mFormatFlags = kAudioFormatFlagIsBigEndian | kAudioFormatFlagIsSignedInteger;
asbd.mBitsPerChannel = 16;
asbd.mChannelsPerFrame = 1;
asbd.mFramesPerPacket = 1;
asbd.mBytesPerFrame = 2;
asbd.mBytesPerPacket = 2;
CFURLRef url = MakeUrl("sinpos.aiff");
res = AudioFileCreateWithURL(url, kAudioFileAIFFType, &asbd,
kAudioFileFlags_EraseFile, &audioFile);
CheckError(res);
UInt32 numBytes = 2;
int freq = 44; // 100 for approx 440Hz, 2940 for 15Hz, 44 for 1000Hz
for (int i=0; i<nSamples; i++) {
int x = (i % freq);
double angle = 2.0*3.1459*x/freq;
double s = 1.0*32767*sin(angle);
SInt16 sample = (SInt16) s;
sample = OSSwapHostToBigInt16(sample);
res = AudioFileWriteBytes(audioFile, false, i*2, &numBytes, &sample);
CheckError(res);
}
res = AudioFileClose(audioFile);
CheckError(res);
exit(0);
}
The Makefile is as follows:
aiffcreate: aiffcreate.c
gcc -o $# $< -framework AudioToolbox -framework CoreFoundation -framework CoreAudio -lm
clean:
rm *.aiff aiffcreate || true
This can be run by simply issuing a ./aiffcreate command on the command line and a file will be created named sinpos.aiff which is a pure 1000Hz tone lasting 10 seconds.

Convert .m4a to PCM using libavcodec

I'm trying to convert a .m4a file to raw PCM file so that I can play it back in Audacity.
According to the AVCodecContext it is a 44100 Hz track using the sample format AV_SAMPLE_FMT_FLTP which, to my understanding, when decodeded using avcodec_decode_audio4, I should get two arrays of floating point values (one for each channel).
I'm unsure of the significance of the AVCodecContext's bits_per_coded_sample = 16
Unfortunately Audacity plays the result back as if I have the original track is mixed in with some white noise.
Here is some sample code of what I've been done. Note that I've also added a case for a track that uses signed 16bit non-interleaved data (sample_format = AC_SAMPLE_FMT_S16P), which Audacity plays back fine.
int AudioDecoder::decode(std::string path)
{
const char* input_filename=path.c_str();
av_register_all();
AVFormatContext* container=avformat_alloc_context();
if(avformat_open_input(&container,input_filename,NULL,NULL)<0){
printf("Could not open file");
}
if(avformat_find_stream_info(container, NULL)<0){
printf("Could not find file info");
}
av_dump_format(container,0,input_filename,false);
int stream_id=-1;
int i;
for(i=0;i<container->nb_streams;i++){
if(container->streams[i]->codec->codec_type==AVMEDIA_TYPE_AUDIO){
stream_id=i;
break;
}
}
if(stream_id==-1){
printf("Could not find Audio Stream");
}
AVDictionary *metadata=container->metadata;
AVCodecContext *ctx=container->streams[stream_id]->codec;
AVCodec *codec=avcodec_find_decoder(ctx->codec_id);
if(codec==NULL){
printf("cannot find codec!");
}
if(avcodec_open2(ctx,codec,NULL)<0){
printf("Codec cannot be found");
}
AVSampleFormat sfmt = ctx->sample_fmt;
AVPacket packet;
av_init_packet(&packet);
AVFrame *frame = avcodec_alloc_frame();
int buffer_size = AVCODEC_MAX_AUDIO_FRAME_SIZE+ FF_INPUT_BUFFER_PADDING_SIZE;;
uint8_t buffer[buffer_size];
packet.data=buffer;
packet.size =buffer_size;
FILE *outfile = fopen("test.raw", "wb");
int len;
int frameFinished=0;
while(av_read_frame(container,&packet) >= 0)
{
if(packet.stream_index==stream_id)
{
//printf("Audio Frame read \n");
int len=avcodec_decode_audio4(ctx, frame, &frameFinished, &packet);
if(frameFinished)
{
if (sfmt==AV_SAMPLE_FMT_S16P)
{ // Audacity: 16bit PCM little endian stereo
int16_t* ptr_l = (int16_t*)frame->extended_data[0];
int16_t* ptr_r = (int16_t*)frame->extended_data[1];
for (int i=0; i<frame->nb_samples; i++)
{
fwrite(ptr_l++, sizeof(int16_t), 1, outfile);
fwrite(ptr_r++, sizeof(int16_t), 1, outfile);
}
}
else if (sfmt==AV_SAMPLE_FMT_FLTP)
{ //Audacity: big endian 32bit stereo start offset 7 (but has noise)
float* ptr_l = (float*)frame->extended_data[0];
float* ptr_r = (float*)frame->extended_data[1];
for (int i=0; i<frame->nb_samples; i++)
{
fwrite(ptr_l++, sizeof(float), 1, outfile);
fwrite(ptr_r++, sizeof(float), 1, outfile);
}
}
}
}
}
fclose(outfile);
av_close_input_file(container);
return 0;
}
I'm hoping I've just done a naive conversion (most/less significant bit issues), but at present I've been unable to figure it out. Note that Audacity can only import RAW float data if its 32bit or 64 bit float (big or little endian).
Thanks for any insight.
I think problem is in "nb_samples". It's not exactly you need. It's better to try with "linesize[0]".
Example:
char* ptr_l = (char*)frame->extended_data[0];
char* ptr_r = (char*)frame->extended_data[1];
size_t size = sizeof(float);
for (int i=0; i<frame->linesize[0]; i+=size)
{
fwrite(ptr_l, size, 1, outfile);
fwrite(ptr_r, size, 1, outfile);
ptr_l += size;
ptr_r += size;
}
It's for "float", and repeat the same for "int16_t". But "size" will be "sizeof(int16_t)"
You must use a converter of AV_SAMPLE_FMT_FLTP in AC_SAMPLE_FMT_S16P
How to convert sample rate from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16?
Here is a working example (in pAudioBuffer you have pcm data within white nose):
SwrContext *swr;
swr=swr_alloc();
av_opt_set_int(swr,"in_channel_layout",2,0);
av_opt_set_int(swr, "out_channel_layout", 2, 0);
av_opt_set_int(swr, "in_sample_rate", codecContext->sample_rate, 0);
av_opt_set_int(swr, "out_sample_rate", codecContext->sample_rate, 0);
av_opt_set_sample_fmt(swr, "in_sample_fmt", AV_SAMPLE_FMT_FLTP, 0);
av_opt_set_sample_fmt(swr, "out_sample_fmt", AV_SAMPLE_FMT_S16P, 0);
swr_init(swr);
int16_t * pAudioBuffer = (int16_t *) av_malloc (AUDIO_INBUF_SIZE * 2);
while(av_read_frame(fmt_cntx,&readingPacket)==0){
if(readingPacket.stream_index==audioSteam->index){
AVPacket decodingPacket=readingPacket;
while(decodingPacket.size>0){
int gotFrame=0;
int result=avcodec_decode_audio4(codecContext,frame,&gotFrame,&decodingPacket);
if(result<0){
av_frame_free(&frame);
avformat_close_input(&fmt_cntx);
return null;
}
if(result>=0 && gotFrame){
int data_size=frame->nb_samples*frame->channels;
swr_convert(swr,&pAudioBuffer,frame->nb_samples,frame->extended_data,frame->nb_samples);
jshort *outShortArray=(*pEnv)->NewShortArray(pEnv,data_size);
(*pEnv)->SetShortArrayRegion(pEnv,outShortArray,0,data_size,pAudioBuffer);
(*pEnv)->CallVoidMethod(pEnv,pObj,callBackShortBuffer,outShortArray,data_size);
(*pEnv)->DeleteLocalRef(pEnv,outShortArray);
decodingPacket.size -= result;
decodingPacket.data += result;
}else{
decodingPacket.size=0;
decodingPacket.data=NULL;
}}
av_free_packet(&decodingPacket);
}

Resources