ALSA playback interrupted without snd_pcm_hw_params_get_* calls - audio

I'm finding that a simple ALSA playback program behaves differently when I put in some calls to snd_pcm_hw_params_get_* functions. My program plays a sine wave from a buffer repeatedly. When I include the calls, I get a pure tone as I expected. When I remove the calls, however, I get a series of beeps. This worries me, because I would not expect calls that retrieve data to have anything to do with how the sound is played. I get this behavior both on a cheap USB sound card and my (presumably nicer) internal sound card.
Here is the code:
#define GETPARAMS
int main() {
snd_pcm_t *handle;
snd_pcm_hw_params_t *params;
const char name[] = "hw:0,0";
int dir;
snd_pcm_stream_t stream = SND_PCM_STREAM_PLAYBACK;
snd_pcm_access_t access = SND_PCM_ACCESS_RW_INTERLEAVED;
snd_pcm_format_t format = SND_PCM_FORMAT_S16_LE;
unsigned int rate = 48000;
unsigned int channels = 2;
unsigned int periods = 4;
snd_pcm_uframes_t periodsize = 2048;
int num_frames = 2*periodsize;
snd_pcm_hw_params_alloca(&params);
snd_pcm_open(&handle, name, stream, 0);
snd_pcm_hw_params_any(handle, params);
#ifdef GETPARAMS
printf("\nparameters before setting:\n");
snd_pcm_hw_params_get_rate(params, &rate, &dir);
printf(" rate = %d, dir = %d\n", rate, dir);
snd_pcm_hw_params_get_channels(params, &channels);
printf(" channels = %d\n", channels);
snd_pcm_hw_params_get_periods(params, &periods, &dir);
printf(" periods = %d, dir = %d\n", periods, dir);
snd_pcm_hw_params_get_buffer_size(params, &periodsize);
printf(" periodsize = %ld\n", periodsize);
#endif
snd_pcm_hw_params_set_access(handle, params, access);
snd_pcm_hw_params_set_format(handle, params, format);
snd_pcm_hw_params_set_rate_near(handle, params, &rate, &dir);
snd_pcm_hw_params_set_channels(handle, params, 2);
snd_pcm_hw_params_set_periods(handle, params, periods, 0);
snd_pcm_hw_params_set_buffer_size(handle, params, num_frames);
snd_pcm_hw_params(handle, params);
#ifdef GETPARAMS
printf("\nparameters after setting:\n");
snd_pcm_hw_params_get_rate(params, &rate, &dir);
printf(" rate = %d, dir = %d\n", rate, dir);
snd_pcm_hw_params_get_channels(params, &channels);
printf(" channels = %d\n", channels);
snd_pcm_hw_params_get_periods(params, &periods, &dir);
printf(" periods = %d, dir = %d\n", periods, dir);
snd_pcm_hw_params_get_buffer_size(params, &periodsize);
printf(" periodsize = %ld\n\n", periodsize);
#endif
int16_t *data = (int16_t*)calloc(2*periodsize, sizeof(int16_t));
loadpage(data, 2*periodsize);
snd_pcm_sframes_t frames;
snd_pcm_prepare(handle);
for (int i=0; i<8; i++) {
frames = snd_pcm_writei(handle, data, num_frames);
if (frames < 0)
frames = snd_pcm_recover(handle, frames, 0);
if (frames < 0) {
printf("snd_pcm_writei failed: %s\n", snd_strerror(frames));
}
if (frames > 0 && frames < num_frames)
printf("short write (expected %d, write %li)\n", num_frames, frames);
}
snd_pcm_close(handle);
free(data);
}
loadpage() fills the buffer. When I comment out the #define GETPARAMS I get a series of short beeps. When I include it I get a pure tone.
Here is the output when GETPARAMS is defined:
parameters before setting:
rate = 48000, dir = 32766
channels = 2
periods = 4, dir = 32766
periodsize = 2048
parameters after setting:
rate = 48000, dir = 0
channels = 2
periods = 4, dir = 0
periodsize = 4096

You must not call the snd_pcm_hw_param_get_*() functions if the parameters have not yet been set because at that time, the configuration space contains multiple potential values for the parameters.
To print the current state of the hw_params container, use snd_pcm_hw_params_dump():
snd_output_t *output;
snd_output_stdio_attach(&output, stdout, 0);
...
snd_pcm_hw_params_dump(params, output);
...
snd_output_close(output);
Anyway, the problem is that the initial values of periods, periodsize, and num_frames are inconsistent, and that the _get_ calls overwrite these variables with other values that happen to be consistent.
I do not know what values you actually want to use, but note that the period size and the buffer size are measured in frames, and that one frame contains all samples of all channels, i.e., in this case, one frame has four bytes.

Related

sending audio via bluetooth a2dp source esp32

I am trying to send measured i2s analogue signal (e.g. from mic) to the sink device via Bluetooth instead of the default noise.
Currently I am trying to change the bt_app_a2d_data_cb()
static int32_t bt_app_a2d_data_cb(uint8_t *data, int32_t i2s_read_len)
{
if (i2s_read_len < 0 || data == NULL) {
return 0;
}
char* i2s_read_buff = (char*) calloc(i2s_read_len, sizeof(char));
bytes_read = 0;
i2s_adc_enable(I2S_NUM_0);
while(bytes_read == 0)
{
i2s_read(I2S_NUM_0, i2s_read_buff, i2s_read_len,&bytes_read, portMAX_DELAY);
}
i2s_adc_disable(I2S_NUM_0);
// taking care of the watchdog//
TIMERG0.wdt_wprotect=TIMG_WDT_WKEY_VALUE;
TIMERG0.wdt_feed=1;
TIMERG0.wdt_wprotect=0;
uint32_t j = 0;
uint16_t dac_value = 0;
// change 16bit input signal to 8bit
for (int i = 0; i < i2s_read_len; i += 2) {
dac_value = ((((uint16_t) (i2s_read_buff[i + 1] & 0xf) << 8) | ((i2s_read_buff[i + 0]))));
data[j] = (uint8_t) dac_value * 256 / 4096;
j++;
}
// testing for loop
//uint8_t da = 0;
//for (int i = 0; i < i2s_read_len; i++) {
// data[i] = (uint8_t) (i2s_read_buff[i] >> 8);// & 0xff;
// da++;
// if(da>254) da=0;
//}
free(i2s_read_buff);
i2s_read_buff = NULL;
return i2s_read_len;
}
I can hear the sawtooth sound from the sink device.
Any ideas what to do?
your data can be an array of some float digits representing analog signals or analog signal variations, for example, a 32khz sound signal contains 320000 float numbers to define captures sound for every second. if your data have been expected to transmit in offline mode you can prepare your outcoming data in the form of a buffer plus a terminator sign then send buffer by Bluetooth module of sender device which is connected to the proper microcontroller. for the receiving device, if you got terminator character like "\r" you can process incoming buffer e.g. for my case, I had to send a string array of numbers but I often received at most one or two unknown characters and to avoid it I reject it while fulfill receiving container.
how to trim unknown first characters of string in code vision
if you want it in online mode i.e. your data must be transmitted and played concurrently. you must consider delays and reasonable time to process for all microcontrollers and devices like Bluetooth, EEprom iCs and...
I'm also working on a project "a2dp source esp32".
I'm playing a wav-file from spiffs.
If the wav-file is 44100, 16-bit, stereo then you can directly write a stream of bytes from the file to the array data[ ].
When I tried to write less data than in the len-variable and return less (for example 88), I got an error, now I'm trying to figure out how to reduce this buffer because of big latency (len=512).
Also, the data in the array data[ ] is stored as stereo.
Example: read data from file to data[ ]-array:
size_t read;
read = fread((void*) data, 1, len, fwave);//fwave is a file
if(read<len){//If get EOF, go to begin of the file
fseek(fwave , 0x2C , SEEK_SET);//skip wav-header 44bytesт
read = fread((void*) (&(data[read])), 1, len-read, fwave);//read up
}
If file mono, I convert it to stereo like this (I read half and then double data):
int32_t lenHalf=len/2;
read = fread((void*) data, 1, lenHalf, fwave);
if(read<lenHalf){
fseek(fwave , 0x2C , SEEK_SET);//skip wav-header 44bytesт
read = fread((void*) (&(data[read])), 1, lenHalf-read, fwave);//read up
}
//copy to the second channel
uint16_t *data16=(uint16_t*)data;
for (int i = lenHalf/2-1; i >= 0; i--) {
data16[(i << 1)] = data16[i];
data16[(i << 1) + 1] = data16[i];
}
I think you have got sawtooth sound because:
your data is mono?
in your "return i2s_read_len;" i2s_read_len less than len
you // change 16bit input signal to 8bit, in the array data[ ] data as 16-bit: 2ByteLeft-2ByteRight-2ByteLeft-2ByteRight-...
I'm not sure, it's a guess.

Display "Hello World" on framebuffer in linux

I have used the linux 3.14 version on my ARM target and i want to show some line of characters in the display using frame buffer. I can change the colors of the display using the below code.
#include <stdio.h>
unsigned char colours[8][4] = {
{ 0x00, 0xFF, 0x00, 0xFF }, // green
{ 0x00, 0xFF, 0x00, 0xFF }, // green
{ 0x00, 0xFF, 0x00, 0xFF }, // green
{ 0x00, 0xFF, 0x00, 0xFF }, // green
{ 0x00, 0xFF, 0x00, 0xFF }, // green
{ 0x00, 0xFF, 0x00, 0xFF }, // green
{ 0x00, 0xFF, 0x00, 0xFF }, // green
{ 0x00, 0xFF, 0x00, 0xFF }, // green
};
int frames[] = {0,5,10,15,20,25,30};
int columns = 800;
int lines = 480;
#define ARRAY_SIZE(a) (sizeof(a)/sizeof(a[0]))
int frame(int c, int l){
int i;
for(i=0; i < ARRAY_SIZE(frames); i++){
if((c==frames[i])&&((l>=frames[i])&&l<=(lines-frames[i]))){
return 1;
}
if((c==columns-frames[i])&&((l>=frames[i])&&l<=(lines-frames[i]))){
return 1;
}
if((l==frames[i])&&((c>=frames[i])&&c<=(columns-frames[i]))){
return 1;
}
if((l==lines-frames[i])&&((c>=frames[i])&&c<=(columns-frames[i]))){
return 1;
}
}
return 0;
}
int main(int argc, char **argv)
{
unsigned char pixel[3];
int l, c;
char *filename = argv[1];
printf ("Device : %s\n",filename);
FILE *f = fopen(filename,"wb");
if(f){
printf("Device open success \n");
for(l=0; l<lines; l++){
for(c=0; c < columns; c++){
if(frame(c,l)){
fwrite(colours[3], 1, sizeof(colours[3]), f);
}else{
int colour = c/(columns/ARRAY_SIZE(colours));
fwrite(colours[colour], 1, sizeof(colours[colour]), f);
}
}
}
fclose(f);
}
else
printf("Device open failed \n");
return 0;
}
In the same way i want to show some lines of character to the display. Example, I want to show characters "Hello world !!!" in the display using frame buffer.
Could any one help me to work it out.
You can find an elegant piece of code to do this in tslib. tslib is a c library for filtering touchscreen events. Actually, you don't need tslib for your purpose (yes, you don't have to build it). In their tests you can find a utility to access the framebuffer.
They have provided the fbutils.h whose implementation you can find in fbutils-linux.c. This code is very simple in that it directly manipulates the linux framebuffer and does not have any dependencies. Currently it's not even 500 lines, and if you only want to display text, you can remove other irrelevant functionality. It supports two fonts - font_8x8 and font_8x16 - whose definitions you can find in the respective .c files.
I won't go into code details as it is easy to understand. Will just list the current API and provide a simpler code for open and close functionality.
int open_framebuffer(void);
void close_framebuffer(void);
void setcolor(unsigned colidx, unsigned value);
void put_cross(int x, int y, unsigned colidx);
void put_string(int x, int y, char *s, unsigned colidx);
void put_string_center(int x, int y, char *s, unsigned colidx);
void pixel(int x, int y, unsigned colidx);
void line(int x1, int y1, int x2, int y2, unsigned colidx);
void rect(int x1, int y1, int x2, int y2, unsigned colidx);
void fillrect(int x1, int y1, int x2, int y2, unsigned colidx);
To manipulate the linux framebuffer, first you should memory map it into your process address space. After memory mapping you can access it just like an array. Using some ioctl you can get information about the framebuffer such as resolution, bytes-per-pixel etc. See here for details.
In the code below, you can pass the name of the fb device to open it, such as /dev/fb0. You can use the rest of the functions in the original code for drawing.
int open_framebuffer(const char *fbdevice)
{
uint32_t y, addr;
fb_fd = open(fbdevice, O_RDWR);
if (fb_fd == -1) {
perror("open fbdevice");
return -1;
}
if (ioctl(fb_fd, FBIOGET_FSCREENINFO, &fix) < 0) {
perror("ioctl FBIOGET_FSCREENINFO");
close(fb_fd);
return -1;
}
if (ioctl(fb_fd, FBIOGET_VSCREENINFO, &var) < 0) {
perror("ioctl FBIOGET_VSCREENINFO");
close(fb_fd);
return -1;
}
xres_orig = var.xres;
yres_orig = var.yres;
if (rotation & 1) {
/* 1 or 3 */
y = var.yres;
yres = var.xres;
xres = y;
} else {
/* 0 or 2 */
xres = var.xres;
yres = var.yres;
}
fbuffer = mmap(NULL,
fix.smem_len,
PROT_READ | PROT_WRITE, MAP_FILE | MAP_SHARED,
fb_fd,
0);
if (fbuffer == (unsigned char *)-1) {
perror("mmap framebuffer");
close(fb_fd);
return -1;
}
memset(fbuffer, 0, fix.smem_len);
bytes_per_pixel = (var.bits_per_pixel + 7) / 8;
transp_mask = ((1 << var.transp.length) - 1) <<
var.transp.offset; /* transp.length unlikely > 32 */
line_addr = malloc(sizeof(*line_addr) * var.yres_virtual);
addr = 0;
for (y = 0; y < var.yres_virtual; y++, addr += fix.line_length)
line_addr[y] = fbuffer + addr;
return 0;
}
void close_framebuffer(void)
{
memset(fbuffer, 0, fix.smem_len);
munmap(fbuffer, fix.smem_len);
close(fb_fd);
free(line_addr);
xres = 0;
yres = 0;
rotation = 0;
}
You can find examples of its usage in test programs in the folder, such as ts_test.c.
You can extend this code to support other fonts, display images etc.
Good luck!
First, I strongly suggest to avoid use of fopen/fwrite function to access devices. These function handle internal buffers that can be troublesome. Prefers functions open and write.
Next, you can't continue with series of if .. then .. else .. to render a true graphic. You need to allocate a buffer that represent your framebuffer. Its size will, be columns * lines * 4 (you need 1 byte per primary color). To write a pixel, you have to use something like:
buf[l * columns + c * 4 + 0] = red_value;
buf[l * columns + c * 4 + 1] = green_value;
buf[l * columns + c * 4 + 2] = blue_value;
buf[l * columns + c * 4 + 3] = alpha_value;
Once you buffer is fully filled, write it with:
write(fd, buf, sizeof(buf));
(where fd is file descriptor return by fd = open("/dev/fbdev0", O_WRONLY);)
Check that you are now able to set arbitrary pixels on our framebuffer.
Finally, you need a database of rendered characters. You could create it yourself, but I suggest to use https://github.com/dhepper/font8x8.
Fonts are monochrome so each bit represent one pixel. On your framebuffer, you need 4bytes for one pixel. So you will have to do some conversion.
This is a really basic way to access framebuffer, there are plenty of improvements to do:
columns, lines and pixel representation should negotiated/retrieved using FBIO*ET_*SCREENINFO ioctl.
using write to access framebuffer is not the preferred method. It is slow and does not allow to updating framebuffer easily. The preferred method use mmap.
if you want to to animate framebuffer, you to use a double buffer: allocate a buffer twice larger than necessary, write alternatively first part or second part and update shown buffer with FBIOPAN_DISPLAY
font8x8 is not ideal. You may want to use any other font available on web. You need a library to decode font format (libfreetype) and a library to render a glyph (= a letter) in a particular size to a buffer (aka rasterize step) that you can copy to your screen (libpango)
you may want to accelerate buffer copy between your glyph database and your screen framebuffer (aka compose step), but it is a far longer story that involve true GPU drivers

Large overhead in CUDA kernel launch outside GPU execution

I am measuring the running time of kernels, as seen from a CPU thread, by measuring the interval from before launching a kernel to after a cudaDeviceSynchronize (using gettimeofday). I have a cudaDeviceSynchronize before I start recording the interval. I also instrument the kernels to record the timestamp on the GPU (using clock64) at the start of the kernel by thread(0,0,0) of each block from block(0,0,0) to block(occupancy-1,0,0) to an array of size equal to number of SMs. Every thread at the end of the kernel code, updates the timestamp to another array (of the same size) at the index equal to the index of the SM it runs on.
The intervals calculated from the two arrays are 60-70% of that measured from the CPU thread.
For example, on a K40, while gettimeofday gives an interval of 140ms, the avg of intervals calculated from GPU timestamps is only 100ms. I have experimented with many grid sizes (15 blocks to 6K blocks) but have found similar behavior so far.
__global__ void some_kernel(long long *d_start, long long *d_end){
if(threadIdx.x==0){
d_start[blockIdx.x] = clock64();
}
//some_kernel code
d_end[blockIdx.x] = clock64();
}
Does this seem possible to the experts?
Does this seem possible to the experts?
I suppose anything is possible for code you haven't shown. After all, you may just have a silly bug in any of your computation arithmetic. But if the question is "is it sensible that there should be 40ms of unaccounted-for time overhead on a kernel launch, for a kernel that takes ~140ms to execute?" I would say no.
I believe the method I outlined in the comments is reasonably accurate. Take the minimum clock64() timestamp from any thread in the grid (but see note below regarding SM restriction). Compare it to the maximum time stamp of any thread in the grid. The difference will be comparable to the reported execution time of gettimeofday() to within 2 percent, according to my testing.
Here is my test case:
$ cat t1040.cu
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#define LS_MAX 2000000000U
#define MAX_SM 64
#define cudaCheckErrors(msg) \
do { \
cudaError_t __err = cudaGetLastError(); \
if (__err != cudaSuccess) { \
fprintf(stderr, "Fatal error: %s (%s at %s:%d)\n", \
msg, cudaGetErrorString(__err), \
__FILE__, __LINE__); \
fprintf(stderr, "*** FAILED - ABORTING\n"); \
exit(1); \
} \
} while (0)
#include <time.h>
#include <sys/time.h>
#define USECPSEC 1000000ULL
__device__ int result;
__device__ unsigned long long t_start[MAX_SM];
__device__ unsigned long long t_end[MAX_SM];
unsigned long long dtime_usec(unsigned long long start){
timeval tv;
gettimeofday(&tv, 0);
return ((tv.tv_sec*USECPSEC)+tv.tv_usec)-start;
}
__device__ __inline__ uint32_t __mysmid(){
uint32_t smid;
asm volatile("mov.u32 %0, %%smid;" : "=r"(smid));
return smid;}
__global__ void kernel(unsigned ls){
unsigned long long int ts = clock64();
unsigned my_sm = __mysmid();
atomicMin(t_start+my_sm, ts);
// junk code to waste time
int tv = ts&0x1F;
for (unsigned i = 0; i < ls; i++){
tv &= (ts+i);}
result = tv;
// end of junk code
ts = clock64();
atomicMax(t_end+my_sm, ts);
}
// optional command line parameter 1 = kernel duration, parameter 2 = number of blocks, parameter 3 = number of threads per block
int main(int argc, char *argv[]){
unsigned ls;
if (argc > 1) ls = atoi(argv[1]);
else ls = 1000000;
if (ls > LS_MAX) ls = LS_MAX;
int num_sms = 0;
cudaDeviceGetAttribute(&num_sms, cudaDevAttrMultiProcessorCount, 0);
cudaCheckErrors("cuda get attribute fail");
int gpu_clk = 0;
cudaDeviceGetAttribute(&gpu_clk, cudaDevAttrClockRate, 0);
if ((num_sms < 1) || (num_sms > MAX_SM)) {printf("invalid sm count: %d\n", num_sms); return 1;}
unsigned blks;
if (argc > 2) blks = atoi(argv[2]);
else blks = num_sms;
if ((blks < 1) || (blks > 0x3FFFFFFF)) {printf("invalid blocks: %d\n", blks); return 1;}
unsigned ntpb;
if (argc > 3) ntpb = atoi(argv[3]);
else ntpb = 256;
if ((ntpb < 1) || (ntpb > 1024)) {printf("invalid threads: %d\n", ntpb); return 1;}
kernel<<<1,1>>>(100); // warm up
cudaDeviceSynchronize();
cudaCheckErrors("kernel fail");
unsigned long long *h_start, *h_end;
h_start = new unsigned long long[num_sms];
h_end = new unsigned long long[num_sms];
for (int i = 0; i < num_sms; i++){
h_start[i] = 0xFFFFFFFFFFFFFFFFULL;
h_end[i] = 0;}
cudaMemcpyToSymbol(t_start, h_start, num_sms*sizeof(unsigned long long));
cudaMemcpyToSymbol(t_end, h_end, num_sms*sizeof(unsigned long long));
unsigned long long htime = dtime_usec(0);
kernel<<<blks,ntpb>>>(ls);
cudaDeviceSynchronize();
htime = dtime_usec(htime);
cudaMemcpyFromSymbol(h_start, t_start, num_sms*sizeof(unsigned long long));
cudaMemcpyFromSymbol(h_end, t_end, num_sms*sizeof(unsigned long long));
cudaCheckErrors("some error");
printf("host elapsed time (ms): %f \n device sm clocks:\n start:", htime/1000.0f);
unsigned long long max_diff = 0;
for (int i = 0; i < num_sms; i++) {printf(" %12lu ", h_start[i]);}
printf("\n end: ");
for (int i = 0; i < num_sms; i++) {printf(" %12lu ", h_end[i]);}
for (int i = 0; i < num_sms; i++) if ((h_start[i] != 0xFFFFFFFFFFFFFFFFULL) && (h_end[i] != 0) && ((h_end[i]-h_start[i]) > max_diff)) max_diff=(h_end[i]-h_start[i]);
printf("\n max diff clks: %lu\nmax diff kernel time (ms): %f\n", max_diff, max_diff/(float)(gpu_clk));
return 0;
}
$ nvcc -o t1040 t1040.cu -arch=sm_35
$ ./t1040 1000000 1000 128
host elapsed time (ms): 2128.818115
device sm clocks:
start: 3484744 3484724
end: 2219687393 2228431323
max diff clks: 2224946599
max diff kernel time (ms): 2128.117432
$
Notes:
This code can only be run on a cc3.5 or higher GPU due to the use of 64-bit atomicMin and atomicMax.
I've run it on a variety of grid configurations, on both a GT640 (very low end cc3.5 device) and K40c (high end) and the timing results between host and device agree to within 2% (for reasonably long kernel execution times. If you pass 1 as the command line parameter, with very small grid sizes, the kernel execution time will be very short (nanoseconds) whereas the host will see about 10-20us. This is kernel launch overhead being measured. So the 2% number is for kernels that take much longer than 20us to execute).
It accepts 3 (optional) command line parameters, the first of which varies the amount of time the kernel will execute.
My timestamping is done on a per-SM basis, because the clock64() resource is indicated to be a per-SM resource. The sm clocks are not guaranteed to be synchronized between SMs.
You can modify the grid dimensions. The second optional command line parameter specifies the number of blocks to launch. The third optional command line parameter specifies the number of threads per block. The timing methodology I have shown here should not be dependent on number of blocks launched or number of threads per block. If you specify fewer blocks than SMs, the code should ignore "unused" SM data.

Using Lame function hip_decode in Android NDK to decode mp3 return 0

I am using Lame's mpglib to decode mp3 to PCM in Android NDK for playing. But when I called hip_decode(), it returen 0 meaning that "need more data before we can complete the decode". I had no idea how to solve it. Can someone helps me? Here is my code:
void CBufferWrapper::ConvertMp3toPCM (AAssetManager* mgr, const char *filename){
Print ("ConvertMp3toPCM:file:%s", filename);
AAsset* asset = AAssetManager_open (mgr, filename, AASSET_MODE_UNKNOWN);
// the asset might not be found
assert (asset != NULL);
// open asset as file descriptor
off_t start, length;
int fd = AAsset_openFileDescriptor (asset, &start, &length);
assert (0 <= fd);
long size = AAsset_getLength (asset);
char* buffer = (char*)malloc (sizeof(char)*size);
memset (buffer, 0, size*sizeof(char));
AAsset_read (asset, buffer, size);
AAsset_close (asset);
hip_t ht = hip_decode_init ();
int count = hip_decode (ht, (unsigned char*)buffer, size, pcm_l, pcm_r);
free (buffer);
Print ("ConvertMp3toPCM: length:%ld,pcmcount=%d",length, count);
}
I used MACRO "HAVE_MPGLIB" to compile Lame in NDK. So I think it should work for decoding literally.
Yesterday I had the same problem. Is the same problem but using lame_enc.dll. I did not know how to resolve this 0 returned, this is the reason to this post.
Create a buffer to put mp3 data: unsigned char mp3Data[4096]
Create two buffers for pcm data, but bigger than mp3 one:
unsigned short[4096 * 100];
Open mp3 file and initialize hip.
Now, enter in a do while loop until read bytes are 0 (the end of file).
Inside the loop read 4096 bytes into mp3Data and call hip_decode with
hip_decode(ht, mp3Data, bytesRead, lpcm, rpcm);
You are right, it returns 0. It is asking you for more data.
You need to repeat the reading of 4096 bytes and the call to hip_decode until it returns a valid samples number.
Here is the important part of my program:
int total = 0;
int hecho = 0;
int leido = 0;
int lon = 0;
int x;
do
{
total = fread(mp3b, 1, MAXIMO, fich);
leido += total;
x = hip_decode(hgf, mp3b, total, izquierda, derecha);
if(x > 0)
{
int tamanio;
int y;
tamanio = 1.45 * x + 9200;
unsigned char * bu = (unsigned char *) malloc(tamanio);
y = lame_encode_buffer(lamglofla, izquierda, derecha, x, bu, tamanio);
fwrite(bu, 1, y, fichs);
free(bu);
}
}while(total > 0);
My program decodes a mp3 file and encodes the output into another mp3 file.
I expect that this could be useful.

How to convert sample rate from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16?

I am decoding aac to pcm with ffmpeg with avcodec_decode_audio3. However it decodes into AV_SAMPLE_FMT_FLTP sample format (PCM 32bit Float Planar) and i need AV_SAMPLE_FMT_S16 (PCM 16 bit signed - S16LE).
I know that ffmpeg can do this easily with -sample_fmt. I want to do the same with the code but i still couldn't figure it out.
audio_resample did not work for: it fails with error message: .... conversion failed.
EDIT 9th April 2013: Worked out how to use libswresample to do this... much faster!
At some point in the last 2-3 years FFmpeg's AAC decoder's output format changed from AV_SAMPLE_FMT_S16 to AV_SAMPLE_FMT_FLTP. This means that each audio channel has it's own buffer, and each sample value is a 32-bit floating point value scaled from -1.0 to +1.0.
Whereas with AV_SAMPLE_FMT_S16 the data is in a single buffer, with the samples interleaved, and each sample is a signed integer from -32767 to +32767.
And if you really need your audio as AV_SAMPLE_FMT_S16, then you have to do the conversion yourself. I figured out two ways to do it:
1. Use libswresample (recommended)
#include "libswresample/swresample.h"
...
SwrContext *swr;
...
// Set up SWR context once you've got codec information
swr = swr_alloc();
av_opt_set_int(swr, "in_channel_layout", audioCodec->channel_layout, 0);
av_opt_set_int(swr, "out_channel_layout", audioCodec->channel_layout, 0);
av_opt_set_int(swr, "in_sample_rate", audioCodec->sample_rate, 0);
av_opt_set_int(swr, "out_sample_rate", audioCodec->sample_rate, 0);
av_opt_set_sample_fmt(swr, "in_sample_fmt", AV_SAMPLE_FMT_FLTP, 0);
av_opt_set_sample_fmt(swr, "out_sample_fmt", AV_SAMPLE_FMT_S16, 0);
swr_init(swr);
...
// In your decoder loop, after decoding an audio frame:
AVFrame *audioFrame = ...;
int16_t* outputBuffer = ...;
swr_convert(&outputBuffer, audioFrame->nb_samples, audioFrame->extended_data, audioFrame->nb_samples);
And that's all you have to do!
2. Do it by hand in C (original answer, not recommended)
So in your decode loop, when you've got an audio packet you decode it like this:
AVCodecContext *audioCodec; // init'd elsewhere
AVFrame *audioFrame; // init'd elsewhere
AVPacket packet; // init'd elsewhere
int16_t* outputBuffer; // init'd elsewhere
int out_size = 0;
...
int len = avcodec_decode_audio4(audioCodec, audioFrame, &out_size, &packet);
And then, if you've got a full frame of audio, you can convert it fairly easily:
// Convert from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16
int in_samples = audioFrame->nb_samples;
int in_linesize = audioFrame->linesize[0];
int i=0;
float* inputChannel0 = (float*)audioFrame->extended_data[0];
// Mono
if (audioFrame->channels==1) {
for (i=0 ; i<in_samples ; i++) {
float sample = *inputChannel0++;
if (sample<-1.0f) sample=-1.0f; else if (sample>1.0f) sample=1.0f;
outputBuffer[i] = (int16_t) (sample * 32767.0f);
}
}
// Stereo
else {
float* inputChannel1 = (float*)audioFrame->extended_data[1];
for (i=0 ; i<in_samples ; i++) {
outputBuffer[i*2] = (int16_t) ((*inputChannel0++) * 32767.0f);
outputBuffer[i*2+1] = (int16_t) ((*inputChannel1++) * 32767.0f);
}
}
// outputBuffer now contains 16-bit PCM!
I've left a couple of things out for clarity... the clamping in the mono path should ideally be duplicated in the stereo path. And the code can be easily optimized.
I found 2 resample function from FFMPEG. The performance maybe better.
avresample_convert()
http://libav.org/doxygen/master/group__lavr.html
swr_convert() http://spirton.com/svn/MPlayer-SB/ffmpeg/libswresample/swresample_test.c
Thanks Reuben for a solution to this. I did find that some of the sample values were slightly off when compared with a straight ffmpeg -i file.wav. It seems that in the conversion, they use a round() on the value.
To do the conversion, I did what you did with a bid of modification to work for any amount of channels:
if (audioCodecContext->sample_fmt == AV_SAMPLE_FMT_FLTP)
{
int nb_samples = decoded_frame->nb_samples;
int channels = decoded_frame->channels;
int outputBufferLen = nb_samples & channels * 2;
short* outputBuffer = new short[outputBufferLen/2];
for (int i = 0; i < nb_samples; i++)
{
for (int c = 0; c < channels; c++)
{
float* extended_data = (float*)decoded_frame->extended_data[c];
float sample = extended_data[i];
if (sample < -1.0f) sample = -1.0f;
else if (sample > 1.0f) sample = 1.0f;
outputBuffer[i * channels + c] = (short)round(sample * 32767.0f);
}
}
// Do what you want with the data etc.
}
I went from ffmpeg 0.11.1 -> 1.1.3 and found the change of sample format annoying. I looked at setting the request_sample_fmt to AV_SAMPLE_FMT_S16 but it seems the aac decoder doesn't support anything other than AV_SAMPLE_FMT_FLTP anyway.

Resources