Dead lock without a explicit lock - multithreading

I am testing a pthread program.
This program is simple. The main thread creates a child thread.
The main thread and the child thread are both operating on a queue.
The child thread keeps scanning the queue and return the minimal element and its position with a infinite loop.
The main thread also is running a loop, each iteration of which delete the minimal element calculated by the child thread from the queue, and insert some new elements to the end of the queue.
The minimal element and its position, and the queue are all declared as global variables.
The main ends when the queue is empty and it will cancel the child thread.
This progress is some like a breadth-first search.
The queue is implemented as an array with a size counter. The deletion operation is implemented as replacing the element to be deleted by the last element and decreasing the size counter by one.
No lock is used here. But when running, the program will get stuck.
What's more amazing, if I insert some printf statements to view the status, it may finish.
I want to know what causes this program endless?
struct multiblocks_pthread_args {
volatile int local_e;
volatile int local_v;
volatile int local_pos;
int* Q;
int* val;
volatile int* size;
} para;
volatile int update = 0;
void* child_thread ( void* args ) {
pthread_setcanceltype ( PTHREAD_CANCEL_ASYNCHRONOUS, NULL );
multiblocks_pthread_args* arglist = ( multiblocks_pthread_args* ) args;
bindToCore ( 1 );
int* list = arglist -> Q, * value = arglist -> val;
while ( true ) {
int size, e, v, pos;
do {
size = * ( arglist->size ), e, v = INF, pos = 0;
update = 0;
for ( int i = 0; i < size; i++ ) {
int vi = value[i];
if ( vi < v ) {
pos = i;
v = vi;
}
}
} while ( update );
if ( size > 0 ) e = list[pos];
arglist->local_e = e;
arglist->local_pos = pos;
arglist->local_v = v;
}
return NULL;
}
void main_thread () {
int size;
int* Q = ( int* ) malloc ( sizeof ( int ) * NumNode );
int** hash = ( int** ) malloc ( sizeof ( int* ) * numNode );
NodeColor* color = ( NodeColor* ) malloc ( sizeof ( NodeColor ) * numNode );
// NodeColor is a enum with 3 values: WHITE, GRAY, BLACK
memset ( color, 0, sizeof ( NodeColor ) * numNode );
pthread_t tid;
para.val = ( int* ) malloc ( sizeof ( int ) * NumNode );
para.Q = Q;
para.size = &size;
pthread_create ( &tid, NULL, child_thread, &para );
// Only one element is in the queue
size = 0;
para.Q[size] = 0;
para.val[size] = 0;
hash[0] = &para.val[size]; // hash is used to modify the value of particular element
++size;
color[0] = GRAY;
while ( true ) {
int global_e, global_v = INF, global_pos;
global_e = para.local_e, global_v = para.local_v, global_pos = para.local_pos;
if ( size == 0 ) break;
if ( color[global_e] != BLACK ) {
value[global_e] = global_v, color[global_e] = BLACK;
if ( size > 0 ) {
--size;
para.Q[global_pos] = para.Q[size];
para.val[global_pos] = para.val[size];
hash[para.Q[global_pos]] = & para.val[global_pos];
update = 1;
}
for ( int i = 0; i < MAXDEG; ++i ) {
int ee = ;// new element;
int vv = ;// value of new element;
if ( /* if new element is valid */ ) {
if ( color[ee] == WHITE ) { // WHITE means ee is not in the queue
para.Q[size] = ee;
para.val[size] = vv;
hash[ee] = &para.val[size];
++size, color[ee] = GRAY;
} else {
*hash[ee] = vv;
}
update = 1;
}
}
}
}
free ( Q );
pthread_cancel ( tid );
printf ( "Computation finishes!!!" );
return ;
}

That's not a deadlock but a race condition.
The overall structure of your hang is, you start with WHITE item at index 0 and this loop goes on forever:
size = 1;
while (size != 0) {
if (WHITE) --size;
for (...) {
if (WHITE) ++size;
}
}
The only way this changes is that your child thread would set the pos something else than 0. But your child thread depends on size to be greater than 1 to make it other than 0. There you have your race condition.
My diagnosis may not be accurate. A cleaner code would help a lot. The names like Q, e, v would save you couple of keystrokes but can easily lose you days, as in this example. You also interchangeably use numbers and enums, a bad practice.

Related

ESP8266 painless mesh, sometimes does not connect

I have a simple setup between 2 wemos d1 boards. They work with a painless mesh.
The devices each have buttons and LEDs with which users can interact with one another.
The problem that I am having, is that sometimes the two units don't connect after one of them is turned off. I am extensively testing what happens when one of the 2 nodes falls off and then comes back on again. Sometimes they connect fast, sometimes slow, and sometimes they won't connect at all.
Retrying to reset the turned-off module mostly works but sometimes I need to reset the first module as well or they will never connect again. Judging by the LEDs and operation the program keeps running. Resetting both devices always works to reconnect the two.
// #define wrong_led // I may or may not have made a slight error in soldering the duo leds
bool connected = 0 ;
const uint32_t R1 = 1000 ;
const uint32_t R2 = 4700 ;
const uint32_t threshold[] =
{
/*4 * 1023 * R1 / ( R2 + ( 4 * R1) ) , // 470 ->*/ 682,
/*3 * 1023 * R1 / ( R2 + ( 3 * R1) ) , // 398 ->*/ 579,
/*2 * 1023 * R1 / ( R2 + ( 2 * R1) ) , // 305 ->*/ 446,
/*1 * 1023 * R1 / ( R2 + ( 1 * R1) ) , // 179 ->*/ 262,
/*0 * 1023 * R1 / ( R2 + ( 0 * R1) ) , // 0 */ 0,
} ;
const int nSections = 5 ;
#ifdef wrong_led
const int red[] = { D0, D2, D4, D6, 3 } ;
const int green[] = { D1, D3, D5, D7, 1 } ;
#else
const int green[] = { D0, D2, D4, D6, 3 } ;
const int red[] = { D1, D3, D5, D7, 1 } ;
#endif
const int switchesPin = A0 ;
uint32_t timeOut[nSections] = {0,0,0,0} ;
const int debugPin = 2 ; // DEBUG TEST ME
const uint32_t timeOutInterval = 3000 ;
const uint32_t sendInterval = 2000 ;
const uint32_t connectionTimeout = 10000 ;
enum tokenStates
{
AVAILABLE,
IN_POSSESSION,
TAKEN,
} ;
uint8_t token[ nSections ] ;
Debounce button[] =
{
Debounce ( 255 ),
Debounce ( 255 ),
Debounce ( 255 ),
Debounce ( 255 ),
Debounce ( 255 )
} ;
/************** FUNCTIONS **************/
void updateLEDs()
{
if( !connected )
{
REPEAT_MS( 500 )
{
for (int i = 0; i < nSections ; i++) digitalWrite( red[i], !digitalRead(red[i] )) ; // toggle all red lights during connecting
}
END_REPEAT
}
else for (int i = 0; i < nSections ; i++)
{
switch (token[i])
{
case AVAILABLE: analogWrite( green[i], 32 ) ; // green
digitalWrite( red[i], LOW ) ; break;
case IN_POSSESSION:analogWrite( green[i], 32 ) ; // yellow
digitalWrite( red[i], HIGH ) ; break;
case TAKEN: digitalWrite( green[i], LOW ) ; // red
digitalWrite( red[i], HIGH ) ; break;
}
}
} ;
void newConnection(uint32_t nodeId)
{
connected = 1 ;
}
void debounceInputs()
{
REPEAT_MS( 50 )
{
int sample = analogRead( switchesPin ) ;
for (int i = 0; i < nSections ; i++)
{
uint16_t ref ;
if( threshold[i] >= 35 ) ref = threshold[i] ;
else ref = 35 ;
if( sample >= ref - 35
&& sample <= ref + 35 ) button[i].debounceInputs( 1 ) ;
else button[i].debounceInputs( 0 ) ;
}
} END_REPEAT
}
void processInputs( )
{
for (int i = 0; i < nSections ; i++ )
{
String message = "" ;
message += i ;
message += ',' ;
if( button[i].readInput() == FALLING )
{
if( token[i] == TAKEN ) { continue ; } // token is claimed by another discard button press
else if( token[i] == AVAILABLE ) // if the token is available.....
{
token[i] = IN_POSSESSION ; // claim the token
message += TAKEN ;
}
else if( token[i] == IN_POSSESSION ) // if the token is in possession
{
token[i] = AVAILABLE ; // free up the token
message += AVAILABLE ;
}
mesh.sendBroadcast( message ) ;
}
}
}
void transceiveTokens()
{
static uint8_t index = 0 ;
REPEAT_MS( sendInterval / nSections ) // if we claimed atleast 1 token, transmitt this once every second
{
if( token[index] == IN_POSSESSION )
{
String message = "" ;
message += index ;
message += ',' ;
message += TAKEN ;
mesh.sendBroadcast( message ) ;
}
if( ++ index == nSections ) index = 0 ;
}
END_REPEAT
for (int i = 0; i < nSections ; i++ )
{
if( token[i] != IN_POSSESSION // if a node which claimed a token is turned off while still possessing the token
&& millis() - timeOut[i] >= timeOutInterval ) // the token becomes available again after a timeout
{
token[i] = AVAILABLE ;
}
}
}
void incomingMessage( uint32 from, String msg )
{
uint32_t tokenState ;
uint32_t tokenID ;
char char_array[32];
strcpy(char_array, msg.c_str());
sscanf( char_array, "%d,%d", &tokenID, &tokenState ) ;
if( token[ tokenID ] == IN_POSSESSION && tokenState == TAKEN ) // if we have the token and an other also claims the token...
{ // .. free the token again, and transmitt it.
token[ tokenID ] = AVAILABLE ;
String message = "" ;
message += tokenID ;
message += ',' ;
message += AVAILABLE ;
mesh.sendBroadcast( message ) ;
}
if( tokenState == AVAILABLE )
{
token[ tokenID ] = AVAILABLE ;
}
token[tokenID] = tokenState ; // update token with state
if( token[tokenID] == TAKEN )
{
timeOut[tokenID] = millis() ; // set timeout
}
}
void setup()
{
debounceInputs() ; // to be sure
mesh.init( MESH_PREFIX, MESH_PASSWORD, MESH_PORT );
mesh.onReceive(&incomingMessage );
mesh.onNewConnection( &newConnection );
for( int i = 0 ; i < nSections ; i ++ )
{
pinMode( green[i], OUTPUT ) ;
pinMode( red[i], OUTPUT ) ;
digitalWrite( green[i], LOW ) ;
digitalWrite( red[i], LOW ) ;
}
}
void loop()
{
debounceInputs() ;
processInputs( ) ;
updateLEDs() ;
transceiveTokens() ;
mesh.update() ;
if( millis() > connectionTimeout ) // the first node which is powered on, does need to work eventually, even when it is the only one.
{
connected = 1 ;
}
}
I am yet to build three more units. I am hoping that having a network with at least 2 active nodes at all times will solve this problem.
I am curious as to why it sometimes does work and sometimes it does not work.
AFAIK I am not making any obvious mistakes. None of the functions in loop() take incredibly long, but I do not know how fast mesh.update() ; is to be called. For all I know, the functions together take too long. However, if both nodes are not turned off there seem to be no problems at all. Intervals between messages are also larger than 100ms. About mesh.update() the painless mesh website only states that:
This routine runs various maintainance tasks... Not super interesting, but things don't work without it.
What could it be?

Add user input to array of unknown size

I still pretty new to programming and my only prior experience before C was Javascript. I'm doing the CS50 Introduction to Computer Science and in one of the lectures there's an example code that computes the average of some user input. It looks like this:
#include <cs50.h>
#include <stdio.h>
const int TOTAL = 3;
float average(int length, int array[])
int main(void)
{
int scores[TOTAL];
for (int i = 0; i < TOTAL; i++)
{
scores[i] = get_int("Score: ");
}
printf("Average: %f\n", average(TOTAL, scores);
}
float average(int length, int array[])
{
int sum = 0;
for (int i = 0; i < length; i++)
{
sum += array[i];
}
return sum / (float) length;
}
The feature that I'm trying to add is to dynamically store the size of the array depending of the user input, instead of having one variable (TOTAL in this case). For example: I need to have a loop that is always asking the user for a score (instead of just 3 times like the code above), and when the user types zero(0), the loops breaks and the size of the array is defined by how many times the user has typed some score.
This is what I've done:
int main(void)
{
int score;
// start count for size of array
int count = - 1;
do
{
score = get_int("Score: ");
// add one to the count for each score
count++;
}
while (score != 0);
// now the size of the array is defined by how many times the user has typed.
int scores[count];
for (int i = 0; i < count; i++)
{
// how do I add each score to the array???
}
}
My problem is how to add each score that the user types to the array. In advance thanks!!!
You need a "dynamic data structure" that can expand as required. These are two ways:
Allocate an array of initial size with malloc and realloc when there is not enough room.
Use a linked list
(there are more ways, but these are quite common for this problem)
You need to have data structure which will keep track of the size and stores the data.
Here you have a simple implementation:
typedef struct
{
size_t size;
int result[];
}SCORES_t;
SCORES_t *addScore(SCORES_t *scores, int score)
{
size_t newsize = scores ? scores -> size + 1 : 1;
scores = realloc(scores, newsize * sizeof(scores -> result[0]) + sizeof(*scores));
if(scores)
{
scores -> size = newsize;
scores -> result[scores -> size - 1] = score;
}
return scores;
}
double getAverage(const SCORES_t *scores)
{
double average = 0;
if(scores)
{
for(size_t index = 0; index < scores -> size; average += scores -> result[index], index++);
average /= scores -> size;
}
return average;
}
int main(void)
{
int x;
SCORES_t *scores = NULL;
while(scanf("%d", &x) == 1 && x >= 0)
{
SCORES_t *temp = addScore(scores, x);
if(temp)
{
scores = temp;
}
else
{
printf("Memery allocation error\n");
free(scores);
}
}
if(scores) printf("Number of results: %zu Average %f\n", scores -> size, getAverage(scores));
free(scores);
}
https://godbolt.org/z/5oPesn
regarding:
int main(void)
{
int score;
// start count for size of array
int count = - 1;
do
{
score = get_int("Score: ");
// add one to the count for each score
count++;
}
while (score != 0);
// now the size of the array is defined by how many times the user has typed.
int scores[count];
for (int i = 0; i < count; i++)
{
// how do I add each score to the array???
}
}
this does not compile and contains several logic errors
it is missing the statements: #include <cs50.h> and
#include <stdio.h>
regarding:
int score;
// start count for size of array
int count = - 1;
do
{
score = get_int("Score: ");
// add one to the count for each score
count++;
}
while (score != 0);
This only defines a single variable: score and each time through the loop that single variable is overlayed. Also, the first time through the loop, the counter: count will be incremented to 0, not 1
on each following time through the loop, the variable score will be overlayed (I.E. all prior values entered by the user will be lost)
suggest using dynamic memory. Note: to use dynamic memory, will need the header file: stdlib.h for the prototypes: malloc() and free(). Suggest:
#include <cs50.h>
#include <stdio.h>
#include <stdlib.h>
int main( void )
{
// pointer to array of scores
int * score = NULL;
// start count for size of array
int count = 0;
while( 1 )
{
int score = get_int("Score: ");
if( score != 0 )
{ // then user has entered another score to put into array
count++;
int * temp = realloc( scores, count * sizeof( int ) )
if( ! temp )
{ // realloc failed
// output error info to `stderr`
// note: `perror()` from `stdio.h`
perror( "realloc failed" );
// cleanup
free( scores );
// `exit()` and `EXIT_FAILURE` from `stdlib.h`
exit( EXIT_FAILURE );
}
// implied else, 'realloc()' successful, so update the target pointer
scores = temp;
// insert the new score into the array
scores[ count ] = score;
}
else
{ // user entered 0 so exit the loop
break;
}
}
note: before exiting the program, pass scores to free() so no memory leak.

Dividing processes equally among users, linux kernel programming

I want to equally distribute the processes on the CPU to the users.
For example, I have 4 users, user A and B have 2 processes , User B and C have 4 processes , in total there are 10 processes. All users can use 25% on CPU for these processes. I edited a certain part of the sched.c file in the Linux kernel, but there is a part where I am stuck.
What I want to do is suppose we have 4 users for this example, no matter how many processes the users have, they should all use the CPU equally. For example, let's say that user A has 2, user B has 2, user C has 3, user D has 3, and a total of 10 processes. User A and user B will use the CPU at 25% per user and 12.5% per processes. User C and D will use the CPU at 25% per user and 8.5% per processes.
The CPU should behave equally to all user processes, how can I do this?
asmlinkage void schedule(void)
{
struct schedule_data * sched_data;
struct task_struct *prev, *next, *p;
struct list_head *tmp;
int this_cpu, c;
/* our variables */
unsigned int rnd;
unsigned int found = 0;
gid_t runGid;
unsigned int sumOfAllFlags = 0;
spin_lock_prefetch(&runqueue_lock);
BUG_ON(!current->active_mm);
need_resched_back:
prev = current;
this_cpu = prev->processor;
if (unlikely(in_interrupt())) {
printk("Scheduling in interrupt\n");
BUG();
}
release_kernel_lock(prev, this_cpu);
/*
* 'sched_data' is protected by the fact that we can run
* only one process per CPU.
*/
sched_data = & aligned_data[this_cpu].schedule_data;
spin_lock_irq(&runqueue_lock);
/* move an exhausted RR process to be last.. */
if (unlikely(prev->policy == SCHED_RR))
if (!prev->counter)
{
prev->counter = NICE_TO_TICKS(prev->nice);
move_last_runqueue(prev);
}
switch (prev->state)
{
case TASK_INTERRUPTIBLE:
if (signal_pending(prev))
{
prev->state = TASK_RUNNING;
break;
}
default:
del_from_runqueue(prev);
case TASK_RUNNING:;
}
prev->need_resched = 0;
/*
* this is the scheduler proper:
*/
repeat_schedule:
/*
* Default process to select..
*/
next = idle_task(this_cpu);
//prev->willBeChoosen = 0; //willBeChoosen of prev process = 0
if (sched_type == SCHED_DEFAULT)
{
// next = idle_task(this_cpu);
c = -1000;
list_for_each(tmp, &runqueue_head)
{
p = list_entry(tmp, struct task_struct, run_list);
if (can_schedule(p, this_cpu))
{
int weight = goodness(p, this_cpu, prev->active_mm);
if (weight > c)
c = weight, next = p;
}
}
/* Do we need to re-calculate counters? */
if (unlikely(!c))
{
struct task_struct *p;
spin_unlock_irq(&runqueue_lock);
read_lock(&tasklist_lock);
for_each_task(p)
{
p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice);
}
read_unlock(&tasklist_lock);
spin_lock_irq(&runqueue_lock);
goto repeat_schedule;
}
}
else if (sched_type == GTICKET)
{
current->prevJiffies = jiffies;
// sum of group flags of all processes
sumOfAllFlags = 0;
// calculate sum of flags of all processes
// -> are there any unprocessed waiting processes
spin_unlock_irq(&runqueue_lock);
read_lock(&tasklist_lock);
list_for_each(tmp, &runqueue_head)
{
p = list_entry(tmp, struct task_struct, run_list);
//p->willBeChoosen = 0;
sumOfAllFlags = sumOfAllFlags + p->groupFlag;
}
read_unlock(&tasklist_lock);
spin_lock_irq(&runqueue_lock);
// Check if all processed
// If so go to repeat schedule
// Mark all existing as unprocessed
spin_unlock_irq(&runqueue_lock);
read_lock(&tasklist_lock);
if (sumOfAllFlags == 0)
{
list_for_each(tmp, &runqueue_head)
{
p = list_entry(tmp, struct task_struct, run_list);
p->groupFlag = 1;
}
// goto repeat_schedule;
}
read_unlock(&tasklist_lock);
spin_lock_irq(&runqueue_lock);
spin_unlock_irq(&runqueue_lock);
read_lock(&tasklist_lock);
list_for_each(tmp, &runqueue_head)
{
p = list_entry(tmp, struct task_struct, run_list);
if((current->prevJiffies - p->prevJiffies) > 1)
{
p->counter--;
}
else if((current->prevJiffies - p->prevJiffies) < 4)
{
p->counter--;
}
}
read_unlock(&tasklist_lock);
spin_lock_irq(&runqueue_lock);
// Random Selection of next process
// Random Selection is between 1 and 15
/*get_random_bytes(&rnd, sizeof(unsigned int));
if(rnd < 0)
rnd = rnd*(-1);
if(maxTicket>0)
{
rnd = (rnd % maxTicket);
rnd++;
}*/
// if process's ticket is greater or equal to rnd
// next process <- that process
spin_unlock_irq(&runqueue_lock);
read_lock(&tasklist_lock);
list_for_each(tmp, &runqueue_head)
{
p = list_entry(tmp, struct task_struct, run_list);
if (can_schedule(p, this_cpu))
{
if (p->groupFlag>0)
{
p->willBeChoosen = 20;
runGid = get_gid(p->user->uid);
//p->prevJiffies = jiffies;
break;
}
}
}
read_unlock(&tasklist_lock);
spin_lock_irq(&runqueue_lock);
//choosing next process using goodness with the integrated will be chosen variable
next = idle_task(this_cpu);
c = -1000;
spin_unlock_irq(&runqueue_lock);
read_lock(&tasklist_lock);
list_for_each(tmp, &runqueue_head)
{
p = list_entry(tmp, struct task_struct, run_list);
if (can_schedule(p, this_cpu))
{
int weight = goodness(p, this_cpu, prev->active_mm);
if (weight > c)
{
c = weight, next = p;
}
}
}
read_unlock(&tasklist_lock);
spin_lock_irq(&runqueue_lock);
spin_unlock_irq(&runqueue_lock);
read_lock(&tasklist_lock);
list_for_each(tmp, &runqueue_head)
{
p = list_entry(tmp, struct task_struct, run_list);
p->willBeChoosen = 0;
}
read_unlock(&tasklist_lock);
spin_lock_irq(&runqueue_lock);
/* Do we need to re-calculate counters? */
if (unlikely(!c))
{
struct task_struct *p;
spin_unlock_irq(&runqueue_lock);
read_lock(&tasklist_lock);
for_each_task(p)
{
p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice);
}
read_unlock(&tasklist_lock);
spin_lock_irq(&runqueue_lock);
goto repeat_schedule;
}
// Assign all processes with the same group id as processed
// So that a group based fair scheduler can be achived
spin_unlock_irq(&runqueue_lock);
read_lock(&tasklist_lock);
list_for_each(tmp, &runqueue_head)
{
p = list_entry(tmp, struct task_struct, run_list);
if (get_gid(p->user->uid) == runGid)
p->groupFlag = 0;
}
read_unlock(&tasklist_lock);
spin_lock_irq(&runqueue_lock);
}
/*
* from this point on nothing can prevent us from
* switching to the next task, save this fact in
* sched_data.
*/
sched_data->curr = next;
task_set_cpu(next, this_cpu);
spin_unlock_irq(&runqueue_lock);
if (unlikely(prev == next))
{
/* We won't go through the normal tail, so do this by hand */
prev->policy &= ~SCHED_YIELD;
goto same_process;
}

How to parse interleaved buffer into distinct multiple channel buffers with PortAudio

hope you can help me :)
I'm trying to get audio data from a multichannel ASIO device with PortAudio library. Everything is ok: I managed to set the default host API as ASIO and I also managed to select 4 specific channels as inputs. Then,I get an interleaved audio stream which sounds correctly but i would like to get each channel data separately.
PortAudio allows to do a non-interleaved recording, but I don't know how to write or modify my RecordCallBack and the multibuffer pointer (one buffer per channel). Sure I've tried... :(
It would be of massive help to me if someone knows how to deal with this issue.
The original RecordCallBack function is taken from a well known stereo example (slightly modified to manage 4 channles instead of 2) but it manages a single interleaved buffer:
static int recordCallback( const void *inputBuffer, void *outputBuffer,
unsigned long framesPerBuffer,
const PaStreamCallbackTimeInfo* timeInfo,
PaStreamCallbackFlags statusFlags,
void *userData )
{
paTestData *data = (paTestData*)userData;
const short *rptr = (const short*)inputBuffer;
short *wptr = &data->recordedSamples[data->frameIndex * NUM_CHANNELS_I];
long framesToCalc;
long i;
int finished;
unsigned long framesLeft = data->maxFrameIndex - data->frameIndex;
(void) outputBuffer; /* Prevent unused variable warnings. */
(void) timeInfo;
(void) statusFlags;
(void) userData;
if( framesLeft < framesPerBuffer )
{
framesToCalc = framesLeft;
finished = paComplete;
}
else
{
framesToCalc = framesPerBuffer;
finished = paContinue;
}
if( inputBuffer == NULL )
{
for( i=0; i<framesToCalc; i++ )
{
*wptr++ = SAMPLE_SILENCE; /* ch1*/
if( NUM_CHANNELS_I == 4 ){
*wptr++ = SAMPLE_SILENCE;/* ch2*/
*wptr++ = SAMPLE_SILENCE;/* ch3*/
*wptr++ = SAMPLE_SILENCE;} /* ch4*/
}
}
else
{
for( i=0; i<framesToCalc; i++ )
{
*wptr++ = *rptr++; /* ch1*/
if( NUM_CHANNELS_I == 4 ){
*wptr++ = *rptr++;/* ch2*/
*wptr++ = *rptr++;/* ch3*/
*wptr++ = *rptr++;} /* ch4*/
}
}
data->frameIndex += framesToCalc;
return finished;
}
The *inputbuffer pointer is declared as:
PaStream* stream;
And the Open_Stream function is called:
err = Pa_OpenStream(
&stream,
NULL, /* no input */
&outputParameters,
SAMPLE_RATE,
FRAMES_PER_BUFFER,
paClipOff, /* we won't output out of range samples so don't bother clipping them */
playCallback,
&data );
Interleaved just means the bytes for each channel follow after each other as in :
aabbccddeeaabbccddeeaabbccddee (each character represents one byte)
where this input buffer contains two bytes (16 bits) per each of the 5 channels : a, b, c, d & e as it makes 3 repeats across the set of channels which equates to 3 samples per channel ... so knowing input is interleaved, it could be extracted into separate output channel buffers one per channel, but in your code you have just a single output buffer which as you say is due to the necessary callback signature ... one approach would be to write each output channel into the single output buffer separated by distinct offsets per channel so output would be
aaaaaabbbbbbccccccddddddeeeeee
then outside the callback extract out each channel also using same offset per channel
First you need to obtain size of the given output buffer, say X, number of channels, Y, and number of bytes per channel per sample, Z. So global channel offset would be
size_offset = X / (Y * Z) # assure this is an integer
# if its a fraction then error in assumptions
so when addressing output buffer both inside callback and outside we use this offset and knowledge of which channel we are on, W (values 0, 1, 2, 3, ...), and which sample K :
index_output_buffer = K + (W * size_offset) # 1st byte of sample pair
now use index_output_buffer ... then calculate follow-on index :
index_output_buffer = K + (W * size_offset) + 1 # 2nd byte of sample pair
and use it ... you could put above two commands for a given sample into a loop using Z to control number of iterations if Z were to vary but above assumes samples are two bytes
Thank's Scott for your help. The solution was right in front of my eyes and I finally didn't have to work with samples offset. I didn't give you enough information about the code, so your approach was excellent, but the code itself provides an easier way to do that:
The data is storaged in an structrue:
typedef struct
{
int frameIndex; /* Index into sample array. */
int maxFrameIndex;
short *recordedSamples;
}
paTestData;
I modified it to:
typedef struct
{
int frameIndex; /* Index into sample array. */
int maxFrameIndex;
short *recordedSamples;
short * recordedSamples2; //ch2
short * recordedSamples3; //ch3
short *recordedSamples4; //ch4
}
paTestData;
Then I just had to allocate this variables in memory and modified the recordCallback function as follows:
static int recordCallback( const void *inputBuffer, void *outputBuffer,
unsigned long framesPerBuffer,
const PaStreamCallbackTimeInfo* timeInfo,
PaStreamCallbackFlags statusFlags,
void *userData )
{
paTestData *data = (paTestData*)userData;
const short *rptr = (const short*)inputBuffer;
short *wptr = &data->recordedSamples[data->frameIndex];
short *wptr2=&data->recordedSamples2[data->frameIndex];
short *wptr3=&data->recordedSamples3[data->frameIndex];
short *wptr4=&data->recordedSamples4[data->frameIndex];
long framesToCalc;
long i;
int finished;
unsigned long framesLeft = data->maxFrameIndex - data->frameIndex;
(void) outputBuffer; /* Prevent unused variable warnings. */
(void) timeInfo;
(void) statusFlags;
(void) userData;
if( framesLeft < framesPerBuffer )
{
framesToCalc = framesLeft;
finished = paComplete;
}
else
{
framesToCalc = framesPerBuffer;
finished = paContinue;
}
if( inputBuffer == NULL )
{
for( i=0; i<framesToCalc; i++ )
{
*wptr++ = SAMPLE_SILENCE; //ch1
if( NUM_CHANNELS_I == 4 ){
*wptr2++ = SAMPLE_SILENCE;//ch2
*wptr3 ++= SAMPLE_SILENCE;//ch3
*wptr4++ = SAMPLE_SILENCE;} //ch4
}
}
else
{
for( i=0; i<framesToCalc; i++ )
{
*wptr++ = *rptr++; //ch1
if( NUM_CHANNELS_I == 4 ){
*wptr2++ = *rptr++;//ch2
*wptr3++ = *rptr++;//ch3
*wptr4 ++= *rptr++;} //ch4
}
}
data->frameIndex += framesToCalc;
return finished;
}
Hope this can help other people. And thank's again, Scott

messed up using do_futex?

I'm getting a weird error. I implemented these two functions:
int flag_and_sleep(volatile unsigned int *flag)
{
int res = 0;
(*flag) = 1;
res = syscall(__NR_futex, flag, FUTEX_WAIT, 1, NULL, NULL, 0);
if(0 == res && (0 != (*flag)))
die("0 == res && (0 != (*flag))");
return 0;
}
int wake_up_if_any(volatile unsigned int *flag)
{
if(1 == (*flag))
{
(*flag) = 0;
return syscall(__NR_futex, flag, FUTEX_WAKE, 1, NULL, NULL, 0);
}
return 0;
}
and test them by running two Posix threads:
static void die(const char *msg)
{
fprintf(stderr, "%s %u %lu %lu\n", msg, thread1_waits, thread1_count, thread2_count);
_exit( 1 );
}
volatile unsigned int thread1_waits = 0;
void* threadf1(void *p)
{
int res = 0;
while( 1 )
{
res = flag_and_sleep( &thread1_waits );
thread1_count++;
}
return NULL;
}
void* threadf2(void *p)
{
int res = 0;
while( 1 )
{
res = wake_up_if_any( &thread1_waits );
thread2_count++;
}
return NULL;
}
After thread2 has had a million or so iterations, I get the assert fire on me:
./a.out
0 == res && (0 != (*flag)) 1 261129 1094433
This means that the syscall - and thereby do_futex() - returned 0. Man says it should only do so if woken up by a do_futex(WAKE) call. But then before I do a WAKE call, I set the flag to 0. Here it appears that flag is still 1.
This is Intel, which means strong memory model. So if in thread1 I see results from a syscall in thread2, I must also see the results of the write in thread 2 which was before the call.
Flag and all pointers to it are volatile, so I don't see how gcc could fail to read the correct value.
I'm baffled.
Thanks!
the race happens when thread 1 goes the full cycle and re-enters WAIT call when thread 2 goes from
(*flag) = 0;
to
return syscall(__NR_futex, flag, FUTEX_WAKE, 1, NULL, NULL, 0);
So the test is faulty.

Resources