I'm trying to interleave the inputs before generating the output.
I have 64 bits input and 64 bits output.
I tried the code like this.
But, I want to know if there is a more efficient way of doing this?
for (integer i = 0; i < 16; i++)
out[i*4] = in[i];
out[(i*4) +1] = in[i+16];
out[(i*4) +2] = in[i+32];
out[(i*4) +3] = in[i+48];
If by "efficient" you mean more compact code, then you can use nested for loops:
for (int i = 0; i < 16; i++) begin
for (int j = 0; j < 4; j++) begin
out[ (i*4) + j ] = in[ i + (j*16) ];
end
end
You could do
for (int i = 0; i < 16; i++)
out[i*4 +: 4] = {in[i],in[i+16],in[i+32],in[i+48]};
Based on your code, out would follow order:
00 16 32 48 01 17 33 49 02 18 34 50 ... 15 31 47 63
In SystemVerilog that can be compressed to:
foreach(out[i]) out[i] = in[ 16*(i%4) + (i/4) ];
or the equivalent:
foreach(in[i]) out[ 4*(i%16) + (i/16) ] = in[i];
You original code and my compressed solutions are logically equivalent. Any decent synthesizer would treat them the same. Fewer lines of code will not change anything. It really comes down to human readability and how you plan to handle future changes to the interleave function.
From the simulation perspective, modular and division operations are more resource intensive. A good simulator should recognize 4 and 16 are powers of 2 and optimize the math for it. If not, try replacing 16*(i%4) + (i/4) with {(i[1:0]+i[6]),i[5:2]} (note: simplest logic and less human readable). You should not see a meaningful performance difference with a descent simulator.
I tried to run the following Node.js code:
for (let a = 5; a === 11 ; a++) {
console.log(a);
}
I want to know how this loop returns the result . (Logically, step by step.) Why does it log nothing to the console?
Example:
It starts with 5.
It then gets added by 1 each time.
It's stopping condition is...
Thanks in advance and do ask in comments if the question wasn't clear.
This particular loop should not print anything.
for (let a = 5; a === 11 ; a++) {
console.log(a);
}
The loop states: "Execute the code inside me, as long as a is equal to 11.
However, in our case, a is initialized to 5. Because of this, when we first start the loop, its run condition is already false, so it never executes.
An example of a loop that would print the numbers from 5 to 10:
for (let a = 5; a < 11 ; a++) {
console.log(a);
}
It is important to note that the condition in a for loop is the run condition, not the end condition.
Ok, so your problem here is that it only runs if the variable A is equal to 11.
If you want everything from 5 below 11 you use a < symbol.
I'll explain the steps in this example.
for (a = 5; a < 11; a++) {
console.log(a)
}
It starts with 5.
It then gets added by 1 each time.
Its stopping condition is if A is greater than or equal to 11.
I was going through learnyounode when I had to complete the juggling async challenge, and that is where I came across a similar simpler problem as below:
// try print numbers 0 - 9
for (var i = 0; i < 10; i++) {
setTimeout(function() {
console.log(i)
})
}
the above snippet gives the output as:
10
10
10
10
10
10
10
10
10
10
which is not the intended result, however when I write it in the following way:
// try print numbers 0 - 9
var f = function(i) {
setTimeout(function() {
console.log(i)
})
}
for (var i = 0; i < 10; i++)f(i);
I get the desired output. So what is exactly happening when I am writing the setTimeout() portion inside that function?
In the first snipped, i variable is defined outside the function. You can access it just simply because it is on its (higher level) scope and there isn't another i variable defined in the function own (or any intermediate if it had been exist).
The for loop sets ten timeouts (enqueued to the event loop). But they will be executed only when the for loop had finished, so i value is 10 (the last of the iteration).
In the second example, you are surrounding the setTimeout() inside a closure (function) which is execued immediately each loop iteration.
In those iterations, when the function is called, the current value of i is passed as parameter (also named i, which is defined in the local function's scope and, because it has the same name, hides the fact you aren't calling the same variable).
See below slight modification of your second example:
// try print numbers 0 - 9
var f = function(j) {
setTimeout(function() {
console.log(i, j)
})
}
for (var i = 0; i < 10; i++)f(i);
// Output:
// -------
// 10 0
// 10 1
// 10 2
// 10 3
// 10 4
// 10 5
// 10 6
// 10 7
// 10 8
// 10 9
This is because variable declared with var is function-level, not block-level.
The first snippt is same code below
```
var i;
// try print numbers 0 - 9
for (i = 0; i < 10; i++) {
setTimeout(function() {
console.log(i)
})
}
all 'i's point to same address in memory which eventually equal 10.
When it comes to second snippt,because you pass i as an parameter into a function, the real value of i will copy to new memory address in stack for the function. Each i in the function point to the new memory address with diffent value.
I'm wondering if it would be feasible to automatically test for race conditions using a debugger.
For example, imaging you want to test a multi-threaded queue. Amongst others you would want to test that you can concurrently call enqueue() and dequeue().
A simple unit-test could be able to start two threads, each calling enqueue() and dequeue() respectively in a loop and checking the results:
// thread A
for( int i=0; i<count; i+=1 ) {
enqueue( queue, i );
}
// thread B
for( int i=0; i<count; i+=1 ) {
ASSERT( i == dequeue( queue ) );
}
Now, a clever test-driver, running the unit-test in gdb or lldb, should be able to wait for breakpoints set inside both loops and then use the debuggers si (step instruction) command to simulate all possible interleavings of the two threads.
My question is not if this is technically possible (it is). What I want to know is this:
Assuming the enqueue() function has 10 instructions and the dequeue() function has 20 - how many different interleavings does the test have to try?
Let's see...
If we only have 2 instructions in each: a,b and A,B:
a,b,A,B
a,A,b,B
a,A,B,b
A,a,b,B
A,a,B,b
A,B,a,b
That's 6.
For a, b, C and A,B,C:
a,b,c,A,B,C
a,b,A,c,B,C
a,b,A,B,c,C
a,b,A,B,C,c
a,A,b,c,B,C
a,A,b,B,c,C
a,A,B,b,c,C
a,A,b,B,C,c
a,A,B,b,C,c
a,A,B,C,b,c
A,a,b,c,B,C
A,a,b,B,c,C
A,a,B,b,c,C
A,B,a,b,c,C
A,a,b,B,C,c
A,a,B,b,C,c
A,B,a,b,C,c
A,a,B,C,b,c
A,B,a,C,b,c
A,B,C,a,b,c
That's 20, unless I'm missing something.
If we generalize it to N instructions (say, N is 26) in each and start with a...zA...Z, then there will be 27 possible positions for z (from before A to after Z), at most 27 positions for y, at most 28 for x, at most 29 for w, etc. This suggest a factorial at worst. In reality, however, it's less than that, but I'm being a bit lazy, so I'm going to use the output from a simple program calculating the number of possible "interleavings" instead of deriving the exact formula:
1 & 1 -> 2
2 & 2 -> 6
3 & 3 -> 20
4 & 4 -> 70
5 & 5 -> 252
6 & 6 -> 924
7 & 7 -> 3432
8 & 8 -> 12870
9 & 9 -> 48620
10 & 10 -> 184756
11 & 11 -> 705432
12 & 12 -> 2704156
13 & 13 -> 10400600
14 & 14 -> 40116600
15 & 15 -> 155117520
16 & 16 -> 601080390
So, with these results you may conclude that while the idea is correct, it's going to take an unreasonable amount of time to use it for code validation.
Also, you should remember that you need to take into account not only the order of instruction execution, but also the state of the queue. That's going to increase the number of iterations.
Here's the program (in C):
#include <stdio.h>
unsigned long long interleavings(unsigned remaining1, unsigned remaining2)
{
switch (!!remaining1 * 2 + !!remaining2)
{
default: // remaining1 == 0 && remaining2 == 0
return 0;
case 1: // remaining1 == 0 && remaining2 != 0
case 2: // remaining1 != 0 && remaining2 == 0
return 1;
case 3: // remaining1 != 0 && remaining2 != 0
return interleavings(remaining1 - 1, remaining2) +
interleavings(remaining1, remaining2 - 1);
}
}
int main(void)
{
unsigned i;
for (i = 0; i <= 16; i++)
printf("%3u items can interleave with %3u items %llu times\n",
i, i, interleavings(i, i));
return 0;
}
BTW, you could also save an order of magnitude (or two) of the overhead due to interfacing with the debugger and due to the various context switches, if you simulate pseudo-code instead. See this answer to a somewhat related question for a sample implementation. This may also give you a more fine grained control over switching between the threads than direct execution.