Recently I modified my code to store everything in the renderscript (before that I copied the data back and forth wasn't effective), but now the garbage collector is collecting garbage like crazy. (Still the app is preforming better this way.)
I can't figure out what needs to be collected, I use everything I don't create new arrays in the functions, which I call frequently. My only idea is that if I do this:
void __attribute__((kernel)) diffuseVelocityY(float in, uint32_t x, uint32_t y) {
velocityY_prev[x] = velocityY[x] + a*(velocityY_prev[x-1] + velocityY_prev[x+1] + velocityY_prev[x-(width)] + velocityY_prev[x+(width)])/(1+(4*a));
}
it creates a temperaly pointer for it because I'm using data from the same pointer that I want to update(I have no idea if this is the way it works). I tried to change it, so it puts the data in an empty pointer and after it finishes I copy the data to the right place. It seemd that it collected less garbage but there was still garbage collection and the preformance went down aswell.
I uploaded my code here if someone wants to look(the _befores are from before I modified the code).
I have no idea how to stop the garbage collection, I hope someone can help.
One of the methods:
void set_bnd_densiy_prev(int b){
for (int i = 1; i <= gridSizeY; i++) {
density_prev[IX(0, i)] = (b == 1 ? -density_prev[IX(1, i)] : density_prev[IX(1, i)]);
density_prev[IX(gridSizeX + 1, i)] = (b == 1 ? -density_prev[IX(gridSizeX, i)] : density_prev[IX(gridSizeX, i)]);
}
for (int i = 1; i <= gridSizeX; i++) {
density_prev[IX(i, 0)] = (b == 2 ? -density_prev[IX(i, 1)] : density_prev[IX(i, 1)]);
density_prev[IX(i, gridSizeY + 1)] = (b == 2 ? -density_prev[IX(i, gridSizeY)] : density_prev[IX(i, gridSizeY)]);
}
density_prev[IX(0 ,0 )] = 0.5f*(density_prev[IX(1,0 )]+density_prev[IX(0 ,1)]);
density_prev[IX(0 ,gridSizeY+1)] = 0.5f*(density_prev[IX(1,gridSizeY+1)]+density_prev[IX(0 ,gridSizeY )]);
density_prev[IX(gridSizeX+1,0 )] = 0.5f*(density_prev[IX(gridSizeX,0 )]+density_prev[IX(gridSizeX+1,1)]);
density_prev[IX(gridSizeX+1,gridSizeY+1)] = 0.5f*(density_prev[IX(gridSizeX,gridSizeY+1)]+density_prev[IX(gridSizeX+1,gridSizeY )]);
}
Code generated from it:
private final static int mExportFuncIdx_set_bnd_densiy_prev = 3;
public void invoke_set_bnd_densiy_prev(int b) {
FieldPacker set_bnd_densiy_prev_fp = new FieldPacker(4);
set_bnd_densiy_prev_fp.addI32(b);
invoke(mExportFuncIdx_set_bnd_densiy_prev, set_bnd_densiy_prev_fp);
}
The problem was with the function args because renderscript needs to create Fieldpackers to handle them. So if you have the same problem remove the function args then copy paste the function and modify the variables and call the different functions not pretty but it works.
(Thanks thebaron for your help)
Related
I have been working on an exercise from google's dev tech guide. It is called Compression and Decompression you can check the following link to get the description of the problem Challenge Description.
Here is my code for the solution:
public static String decompressV2 (String string, int start, int times) {
String result = "";
for (int i = 0; i < times; i++) {
inner:
{
for (int j = start; j < string.length(); j++) {
if (isNumeric(string.substring(j, j + 1))) {
String num = string.substring(j, j + 1);
int times2 = Integer.parseInt(num);
String temp = decompressV2(string, j + 2, times2);
result = result + temp;
int next_j = find_next(string, j + 2);
j = next_j;
continue;
}
if (string.substring(j, j + 1).equals("]")) { // Si es un bracket cerrado
break inner;
}
result = result + string.substring(j,j+1);
}
}
}
return result;
}
public static int find_next(String string, int start) {
int count = 0;
for (int i = start; i < string.length(); i++) {
if (string.substring(i, i+1).equals("[")) {
count= count + 1;
}
if (string.substring(i, i +1).equals("]") && count> 0) {
count = count- 1;
continue;
}
if (string.substring(i, i +1).equals("]") && count== 0) {
return i;
}
}
return -111111;
}
I will explain a little bit about the inner workings of my approach. It is a basic solution involves use of simple recursion and loops.
So, let's start from the beggining with a simple decompression:
DevTech.decompressV2("2[3[a]b]", 0, 1);
As you can see, the 0 indicates that it has to iterate over the string at index 0, and the 1 indicates that the string has to be evaluated only once: 1[ 2[3[a]b] ]
The core here is that everytime you encounter a number you call the algorithm again(recursively) and continue where the string insides its brackets ends, that's the find_next function for.
When it finds a close brackets, the inner loop breaks, that's the way I choose to make the stop sign.
I think that would be the main idea behind the algorithm, if you read the code closely you'll get the full picture.
So here are some of my concerns about the way I've written the solution:
I could not find a more clean solution to tell the algorithm were to go next if it finds a number. So I kind of hardcoded it with the find_next function. Is there a way to do this more clean inside the decompress func ?
About performance, It wastes a lot of time by doing the same thing again, when you have a number bigger than 1 at the begging of a bracket.
I am relatively to programming so maybe this code also needs an improvement not in the idea, but in the ways It's written. So would be very grateful to get some suggestions.
This is the approach I figure out but I am sure there are a couple more, I could not think of anyone but It would be great if you could tell your ideas.
In the description it tells you some things that you should be awared of when developing the solutions. They are: handling non-repeated strings, handling repetitions inside, not doing the same job twice, not copying too much. Are these covered by my approach ?
And the last point It's about tets cases, I know that confidence is very important when developing solutions, and the best way to give confidence to an algorithm is test cases. I tried a few and they all worked as expected. But what techniques do you recommend for developing test cases. Are there any softwares?
So that would be all guys, I am new to the community so I am open to suggestions about the how to improve the quality of the question. Cheers!
Your solution involves a lot of string copying that really slows it down. Instead of returning strings that you concatenate, you should pass a StringBuilder into every call and append substrings onto that.
That means you can use your return value to indicate the position to continue scanning from.
You're also parsing repeated parts of the source string more than once.
My solution looks like this:
public static String decompress(String src)
{
StringBuilder dest = new StringBuilder();
_decomp2(dest, src, 0);
return dest.toString();
}
private static int _decomp2(StringBuilder dest, String src, int pos)
{
int num=0;
while(pos < src.length()) {
char c = src.charAt(pos++);
if (c == ']') {
break;
}
if (c>='0' && c<='9') {
num = num*10 + (c-'0');
} else if (c=='[') {
int startlen = dest.length();
pos = _decomp2(dest, src, pos);
if (num<1) {
// 0 repetitions -- delete it
dest.setLength(startlen);
} else {
// copy output num-1 times
int copyEnd = startlen + (num-1) * (dest.length()-startlen);
for (int i=startlen; i<copyEnd; ++i) {
dest.append(dest.charAt(i));
}
}
num=0;
} else {
// regular char
dest.append(c);
num=0;
}
}
return pos;
}
I would try to return a tuple that also contains the next index where decompression should continue from. Then we can have a recursion that concatenates the current part with the rest of the block in the current recursion depth.
Here's JavaScript code. It takes some thought to encapsulate the order of operations that reflects the rules.
function f(s, i=0){
if (i == s.length)
return ['', i];
// We might start with a multiplier
let m = '';
while (!isNaN(s[i]))
m = m + s[i++];
// If we have a multiplier, we'll
// also have a nested expression
if (s[i] == '['){
let result = '';
const [word, nextIdx] = f(s, i + 1);
for (let j=0; j<Number(m); j++)
result = result + word;
const [rest, end] = f(s, nextIdx);
return [result + rest, end]
}
// Otherwise, we may have a word,
let word = '';
while (isNaN(s[i]) && s[i] != ']' && i < s.length)
word = word + s[i++];
// followed by either the end of an expression
// or another multiplier
const [rest, end] = s[i] == ']' ? ['', i + 1] : f(s, i);
return [word + rest, end];
}
var strs = [
'2[3[a]b]',
'10[a]',
'3[abc]4[ab]c',
'2[2[a]g2[r]]'
];
for (const s of strs){
console.log(s);
console.log(JSON.stringify(f(s)));
console.log('');
}
I just get started on CPLEX so this is my problem:
I do have an issue I have a variable decision Y (patient allocated =1 if yes for the day i the hour h ) with three parameters (patient daytime ) and I want to display on a table on excel those results. one table with the Y==1 and their parameters beside this table.
if Ypih == Zpm= 1 (Zpm variable decision if the patient p is consulted from the doctor m ) then write on excel the patient p is registered to consult the doctor m on the day I at the hour h.
my problem is that i cannot display the parameters for their ranges for every instance of the loop .
so how to cross the pool solution to get the values of pih when Y==1==Z and display them
you can solve your problem as pointed below (assuming you're using the ILOG CPLEX Optimization Studio C++ library).
// solve your model
cplex.solve();
// now, we will verify all variables that are equal to 1
// first, we will loop through variables Y
for (int p_ = 0; p_ < maxP; p_++) {
for (int i_ = 0; i_ < maxI; i_++) {
for (int h_ = 0; h_ < maxH; h_++) {
// if Y_{pih} == 1
if (cplex.getValue(cplex.varY[p_][i_][h_]) == 1) {
// we will look if there is a variable Z == 1
for (int m_ = 0; m_ < maxM; m_++) {
if (cplex.getValue(cplex.varZ[p_][m_] == 1) {
// print or store your variables
}
}
}
}
}
}
After solving your model, you will need to verify which variables are equal to one. Thus, you can loop through all of your model variables and verify whether they are one or not, using the getValue CPLEX function.
See this link for a description regarding the CPLEX function.
This is a question not about how LongAdder works, it's about an intriguing implementation detail that I can't figure out.
Here is the code from Striped64 (I've cut out some parts and left the relevant parts for the question):
final void longAccumulate(long x, LongBinaryOperator fn,
boolean wasUncontended) {
int h;
if ((h = getProbe()) == 0) {
ThreadLocalRandom.current(); // force initialization
h = getProbe();
wasUncontended = true;
}
boolean collide = false; // True if last slot nonempty
for (;;) {
Cell[] as; Cell a; int n; long v;
if ((as = cells) != null && (n = as.length) > 0) {
if ((a = as[(n - 1) & h]) == null) {
//logic to insert the Cell in the array
}
// CAS already known to fail
else if (!wasUncontended) {
wasUncontended = true; // Continue after rehash
}
else if (a.cas(v = a.value, ((fn == null) ? v + x : fn.applyAsLong(v, x)))){
break;
}
A lot of things from code are clear to me, except for the :
// CAS already known to fail
else if (!wasUncontended) {
wasUncontended = true; // Continue after rehash
}
Where does this certainty that the following CAS will fail?
This is really confusing for me at least, because this check only makes sense for a single case : when some Thread enters the longAccumulate method for the n-th time (n > 1) and the busy spin is at it's first cycle.
It's like this code is saying : if you (some Thread) have been here before and you have some contention on a particular Cell slot, don't try to CAS your value to the already existing one, but instead rehash the probe.
I honestly hope I will make some sense for someone.
It's not that it will fail, it's more that it has failed. The call to this method is done by the LongAdder add method.
public void add(long x) {
Cell[] as; long b, v; int m; Cell a;
if ((as = cells) != null || !casBase(b = base, b + x)) {
boolean uncontended = true;
if (as == null || (m = as.length - 1) < 0 ||
(a = as[getProbe() & m]) == null ||
!(uncontended = a.cas(v = a.value, v + x)))
longAccumulate(x, null, uncontended);
}
}
The first set of conditionals is related to existence of the long Cells. If the necessary cell doesn't exist, then it will try to accumulate uncontended (as there was no attempt to add) by atomically adding the necessary cell and then adding.
If the cell does exist, try to add (v + x). If the add failed then there was some form of contention, in that case try to do the accumulating optimistically/atomically (spin until successful)
So why does it have
wasUncontended = true; // Continue after rehash
My best guess is that with heavy contention, it will try to give the running thread time to catch up and will force a retry of the existing cells.
I'm trying to write my first real program with dynamic arrays, but I've come across a problem I cannot understand. Basically, I am trying to take a dynamic array, copy it into a temporary one, add one more address to the original array, then copy everything back to the original array. Now the original array has one more address than before. This worked perfectly when trying with ints, but strings crash my program. Here's an example of the code I'm struggling with:
void main()
{
int x = 3;
std::string *q;
q = new std::string[x];
q[0] = "1";
q[1] = "2";
q[2] = "3";
x++;
std::string *temp = q;
q = new std::string[x];
q = temp;
q[x-1] = "4";
for (int i = 0; i < 5; i++)
std::cout << q[i] << std::endl;
}
If I were to make q and temp into pointers to int instead of string then the program runs just fine. Any help would be greatly appreciated, I've been stuck on this for an hour or two.
q = temp performs only a shallow copy. You lose the original q and all of the strings it pointed to.
Since you reallocated q to have 4 elements, but then immediately reassigned temp (which was allocated with only 3 elements), accessing (and assigning) the element at x now is outside the bounds of the array.
If you have to do it this way for some reason, it should look like this:
auto temp = q;
q = new std::string[x];
for(int x = 0; x < 3; ++x)
q[x] = temp[x];
delete [] temp;
q[x] = 4;
However, this is obviously more complex and very much more prone to error than the idiomatic way of doing this in C++. Better to use std::vector<std::string> instead.
Simple copy pasta here:
static void Main(string[] args)
{
List<Task> Tasks = new List<Task>();
Random r = new Random();
for (int o = 0; o < 5; o++)
Tasks.Add(Task.Factory.StartNew(() => { int i = r.Next(0, 3000); Thread.Sleep(i); Console.WriteLine("{0}: {1}", o, i); }));
Task.WaitAll(Tasks.ToArray());
Console.Read();
}
When you run that, you will get something like this:
5: 98
5: 198
5: 658
5: 1149
5: 1300
What am I not understanding about this? Writing each iteration of o is showing as 5 for all threads when I'd expect to see numbers 0 through 4 in random order.
I tried using an actual method instead of anonymous and it does the same thing. What am I missing?
Edit: I just found the problem with my very first post and edited the question, so sorry if you answered about the improper order problem. However, I am curious as to why o is not writing properly.
() =>
{
int i = r.Next(0, 3000);
Thread.Sleep(i);
Console.WriteLine("{0}: {1}", o, i);
})
Your are closing over your loop variable o with the delegate you use for your task - by the time it is executed your loop has finished and you only get the end value 5 for o. Remember you are creating a closure over the loop variable, not it's current value - the value is only evaluated when the delegate is executed once the task is started.
You have to create a local copy of the loop variable instead, which you can then use safely:
for (int o = 0; o < 5; o++)
{
int localO = o;
Tasks.Add(Task.Factory.StartNew(() => { int i = r.Next(0, 3000); Thread.Sleep(i); Console.WriteLine("{0}: {1}", localO, i); }));
}
There are at least two problems here.
The problem with o having the value of 5 on every iteration is one of those "gotchas" of lexical closures. If you want o to capture its current value, you must create a locally scoped variable inside your loop and use that in your lambda, e.g.:
for (int o = 0; o < 5; ++o)
{
int localO = o;
// now use "localO" in your lambda ...
}
Also, Random is not thread-safe. Using the same instance of Random simultaneously across multiple threads can corrupt its state and give you unexpected results.
I think you are making the assumption that the tasks are executed in the order they are created, and the TPL makes no such guarantees...
As for the 'o' parameter always printing as 5, that is because it is a local variable in the parent scope of the anonymous function, hence when the print is actually executed its value is 5 because the loop has completed (compare to 'i' being scoped within the anonymous function)