Weird memory usage on Node.js - node.js

This simple code stores 1 million strings (100 chars length) in an array.
function makestring(len) {
var s = '';
while (len--) s = s+'1';
return s;
}
var s = '';
var arr = [];
for (var i=0; i<1000000; i++) {
s = makestring(100);
arr.push(s);
if (i%1000 == 0) console.log(i+' - '+s);
}
When I run it, I get this error:
(...)
408000 - 1111111111111111111 (...)
409000 - 1111111111111111111 (...)
FATAL ERROR: JS Allocation failed - process out of memory
That's strange 1 million * 100 are just 100 megabytes.
But if I move the s = makestring(100); outside the loop...
var s = makestring(100);
var arr = [];
for (var i=0; i<1000000; i++) {
arr.push(s);
if (i%1000 == 0) {
console.log(i+' - '+s);
}
}
This executes without errors!
Why? How can I store 1 Million objects in node?

In the moment you move the String generation outside the loop, you basically just create one String and push it a million times into the array.
Inside the array, however, just pointers to the original String are used, which is far less memory consuming then saving the String a million times.

Your first example builds 1000000 strings.
In your second example, you're taking the same string object and adding it to your array 1000000 times. (it's not copying the string; each entry of the array points to the same object)
V8 does a lot of things to optimize string use. For example, string concatenation is less expensive (in most cases) than you think. Rather than building a whole new string, it will typically opt to connect them i a linked list fashion under the covers.

Related

Subset Leetcode, size of a List

I am really curios about one thing when compiling, in the given code below, where I am creating nested for loop, I am giving as a limit
subsetArr.size()
but when compiling it is saying memory exceeded, and if I define size just before the for loop
int size = subsetArr.size()
and then passing limit as size
i<size;
it is working fine. What can be the cause?
class Solution {
public List<List<Integer>> subsets(int[] nums) {
List<List<Integer>> subsetArr = new ArrayList<>();
subsetArr.add(new ArrayList());
for(int num: nums){
for(int i=0; i<subsetArr.size(); i++){
List<Integer> takenList = new ArrayList<>(subsetArr.get(i));
takenList.add(num);
subsetArr.add(takenList);
}
}
return subsetArr;
}
}
Look at what this loop does:
for(int i=0; i<subsetArr.size(); i++){
List<Integer> takenList = new ArrayList<>(subsetArr.get(i));
takenList.add(num);
subsetArr.add(takenList); // <-- here
}
Each iteration adds to the collection. So in the next iteration, subsetArr.size() will be larger. Thus you have a loop which indefinitely increases the size of the collection until it runs out of resources.
Contrast that to when you store the value:
int size = subsetArr.size();
In this case, while subsetArr.size() may change, size won't unless you update it. So as long as you don't update size then you have a finite loop.

I cannot find out why this code keeps skipping a loop

Some background on what is going on:
We are processing addresses into standardized forms, this is the code to take addresses scored by how many components found and then rescore them using a levenshtein algorithm across similar post codes
The scores are how many components were found in that address divided by the number missed, to return a ratio
The input data, scoreDict, is a dictionary containing arrays of arrays. The first set of arrays is the scores, so there are 12 arrays because there are 12 scores in this file (it adjusts by file). There are then however many addresses fit that score in their own separate arrays stored in that. Don't ask me why I'm doing it that way, my brain is dead
The code correctly goes through each score array and each one is properly filled with the unique elements that make it up. It is not short by any amount, nothing is duplicated, I have checked
When we hit the score that is -1 (this goes to any address where it doesn't fit in some rule so we can't use its post code to find components so no components are found) the loop specifically ONLY DOES EVERY OTHER ADDRESS IN THIS SCORE ARRAY
It doesn't do this to any other score array, I have checked
I have tried changing the number to something else like 99, same issue except one LESS address got rescored, and the rest stayed at the original failing score of 99
I am going insane, can anyone find where in this loop something may be going wrong to cause it to only do every other line. The index counter of line and sc come through in the correct order and do not skip over. I have checked
I am sorry this is not professional, I have been at this one loop for 5 hours
Rescore: function Rescore(scoreDict) {
let tempInc = 0;
//Loop through all scores stored in scoreDict
for (var line in scoreDict) {
let addUpdate = "";
//Loop through each line stored by score
for (var sc in scoreDict[line.toString()]) {
console.log(scoreDict[line.toString()].length);
let possCodes = new Array();
const curLine = scoreDict[line.toString()][sc];
console.log(sc);
const curScore = curLine[1].split(',')[curLine[1].split(',').length-1];
switch (true) {
case curScore == -1:
let postCode = (new RegExp('([A-PR-UWYZ][A-HK-Y]?[0-9][A-Z0-9]?[ ]?[0-9][ABD-HJLNP-UW-Z]{2})', 'i')).exec(curLine[1].replace(/\\n/g, ','));
let areaCode;
//if (curLine.split(',')[curLine.split(',').length-2].includes("REFERENCE")) {
if ((postCode = (new RegExp('(([A-Z][A-Z]?[0-9][A-Z0-9]?(?=[ ]?[0-9][A-Z]{2}))|[0-9]{5})', 'i').exec(postCode))) !== null) {
for (const code in Object.keys(addProper)) {
leven.LoadWords(postCode[0], Object.keys(addProper)[code]);
if (leven.distance < 2) {
//Weight will have adjustment algorithms based on other factors
let weight = 1;
//Add all codes that are close to the same to a temp array
possCodes.push(postCode.input.replace(postCode[0], Object.keys(addProper)[code]).split(',')[0] + "(|W|)" + (leven.distance/weight));
}
}
let highScore = 0;
let candidates = new Array();
//Use the component script from cityprocess to rescore
for (var i=0;i<possCodes.length;i++) {
postValid.add([curLine[1].split(',').slice(0,curLine[1].split(',').length-2) + '(|S|)' + possCodes[i].split("(|W|)")[0]]);
if (postValid.addChunk[0].split('(|S|)')[postValid.addChunk[0].split('(|S|)').length-1] > highScore) {
candidates = new Array();
highScore = postValid.addChunk[0].split('(|S|)')[postValid.addChunk[0].split('(|S|)').length-1];
candidates.push(postValid.addChunk[0]);
} else if (postValid.addChunk[0].split('(|S|)')[postValid.addChunk[0].split('(|S|)').length-1] == highScore) {
candidates.push(postValid.addChunk[0]);
}
}
score.Rescore(curLine, sc, candidates[0]);
}
//} else if (curLine.split(',')[curLine.split(',').length-2].contains("AREA")) {
// leven.LoadWords();
//}
break;
case curScore > 0:
//console.log("That's a pretty good score mate");
break;
}
//console.log(line + ": " + scoreDict[line].length);
}
}
console.log(tempInc)
score.ScoreWrite(score.scoreDict);
}
The issue was that I was calling the loop on the array I was editing, so as each element got removed from the array (rescored and moved into a separate array) it got shorter by that element, resulting in an issue that when the first element was rescored and removed, and then we moved onto the second index which was now the third element, because everything shifted up by 1 index
I fixed it by having it simply enter an empty array for each removed element, so everything kept its index and the array kept its length, and then clear the empty values at a later time in the code

Node.js sliced string memory

I was trying:
r = [];
for (i = 0; i < 1e3; i++) {
a = (i+'').repeat(1e6);
r[i] = a.slice(64, 128);
}
and got an OutOfMemory. From here we see it's because all the as are kept in GC cuz a part of them are used.
How to make the slice don't keep the memory? I tried r[i]=''+a.slice(64, 128)+'' but still OOM. Do I have to a[64]+...+a[127] (loops also count as brute force)?
Is it so hard to slice and keep only necessary part of the old large string? The problem here only mentioned "copying every substring as a new string", but not "freeing part of the string remaining the necessary part assessible"
In this case it makes sense for the application code to be more aware of system constraints:
const r = [];
for (let i = 0; i < 1e3; ++i) {
const unitStr = String(i);
// choose something other than "1e6" here:
const maxRepeats = Math.ceil(128 / unitStr.length); // limit the size of the new string
// only using the last 64 characters...
r[i] = unitStr.repeat(maxRepeats).slice(64, 128);
}
...The application improvement is: to no longer construct 1000 strings of up to 3,000,000 bytes each when only 64 bytes are needed for each output string.
Your hardware and other constraints are not specified, but sometimes allowing the program more memory is appropriate:
node --max-old-space-size=8192 my-script.js
An analytic approach. Use logic to more precisely determine the in-memory state required for each working data chunk. With the constraints provided, minimize the generation of unneeded in-memory string data.
const r = new Array(1e3).fill().map((e,i) => outputRepeats(i));
function outputRepeats(idx) {
const OUTPUT_LENGTH = 128 - 64;
const unitStr = String(idx); // eg, '1', '40' or '286'
// determine from which character to start output from "unitStr"
const startIdxWithinUnit = (64 + 1) % unitStr.length; // this can be further optimized for known ranges of the "idx" input
// determine the approximate output string (may consume additional in-memory bytes: up to unitStr.length - 1)
// this can be logically simplified by unconditionally using a few more bytes of memory and eliminating the second arithmetic term
const maxOutputWindowStr = unitStr.repeat(Math.ceil(OUTPUT_LENGTH / unitStr.length) + Math.floor(Math.sign(startIdxWithinUnit)));
// return the exact resulting string
return maxOutputWindowStr.slice(startIdxWithinUnit, OUTPUT_LENGTH);
}

Javascript ES6 - Node Js - Associative Array - Objects use - wrong default order

I have a problem trying to use associative arrays/objects in node, I have the following code and I want to get the same order I use when i inserted elements.
var aa = []
aa[0] = 1
aa['second'] = 'pep'
aa['third'] = 'rob'
aa[4] = 2
for (var pos in aa) console.log (aa[pos])
But internally node put/sort first the numbers.
Look real execution :
1
2
pep
rob
I have created a parallel dynamic structure to solve this problem, but I'd like to ask for another solution or idea to get this target.
Regards.
Ricardo.
First of all, I'd recommend to use dictionary but not array, for dynamic keys:
var aa = {};
Elements are listed as its default order. You could check its default order with:
var keys = Object.keys(aa);
for (var i = 0; i < keys.length; i++) {
console.log(keys[i]);
}
If the default order is needed as the same order when inserting elements, try this to save the inserting order in another array:
var aa = {};
var keys = [];
aa[0] = 1
keys.push(0);
aa['second'] = 'pep'
keys.push('second');
aa['third'] = 'rob'
keys.push('third');
aa[4] = 2
keys.push(4);
for (var i = 0; i < keys.length; i++) {
console.log(aa[keys[i]]);
}
You may also want to give some ES6 features a try. Given you want to store data in a hash-like data structure that preserves the order I would recommend to give Map a try:
var map = new Map();
map.set(0, 1);
map.set('second', 'pep');
map.set('third', 'rob');
map.set(4, 2);
for (var [key, value] of map) {
console.log(key, value);
}
map.forEach(function (value, key) {
console.log(key, value);
});
nodejs arrays are just objects with numbers as keys and some added functions to their prototype. Also assoc. arrays are usually two arrays with values hinged together via a common index. for example:
let a = [1,2,3];
let b = ["one","two","three"];
also try to get away from looking through arrays with for loops. use the plethora of functions available to you via the prototype. for loops also iterate through enumerable properties on the objects prototype, so a for loop would incorrectly also iterate through that.

Node.js variables being changed without any operations performed on them

I am using a node.js server for a multiplayer synchronized dice, but i am having some strange problems with variables changing that are not referenced or used...
var origonalRolls = rolls;
//identify epic fails
var epicFails = [];
for(var r = 0; r < rolls.length; r++)
if(rolls[r] === 1)
epicFails.push(r);
console.log("TEST 1 :: " + JSON.stringify(rolls));
console.log("TEST 2 :: " + JSON.stringify(origonalRolls));
//remove epic fails and the corresponding heighest
if(epicFails.length > 0){
for(var i = epicFails.length-1; i >= 0; i--){
rolls.splice(epicFails[i], 1);
if(rolls[rolls.length-1] >= success)
rolls.splice(rolls.length-1, 1);
}
}
console.log("TEST 3 :: " + JSON.stringify(rolls));
console.log("TEST 4 :: " + JSON.stringify(origonalRolls));
the above should find any element in the rolls array which is 1 and then add it to epicFails. it should then remove it from rolls as well as the heighest remaining roll. (note, rolls is sorted numerically)
for some reason the output of this segment of code is as follows:
TEST 1 :: [1,1,2,3,3,6,7,7,9,9]
TEST 2 :: [1,1,2,3,3,6,7,7,9,9]
TEST 3 :: [2,3,3,6,7,7]
TEST 4 :: [2,3,3,6,7,7]
I am unsure why rolls and origonalRolls start the same and end the same. I am only using rolls.
Any help and/or explanation to this problem is welcome, it's been troubling me for a long time now...
In Javascript Arrays and Objects are only shallow copied - which means that an array (rolls) copied from another array (originalRolls) is only a reference to originalRolls - it is not an entirely new array, and modifying values in one will affect values in the other.
You will need to implement a deep copy function to create an entirely new array based off another. There are numerous imlementations of deep copying arrays/objects both here and elsewhere on the net - here is one of them from a quick Google.
Replace var origonalRolls = rolls; with:
var origonalRolls = [];
for (var i = 0, len = rolls.length; i < len; i++) {
origonalRolls[i] = rolls[i];
}

Resources