I'm currently making a program with many functions that utilise Math.rand(). I'm trying to generate a string with a given keyword (in this case, lathe). I want the program to log a string that has "lathe" (or any version of it, with capitals or not), but everything I've tried has the program hit its call stack size limit (I understand exactly why, I want the program to generate a string with the word without it hitting its call stack size).
What I have tried:
function generateStringWithKeyword(randNum: number) {
const chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+/";
let result = "";
for(let i = 0; i < randNum; i++) {
result += chars[Math.floor(Math.random() * chars.length)];
if(result.includes("lathe")) {
continue;
} else {
generateStringWithKeyword(randNum);
}
}
console.log(result);
}
This is what I have now, after doing brief research on stackoverflow I learned that it might have been better to add the if/else block with a continue, rather than using
if(!result.includes("lathe")) return generateStringWithKeyword(randNum);
But both ways I had hit the call stack size limit.
A "correct" version of your algorithm, written as an iterative function instead of as a recursive one so as not to exceed stack depth, would look something like this:
function generateStringWithKeyword(randNum: number) {
const chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+/";
let result = "";
let attemptCnt = 0;
while (!result.toLowerCase().includes("lathe")) {
attemptCnt++;
result = "";
for (let i = 0; i < randNum; i++) {
result += chars[Math.floor(Math.random() * chars.length)];
}
if (attemptCnt > 1e6) {
console.log("I GIVE UP");
return;
}
}
console.log(result);
return result;
}
I don't like when my browser hangs because of a script that won't finish, so I put a maximum attempt count in there. A million chances seems reasonable. When you try it out, this happens:
generateStringWithKeyword(10); // I GIVE UP
Which makes sense; let's perform a rough back-of-the-envelope probability calculation to see how long we might expect this to take. The chance that "lathe" will appear in some case at position 1 of the word is (2/64)×(2/64)×(2×64)×(2/64)×(2/64) ("L" or "l" appears first, followed by "A" or "a", etc) which is approximately 3×10-8. For a word of length 10, "lathe" can appear starting at positions 1, 2, 3, 4, 5, or 6. While this isn't exactly correct, let's think of this as multiplying your chances by 6 of getting the word somewhere, so the actual chance of getting a valid result is somewhere around 1.8×10-7. So we can expect that you'd need to make approximately 1 ÷ 1.8×10-7 = 5.6 million chances to succeed.
Oh, darn, I only gave it a million. Let's up that to 10 million and try again:
generateStringWithKeyword(10); // "lATHELEYSc"
Great! Although, it does sometimes still give up. And really, an algorithm which needs millions of tries before it succeeds is very, very inefficient. You might want to read about bogosort, a sorting algorithm which works by randomly shuffling things and checking to see if they are sorted, and it keeps trying until it works. It's used for educational purposes to highlight how such techniques don't really perform well enough to be practical. Nobody would ever want to use such an algorithm for real.
So how would you do this "the right" way? Well, my suggestion here is to just build your result correctly the first time. If you have 10 characters and 5 of them need to be "lathe" in some case, then you will need 5 truly random characters. So randomly decide how many of those letters should be before "lathe". If you pick 2, for example, then put 2 random characters, plus "lathe" in a random case, plus 3 more random characters.
It could be something like this, where I mostly use your same style of for-loops and += string concatenation:
function generateStringWithKeyword(randNum: number) {
const keyword = "lathe";
if (randNum < keyword.length) throw new Error(
"This is not possible; \"" + keyword + "\" doesn't fit in " + randNum + " characters"
);
const actuallyRandNum = randNum - keyword.length;
const chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+/";
let result = "";
const kwInsertionPoint = Math.floor(Math.random() * (actuallyRandNum + 1));
for (let i = 0; i < kwInsertionPoint; i++) {
result += chars[Math.floor(Math.random() * chars.length)];
}
for (let i = 0; i < keyword.length; i++) {
result += Math.random() < 0.5 ? keyword[i].toLowerCase() : keyword[i].toUpperCase();
}
for (let i = kwInsertionPoint; i < actuallyRandNum; i++) {
result += chars[Math.floor(Math.random() * chars.length)];
}
return result;
}
If you run this, you will see that it is very efficient, and never gives up:
console.log(Array.from({ length: 4 }, () => generateStringWithKeyword(5)).join(" "));
// "lathE LaThe lATHe LatHe"
console.log(Array.from({ length: 4 }, () => generateStringWithKeyword(7)).join(" "));
// "p6lAtHe laThE01 nlaTheK lATHeRJ"
console.log(Array.from({ length: 4 }, () => generateStringWithKeyword(10)).join(" "));
// "giMqzLaTHe 5klAthegBo oVdLatHe0q twNlATheCr"
Playground link to code
I want to change each word that matches the synonym list randomly by another synonym or itself (to randomly keep this keyword).
I test if a string (input) contains one element of an array (words). If it's true, I want to randomly replace this with the element of this same list.
var input = "This is an amazing text blob where this word amazing is replaced by a random word from list_of_words. Isn't this amazing!";
words_synonym = ["amazing", "formidable", "great", "smart"];
// first condition --> true if "input" contain one element of "words_synonym"
input = input.toLowerCase();
console.log(words_synonym.some(word => input.includes(word)));
after, I want to replace the "element" that validated the condition with a random element of the same array (words_synonym).
But I can't select this element. I have just true or false
var random_word = words_synonym[Math.floor(Math.random() * (words_synonym.length))]
input = input.replace(element, random_word, 0)
thanks
The way you have it right now, you're checking if any of the synonyms match any of the words (via words_synonym.some(word => input.includes(word))). In order to do what you want, you'll need both the position of the target word and the new word, neither of which you have now. To do this, you'll want to break apart your nested loops.
The code words_synonym.some(word => input.includes(word)) is equivalent to:
let has_synonym = false;
for (word of words_synonym) { // this is a loop
if (input.includes(word)) { // this is also a loop
has_synonym = true;
break;
}
}
console.log(has_synonym);
So to fix your main issue, just replace includes with indexOf.
To handle the case of replacing all of the tokens, I would suggest keeping track of the token you have replaced outside of the loop, otherwise you end up replacing each token many times which may become very expensive. To do this, just keep track of your starting position outside of the loop and increment it with the end index of the replacement word. indexOf already takes a start argument for exactly this use case!
const input = "This is an amazing text blob where this word amazing is replaced by a random word from list_of_words. Isn't this amazing!";
const words_synonym = ["amazing", "formidable", "great", "smart"];
let output = input;
let start = 0; // index of the end of the last replaced token
for (word of words_synonym) {
let index = output.indexOf(word, start);
while (index >= 0) {
const new_word = words_synonym[Math.floor(Math.random() * (words_synonym.length))];
output = output.substr(0, index) + new_word + output.substr(index + word.length, output.length);
start = index + new_word.length + 1; // increment the start
index = output.indexOf(word, start);
}
}
console.log("input: ", input);
console.log("output: ", output);
You can use method find:
words_synonym.find(word => input.includes(word))
Which returns
The value of the first element in the array that satisfies the
provided testing function. Otherwise, undefined is returned.
from docs:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/find
i have modify answer of dantiston and i have include a loop in order to change all the word match "words_synonym".
But there is a problem. The program don't check all the word of "words_synonym" but only the first with indexof.
var input = "This is an amazing text blob where this word amazing is replaced by a random word from list_of_words. Isn't this amazing!";
words_synonym = ["amazing", "formidable", "great", "smart"];
let output = input;
for (word of words_synonym) {
let index = output.indexOf(word);
if (index >= 0) {
console.log(word);
var indexes = [], i = -1;
while ((i = output.indexOf(word, i+1)) != -1){
index=output.indexOf(word, i);
var new_word = words_synonym[Math.floor(Math.random() * (words_synonym.length))];
output = output.substr(0, index) + new_word + output.substr(index + word.length, output.length);
}
}
}
console.log("input: ", input);
console.log("output: ", output);
How do I convert a string to an integer in JavaScript?
The simplest way would be to use the native Number function:
var x = Number("1000")
If that doesn't work for you, then there are the parseInt, unary plus, parseFloat with floor, and Math.round methods.
parseInt()
var x = parseInt("1000", 10); // You want to use radix 10
// So you get a decimal number even with a leading 0 and an old browser ([IE8, Firefox 20, Chrome 22 and older][1])
Unary plus
If your string is already in the form of an integer:
var x = +"1000";
floor()
If your string is or might be a float and you want an integer:
var x = Math.floor("1000.01"); // floor() automatically converts string to number
Or, if you're going to be using Math.floor several times:
var floor = Math.floor;
var x = floor("1000.01");
parseFloat()
If you're the type who forgets to put the radix in when you call parseInt, you can use parseFloat and round it however you like. Here I use floor.
var floor = Math.floor;
var x = floor(parseFloat("1000.01"));
round()
Interestingly, Math.round (like Math.floor) will do a string to number conversion, so if you want the number rounded (or if you have an integer in the string), this is a great way, maybe my favorite:
var round = Math.round;
var x = round("1000"); // Equivalent to round("1000", 0)
Try parseInt function:
var number = parseInt("10");
But there is a problem. If you try to convert "010" using parseInt function, it detects as octal number, and will return number 8. So, you need to specify a radix (from 2 to 36). In this case base 10.
parseInt(string, radix)
Example:
var result = parseInt("010", 10) == 10; // Returns true
var result = parseInt("010") == 10; // Returns false
Note that parseInt ignores bad data after parsing anything valid.
This guid will parse as 51:
var result = parseInt('51e3daf6-b521-446a-9f5b-a1bb4d8bac36', 10) == 51; // Returns true
There are two main ways to convert a string to a number in JavaScript. One way is to parse it and the other way is to change its type to a Number. All of the tricks in the other answers (e.g., unary plus) involve implicitly coercing the type of the string to a number. You can also do the same thing explicitly with the Number function.
Parsing
var parsed = parseInt("97", 10);
parseInt and parseFloat are the two functions used for parsing strings to numbers. Parsing will stop silently if it hits a character it doesn't recognise, which can be useful for parsing strings like "92px", but it's also somewhat dangerous, since it won't give you any kind of error on bad input, instead you'll get back NaN unless the string starts with a number. Whitespace at the beginning of the string is ignored. Here's an example of it doing something different to what you want, and giving no indication that anything went wrong:
var widgetsSold = parseInt("97,800", 10); // widgetsSold is now 97
It's good practice to always specify the radix as the second argument. In older browsers, if the string started with a 0, it would be interpreted as octal if the radix wasn't specified which took a lot of people by surprise. The behaviour for hexadecimal is triggered by having the string start with 0x if no radix is specified, e.g., 0xff. The standard actually changed with ECMAScript 5, so modern browsers no longer trigger octal when there's a leading 0 if no radix has been specified. parseInt understands radixes up to base 36, in which case both upper and lower case letters are treated as equivalent.
Changing the Type of a String to a Number
All of the other tricks mentioned above that don't use parseInt, involve implicitly coercing the string into a number. I prefer to do this explicitly,
var cast = Number("97");
This has different behavior to the parse methods (although it still ignores whitespace). It's more strict: if it doesn't understand the whole of the string than it returns NaN, so you can't use it for strings like 97px. Since you want a primitive number rather than a Number wrapper object, make sure you don't put new in front of the Number function.
Obviously, converting to a Number gives you a value that might be a float rather than an integer, so if you want an integer, you need to modify it. There are a few ways of doing this:
var rounded = Math.floor(Number("97.654")); // other options are Math.ceil, Math.round
var fixed = Number("97.654").toFixed(0); // rounded rather than truncated
var bitwised = Number("97.654")|0; // do not use for large numbers
Any bitwise operator (here I've done a bitwise or, but you could also do double negation as in an earlier answer or a bit shift) will convert the value to a 32 bit integer, and most of them will convert to a signed integer. Note that this will not do want you want for large integers. If the integer cannot be represented in 32 bits, it will wrap.
~~"3000000000.654" === -1294967296
// This is the same as
Number("3000000000.654")|0
"3000000000.654" >>> 0 === 3000000000 // unsigned right shift gives you an extra bit
"300000000000.654" >>> 0 === 3647256576 // but still fails with larger numbers
To work correctly with larger numbers, you should use the rounding methods
Math.floor("3000000000.654") === 3000000000
// This is the same as
Math.floor(Number("3000000000.654"))
Bear in mind that coercion understands exponential notation and Infinity, so 2e2 is 200 rather than NaN, while the parse methods don't.
Custom
It's unlikely that either of these methods do exactly what you want. For example, usually I would want an error thrown if parsing fails, and I don't need support for Infinity, exponentials or leading whitespace. Depending on your use case, sometimes it makes sense to write a custom conversion function.
Always check that the output of Number or one of the parse methods is the sort of number you expect. You will almost certainly want to use isNaN to make sure the number is not NaN (usually the only way you find out that the parse failed).
ParseInt() and + are different
parseInt("10.3456") // returns 10
+"10.3456" // returns 10.3456
Fastest
var x = "1000"*1;
Test
Here is little comparison of speed (macOS only)... :)
For Chrome, 'plus' and 'mul' are fastest (>700,000,00 op/sec), 'Math.floor' is slowest. For Firefox, 'plus' is slowest (!) 'mul' is fastest (>900,000,000 op/sec). In Safari 'parseInt' is fastest, 'number' is slowest (but results are quite similar, >13,000,000 <31,000,000). So Safari for cast string to int is more than 10x slower than other browsers. So the winner is 'mul' :)
You can run it on your browser by this link
https://jsperf.com/js-cast-str-to-number/1
I also tested var x = ~~"1000";. On Chrome and Safari, it is a little bit slower than var x = "1000"*1 (<1%), and on Firefox it is a little bit faster (<1%).
I use this way of converting string to number:
var str = "25"; // String
var number = str*1; // Number
So, when multiplying by 1, the value does not change, but JavaScript automatically returns a number.
But as it is shown below, this should be used if you are sure that the str is a number (or can be represented as a number), otherwise it will return NaN - not a number.
You can create simple function to use, e.g.,
function toNumber(str) {
return str*1;
}
Try parseInt.
var number = parseInt("10", 10); //number will have value of 10.
I love this trick:
~~"2.123"; //2
~~"5"; //5
The double bitwise negative drops off anything after the decimal point AND converts it to a number format. I've been told it's slightly faster than calling functions and whatnot, but I'm not entirely convinced.
Another method I just saw here (a question about the JavaScript >>> operator, which is a zero-fill right shift) which shows that shifting a number by 0 with this operator converts the number to a uint32 which is nice if you also want it unsigned. Again, this converts to an unsigned integer, which can lead to strange behaviors if you use a signed number.
"-2.123" >>> 0; // 4294967294
"2.123" >>> 0; // 2
"-5" >>> 0; // 4294967291
"5" >>> 0; // 5
In JavaScript, you can do the following:
ParseInt
parseInt("10.5") // Returns 10
Multiplying with 1
var s = "10";
s = s*1; // Returns 10
Using the unary operator (+)
var s = "10";
s = +s; // Returns 10
Using a bitwise operator
(Note: It starts to break after 2140000000. Example: ~~"2150000000" = -2144967296)
var s = "10.5";
s = ~~s; // Returns 10
Using Math.floor() or Math.ceil()
var s = "10";
s = Math.floor(s) || Math.ceil(s); // Returns 10
Please see the below example. It will help answer your question.
Example Result
parseInt("4") 4
parseInt("5aaa") 5
parseInt("4.33333") 4
parseInt("aaa"); NaN (means "Not a Number")
By using parseint function, it will only give op of integer present and not the string.
Beware if you use parseInt to convert a float in scientific notation!
For example:
parseInt("5.6e-14")
will result in
5
instead of
0
Also as a side note: MooTools has the function toInt() which is used on any native string (or float (or integer)).
"2".toInt() // 2
"2px".toInt() // 2
2.toInt() // 2
We can use +(stringOfNumber) instead of using parseInt(stringOfNumber).
Example: +("21") returns int of 21, like the parseInt("21").
We can use this unary "+" operator for parsing float too...
To convert a String into Integer, I recommend using parseFloat and not parseInt. Here's why:
Using parseFloat:
parseFloat('2.34cms') //Output: 2.34
parseFloat('12.5') //Output: 12.5
parseFloat('012.3') //Output: 12.3
Using parseInt:
parseInt('2.34cms') //Output: 2
parseInt('12.5') //Output: 12
parseInt('012.3') //Output: 12
So if you have noticed parseInt discards the values after the decimals, whereas parseFloat lets you work with floating point numbers and hence more suitable if you want to retain the values after decimals. Use parseInt if and only if you are sure that you want the integer value.
There are many ways in JavaScript to convert a string to a number value... All are simple and handy. Choose the way which one works for you:
var num = Number("999.5"); //999.5
var num = parseInt("999.5", 10); //999
var num = parseFloat("999.5"); //999.5
var num = +"999.5"; //999.5
Also, any Math operation converts them to number, for example...
var num = "999.5" / 1; //999.5
var num = "999.5" * 1; //999.5
var num = "999.5" - 1 + 1; //999.5
var num = "999.5" - 0; //999.5
var num = Math.floor("999.5"); //999
var num = ~~"999.5"; //999
My prefer way is using + sign, which is the elegant way to convert a string to number in JavaScript.
Try str - 0 to convert string to number.
> str = '0'
> str - 0
0
> str = '123'
> str - 0
123
> str = '-12'
> str - 0
-12
> str = 'asdf'
> str - 0
NaN
> str = '12.34'
> str - 0
12.34
Here are two links to compare the performance of several ways to convert string to int
https://jsperf.com/number-vs-parseint-vs-plus
http://phrogz.net/js/string_to_number.html
Here is the easiest solution
let myNumber = "123" | 0;
More easy solution
let myNumber = +"123";
In my opinion, no answer covers all edge cases as parsing a float should result in an error.
function parseInteger(value) {
if(value === '') return NaN;
const number = Number(value);
return Number.isInteger(number) ? number : NaN;
}
parseInteger("4") // 4
parseInteger("5aaa") // NaN
parseInteger("4.33333") // NaN
parseInteger("aaa"); // NaN
The easiest way would be to use + like this
const strTen = "10"
const numTen = +strTen // string to number conversion
console.log(typeof strTen) // string
console.log(typeof numTen) // number
I actually needed to "save" a string as an integer, for a binding between C and JavaScript, so I convert the string into an integer value:
/*
Examples:
int2str( str2int("test") ) == "test" // true
int2str( str2int("t€st") ) // "t¬st", because "€".charCodeAt(0) is 8364, will be AND'ed with 0xff
Limitations:
maximum 4 characters, so it fits into an integer
*/
function str2int(the_str) {
var ret = 0;
var len = the_str.length;
if (len >= 1) ret += (the_str.charCodeAt(0) & 0xff) << 0;
if (len >= 2) ret += (the_str.charCodeAt(1) & 0xff) << 8;
if (len >= 3) ret += (the_str.charCodeAt(2) & 0xff) << 16;
if (len >= 4) ret += (the_str.charCodeAt(3) & 0xff) << 24;
return ret;
}
function int2str(the_int) {
var tmp = [
(the_int & 0x000000ff) >> 0,
(the_int & 0x0000ff00) >> 8,
(the_int & 0x00ff0000) >> 16,
(the_int & 0xff000000) >> 24
];
var ret = "";
for (var i=0; i<4; i++) {
if (tmp[i] == 0)
break;
ret += String.fromCharCode(tmp[i]);
}
return ret;
}
String to Number in JavaScript:
Unary + (most recommended)
+numStr is easy to use and has better performance compared with others
Supports both integers and decimals
console.log(+'123.45') // => 123.45
Some other options:
Parsing Strings:
parseInt(numStr) for integers
parseFloat(numStr) for both integers and decimals
console.log(parseInt('123.456')) // => 123
console.log(parseFloat('123')) // => 123
JavaScript Functions
Math functions like round(numStr), floor(numStr), ceil(numStr) for integers
Number(numStr) for both integers and decimals
console.log(Math.floor('123')) // => 123
console.log(Math.round('123.456')) // => 123
console.log(Math.ceil('123.454')) // => 124
console.log(Number('123.123')) // => 123.123
Unary Operators
All basic unary operators, +numStr, numStr-0, 1*numStr, numStr*1, and numStr/1
All support both integers and decimals
Be cautious about numStr+0. It returns a string.
console.log(+'123') // => 123
console.log('002'-0) // => 2
console.log(1*'5') // => 5
console.log('7.7'*1) // => 7.7
console.log(3.3/1) // =>3.3
console.log('123.123'+0, typeof ('123.123' + 0)) // => 123.1230 string
Bitwise Operators
Two tilde ~~numStr or left shift 0, numStr<<0
Supports only integers, but not decimals
console.log(~~'123') // => 123
console.log('0123'<<0) // => 123
console.log(~~'123.123') // => 123
console.log('123.123'<<0) // => 123
// Parsing
console.log(parseInt('123.456')) // => 123
console.log(parseFloat('123')) // => 123
// Function
console.log(Math.floor('123')) // => 123
console.log(Math.round('123.456')) // => 123
console.log(Math.ceil('123.454')) // => 124
console.log(Number('123.123')) // => 123.123
// Unary
console.log(+'123') // => 123
console.log('002'-0) // => 2
console.log(1*'5') // => 5
console.log('7.7'*1) // => 7.7
console.log(3.3/1) // => 3.3
console.log('123.123'+0, typeof ('123.123'+0)) // => 123.1230 string
// Bitwise
console.log(~~'123') // => 123
console.log('0123'<<0) // => 123
console.log(~~'123.123') // => 123
console.log('123.123'<<0) // => 123
function parseIntSmarter(str) {
// ParseInt is bad because it returns 22 for "22thisendsintext"
// Number() is returns NaN if it ends in non-numbers, but it returns 0 for empty or whitespace strings.
return isNaN(Number(str)) ? NaN : parseInt(str, 10);
}
You can use plus.
For example:
var personAge = '24';
var personAge1 = (+personAge)
then you can see the new variable's type bytypeof personAge1 ; which is number.
Summing the multiplication of digits with their respective power of ten:
i.e: 123 = 100+20+3 = 1100 + 2+10 + 31 = 1*(10^2) + 2*(10^1) + 3*(10^0)
function atoi(array) {
// Use exp as (length - i), other option would be
// to reverse the array.
// Multiply a[i] * 10^(exp) and sum
let sum = 0;
for (let i = 0; i < array.length; i++) {
let exp = array.length - (i+1);
let value = array[i] * Math.pow(10, exp);
sum += value;
}
return sum;
}
The safest way to ensure you get a valid integer:
let integer = (parseInt(value, 10) || 0);
Examples:
// Example 1 - Invalid value:
let value = null;
let integer = (parseInt(value, 10) || 0);
// => integer = 0
// Example 2 - Valid value:
let value = "1230.42";
let integer = (parseInt(value, 10) || 0);
// => integer = 1230
// Example 3 - Invalid value:
let value = () => { return 412 };
let integer = (parseInt(value, 10) || 0);
// => integer = 0
Another option is to double XOR the value with itself:
var i = 12.34;
console.log('i = ' + i);
console.log('i ⊕ i ⊕ i = ' + (i ^ i ^ i));
This will output:
i = 12.34
i ⊕ i ⊕ i = 12
I only added one plus(+) before string and that was solution!
+"052254" // 52254
Number()
Number(" 200.12 ") // Returns 200.12
Number("200.12") // Returns 200.12
Number("200") // Returns 200
parseInt()
parseInt(" 200.12 ") // Return 200
parseInt("200.12") // Return 200
parseInt("200") // Return 200
parseInt("Text information") // Returns NaN
parseFloat()
It will return the first number
parseFloat("200 400") // Returns 200
parseFloat("200") // Returns 200
parseFloat("Text information") // Returns NaN
parseFloat("200.10") // Return 200.10
Math.floor()
Round a number to the nearest integer
Math.floor(" 200.12 ") // Return 200
Math.floor("200.12") // Return 200
Math.floor("200") // Return 200
function doSth(){
var a = document.getElementById('input').value;
document.getElementById('number').innerHTML = toNumber(a) + 1;
}
function toNumber(str){
return +str;
}
<input id="input" type="text">
<input onclick="doSth()" type="submit">
<span id="number"></span>
This (probably) isn't the best solution for parsing an integer, but if you need to "extract" one, for example:
"1a2b3c" === 123
"198some text2hello world!30" === 198230
// ...
this would work (only for integers):
var str = '3a9b0c3d2e9f8g'
function extractInteger(str) {
var result = 0;
var factor = 1
for (var i = str.length; i > 0; i--) {
if (!isNaN(str[i - 1])) {
result += parseInt(str[i - 1]) * factor
factor *= 10
}
}
return result
}
console.log(extractInteger(str))
Of course, this would also work for parsing an integer, but would be slower than other methods.
You could also parse integers with this method and return NaN if the string isn't a number, but I don't see why you'd want to since this relies on parseInt internally and parseInt is probably faster.
var str = '3a9b0c3d2e9f8g'
function extractInteger(str) {
var result = 0;
var factor = 1
for (var i = str.length; i > 0; i--) {
if (isNaN(str[i - 1])) return NaN
result += parseInt(str[i - 1]) * factor
factor *= 10
}
return result
}
console.log(extractInteger(str))
Thinking about this question on testing string rotation, I wondered: Is there was such thing as a circular/cyclic hash function? E.g.
h(abcdef) = h(bcdefa) = h(cdefab) etc
Uses for this include scalable algorithms which can check n strings against each other to see where some are rotations of others.
I suppose the essence of the hash is to extract information which is order-specific but not position-specific. Maybe something that finds a deterministic 'first position', rotates to it and hashes the result?
It all seems plausible, but slightly beyond my grasp at the moment; it must be out there already...
I'd go along with your deterministic "first position" - find the "least" character; if it appears twice, use the next character as the tie breaker (etc). You can then rotate to a "canonical" position, and hash that in a normal way. If the tie breakers run for the entire course of the string, then you've got a string which is a rotation of itself (if you see what I mean) and it doesn't matter which you pick to be "first".
So:
"abcdef" => hash("abcdef")
"defabc" => hash("abcdef")
"abaac" => hash("aacab") (tie-break between aa, ac and ab)
"cabcab" => hash("abcabc") (it doesn't matter which "a" comes first!)
Update: As Jon pointed out, the first approach doesn't handle strings with repetition very well. Problems arise as duplicate pairs of letters are encountered and the resulting XOR is 0. Here is a modification that I believe fixes the the original algorithm. It uses Euclid-Fermat sequences to generate pairwise coprime integers for each additional occurrence of a character in the string. The result is that the XOR for duplicate pairs is non-zero.
I've also cleaned up the algorithm slightly. Note that the array containing the EF sequences only supports characters in the range 0x00 to 0xFF. This was just a cheap way to demonstrate the algorithm. Also, the algorithm still has runtime O(n) where n is the length of the string.
static int Hash(string s)
{
int H = 0;
if (s.Length > 0)
{
//any arbitrary coprime numbers
int a = s.Length, b = s.Length + 1;
//an array of Euclid-Fermat sequences to generate additional coprimes for each duplicate character occurrence
int[] c = new int[0xFF];
for (int i = 1; i < c.Length; i++)
{
c[i] = i + 1;
}
Func<char, int> NextCoprime = (x) => c[x] = (c[x] - x) * c[x] + x;
Func<char, char, int> NextPair = (x, y) => a * NextCoprime(x) * x.GetHashCode() + b * y.GetHashCode();
//for i=0 we need to wrap around to the last character
H = NextPair(s[s.Length - 1], s[0]);
//for i=1...n we use the previous character
for (int i = 1; i < s.Length; i++)
{
H ^= NextPair(s[i - 1], s[i]);
}
}
return H;
}
static void Main(string[] args)
{
Console.WriteLine("{0:X8}", Hash("abcdef"));
Console.WriteLine("{0:X8}", Hash("bcdefa"));
Console.WriteLine("{0:X8}", Hash("cdefab"));
Console.WriteLine("{0:X8}", Hash("cdfeab"));
Console.WriteLine("{0:X8}", Hash("a0a0"));
Console.WriteLine("{0:X8}", Hash("1010"));
Console.WriteLine("{0:X8}", Hash("0abc0def0ghi"));
Console.WriteLine("{0:X8}", Hash("0def0abc0ghi"));
}
The output is now:
7F7D7F7F
7F7D7F7F
7F7D7F7F
7F417F4F
C796C7F0
E090E0F0
A909BB71
A959BB71
First Version (which isn't complete): Use XOR which is commutative (order doesn't matter) and another little trick involving coprimes to combine ordered hashes of pairs of letters in the string. Here is an example in C#:
static int Hash(char[] s)
{
//any arbitrary coprime numbers
const int a = 7, b = 13;
int H = 0;
if (s.Length > 0)
{
//for i=0 we need to wrap around to the last character
H ^= (a * s[s.Length - 1].GetHashCode()) + (b * s[0].GetHashCode());
//for i=1...n we use the previous character
for (int i = 1; i < s.Length; i++)
{
H ^= (a * s[i - 1].GetHashCode()) + (b * s[i].GetHashCode());
}
}
return H;
}
static void Main(string[] args)
{
Console.WriteLine(Hash("abcdef".ToCharArray()));
Console.WriteLine(Hash("bcdefa".ToCharArray()));
Console.WriteLine(Hash("cdefab".ToCharArray()));
Console.WriteLine(Hash("cdfeab".ToCharArray()));
}
The output is:
4587590
4587590
4587590
7077996
You could find a deterministic first position by always starting at the position with the "lowest" (in terms of alphabetical ordering) substring. So in your case, you'd always start at "a". If there were multiple "a"s, you'd have to take two characters into account etc.
I am sure that you could find a function that can generate the same hash regardless of character position in the input, however, how will you ensure that h(abc) != h(efg) for every conceivable input? (Collisions will occur for all hash algorithms, so I mean, how do you minimize this risk.)
You'd need some additional checks even after generating the hash to ensure that the strings contain the same characters.
Here's an implementation using Linq
public string ToCanonicalOrder(string input)
{
char first = input.OrderBy(x => x).First();
string doubledForRotation = input + input;
string canonicalOrder
= (-1)
.GenerateFrom(x => doubledForRotation.IndexOf(first, x + 1))
.Skip(1) // the -1
.TakeWhile(x => x < input.Length)
.Select(x => doubledForRotation.Substring(x, input.Length))
.OrderBy(x => x)
.First();
return canonicalOrder;
}
assuming generic generator extension method:
public static class TExtensions
{
public static IEnumerable<T> GenerateFrom<T>(this T initial, Func<T, T> next)
{
var current = initial;
while (true)
{
yield return current;
current = next(current);
}
}
}
sample usage:
var sequences = new[]
{
"abcdef", "bcdefa", "cdefab",
"defabc", "efabcd", "fabcde",
"abaac", "cabcab"
};
foreach (string sequence in sequences)
{
Console.WriteLine(ToCanonicalOrder(sequence));
}
output:
abcdef
abcdef
abcdef
abcdef
abcdef
abcdef
aacab
abcabc
then call .GetHashCode() on the result if necessary.
sample usage if ToCanonicalOrder() is converted to an extension method:
sequence.ToCanonicalOrder().GetHashCode();
One possibility is to combine the hash functions of all circular shifts of your input into one meta-hash which does not depend on the order of the inputs.
More formally, consider
for(int i=0; i<string.length; i++) {
result^=string.rotatedBy(i).hashCode();
}
Where you could replace the ^= with any other commutative operation.
More examply, consider the input
"abcd"
to get the hash we take
hash("abcd") ^ hash("dabc") ^ hash("cdab") ^ hash("bcda").
As we can see, taking the hash of any of these permutations will only change the order that you are evaluating the XOR, which won't change its value.
I did something like this for a project in college. There were 2 approaches I used to try to optimize a Travelling-Salesman problem. I think if the elements are NOT guaranteed to be unique, the second solution would take a bit more checking, but the first one should work.
If you can represent the string as a matrix of associations so abcdef would look like
a b c d e f
a x
b x
c x
d x
e x
f x
But so would any combination of those associations. It would be trivial to compare those matrices.
Another quicker trick would be to rotate the string so that the "first" letter is first. Then if you have the same starting point, the same strings will be identical.
Here is some Ruby code:
def normalize_string(string)
myarray = string.split(//) # split into an array
index = myarray.index(myarray.min) # find the index of the minimum element
index.times do
myarray.push(myarray.shift) # move stuff from the front to the back
end
return myarray.join
end
p normalize_string('abcdef').eql?normalize_string('defabc') # should return true
Maybe use a rolling hash for each offset (RabinKarp like) and return the minimum hash value? There could be collisions though.