Convert String of words/letters into an Integer - string

Today I've finally decided to make an account, in hope for some aid in an issue I've spent the last few hours hunting. (I've spent the past couple hours hunting down a response, from Google to here to Unity Answers. Here's everything that I've found so far, which doesn't work.)
What I'm looking for, is to change a string of purely words/letters into an integer. Therefore "Hello World", would be translated into a string of numbers accordingly. This may be surprising, but this is a lot harder than it sounds. I've found a way to do essentially everything but, thus far.
Presumably the best way would be to get the ASCII value of each letter in the string, and put them all together into a single integer. (No sequences or need to separate them, but one single number.) I have no idea where to get started or how to do that, however. Really anything that you think would work, preferably as short-hand and un-bothersome as possible.
To be as clear as possible, I need to take the letter-only variable "example" and transmorph it to be a integer/only a sequence of numbers.

If you're just trying to convert an arbitrary string into a random seed, then why not try randomSeed.GetHashCode()? That will return an int value suitable for setting the seed, which would produce the same number each time the same string is entered.

You can iterate over all characters, get their charCode and chain them together. The first method splits the string into single chars and uses Array.reduce:
var str = 'qwertzuiop';
var num = parseInt(str.split('').reduce(function(a, b) {return a + b.charCodeAt(0);}, '');
The second calls Array.forEach on the string, because it has numerical indices and a length property.
var num = ''; [].forEach.call(str, function(c) {num += c.charCodeAt(0);});
num = parseInt(num);
In stoneaged browsers you have to use for-loops instead.

Related

Hash function to see if one string is scrambled form/permutation of another?

I want to check if string A is just a reordered version of string B. For example, "abc" = "bca" = "cab"...
There are other solutions here: https://www.geeksforgeeks.org/check-if-two-strings-are-permutation-of-each-other/
However, I was thinking a hash function would be an easy way of doing this, but the typical hash function takes order into consideration. Are there any hash functions that do not care about character order?
Are there any hash functions that do not care about character order?
I don't know of real-world hash functions that have this property, no. Because this is not a problem they are designed to solve.
However, in this specific case, you can make your own "hash" function (a very very bad one) that will indeed ignore order: just sum ASCII codes of characters. This works due to the commutative property of addition (a + b == b + a)
def isAnagram(self,a,b):
sum_a = 0
sum_b = 0
for c in a:
sum_a += ord(c)
for c in b:
sum_b += ord(c)
return sum_a == sum_b
To reiterate, this is absolutely a hack, that only happens to work because input strings are limited in content in the judge system (only have lowercase ASCII characters and do not contain spaces). It will not work (reliably) on arbitrary strings.
For a fast check you could use a kind af hash-funkction
Candidates are:
xor all characters of a String
add all characters of a String
multiply all characters of a String (be careful might lead to overflow for large Strings)
If the hash-value is equal, it could still be a collision of two not 'equal' strings. So you still need to make a dedicated compare. (e.g. sort the characters of each string before comparing them).

String matching without using builtin functions

I want to search for a query (a string) in a subject (another string).
The query may appear in whole or in parts, but will not be rearranged. For instance, if the query is 'da', and the subject is 'dura', it is still a match.
I am not allowed to use string functions like strfind or find.
The constraints make this actually quite straightforward with a single loop. Imagine you have two indices initially pointing at the first character of both strings, now compare them - if they don't match, increment the subject index and try again. If they do, increment both. If you've reached the end of the query at that point, you've found it. The actual implementation should be simple enough, and I don't want to do all the work for you ;)
If this is homework, I suggest you look at the explanation which precedes the code and then try for yourself, before looking at the actual code.
The code below looks for all occurrences of chars of the query string within the subject string (variables m; and related ii, jj). It then tests all possible orders of those occurrences (variable test). An order is "acceptable" if it contains all desired chars (cond1) in increasing positions (cond2). The result (variable result) is affirmative if there is at least one acceptable order.
subject = 'this is a test string';
query = 'ten';
m = bsxfun(#eq, subject.', query);
%'// m: test if each char of query equals each char of subject
[ii jj] = find(m);
jj = jj.'; %'// ii: which char of query is found within subject...
ii = ii.'; %'// jj: ... and at which position
test = nchoosek(1:numel(jj),numel(query)).'; %'// test all possible orders
cond1 = all(jj(test) == repmat((1:numel(query)).',1,size(test,2)));
%'// cond1: for each order, are all chars of query found in subject?
cond2 = all(diff(ii(test))>0);
%// cond2: for each order, are the found chars in increasing positions?
result = any(cond1 & cond2); %// final result: 1 or 0
The code could be improved by using a better approach as regards to test, i.e. not testing all possible orders given by nchoosek.
Matlab allows you to view the source of built-in functions, so you could always try reading the code to see how the Matlab developers did it (although it will probably be very complex). (thanks Luis for the correction)
Finding a string in another string is a basic computer science problem. You can read up on it in any number of resources, such as Wikipedia.
Your requirement of non-rearranging partial matches recalls the bioinformatics problem of mapping splice variants to a genomic sequence.
You may solve your problem by using a sequence alignment algorithm such as Smith-Waterman, modified to work with all English characters and not just DNA bases.
Is this question actually from bioinformatics? If so, you should tag it as such.

Extract decimal part in a string in C#

I have a string in which the data is getting loaded in this format. "float;#123456.0300000" from which i need to extract only 123456.03 and trim all other starting and ending characters. May be it is very basic that i am missing, please let me know how can i do that. Thanks.
If the format is always float;# followed by the bit you want, then it's fairly simple:
// TODO: Validate that it actually starts with "float;#". What do you want
// to do if it doesn't?
string userText = originalText.Substring("float;#".Length);
// We wouldn't want to trim "300" to "3"
if (userText.Contains("."))
{
userText = userText.TrimEnd('0');
// Trim "123.000" to "123" instead of "123."
if (userText.EndsWith("."))
{
userText = userText.Substring(0, userText.Length - 1);
}
}
But you really need to be confident in the format - that there won't be anything else you need to remove, etc.
In particular:
What's the maximum number of decimal places you want to support?
Might there be thousands separators?
Will the decimal separator always be a period?
Can the value be negative?
Is an explicit leading + valid?
Do you need to handle scientific format (1.234e5 etc)?
Might there be trailing characters?
Might the trailing characters include digits?
Do you need to handle non-ASCII digits?
You may well want to use a regular expression if you need anything more complicated than the code above, but you'll really want to know as much as you can about your possible inputs before going much further. (I'd strongly recommend writing unit tests for every form of input you can think of.)
Why not just do:
String s = "float;#123456.0300000";
int i = int.Parse(s.Substring(s.IndexOf('#') + 1));
I haven't tested this, but it should be close.
Assuming the format doesn't change, just find the character after the '#' and turn it into an int, so you lose everything after the '.'.
If you need it as a string, then just call the ToString method.

repeat string with LINQ/extensions methods [duplicate]

This question already has answers here:
Is there an easy way to return a string repeated X number of times?
(21 answers)
Closed 9 years ago.
Just a curiosity I was investigating.
The matter: simply repeating (multiplying, someone would say) a string/character n times.
I know there is Enumerable.Repeat for this aim, but I was trying to do this without it.
LINQ in this case seems pretty useless, because in a query like
from X in "s" select X
the string "s" is being explored and so X is a char. The same is with extension methods, because for example "s".Aggregate(blablabla) would again work on just the character 's', not the string itself. For repeating the string something "external" would be needed, so I thought lambdas and delegates, but it can't be done without declaring a variable to assign the delegate/lambda expression to.
So something like defining a function and calling it inline:
( (a)=>{return " "+a;} )("a");
or
delegate(string a){return " "+a}(" ");
would give a "without name" error (and so no recursion, AFAIK, even by passing a possible lambda/delegate as a parameter), and in the end couldn't even be created by C# because of its limitations.
It could be that I'm watching this thing from the wrong perspective. Any ideas?
This is just an experiment, I don't care about performances, about memory use... Just that it is one line and sort of autonomous. Maybe one could do something with Copy/CopyTo, or casting it to some other collection, I don't know. Reflection is accepted too.
To repeat a character n-times you would not use Enumerable.Repeat but just this string constructor:
string str = new string('X', 10);
To repeat a string i don't know anything better than using string.Join and Enumerable.Repeat
string foo = "Foo";
string str = string.Join("", Enumerable.Repeat(foo, 10));
edit: you could use string.Concat instead if you need no separator:
string str = string.Concat( Enumerable.Repeat(foo, 10) );
If you're trying to repeat a string, rather than a character, a simple way would be to use the StringBuilder.Insert method, which takes an insertion index and a count for the number of repetitions to use:
var sb = new StringBuilder();
sb.Insert(0, "hi!", 5);
Console.WriteLine(sb.ToString());
Otherwise, to repeat a single character, use the string constructor as I've mentioned in the comments for the similar question here. For example:
string result = new String('-', 5); // -----
For the sake of completeness, it's worth noting that StringBuilder provides an overloaded Append method that can repeat a character, but has no such overload for strings (which is where the Insert method comes in). I would prefer the string constructor to the StringBuilder if that's all I was interested in doing. However, if I was already working with a StringBuilder, it might make sense to use the Append method to benefit from some chaining. Here's a contrived example to demonstrate:
var sb = new StringBuilder("This item is ");
sb.Insert(sb.Length, "very ", 2) // insert at the end to append
.Append('*', 3)
.Append("special")
.Append('*', 3);
Console.WriteLine(sb.ToString()); // This item is very very ***special***

Modifying a character in a string in Lua

Is there any way to replace a character at position N in a string in Lua.
This is what I've come up with so far:
function replace_char(pos, str, r)
return str:sub(pos, pos - 1) .. r .. str:sub(pos + 1, str:len())
end
str = replace_char(2, "aaaaaa", "X")
print(str)
I can't use gsub either as that would replace every capture, not just the capture at position N.
Strings in Lua are immutable. That means, that any solution that replaces text in a string must end up constructing a new string with the desired content. For the specific case of replacing a single character with some other content, you will need to split the original string into a prefix part and a postfix part, and concatenate them back together around the new content.
This variation on your code:
function replace_char(pos, str, r)
return str:sub(1, pos-1) .. r .. str:sub(pos+1)
end
is the most direct translation to straightforward Lua. It is probably fast enough for most purposes. I've fixed the bug that the prefix should be the first pos-1 chars, and taken advantage of the fact that if the last argument to string.sub is missing it is assumed to be -1 which is equivalent to the end of the string.
But do note that it creates a number of temporary strings that will hang around in the string store until garbage collection eats them. The temporaries for the prefix and postfix can't be avoided in any solution. But this also has to create a temporary for the first .. operator to be consumed by the second.
It is possible that one of two alternate approaches could be faster. The first is the solution offered by PaĆ­lo Ebermann, but with one small tweak:
function replace_char2(pos, str, r)
return ("%s%s%s"):format(str:sub(1,pos-1), r, str:sub(pos+1))
end
This uses string.format to do the assembly of the result in the hopes that it can guess the final buffer size without needing extra temporary objects.
But do beware that string.format is likely to have issues with any \0 characters in any string that it passes through its %s format. Specifically, since it is implemented in terms of standard C's sprintf() function, it would be reasonable to expect it to terminate the substituted string at the first occurrence of \0. (Noted by user Delusional Logic in a comment.)
A third alternative that comes to mind is this:
function replace_char3(pos, str, r)
return table.concat{str:sub(1,pos-1), r, str:sub(pos+1)}
end
table.concat efficiently concatenates a list of strings into a final result. It has an optional second argument which is text to insert between the strings, which defaults to "" which suits our purpose here.
My guess is that unless your strings are huge and you do this substitution frequently, you won't see any practical performance differences between these methods. However, I've been surprised before, so profile your application to verify there is a bottleneck, and benchmark potential solutions carefully.
You should use pos inside your function instead of literal 1 and 3, but apart from this it looks good. Since Lua strings are immutable you can't really do much better than this.
Maybe
"%s%s%s":format(str:sub(1,pos-1), r, str:sub(pos+1, str:len())
is more efficient than the .. operator, but I doubt it - if it turns out to be a bottleneck, measure it (and then decide to implement this replacement function in C).
With luajit, you can use the FFI library to cast the string to a list of unsigned charts:
local ffi = require 'ffi'
txt = 'test'
ptr = ffi.cast('uint8_t*', txt)
ptr[1] = string.byte('o')

Resources