I have the following string "10P_57.53%_568AA". I want to extract only numbers (integer and float numbers) without any other things. The output should be like this:
10 57.53 568
Since you've not provided any language, this would be general method:
Run a loop, get each character of string in a variable say X. Compare X as if(X>=0||X<=9||X=='.'){ Concat value of X to some string }
Related
My dataframe has columns where one has list of float values. When I train that column as X_train, I showing cannot string to float or tensorflow float data type.
DataSet:
I tried this:
df['sent_to_vec'].apply(lambda x: float(x))
or nested for loop to convert values in float type; but didn't get executed.
Try passing a string that's really just a floating-point number to the Python float() function:
f1 = float('0.1')
print(f1)
It works.
Try passing a string that's not just a floating-point number, but is instead some sort of array or list representation with multiple numbers separated by other punctuation:
f2 = float('[0.1, 0.2]')
print(f2)
You'll get the same error as you're asking about. That string, '[0.1, 0.2]' is not a representation of a floating-point number that float() can read.
You should look for a function that can read a string like '[0.1, 0.2]'. Can you see the code that wrote the Vectorized data.csv file? (Did you write that code, or that file?)
You'll want to use some function that does the reverse of whatever wrote that column of the file.
I'm looking for a simple and invertible way to represent a Julia string by an integer (e.g. for cryptography). To be clear, I'm not considering string representations of integers like "123", but arbitrary strings like "Hello". The representation doesn't need to be human-readable, but it needs to be easily invertible back to a unique string (so not a hash). It doesn't need to be efficient; I'm just looking for something as simple as possible. (Also, it's fine if it only works on a small character set, e.g. lowercase Roman letters.)
One naive way would be to collect the string into a vector of chars, parse(Int, _) each char to an integer, and concatenate the integers. But this seems cumbersome, and I suspect that there's in built-in Julia function (or small composition of functions) that will get the job done more easily.
If your strings only use the numbers 0-9 and letters a-z and A-Z, then you can parse the string directly as base 62 BigInteger:
julia> s = randstring(123)
"RFXkzD6VpWcwvbsxOtdTxS4DGcgciKgDXECa9fEK0Djcdkcj5N75vIHEMVyuH9mcYgvFbLhbPdrKyPIO4JsK1DKgZIacov6WKDZdIpGJ5iJ15dpjmcCBCybMmxB"
julia> i = parse(BigInt, s, base=62)
12798646956721889529517502411501433963894611324020956397632780092623456213685688389093681112679380669903728068303911743800989012987014660454736389459814982802097607808640628339365945710572579898457023165244164689548286133
julia> string(i, base=62)
"RFXkzD6VpWcwvbsxOtdTxS4DGcgciKgDXECa9fEK0Djcdkcj5N75vIHEMVyuH9mcYgvFbLhbPdrKyPIO4JsK1DKgZIacov6WKDZdIpGJ5iJ15dpjmcCBCybMmxB"
I created a (somewhat complicated) implementation that works for ASCII strings:
stringToInt(str::String) = sum(i -> Int(str[end-i]) * 128^i, 0:length(str)-1)
function intToString(m::Int)
chars = Char[]
for n in div(ceil(Int, log2(x)), 7)-1:-1:0
d, m = divrem(m, 128^n)
push!(chars, d)
end
String(chars)
end
Let me know if you can think of a better one.
I know the string methods str.isdigit, str.isdecimal and str.isnumeric.
I'm looking for a built-in method that checks if a character is algebraic, meaning that it can be found in a declaration of a decimal number.
The above mentioned methods return False for '-1' and '1.0'.
I can use isdigit to retrieve a positive integer from a string:
string = 'number=123'
number = ''.join([d for d in string if d.isdigit()]) # returns '123'
But that doesn't work for negative integers or floats.
Imagine a method called isnumber that works like this:
def isnumber(s):
for c in s:
if c not in list('.+-0123456789'):
return False
return True
string1 = 'number=-1'
string2 = 'number=0.1'
number1 = ''.join([d for d in string1 if d.isnumber()]) # returns '-1'
number2 = ''.join([d for d in string2 if d.isnumber()]) # returns '0.1'
The idea is to test against a set of "basic" algebraic characters. The string does not have to contain a valid Python number. It could also be an IP address like 255.255.0.1.
.
Does a handy built-in that works approximately like that exist?
If not, why not? It would be much more efficient than a python function and very useful. I've seen alot of examples on stackoverflow that use str.isdigit() to retrieve a positive integer from a string. Is there a reason why there isn't a built-in like that, although there are three different methods that do almost the same thing?
No such function exists. There are a bunch of odd characters that can be part of number literals in Python, such as o, x and b in the prefix of integers of non-decimal bases, and e to introduce the exponential part of a float. I think those plus the hex digits (0-9 and A-F) and sign characters and the decimal point are all you need.
You can put together a string with the right character yourself and test against it:
from string import hex_digits
num_literal_chars = hex_digits + "oxOX.+-"
That will get a bunch of garbage though if you use it to test against mixed text and numbers:
string1 = "foo. bar. 0xDEADBEEF 10.0.0.1"
print("".join(c for c in string1 if c in num_literal_chars))
# prints "foo.ba.0xDEADBEEF10.0.0.1"
The fact that it gives you a bunch of junk is probably why no builtin function exists to do this. If you want to match a certain kind of number out of a string, write an appropriate regular expression to match that specific kind of number. Don't try to do it character-by-character, or try to match all the different kinds of Python numbers.
I am trying to write a function that takes a string txt and returns an int of that string's character's ascii numbers. It also takes a second argument, n, that is an int that specified the number of digits that each character should translate to. The default value of n is 3. n is always > 3 and the string input is always non-empty.
Example outputs:
string_to_number('fff')
102102102
string_to_number('ABBA', n = 4)
65006600660065
My current strategy is to split txt into its characters by converting it into a list. Then, I convert the characters into their ord values and append this to a new list. I then try to combine the elements in this new list into a number (e.g. I would go from ['102', '102', '102'] to ['102102102']. Then I try to convert the first element of this list (aka the only element), into an integer. My current code looks like this:
def string_to_number(txt, n=3):
characters = list(txt)
ord_values = []
for character in characters:
ord_values.append(ord(character))
joined_ord_values = ''.join(ord_values)
final_number = int(joined_ord_values[0])
return final_number
The issue is that I get a Type Error. I can write code that successfully returns the integer of a single-character string, however when it comes to ones that contain more than one character, I can't because of this type error. Is there any way of fixing this. Thank you, and apologies if this is quite long.
Try this:
def string_to_number(text, n=3):
return int(''.join('{:0>{}}'.format(ord(c), n) for c in text))
print(string_to_number('fff'))
print(string_to_number('ABBA', n=4))
Output:
102102102
65006600660065
Edit: without list comprehension, as OP asked in the comment
def string_to_number(text, n=3):
l = []
for c in text:
l.append('{:0>{}}'.format(ord(c), n))
return int(''.join(l))
Useful link(s):
string formatting in python: contains pretty much everything you need to know about string formatting in python
The join method expects an array of strings, so you'll need to convert your ASCII codes into strings. This almost gets it done:
ord_values.append(str(ord(character)))
except that it doesn't respect your number-of-digits requirement.
I am reading a list with 2 integers from a file using readline(s) which returns a string. How does one convert the string of 2 integers back to 2 integers? Thanks.
I assume you have a string like "123 456". If so you can do this: map(parse_string, split(s)) where s is your string.
parse_string parses one Maxima expression, e.g. one integer. To get both integers, split the string, which makes a list, and then parse each element of the list.