Extracting words and numbers from variable - python-3.x

I'm trying to split a variable based on the type of characters.
I need this:
var = "word12345-54321" #No spaces to help out.
to turn into this:
var2 = "word"
var3 = "12345"
var4 = "54321"
I'm able to extract the word with regex, but I can't find a way to separate the numbers.

As you're already using a RegEx I'd suggest using a pattern like (\D*)(\d*)-(\d*) if the strings will always be in a LETTERSNUMBERS-NUMBERS format.

Related

re.sub replacing string using original sub-string

I have a text file. I would like to remove all decimal points and their trailing numbers, unless text is preceding.
e.g 12.29,14.6,8967.334 should be replaced with 12,14,8967
e.g happypants2.3#email.com should not be modified.
My code is:
import re
txt1 = "9.9,8.8,22.2,88.7,morris1.43#email.com,chat22.3#email.com,123.6,6.54"
txt1 = re.sub(r',\d+[.]\d+', r'\d+',txt1)
print(txt1)
unless there is an easier way of completing this, how do I modify r'\d+' so it just returns the number without a decimal place?
You need to make use of groups in your regex. You put the digits before the '.' into parentheses, and then you can use '\1' to refer to them later:
txt1 = re.sub(r',(\d+)[.]\d+', r',\1',txt1)
Note that in your attempted replacement code you forgot to replace the comma, so your numbers would have been glommed together. This still isn't perfect though; the first number, since it doesn't begin with a comma, isn't processed.
Instead of checking for a comma, the better way is to check word boundaries, which can be done using \b. So the solution is:
import re
txt1 = "9.9,8.8,22.2,88.7,morris1.43#email.com,chat22.3#email.com,123.6,6.54"
txt1 = re.sub(r'\b(\d+)[.]\d+\b', r'\1',txt1)
print(txt1)
Considering these are the only two types of string that is present in your file, you can explicitly check for these conditions.
This may not be an efficient way, but what I have done is split the str and check if the string contains #email.com. If thats true, I am just appending to a new list. For your 1st condition to satisfy, we can convert the str to int which will eliminate the decimal points.
If you want everything back to a str variable, you can use .join().
Code:
txt1 = "9.9,8.8,22.2,88.7,morris1.43#email.com,chat22.3#email.com,123.6,6.54"
txt_list = []
for i in (txt1.split(',')):
if '#email.com' in i:
txt_list.append(i)
else:
txt_list.append(str(int(float(i))))
txt_new = ",".join(txt_list)
txt_new
Output:
'9,8,22,88,morris1.43#email.com,chat22.3#email.com,123,6'

How to replace ALL characters in a string with one character

Does anyone know a method that allows you to replace all the characters in a word with a single character?
If not, can anyone suggest a way to basically print _ (underscore) the number of times which is the length of the string itself without using any loops or ifs in the code?
mystring = '_'*len(mystring)
Of course, I'm guessing at the name of your string variable and the character that you want to use.
Or, if you just want to print it out, you can:
print('_'*len(mystring))
import re
str = "abcdefghi"
print(re.sub('[a-z]','_',str))

Getting substrings from a string in c# in the format Domain\Alias

I have a variable which has strings stored in the format "domain\alias" and I want to split this in two different strings domain and alias.
I have two solutions for the above case, but none of them are working in my case.
solution 1: separating alias from the string.
for this I am using the code below:
int index = name.IndexOf("\") + 1;
string piece = name.Substring(index);
where name is the variable which stores the string in the format "domain\alias"
This solution doesn't work for '\' however it works in case of '.'
solution 2:
separating domain from the string.
Here I got a solution below:
var domainFormattedString = #"fareast\v-sidmis";
var parts = domainFormattedString.Split('\\');
var domainString = parts[0];
return domainString;
this works, but it needs a string prefixed with #symbol and i have my string stored in the variable name for which this solution doesn't work.
Someone please help me to extract the two substrings from my variable name.
EDIT 1: Thanks all for your help! I figured out the issue...when i explicitly declare a string as: var x = "domian\alias" it creates and issue as \ is treated as a escape character by c# so i had to append # at the beginning. But I got to know that when a string is read from a user, the solution works!
\ has a special meaning so you need to override the escape sequence to be treated as normal character with another escape character.
string input = #"domain\alias";
int inputindex= input.IndexOf("\\");
string domain = input.Substring(0, inputindex);
string alias = input.Substring(inputindex+1);
Hope It helps eventhough better late than never :)

In Swift how to obtain the "invisible" escape characters in a string variable into another variable

In Swift I can create a String variable such as this:
let s = "Hello\nMy name is Jack!"
And if I use s, the output will be:
Hello
My name is Jack!
(because the \n is a linefeed)
But what if I want to programmatically obtain the raw characters in the s variable? As in if I want to actually do something like:
let sRaw = s.raw
I made the .raw up, but something like this. So that the literal value of sRaw would be:
Hello\nMy name is Jack!
and it would literally print the string, complete with literal "\n"
Thank you!
The newline is the "raw character" contained in the string.
How exactly you formed the string (in this case from a string literal with an escape sequence in source code) is not retained (it is only available in the source code, but not preserved in the resulting program). It would look exactly the same if you read it from a file, a database, the concatenation of multiple literals, a multi-line literal, a numeric escape sequence, etc.
If you want to print newline as \n you have to convert it back (by doing text replacement) -- but again, you don't know if the string was really created from such a literal.
You can do this with escaped characters such as \n:
let secondaryString = "really"
let s = "Hello\nMy name is \(secondaryString) Jack!"
let find = Character("\n")
let r = String(s.characters.split(find).joinWithSeparator(["\\","n"]))
print(r) // -> "Hello\nMy name is really Jack!"
However, once the string s is generated the \(secondaryString) has already been interpolated to "really" and there is no trace of it other than the replaced word. I suppose if you already know the interpolated string you could search for it and replace it with "\\(secondaryString)" to get the result you want. Otherwise it's gone.

Convert underscores to spaces in Matlab string?

So say I have a string with some underscores like hi_there.
Is there a way to auto-convert that string into "hi there"?
(the original string, by the way, is a variable name that I'm converting into a plot title).
Surprising that no-one has yet mentioned strrep:
>> strrep('string_with_underscores', '_', ' ')
ans =
string with underscores
which should be the official way to do a simple string replacements. For such a simple case, regexprep is overkill: yes, they are Swiss-knifes that can do everything possible, but they come with a long manual. String indexing shown by AndreasH only works for replacing single characters, it cannot do this:
>> s = 'string*-*with*-*funny*-*separators';
>> strrep(s, '*-*', ' ')
ans =
string with funny separators
>> s(s=='*-*') = ' '
Error using ==
Matrix dimensions must agree.
As a bonus, it also works for cell-arrays with strings:
>> strrep({'This_is_a','cell_array_with','strings_with','underscores'},'_',' ')
ans =
'This is a' 'cell array with' 'strings with' 'underscores'
Try this Matlab code for a string variable 's'
s(s=='_') = ' ';
If you ever have to do anything more complicated, say doing a replacement of multiple variable length strings,
s(s == '_') = ' ' will be a huge pain. If your replacement needs ever get more complicated consider using regexprep:
>> regexprep({'hi_there', 'hey_there'}, '_', ' ')
ans =
'hi there' 'hey there'
That being said, in your case #AndreasH.'s solution is the most appropriate and regexprep is overkill.
A more interesting question is why you are passing variables around as strings?
regexprep() may be what you're looking for and is a handy function in general.
regexprep('hi_there','_',' ')
Will take the first argument string, and replace instances of the second argument with the third. In this case it replaces all underscores with a space.
In Matlab strings are vectors, so performing simple string manipulations can be achieved using standard operators e.g. replacing _ with whitespace.
text = 'variable_name';
text(text=='_') = ' '; //replace all occurrences of underscore with whitespace
=> text = variable name
I know this was already answered, however, in my case I was looking for a way to correct plot titles so that I could include a filename (which could have underscores). So, I wanted to print them with the underscores NOT displaying with as subscripts. So, using this great info above, and rather than a space, I escaped the subscript in the substitution.
For example:
% Have the user select a file:
[infile inpath]=uigetfile('*.txt','Get some text file');
figure
% this is a problem for filenames with underscores
title(infile)
% this correctly displays filenames with underscores
title(strrep(infile,'_','\_'))

Resources