Python List Formatting and Updation - python-3.x

I have a list Eg. a = ["dgbbgfbjhffbjjddvj/n//n//n' "]
How do I remove the trailing new lines i.e. all /n with extra single inverted comma at the end?
Expected result = ["dfgjhgjjhgfjjfgg"] (I typed it randomly)

you can use string rstrip() method.
usage:
str.rstrip([c])
where c are what chars have to be trimmed, whitespace is the default when no arg provided.
example:
a = ['Return a copy of the string\n', 'with trailing characters removed\n\n']
[i.rstrip('\n') for i in a]
result:
['Return a copy of the string', 'with trailing characters removed']
more about strip():
https://www.tutorialspoint.com/python3/string_rstrip.htm

Related

Python - how to find string and remove string plus next x characters

I have the following string:
mystr = '(string_to_delete_20221012_11-36) keep this (string_to_delete_20221016_22-22) keep this (string_to_delete_20221017_20-55) keep this'
I wish to delete all the entries (string_to_deletexxxxxxxxxxxxxxx) (including the trailing space)
I sort of need pseudo code as follows:
If you find a string (string_to_delete then replace that string and the timestamp, closing parenthesis and trailing space with null e.g. delete the string (string_to_delete_20221012_11-36)
I would use a list comprehension but given that not all strings are contained inside parenthesis I cannot see what I could use to create the list via a string.split().
Is this somethng that needs regular expressions?
it seemed like a good place to put regex:
import re
pattern = r'\(string_to_delete_.*?\)\s*'
mystr = '(string_to_delete_20221012_11-36) keep this (string_to_delete_20221016_22-22) keep this (string_to_delete_20221017_20-55) keep this'
for match in re.findall(pattern, mystr):
mystr = mystr.replace(match, '', 1) # replace 1st occurence of matched str with empty string
print(mystr)
results with:
>> keep this keep this keep this
brief regex breakdown: \(string_to_delete_.*?\)\s*
\( look for left parenthesis - escape needed
match string string_to_delete_
.*? look for zero or more characters if any
\) match closing parenthesis
\s* include zero or more whitespaces after that

find regex expression based character match

I have a list of strings something like this:
a=['bukt/id=gdhf/year=989/month=98/day=12/hgjhg.csv','bukt/id=76fhfh/year=989/month=08/day=128/hkngjhg.csv']
ids are unique.I want to have a output list which will be something like this
output_list = ['bukt/id=gdhf/','bukt/id=76fhfh/']
So basically need a regex expression to match any id and remove the rest of the part from the string
How can I do that in most efficient way considering the length of the input list is more than 100K
import re
rgx = r'(bukt/id=[a-zA-Z0-9]+/).+'
re.search(rgx, string).group(1)
The result will be in group 1. This captures "bukt/id=", followed by any alphanumeric characters and then a slash, and throws away the rest.
There's no need for regex, you can just split your string on /, discard everything after the second / and then join again with /:
a=['bukt/id=gdhf/year=989/month=98/day=12/hgjhg.csv','bukt/id=76fhfh/year=989/month=08/day=128/hkngjhg.csv']
out = ['/'.join(u.split('/')[:2]) for u in a]
print(out)
Output:
['bukt/id=gdhf', 'bukt/id=76fhfh']
If you want the trailing /, just add an empty string to the end of the split array:
out = ['/'.join(u.split('/')[:2] + ['']) for u in a]
Output:
['bukt/id=gdhf/', 'bukt/id=76fhfh/']

Removing special characters from a string In a Groovy Script

I am looking to remove special characters from a string using groovy, i'm nearly there but it is removing the white spaces that are already in place which I want to keep. I only want to remove the special characters (and not leave a whitespace). I am running the below on a PostCode L&65$$ OBH
def removespecialpostcodce = PostCode.replaceAll("[^a-zA-Z0-9]+","")
log.info removespecialpostcodce
Currently it returns L65OBH but I am looking for it to return L65 OBH
Can anyone help?
Use below code :
PostCode.replaceAll("[^a-zA-Z0-9 ]+","")
instead of
PostCode.replaceAll("[^a-zA-Z0-9]+","")
To remove all special characters in a String you can use the invert regex character:
String str = "..\\.-._./-^+* ".replaceAll("[^A-Za-z0-1]","");
System.out.println("str: <"+str+">");
output:
str: <>
to keep the spaces in the text add a space in the character set
String str = "..\\.-._./-^+* ".replaceAll("[^A-Za-z0-1 ]","");
System.out.println("str: <"+str+">");
output:
str: < >

How to remove extra spaces in between string, matlab?

I have created a script to convert text to morsecode, and now I want to modify it to include a slash between words.So something like space slash space between morsecode words. I know my loop before the main loop is incorrect and I want to fix it to do as stated before I just really need help Thank You!!!:
...
Word=input('Please enter a word:','s');
...
Code=MC_1;
...
case ' '
Code='/'
otherwise
Valid=0;
end
if Valid
fprintf('%s ',Code);
else
disp('Input has invalid characters!')
break
end
I know you want to write a loop to remove multiple spaces in between words, but the best way to remove white space in your particular problem would be to use regular expressions, specifically with regexprep. Regular expressions are used to search for particular patterns / substrings within a larger string. In this case, what we are trying to find are substrings that consist of more than one whitespace. regexprep finds substrings that match a pattern, and replaces them with another string. In our case, you would search for any substrings within your string that contain at least one more whitespace characters, and replace them with a single whitespace character. Also, I see that you've trimmed both leading and trailing whitespace for the string using strtrim, which is great. Now, all you need to do is callregexprep like so:
Word = regexprep(Word, '\s+', ' ');
\s+ is the regular expression for finding at least one white space character. We then replace this with a single whitespace. As such, supposing we had this string stored in Word:
Word = ' hello how are you ';
Doing a trim of leading and trailing whitespace, then calling regexprep in the way we talked about thus gives:
Word = strtrim(Word);
Word = regexprep(Word, '\s+', ' ')
Word =
hello how are you
As you can see, the leading and trailing white space was removed with strtrim, and the regular expression takes care of the rest of the spaces in between.
However, if you are dead set on using a loop, what you can do is use a logical variable which is set to true when we detect a white space, and then we use this variable and skip other white space characters until we hit a character that isn't a space. We would then place our space, then /, then space, then continue. In other words, do something like this:
Word = strtrim(Word); %// Remove leading and trailing whitespace
space_hit = false; %// Initialize space encountered flag
Word_noSpace = []; %// Will store our new string
for index=1:length(Word) %// For each character in our word
if Word(index) == ' ' %// If we hit a space
if space_hit %// Check to see if we have already hit a space
continue; %// Continue if we have
else
Word_noSpace = [Word_noSpace ' ']; %// If not, add a space, then set the flag
space_hit = true;
end
else
space_hit = false; %// When we finally hit a non-space, set back to false
Word_noSpace = [Word_noSpace Word(index)]; %// Keep appending characters
end
end
Word = Word_noSpace; %// Replace to make compatible with the rest of your code
for Character = Word %// Your code begins here
...
...
What the above code does is that we have an empty string called Word_noSpace that will contain our word with no extra spaces, and those spaces replaced with a single whitespace. The loop goes through each character, and should we encounter a space, we check to see if we have already encountered a space. If we have, just continue on in the loop. If we haven't, then concatenate a whitespace. Once we finally hit a non-space character, we simply just add those characters that are not spaces to this new string. The result will be a string with no extra spaces, and those are replaced with a single white space.
Running the above code after you trim the leading and trailing white space thus gives:
Word =
hello how are you

Finding mean of ascii values in a string MATLAB

The string I am given is as follows:
scrap1 =
a le h
ke fd
zyq b
ner i
You'll notice there are 2 blank spaces indicating a space (ASCII 32) in each row. I need to find the mean ASCII value in each column without taking into account the spaces (32). So first I would convert to with double(scrap1) but then how do I find the mean without taking into account the spaces?
If it's only the ASCII 32 you want to omit:
d = double(scrap1);
result = mean(d(d~=32)); %// logical indexing to remove unwanted value, then mean
You can remove the intermediate spaces in the string with scrap1(scrap1 == ' ') = ''; This replaces any space in the input with an empty string. Then you can do the conversion to double and average the result. See here for other methods.
Probably, you can use regex to find the space and ignore it. "\s"
findSpace = regexp(scrap1, '\s', 'ignore')
% I am not sure about the ignore case, this what comes to my mind. but u can read more about regexp by typying doc regexp.

Resources