Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I want to expand the string like below, but without using the extra space..
a5b1c0d5a1a1
And result should be..
aaaaabaa
I am stuck here. How to do it without extra space?
I would read each char, check is is letter, than take the next char, check if its a number, than just add to the resulting string the letter times.
In your example, The first thing I would read is a, a is a letter, so read the next, check if its a number, it is. So append to a resulting string five a's.
Use a loop times to append letter, for example.
UPDATE
Explaning my comment better.
So you're looping through the string.
index 0 you have the 'a'. So you read a letter, then you expect to get a number, which is 5.
I divide now the string in to other string. The first one will have everything until a, which in this case is only a.
The second one will have everything after the number, in this case 5, which will be b1c0d5a1a1
So take the first string, concatenate with the 4 (5-1, you already have the first a) an then concatenate with the rest of the string.
string = b1c0d5a1a1
string = substring(0,1) + "aaaa" + substring(1,stringsize-1);
In the cases like 0, you can play around with the substring indexes so you can remove the letter, instead of adding some more.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 months ago.
Improve this question
I'm trying to find an average of a large array of candidates compensation. Some of the cells contain text with multiple numbers showing a range such as, "$100k - $120k". Others are labeled as TC("120k TC") for total composition.
How would I be able to find the average of these numbers by using a something along the lines of substituting letters or parsing the string into a number WITHOUT changing the actual values listed? I do NOT want to mutate the original cell value of I only want to find an average of them all through a formula to bypass the additional "k", "TC" and "-" rendering them un-averageable as they are not parsed as numbers.
Would need to clean up the texts in stages.
find if a certain text is present: eg.
=IF(IFERROR(FIND("-",A1,1),"")<>"","- is present","")
=IF(IFERROR(FIND("TC",A1,1),"")<>"","TC is present","")
=IF(IFERROR(FIND("$",A1,1),"")<>"","$ is present","")
then split left and right price values if "-" is present: eg.
=LEFT(A1,FIND("-",A1,1))
=RIGHT(A1,FIND("-",A1,1))
then if texts are present, remove those texts: eg.
=SUBSTITUTE(A1,"-","")
=SUBSTITUTE(A1,"$","")
=SUBSTITUTE(A1,"k","")
then can use trim() to remove spaces on ends, value() to convert text to number etc...
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have a list of lists with pairs of name strings
lst = [['Smith-Wilson J.', 'Johnson M.'],['Williams B-M.', 'Jones A.']]
Some of the abbreviated middle names come with a hyphen (e.g. 'Williams B-M.') and some of the last names have a hyphen as well (e.g. 'Smith-Wilson J.').
I want to change only the hyphens attached to the abbreviated middle names to a dot (i.e. '.') using a list comprehension.
I know the index of the hyphen I want to change will always be string[-3].
Output should look like this
lst = [['Smith-Wilson J.', 'Johnson M.'],['Williams B.M.', 'Jones A.']]
If you split the names the surnames will never be in the first position, right? Check for hyphens after the first name: name.split().[1:] your code can be something like this:
new_lst = [[" ".join([name.split()[0]]+list(map(lambda surname: surname.replace('-','.'), name.split()[1:]))) for name in pair] for pair in lst]
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Hi I have a huge set of data with thousands of columns, one of the column I need to extract certain string patterns: e.g. 41242456-2020-12 or 41242456-2020-2 or 41242456-2020-200 (8 digit number-year-1~3 digit number), that was mixed among text in the string, e.g. most of times the numbers appear in the beginning, sometimes its like the following:
Blah Blah LEX#41242456-2020-12BLABLABLAH
Blah Blah LEXIDA ID:41242456-2020-12BLAHBLAHBLAH etc.
Hence unable to extract them fully through one formula.
Is there a way I can use any formula/vba code to only extract 41242456-2020-12 and removing all other characters?
Look here and elsewhere on the web on how to use regular expressions in Excel.
The regular expression you want to match against is \d{8}-[12]\d{3}-\d{1,3} which means
eight numbers
a dash
a "1" or a "2" (because if it's 3, or 0 then I assume it's not a valid year)
three numbers
a dash
one to three numbers
You might want to use (\d{8})-([12]\d{3})-(\d{1,3}) so that matching will give you the three numbers for you. Parentheses in regular expressions mean 'return what matched this part.'
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I need some help here to convert a file into a new file with below requirement:
Split each row (long string) into sub-string based on fixed length
use pipe delimiter "|" between each sub-string
leave last undefined column (sub-string) as-is, but add "|" before it.
Here is example, suppose a file (test.dat) has 2 rows:
PG123ABCD A 000{000
MK789HJKL32H00
Column 1: length(2)
Column 2: length(3)
Column 3: length(4)
Column 4: length(3)
Column 5: undefined, use all remaining value
Below is the final output I need. The example has only 2 rows, suppose I have a file that have 1k+ similar rows, and I need to convert original file to a new file based on above requirement.
PG|123|ABCD| A |000{000
MK|789|HJKL|32H|00
cut -b 1-2,3-5,6-9,10-12,13-500 --output-delimiter='|' test.dat > 1.dat
I wrote above code and it output exactly what I need.
The only question I have is last column, I used 13-500 as fixed length for the undefined column, however the length of the undefined remaining string varies in different rows, is there a generic way to define the last column's length? e.g., something like 13-max_lengh_of_the_row
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I wish to split a string wherever an uppercase character occurs.
for eg-
if string is StackOverflow
the split should give me ['Stack' , 'Overflow']
The words may or may not be in a list but should be separate.
How do i do this?
EDIT :
How to do this without regex ?
You can import the re module and use regex:
>>> import re
>>> re.findall('[A-Z][^A-Z]*', 'StackOverflow')
['Stack', 'Overflow']
Explanation:
Match a single character present in the list below [A-Z]
A-Z a single character in the range between A (ASCII 65) and Z (ASCII 90) (case sensitive)
Match a single character not present in the list below [^A-Z]*
^ means beggining of a string
* quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy) A-Z a single character in the range between A (ASCII 65) and Z (ASCII 90) (case sensitive)