Get digits at end of string in a pythonic way - string

I'm using python 3.x. I'm trying to get the (int) number at the end of a string with format
string_example_1 = l-45-98-567-567-12
string_example_2 = s-89-657
or in general, a single lowercase letter followed by a number of integers separated by '-'. What I need is to get the last number (12 and 657 in these cases). I have archived this with the function
def ending(the_string):
out = ''
while the_string[-1].isdigit():
out = the_string[-1] + out
the_string = the_string[:-1]
return out
but I'm sure there must be a more pythonic way to do this. In a previous instance I check manually that the string starts the way I like by doing something like
if st[0].isalpha() and st[1]=='-' and st[2].isdigit():
statement...

I would just split the string on -, take the last of the splits and convert it to an integer.
string_example_1 = "l-45-98-567-567-12"
string_example_2 = "s-89-657"
def last_number(s):
return int(s.split("-")[-1])
print(last_number(string_example_1))
# 12
print(last_number(string_example_2))
# 657

Without regular expressions, you could reverse the string, take elements from the string while they're still numbers, and then reverse the result. In Python:
from itertools import takewhile
def extract_final_digits(s):
return int(''.join(reversed(list(takewhile(lambda c: c.isdigit(), reversed(s))))))
But the simplest is to just split on a delimiter and take the final element in the split list.

Related

Keeping the same distance no matter the string length [duplicate]

I'm sure this is covered in plenty of places, but I don't know the exact name of the action I'm trying to do so I can't really look it up. I've been reading an official Python book for 30 minutes trying to find out how to do this.
Problem: I need to put a string in a certain length "field".
For example, if the name field was 15 characters long, and my name was John, I would get "John" followed by 11 spaces to create the 15 character field.
I need this to work for any string put in for the variable "name".
I know it will likely be some form of formatting, but I can't find the exact way to do this. Help would be appreciated.
This is super simple with format:
>>> a = "John"
>>> "{:<15}".format(a)
'John '
You can use the ljust method on strings.
>>> name = 'John'
>>> name.ljust(15)
'John '
Note that if the name is longer than 15 characters, ljust won't truncate it. If you want to end up with exactly 15 characters, you can slice the resulting string:
>>> name.ljust(15)[:15]
If you have python version 3.6 or higher you can use f strings
>>> string = "John"
>>> f"{string:<15}"
'John '
Or if you'd like it to the left
>>> f"{string:>15}"
' John'
Centered
>>> f"{string:^15}"
' John '
For more variations, feel free to check out the docs: https://docs.python.org/3/library/string.html#format-string-syntax
You can use rjust and ljust functions to add specific characters before or after a string to reach a specific length.
The first parameter those methods is the total character number after transforming the string.
Right justified (add to the left)
numStr = '69'
numStr = numStr.rjust(5, '*')
The result is ***69
Left justified (add to the right)
And for the left:
numStr = '69'
numStr = numStr.ljust(3, '#')
The result will be 69#
Fill with Leading Zeros
Also to add zeros you can simply use:
numstr.zfill(8)
Which gives you 00000069 as the result.
string = ""
name = raw_input() #The value at the field
length = input() #the length of the field
string += name
string += " "*(length-len(name)) # Add extra spaces
This will add the number of spaces needed, provided the field has length >= the length of the name provided
name = "John" // your variable
result = (name+" ")[:15] # this adds 15 spaces to the "name"
# but cuts it at 15 characters
I know this is a bit of an old question, but I've ended up making my own little class for it.
Might be useful to someone so I'll stick it up. I used a class variable, which is inherently persistent, to ensure sufficient whitespace was added to clear any old lines. See below:
2021-03-02 update: Improved a bit - when working through a large codebase, you know whether the line you are writing is one you care about or not, but you don't know what was previously written to the console and whether you want to retain it.
This update takes care of that, a class variable you update when writing to the console keeps track of whether the line you are currently writing is one you want to keep, or allow overwriting later on.
class consolePrinter():
'''
Class to write to the console
Objective is to make it easy to write to console, with user able to
overwrite previous line (or not)
'''
# -------------------------------------------------------------------------
#Class variables
stringLen = 0
overwriteLine = False
# -------------------------------------------------------------------------
# -------------------------------------------------------------------------
def writeline(stringIn, overwriteThisLine=False):
import sys
#Get length of stringIn and update stringLen if needed
if len(stringIn) > consolePrinter.stringLen:
consolePrinter.stringLen = len(stringIn)+1
ctrlString = "{:<"+str(consolePrinter.stringLen)+"}"
prevOverwriteLine = consolePrinter.overwriteLine
if prevOverwriteLine:
#Previous line entry can be overwritten, so do so
sys.stdout.write("\r" + ctrlString.format(stringIn))
else:
#Previous line entry cannot be overwritten, take a new line
sys.stdout.write("\n" + stringIn)
sys.stdout.flush()
#Update the class variable for prevOverwriteLine
consolePrinter.overwriteLine = overwriteThisLine
return
Which then is called via:
consolePrinter.writeline("text here", True)
If you want this line to be overwriteable
consolePrinter.writeline("text here",False)
if you don't.
Note, for it to work right, all messages pushed to the console would need to be through consolePrinter.writeline.
I generally recommend the f-string/format version, but sometimes you have a tuple, need, or want to use printf-style instead. I did this time and decided to use this:
>>> res = (1280, 720)
>>> '%04sx%04s' % res
'1280x 720'
Thought it was a touch more readable than the format version:
>>> f'{res[0]:>4}x{res[1]:>4}'
First check to see if the string's length needs to be shortened, then add spaces until it is as long as the field length.
fieldLength = 15
string1 = string1[0:15] # If it needs to be shortened, shorten it
while len(string1) < fieldLength:
rand += " "
Just whipped this up for my problem, it just adds a space until the length of string is more than the min_length you give it.
def format_string(str, min_length):
while len(str) < min_length:
str += " "
return str

How to remove the alphanumeric characters from a list and split them in the result?

'''def tokenize(s):
string = s.lower().split()
getVals = list([val for val in s if val.isalnum()])
result = "".join(getVals)
print (result)'''
tokenize('AKKK#eastern B!##est!')
Im trying for the output of ('akkkeastern', 'best')
but my output for the above code is - AKKKeasternBest
what are the changes I should be making
Using a list comprehension is a good way to filter elements out of a sequence like a string. In the example below, the list comprehension is used to build a list of characters (characters are also strings in Python) that are either alphanumeric or a space - we are keeping the space around to use later to split the list. After the filtered list is created, what's left to do is make a string out of it using join and last but not least use split to break it in two at the space.
Example:
string = 'AKKK#eastern B!##est!'
# Removes non-alpha chars, but preserves space
filtered = [
char.lower()
for char in string
if char.isalnum() or char == " "
]
# String-ifies filtered list, and splits on space
result = "".join(filtered).split()
print(result)
Output:
['akkkeastern', 'best']

How can I slice and keep text in a list every specific character?

I used beautifulsoup and I got a result form .get_text(). The result contains a long text:
alpha = ['\n\n\n\nIntroduction!!\nGood\xa0morning.\n\n\n\nHow\xa0are\xa0you?\n\n']
It can be noticed that the number of \n is not the same, and there are \xa0 for spacing.
I want to slice every group of \n (\n\n or \n\n\n or \n\n\n\n ) and replace \xa0 with a space in a new list, to look like this:
beta = ['Introduction!!','Good morning.','How are you?']
How can I do it?
Thank you in advance.
I wrote a little script that solves your problem:
alpha = ['\n\n\n\nIntroduction!!\nGood\xa0morning.\n\n\n\nHow\xa0are\xa0you?\n\n']
beta = []
for s in alpha:
# Turning the \xa0 into spaces
s = s.replace('\xa0',' ')
# Breaking the string by \n
s = s.split('\n')
# Explanation 1
s = list(filter(lambda s: s!= '',s))
# Explanation 2
beta = beta + s
print(beta)
Explanation 1
As there is some sequences of \n inside the alpha string, the split() will generate some empty strings. The filter() that I wrote removes them from the list.
Explanation 2
When the s string got split, it turns into a list of strings. Then, we need to concatenate the lists.

Is there a way to substring, which is between two words in the string in Python?

My question is more or less similar to:
Is there a way to substring a string in Python?
but it's more specifically oriented.
How can I get a par of a string which is located between two known words in the initial string.
Example:
mySrting = "this is the initial string"
Substring = "initial"
knowing that "the" and "string" are the two known words in the string that can be used to get the substring.
Thank you!
You can start with simple string manipulation here. str.index is your best friend there, as it will tell you the position of a substring within a string; and you can also start searching somewhere later in the string:
>>> myString = "this is the initial string"
>>> myString.index('the')
8
>>> myString.index('string', 8)
20
Looking at the slice [8:20], we already get close to what we want:
>>> myString[8:20]
'the initial '
Of course, since we found the beginning position of 'the', we need to account for its length. And finally, we might want to strip whitespace:
>>> myString[8 + 3:20]
' initial '
>>> myString[8 + 3:20].strip()
'initial'
Combined, you would do this:
startIndex = myString.index('the')
substring = myString[startIndex + 3 : myString.index('string', startIndex)].strip()
If you want to look for matches multiple times, then you just need to repeat doing this while looking only at the rest of the string. Since str.index will only ever find the first match, you can use this to scan the string very efficiently:
searchString = 'this is the initial string but I added the relevant string pair a few more times into the search string.'
startWord = 'the'
endWord = 'string'
results = []
index = 0
while True:
try:
startIndex = searchString.index(startWord, index)
endIndex = searchString.index(endWord, startIndex)
results.append(searchString[startIndex + len(startWord):endIndex].strip())
# move the index to the end
index = endIndex + len(endWord)
except ValueError:
# str.index raises a ValueError if there is no match; in that
# case we know that we’re done looking at the string, so we can
# break out of the loop
break
print(results)
# ['initial', 'relevant', 'search']
You can also try something like this:
mystring = "this is the initial string"
mystring = mystring.strip().split(" ")
for i in range(1,len(mystring)-1):
if(mystring[i-1] == "the" and mystring[i+1] == "string"):
print(mystring[i])
I suggest using a combination of list, split and join methods.
This should help if you are looking for more than 1 word in the substring.
Turn the string into array:
words = list(string.split())
Get the index of your opening and closing markers then return the substring:
open = words.index('the')
close = words.index('string')
substring = ''.join(words[open+1:close])
You may want to improve a bit with the checking for the validity before proceeding.
If your problem gets more complex, i.e multiple occurrences of the pair values, I suggest using regular expression.
import re
substring = ''.join(re.findall(r'the (.+?) string', string))
The re should store substrings separately if you view them in list.
I am using the spaces between the description to rule out the spaces between words, you can modify to your needs as well.

Select part of a string and change it in lowercase or uppercase python 3.x

I want to convert a string so that the pair positions will be in upper case characters and the impair positions will be in lower case characters.
Here is what I've tried so far:
def foldingo(chaine):
chaineuh=chaine[0::2].upper()
chaine=chaineuh[1::2].lower()
return chaine
your code takes every other character in chaine, uppercases them, and assigns those characters to chaineuh.
Then it takes every other character in chaineuh, lowercases them, and assigns those characters to chaine again. In other words:
abcdefg -> ACEG -> cg
You'll notice it's not keeping the characters that you're not trying to target.
You could try building all the uppercases and lowercases separately, then iterate with zip to get them together.
def fold(s):
uppers = s[0::2].upper()
lowers = s[1::2].lower()
return zip(uppers, lowers)
But this doesn't quit work either, since zip gives you tuples, not strings, and will drop the last character in odd-lengthed strings
abcdefg -> ACEG, bdf -> ('A', 'b'), ('C', 'd'), ('E', 'f')
We could fix that by using a couple calls to str.join and using itertools.zip_longest with a fillvalue='', but it's kind of like using a wrench to hammer in a nail. It's not really the right tool for the job. For the record: it would look like:
''.join([''.join(pair) for pair in itertools.zip_longest(uppers, lowers, fillvalue='')])
yuck.
Let's instead just iterate over the string and uppercase every other letter. We can use an alternating boolean to track whether we're upper'ing or lower'ing this time around.
def fold(s):
time_to_upper = True
result = ""
for ch in s:
if time_to_upper:
result += ch.upper()
else:
result += ch.lower()
time_to_upper = not time_to_upper
return result
You could also use enumerate and a modulo to keep track:
def fold(s):
result = ""
for i, ch in enumerate(s):
ch = ch.lower() if i % 2 else ch.upper()
result += ch
return result
Or by using itertools.cycle, str.join, and list comprehensions, we can make this a lot shorter (possibly at the cost of readability!)
import itertools
def fold(s):
return ''.join([op(ch) for op, ch in zip(itertools.cycle([str.upper, str.lower]), s)]

Resources