I'm sure this is covered in plenty of places, but I don't know the exact name of the action I'm trying to do so I can't really look it up. I've been reading an official Python book for 30 minutes trying to find out how to do this.
Problem: I need to put a string in a certain length "field".
For example, if the name field was 15 characters long, and my name was John, I would get "John" followed by 11 spaces to create the 15 character field.
I need this to work for any string put in for the variable "name".
I know it will likely be some form of formatting, but I can't find the exact way to do this. Help would be appreciated.
This is super simple with format:
>>> a = "John"
>>> "{:<15}".format(a)
'John '
You can use the ljust method on strings.
>>> name = 'John'
>>> name.ljust(15)
'John '
Note that if the name is longer than 15 characters, ljust won't truncate it. If you want to end up with exactly 15 characters, you can slice the resulting string:
>>> name.ljust(15)[:15]
If you have python version 3.6 or higher you can use f strings
>>> string = "John"
>>> f"{string:<15}"
'John '
Or if you'd like it to the left
>>> f"{string:>15}"
' John'
Centered
>>> f"{string:^15}"
' John '
For more variations, feel free to check out the docs: https://docs.python.org/3/library/string.html#format-string-syntax
You can use rjust and ljust functions to add specific characters before or after a string to reach a specific length.
The first parameter those methods is the total character number after transforming the string.
Right justified (add to the left)
numStr = '69'
numStr = numStr.rjust(5, '*')
The result is ***69
Left justified (add to the right)
And for the left:
numStr = '69'
numStr = numStr.ljust(3, '#')
The result will be 69#
Fill with Leading Zeros
Also to add zeros you can simply use:
numstr.zfill(8)
Which gives you 00000069 as the result.
string = ""
name = raw_input() #The value at the field
length = input() #the length of the field
string += name
string += " "*(length-len(name)) # Add extra spaces
This will add the number of spaces needed, provided the field has length >= the length of the name provided
name = "John" // your variable
result = (name+" ")[:15] # this adds 15 spaces to the "name"
# but cuts it at 15 characters
I know this is a bit of an old question, but I've ended up making my own little class for it.
Might be useful to someone so I'll stick it up. I used a class variable, which is inherently persistent, to ensure sufficient whitespace was added to clear any old lines. See below:
2021-03-02 update: Improved a bit - when working through a large codebase, you know whether the line you are writing is one you care about or not, but you don't know what was previously written to the console and whether you want to retain it.
This update takes care of that, a class variable you update when writing to the console keeps track of whether the line you are currently writing is one you want to keep, or allow overwriting later on.
class consolePrinter():
'''
Class to write to the console
Objective is to make it easy to write to console, with user able to
overwrite previous line (or not)
'''
# -------------------------------------------------------------------------
#Class variables
stringLen = 0
overwriteLine = False
# -------------------------------------------------------------------------
# -------------------------------------------------------------------------
def writeline(stringIn, overwriteThisLine=False):
import sys
#Get length of stringIn and update stringLen if needed
if len(stringIn) > consolePrinter.stringLen:
consolePrinter.stringLen = len(stringIn)+1
ctrlString = "{:<"+str(consolePrinter.stringLen)+"}"
prevOverwriteLine = consolePrinter.overwriteLine
if prevOverwriteLine:
#Previous line entry can be overwritten, so do so
sys.stdout.write("\r" + ctrlString.format(stringIn))
else:
#Previous line entry cannot be overwritten, take a new line
sys.stdout.write("\n" + stringIn)
sys.stdout.flush()
#Update the class variable for prevOverwriteLine
consolePrinter.overwriteLine = overwriteThisLine
return
Which then is called via:
consolePrinter.writeline("text here", True)
If you want this line to be overwriteable
consolePrinter.writeline("text here",False)
if you don't.
Note, for it to work right, all messages pushed to the console would need to be through consolePrinter.writeline.
I generally recommend the f-string/format version, but sometimes you have a tuple, need, or want to use printf-style instead. I did this time and decided to use this:
>>> res = (1280, 720)
>>> '%04sx%04s' % res
'1280x 720'
Thought it was a touch more readable than the format version:
>>> f'{res[0]:>4}x{res[1]:>4}'
First check to see if the string's length needs to be shortened, then add spaces until it is as long as the field length.
fieldLength = 15
string1 = string1[0:15] # If it needs to be shortened, shorten it
while len(string1) < fieldLength:
rand += " "
Just whipped this up for my problem, it just adds a space until the length of string is more than the min_length you give it.
def format_string(str, min_length):
while len(str) < min_length:
str += " "
return str
Related
I used beautifulsoup and I got a result form .get_text(). The result contains a long text:
alpha = ['\n\n\n\nIntroduction!!\nGood\xa0morning.\n\n\n\nHow\xa0are\xa0you?\n\n']
It can be noticed that the number of \n is not the same, and there are \xa0 for spacing.
I want to slice every group of \n (\n\n or \n\n\n or \n\n\n\n ) and replace \xa0 with a space in a new list, to look like this:
beta = ['Introduction!!','Good morning.','How are you?']
How can I do it?
Thank you in advance.
I wrote a little script that solves your problem:
alpha = ['\n\n\n\nIntroduction!!\nGood\xa0morning.\n\n\n\nHow\xa0are\xa0you?\n\n']
beta = []
for s in alpha:
# Turning the \xa0 into spaces
s = s.replace('\xa0',' ')
# Breaking the string by \n
s = s.split('\n')
# Explanation 1
s = list(filter(lambda s: s!= '',s))
# Explanation 2
beta = beta + s
print(beta)
Explanation 1
As there is some sequences of \n inside the alpha string, the split() will generate some empty strings. The filter() that I wrote removes them from the list.
Explanation 2
When the s string got split, it turns into a list of strings. Then, we need to concatenate the lists.
I need to take only the letters and numbers at the beginning of a string, but some numbers are decimals. The strings are not all formatted the same. Here are a few examples of some of the data and what I would need returned:
HB61 .M16 1973 I need HB61 returned
HB97.52 .R6163 1982 I need HB97.52 returned
HB98.V38 1994 I need HB98 returned
HB 119.G74 A3 2007 I need HB119 returned
I'm very new to coding so I'm hoping there's some simple solution that I just don't know?
I was going to just split it at the first dot and then get rid of the spaces, but this wouldn't allow me to keep the decimals such as HB97.52 which I need. I currently have code written just to test one string at a time. The code is as follows:
data = input("Data: ")
components = data.split(".")
str(components)
print(components[0].replace(" ", ""))
This works as expected except for the strings with decimals. for HB97.52 .R6163 1982 I would like HB97.52 returned but it only returns HB97.
The following regular expression extracts the letters at the beginning of a string, followed by optional spaces, followed by a [possibly floating point] number:
s = ['HB61 .M16 1973', 'HB97.52 .R6163 1982',
'HB98.V38 1994', 'HB 119.G74 A3 2007']
import re
pattern = r"^[a-z]+\s*\d+(?:\.\d+)?"
[re.findall(pattern, part, flags=re.I)[0] for part in s]
#['HB61', 'HB97.52', 'HB98', 'HB 119']
If you do not want the spaces in the output, this slightly different pattern extracts the letter part and the number part separately, and then they are joined:
pattern = r"(^[a-z]+)\s*(\d+(?:\.\d+)?)"
list(map("".join, [re.findall(pattern, part, flags=re.I)[0] for part in s]))
#['HB61', 'HB97.52', 'HB98', 'HB119']
For something like HB61.45.78.R5000 what do you want? If you want HB61.45.78 then use this first snippet:
data = data.replace(' ', '')
data = data.split('.')
wanted = data[0]
for i in range(1,len(data)):
if data[i][0].isalpha():
break
else:
wanted += '.' + data[i]
Otherwise, if you want only HB61.45 then use
data = data.replace(' ', '')
data = data.split('.')
wanted = data[0]
if not data[1][0].isalpha():
wanted += '.' + data[1]
I'm using python 3.x. I'm trying to get the (int) number at the end of a string with format
string_example_1 = l-45-98-567-567-12
string_example_2 = s-89-657
or in general, a single lowercase letter followed by a number of integers separated by '-'. What I need is to get the last number (12 and 657 in these cases). I have archived this with the function
def ending(the_string):
out = ''
while the_string[-1].isdigit():
out = the_string[-1] + out
the_string = the_string[:-1]
return out
but I'm sure there must be a more pythonic way to do this. In a previous instance I check manually that the string starts the way I like by doing something like
if st[0].isalpha() and st[1]=='-' and st[2].isdigit():
statement...
I would just split the string on -, take the last of the splits and convert it to an integer.
string_example_1 = "l-45-98-567-567-12"
string_example_2 = "s-89-657"
def last_number(s):
return int(s.split("-")[-1])
print(last_number(string_example_1))
# 12
print(last_number(string_example_2))
# 657
Without regular expressions, you could reverse the string, take elements from the string while they're still numbers, and then reverse the result. In Python:
from itertools import takewhile
def extract_final_digits(s):
return int(''.join(reversed(list(takewhile(lambda c: c.isdigit(), reversed(s))))))
But the simplest is to just split on a delimiter and take the final element in the split list.
I need some help with a specific problem, which I cannot seem to find on this website.
I have a result which looks something like this:
result = "ooooooooooooooooooooooMMMMMMooooooooooooooooooMMMMMMooooooooooMMMMMMMMoo"
This is a transmembrane prediction. So for this string, I have another string of the same length, but is an amino acid code, for example:
amino_acid_code = "MSDENKSTPIVKASDITDKLKEDILTISKDALDKNTWHVIVGKNFGSYVTHEKGHFVYFYIGPLAFLVFKTA"
I want to do some research on the last "M" region. This can vary in length, as well as the "o" that comes after. So in this case I need to extract "PLAFLVFK" from the last string, which corresponds to the last "M" region.
I have something like this already, but I cannot figure out how to obtain the start position, and I also believe a simpler (or computationally better) solution is possible.
end = result.rfind('M')
start = ?
region_I_need = amino_acid_code[start:end]
Thanks in advance
To also find the start position, use rfind again after slicing off the characters after the end of the result string:
result = "ooooooooooooooooooooooMMMMMMooooooooooooooooooMMMMMMooooooooooMMMMMMMMoo"
amino_acid_code = "MSDENKSTPIVKASDITDKLKEDILTISKDALDKNTWHVIVGKNFGSYVTHEKGHFVYFYIGPLAFLVFKTA"
# add 1 to the indices to get the correct positions
end = result.rfind('M') + 1
start = result[:end].rfind('o') + 1
region_I_need = amino_acid_code[start:end]
print(start, end)
print(amino_acid_code[start:end])
>>> 62 70
>>> PLAFLVFK
My question is more or less similar to:
Is there a way to substring a string in Python?
but it's more specifically oriented.
How can I get a par of a string which is located between two known words in the initial string.
Example:
mySrting = "this is the initial string"
Substring = "initial"
knowing that "the" and "string" are the two known words in the string that can be used to get the substring.
Thank you!
You can start with simple string manipulation here. str.index is your best friend there, as it will tell you the position of a substring within a string; and you can also start searching somewhere later in the string:
>>> myString = "this is the initial string"
>>> myString.index('the')
8
>>> myString.index('string', 8)
20
Looking at the slice [8:20], we already get close to what we want:
>>> myString[8:20]
'the initial '
Of course, since we found the beginning position of 'the', we need to account for its length. And finally, we might want to strip whitespace:
>>> myString[8 + 3:20]
' initial '
>>> myString[8 + 3:20].strip()
'initial'
Combined, you would do this:
startIndex = myString.index('the')
substring = myString[startIndex + 3 : myString.index('string', startIndex)].strip()
If you want to look for matches multiple times, then you just need to repeat doing this while looking only at the rest of the string. Since str.index will only ever find the first match, you can use this to scan the string very efficiently:
searchString = 'this is the initial string but I added the relevant string pair a few more times into the search string.'
startWord = 'the'
endWord = 'string'
results = []
index = 0
while True:
try:
startIndex = searchString.index(startWord, index)
endIndex = searchString.index(endWord, startIndex)
results.append(searchString[startIndex + len(startWord):endIndex].strip())
# move the index to the end
index = endIndex + len(endWord)
except ValueError:
# str.index raises a ValueError if there is no match; in that
# case we know that we’re done looking at the string, so we can
# break out of the loop
break
print(results)
# ['initial', 'relevant', 'search']
You can also try something like this:
mystring = "this is the initial string"
mystring = mystring.strip().split(" ")
for i in range(1,len(mystring)-1):
if(mystring[i-1] == "the" and mystring[i+1] == "string"):
print(mystring[i])
I suggest using a combination of list, split and join methods.
This should help if you are looking for more than 1 word in the substring.
Turn the string into array:
words = list(string.split())
Get the index of your opening and closing markers then return the substring:
open = words.index('the')
close = words.index('string')
substring = ''.join(words[open+1:close])
You may want to improve a bit with the checking for the validity before proceeding.
If your problem gets more complex, i.e multiple occurrences of the pair values, I suggest using regular expression.
import re
substring = ''.join(re.findall(r'the (.+?) string', string))
The re should store substrings separately if you view them in list.
I am using the spaces between the description to rule out the spaces between words, you can modify to your needs as well.