How to get rid of extra parentheses, quotes or commas Python - python-3.x

I'm making a discord bot and i'm running in issue with getting list values with out extra parentheses, quotes or commas.
here's the code:
#gets slots for page
def listindexcheck(slot):
if totalitemcount > slot + (page * 9):
return slot + (page * 9)
else:
return 0
if argument == 'show':
#message set up
embed = discord.Embed(
title='title',
description='Here is your inventory',
colour=discord.Colour.red())
for i in range(9):
embed.add_field(name=f"Slot ({listindexcheck(i+1)})", value=f'Dur: {str(item_dur[listindexcheck(i+1)])}\n'
f'Mod: {str(item_mod[listindexcheck(i+1)])}\n'
f'E.lvl: {str(item_level[listindexcheck(i+1)])}\n'
f'*id:{str(item_ids[listindexcheck(i+1)])}*\n')
embed.set_footer(text='page: ' + str(page+1))
msg = await ctx.send(embed=embed)
print (type(str(item_ids[listindexcheck(i+1)])))
and here's output
Type of data before converting it to string is list
i tried to turn values to string type to get rid of at least the quotes but that didn't work
My question is is there a way to just get the values without doing anything extra to it?
Thanks

From the looks of it, item_dur[listindexcheck(i+1), item_mod[listindexcheck(i+1) item_level[listindexcheck(i+1) item_ids[listindexcheck(i+1) seem to be tuples with only one item inside. Why there is a comma inside the parenthesis ? It is because the comma makes the tuple and not the parenthesis.
Therefore, to not get parenthesis and comma when you print them, you can just get the first item of the tuple like this item_dur[listindexcheck(i+1)][0].

After you've turned it into a string you could use the replace() method to remove any extra pieces that you don't want in there. You would replace anything that you don't want in there with '' to simply remove it from the string.
Here's some info on the replace() method: https://www.w3schools.com/python/ref_string_replace.asp

You can use string.replace, and it'll be better if you make a function for this like that:
def delete_brackets(text: str):
return text.replace(')', '').replace(')', '').replace(',', '')
embed.add_field(name=f"Slot ({listindexcheck(i+1)})", value=f'Dur: {delete_brackets(str(item_dur[listindexcheck(i+1)]))}\n'
f'Mod: {delete_brackets(str(item_mod[listindexcheck(i+1)]))}\n'
f'E.lvl: {delete_brackets(str(item_level[listindexcheck(i+1)]))}\n'
f'*id:{delete_brackets(str(item_ids[listindexcheck(i+1)]))}*\n')
So this will work for you.

Related

Get number from string in Python

I have a string, I have to get digits only from that string.
url = "www.mylocalurl.com/edit/1987"
Now from that string, I need to get 1987 only.
I have been trying this approach,
id = [int(i) for i in url.split() if i.isdigit()]
But I am getting [] list only.
You can use regex and get the digit alone in the list.
import re
url = "www.mylocalurl.com/edit/1987"
digit = re.findall(r'\d+', url)
output:
['1987']
Replace all non-digits with blank (effectively "deleting" them):
import re
num = re.sub('\D', '', url)
See live demo.
You aren't getting anything because by default the .split() method splits a sentence up where there are spaces. Since you are trying to split a hyperlink that has no spaces, it is not splitting anything up. What you can do is called a capture using regex. For example:
import re
url = "www.mylocalurl.com/edit/1987"
regex = r'(\d+)'
numbers = re.search(regex, url)
captured = numbers.groups()[0]
If you do not what what regular expressions are, the code is basically saying. Using the regex string defined as r'(\d+)' which basically means capture any digits, search through the url. Then in the captured we have the first captured group which is 1987.
If you don't want to use this, then you can use your .split() method but this time provide a split using / as the separator. For example `url.split('/').

Combining several replace-statements into one in pandas­

There is a DataFrame in pandas, see image below
Basically it is a table scraped from Wikipedia's article: https://de.wikipedia.org/wiki/Liste_der_Gro%C3%9Fst%C3%A4dte_in_Deutschland#Tabelle
For further processing, I am trying to clean up the data. So, these statements work well
df['Name'] = df['Name'].str.replace('\d+', '')
df['Name'] = df['Name'].str.strip()
df['Name'] = df['Name'].str.replace(',', '')
df['Name'] = df['Name'].str.replace('­-', '')
But how can I bring all these four statements into one? Probably using regular expressions.
I tried with df['Name'] = df['Name'].str.replace(r'[\d\-,]+', '') but it did not work. Maybe because of the word wrap character that was used.
My desired output is " Ber,li-n2 "-> "Berlin".
The unknown circumstances are going around 'Mönchen­gladbach1, 5'.
You are removing the data, so you may join the patterns you remove into a single pattern like the one you have. r'[\d,-]+' is a bit better stylistically.
You may remove any dash punctuation + soft hyphen (\u00AD) using [\u00AD\u002D\u058A\u05BE\u1400\u1806\u2010-\u2015\u2E17\u2E1A\u2E3A\u2E3B\u2E40\u301C\u3030\u30A0\uFE31\uFE32\uFE58\uFE63\uFF0D], so you may want to add these codes to the regex.
Remember to assign the cleaned data back to the column and add .str.stip().
You may use
df['Name'] = df['Name'].str.replace(r'[\u00AD\u002D\u058A\u05BE\u1400\u1806\u2010-\u2015\u2E17\u2E1A\u2E3A\u2E3B\u2E40\u301C\u3030\u30A0\uFE31\uFE32\uFE58\uFE63\uFF0D\d,-]+', '').str.strip()
If you do not want to add str.strip(), add ^\s+ and \s+$ alternatives to the regex:
df['Name'] = df['Name'].str.replace(r'^\s+|[\u00AD\u002D\u058A\u05BE\u1400\u1806\u2010-\u2015\u2E17\u2E1A\u2E3A\u2E3B\u2E40\u301C\u3030\u30A0\uFE31\uFE32\uFE58\uFE63\uFF0D\d,-]+|\s+$', '')
Details
^\s+ - 1+ whitespaces at the start of the string
| - or
[\u002D\u058A\u05BE\u1400\u1806\u2010-\u2015\u2E17\u2E1A\u2E3A\u2E3B\u2E40\u301C\u3030\u30A0\uFE31\uFE32\uFE58\uFE63\uFF0D\d,-]+ - 1 or more Unicode dashes, digits, commas or - chars
| - or
\s+$ - 1+ whitespaces at the end of the string.
You can go with
df['Name'] = df['Name'].str.replace('(\d+|,|­<|>|-)', '')
Put the items you want to sort out into a group, and seperate different options using the pipe |

Delete \n charater with Python

I have a list of sentences which have this "\n" character.
[("Types of Third\n-\nParties\n"),("Examples of third\n-\nparties"), ...]
I tried with the following code :
def remove_whitespace(sent_text):
j=0
for i in sent_text:
sent_text[j]=i.rstrip("\n")
j+=1
remove_whitespace(sent_text)
But the \n character didn't disappear.
Any idea please?
Thanks
You can also use list comprehension to remove these unwanted items.
input_list = [("Types of Third\n-\nParties\n"),("Examples of third\n-\nparties")]
def expunge_unwanted_elements(input_variable):
cleaned = [item.replace('\n', ' ').strip() for item in input_variable]
# Do you want to remove the dashes? If so use this one.
# cleaned = [item.replace('\n', '').replace('-', ' ').strip() for item in input_variable]
return cleaned
print (expunge_unwanted_elements(input_list))
# outputs
['Types of Third - Parties', 'Examples of third - parties']
# or this output if you use the other cleaned in the function
['Types of Third Parties', 'Examples of third parties']
Using str.split & str.join
Ex:
data = [("Types of Third\n-\nParties\n"),("Examples of third\n-\nparties")]
for text in data:
text = "".join(text.split("\n"))
print(text)
Output:
Types of Third-Parties
Examples of third-parties
One quick solution is using str.replace.
In your case:
def remove_whitespace(sent_text):
j=0
for i in sent_text:
sent_text[j]=i.replace("\n","")
j+=1
You can use rstrip() function.
If text is coming with \n or \r, text.rstrip() takes these off.

set function with file- python3

I have a text file with given below content
Credit
Debit
21/12/2017
09:10:00
Written python code to convert text into set and discard \n.
with open('text_file_name', 'r') as file1:
same = set(file1)
print (same)
print (same.discard('\n'))
for first print statement print (same). I get correct result:
{'Credit\n','Debit\n','21/12/2017\n','09:10:00\n'}
But for second print statement print (same.discard('\n')) . I am getting result as
None.
Can anybody help me to figure out why I am getting None. I am using same.discard('\n') to discard \n in the set.
Note:
I am trying to understand the discard function with respect to set.
The discard method will only remove an element from the set, since your set doesn't contain just \n it can't discard it. What you are looking for is a map that strips the \n from each element like so:
set(map(lambda x: x.rstrip('\n'), same))
which will return {'Credit', 'Debit', '09:10:00', '21/12/2017'} as the set. This sample works by using the map builtin which applies it's first argument to each element in the set. The first argument in our map usage is lambda x: x.rstrip('\n') which is simply going to remove any occurrences of \n on the right-hand side of each string.
discard removes the given element from the set only if it presents in it.
In addition, the function doesn't return any value as it changes the set it was ran from.
with open('text_file_name', 'r') as file1:
same = set(file1)
print (same)
same = {elem[:len(elem) - 1] for elem in same if elem.endswith('\n')}
print (same)
There are 4 elements in the set, and none of them are newline.
It would be more usual to use a list in this case, as that preserves order while a set is not guaranteed to preserve order, plus it discards duplicate lines. Perhaps you have your reasons.
You seem to be looking for rstrip('\n'). Consider processing the file in this way:
s = {}
with open('text_file_name') as file1:
for line in file1:
s.add(line.rstrip('\n'))
s.discard('Credit')
print(s) # This displays 3 elements, without trailing newlines.

Alternative to .replace() for replacing multiple substrings in a string

Are there any alternatives that are similar to .replace() but that allow you to pass more than one old substring to be replaced?
I have a function with which I pass video titles so that specific characters can be removed (because the API I'm passing the videos too has bugs that don't allow certain characters):
def videoNameExists(vidName):
vidName = vidName.encode("utf-8")
bugFixVidName = vidName.replace(":", "")
search_url ='https://api.brightcove.com/services/library?command=search_videos&video_fields=name&page_number=0&get_item_count=true&token=kwSt2FKpMowoIdoOAvKj&any=%22{}%22'.format(bugFixVidName)
Right now, it's eliminating ":" from any video titles with vidName.replace(":", "") but I also would like to replace "|" when that occurs in the name string sorted in the vidName variable. Is there an alternative to .replace() that would allow me to replace more than one substring at a time?
>>> s = "a:b|c"
>>> s.translate(None, ":|")
'abc'
You may use re.sub
import re
re.sub(r'[:|]', "", vidName)

Resources