Replace/Remove colons of Strings inside a list - python-3.x

Simple question:
I got following List:
['nordhessen:abfall_b', 'nordhessen:anschlussstelle_b']
And I want to get this List:
['nordhessenabfall_b', 'nordhessenanschlussstelle_b']
How do I remove the colons?

data = ['nordhessen:abfall_b', 'nordhessen:anschlussstelle_b']
new_data = [item.replace(':', '') for item in data]

Related

How do I print without square brackets

so I got this function here, and what it's supposed to do is create a file, that I can write in. the second and third parameters are lists, while the first is just the file name that I am going to create to write in. In the function, I made a for loop, and I'm looping through the all_students_list, which is a list, but at each index, is also a list, with the first name and last name in the list. all_courses_list is a list of all the courses in the school, and Schedule is a list that another function returns, giving us the schedule of the student. Then I added the student name and the schedule together, to write to the file. The problem is that it also prints [] square brackets. How can I get rid of it? I've already tried to do
.replace('[', '')
.replace(']', '')
But it doesn't work.
Here is my code.
def generate_student_schedules(filename, all_courses_list, all_students_list):
with open(filename,'w') as fileout:
for one_student in all_students_list:
schedule = get_schedule(all_courses_list)
one_line = ''
one_line += (f'{one_student}')
one_line += (f'{schedule}\n')
fileout.write(one_line)
If one_student is an actual list, then you can use " ".join(one_student), so overall:
def generate_student_schedules(filename, all_courses_list, all_students_list):
with open(filename,'w') as fileout:
for one_student in all_students_list:
schedule = get_schedule(all_courses_list)
one_line = ''
one_line += (" ".join(one_student))
one_line += (f'{schedule}\n')
fileout.write(one_line)
When you print a list, Python's default is to print the brackets and items in the list. You have to build a single string of the components of the list and print that single string. Your format string can pull out individual items or use join across all the items if they are all strings:
>>> student = ['John','Smith']
>>> schedule = ['Class1','Class2']
>>> print(student,schedule)
['John', 'Smith'] ['Class1', 'Class2']
>>> line = f'{student[1]}, {student[0]}: {", ".join(schedule)}'
>>> print(line)
Smith, John: Class1, Class2

Create a string from a list using list comprehension

I am trying to create a string separated by comma from the below given list
['D:\\abc\\pqr\\123\\aaa.xlsx', 'D:\\abc\\pqr\\123\\bbb.xlsx', 'D:\\abc\\pqr\\123\\ccc.xlsx']
New string should contain only the filename like below which is separated by comma
'aaa.xlsx,bbb.xlsx,ccc.xlsx'
I have achieved this using the below code
n = []
for p in input_list:
l = p.split('\\')
l = l[len(l)-1]
n.append(l)
a = ','.join(n)
print(a)
But instead of using multiple lines of code i would like to achieve this in single line using a list comprehension or regular expression.
Thanks in advance...
Simply do a
main_list = ['D:\\abc\\pqr\\123\\aaa.xlsx', 'D:\\abc\\pqr\\123\\bbb.xlsx', 'D:\\abc\\pqr\\123\\ccc.xlsx']
print([x.split("\\")[-1] for x in main_list])
OUTPUT:
['aaa.xlsx', 'bbb.xlsx', 'ccc.xlsx']
In case u want to get the string of this simply do a
print(",".join([x.split("\\")[-1] for x in main_list]))
OUTPUT:
aaa.xlsx,bbb.xlsx,ccc.xlsx
Another way to do the same is:
print(",".join(map(lambda x : x.split("\\")[-1],main_list)))
OUTPUT:
aaa.xlsx,bbb.xlsx,ccc.xlsx
Do see that os.path.basename is OS-dependent and may create problems on cross-platform scripts.
Using os.path.basename with str.join
Ex:
import os
data = ['D:\\abc\\pqr\\123\\aaa.xlsx', 'D:\\abc\\pqr\\123\\bbb.xlsx', 'D:\\abc\\pqr\\123\\ccc.xlsx']
print(",".join(os.path.basename(i) for i in data))
Output:
aaa.xlsx,bbb.xlsx,ccc.xlsx

Splitting a list entry in Python

I am importing a CSV file into a list in Python. When I split it into list elements then print a index,the entry is printed like this.
2000-01-03,3.745536,4.017857,3.631696,3.997768,2.695920,133949200
How would I split this list so if I wanted to just print a solo element like this?
2000-01-03Here is my code so far.
def main():
list = []
filename = "AAPL.csv"
with open(filename) as x:
for line in x.readlines():
val = line.strip('\n').split(',')
list.append(val)
print(list[2])
Your current code build a list of lists, precisely a list (of rows) of lists (of fields).
To extract one single element, say first field of third row, you could do:
...
print(list[2][0])
But except for trivial tasks, you should use the csv module when processing csv file, because it is robust to corner cases like newlines or field separarors contained in fields. Your code could become:
def main():
list = []
filename = "AAPL.csv"
with open(filename) as x:
rd = csv.reader(x)
for val in rd: # the reader is an iterator of lists of fields
list.append(val)
print(list[2][0])

How to separate a string into 2 list of lists

I got this string:
\n
\n
N\tO\tHP\tM\tD\tU\tI\tN\tO\n
E\tS\tA\tE\tI\tT\tL\tN\tI\tN\n
N\tP\tN\tN\tN\tG\tAO\tD\tC\n
\n
\n
PERMANENTE
PETTINE
\n
\n
actually if you looks at original string ,you cannot see the \t and \n ,so I just edited to better understanding.
What is I'm trying to do is separate to 2 different list of lists,for example:
lists1 = [[NOHPMDUINO][ESAEITLNIN][NPNNNGAODC]]
lists2 = [[PERMANENTE][PETTINE]]
I tried to use many methods to solve this, but without success.
at first I removed the new lines at the beginning with .strip('\n') method, and I tried to use replace , but I don't know how to make it right.
Thank you zsomko and snakecharmerb,
Using the method of zsomko and adding strip() to remove the newline at the beginning , here is the loop that I did to divide to 2 variables:
var = True
for line in t:
if line !=['']:
if var:
group1.append(line)
else:
group2.append(line)
else:
var = False
I hope this will help to someone :) If somebody has better solution ,more efficient ,I would like to hear
First eliminate the tabs and split the string into lines:
lines = [line.replace('\t', '') for line in string.splitlines()]
Then the following would yield the list of lists in the variable groups as expected:
groups = []
group = []
for line in lines:
if group and not line:
groups.append(group)
group = []
elif line:
group.append(line)
You can break the string into separate lines using its splitlines method - this will give you a list of lines without their terminating newline ('\n') characters.
Then you can loop over the list and replace the tab characters with empty strings using the str.replace method.
>>> for line in s.splitlines():
... if not line:
... # Skip empty lines
... continue
... cleaned = line.replace('\t', '')
... print(cleaned)
...
NOHPMDUINO
ESAEITLNIN
NPNNNGAODC
PERMANENTE
PETTINE
Grouping the output in lists of lists is a little trickier. The question doesn't mention the criteria for grouping, so let's assume that lines which are not separated by empty lines should be listed together.
We can use a generator to iterate over the string, group adjacent lines and emit them as lists like this:
>>> def g(s):
... out = []
... for line in s.splitlines():
... if not line:
... if out:
... yield out
... out = []
... continue
... cleaned = line.replace('\t', '')
... out.append([cleaned])
... if out:
... yield out
...
>>>
The generator collects lines in a list (out) which it yields each time it finds a blank line and the list is not empty; if the list is yielded it is replaced with an empty list. After looping over the lines in the string it yields the list again, if it isn't empty, in case the string didn't end with blank lines.
Looping over the generator returns the lists of lists in turn.
>>> for x in g(s):print(x)
...
[['NOHPMDUINO'], ['ESAEITLNIN'], ['NPNNNGAODC']]
[['PERMANENTE'], ['PETTINE']]
Alternatively, if you want a list of lists of lists, call list on the generator:
>>> lists = list(g(s))
>>> print(lists)
[[['NOHPMDUINO'], ['ESAEITLNIN'], ['NPNNNGAODC']], [['PERMANENTE'], ['PETTINE']]]
If you want to assign the result to named variables, you can unpack the call to list:
>>> group1, group2 = list(g(s))
>>> group1
[['NOHPMDUINO'], ['ESAEITLNIN'], ['NPNNNGAODC']]
>>> group2
[['PERMANENTE'], ['PETTINE']]
but note to do this you need to know the number of lists that will be generated in advance.

using Python how to remove redundancy from rows of text file

Hello guys I am using RCV1 dataset. I want to remove duplicates words or tokens from the text file but I am not sure how to do it. And since these are not duplicate rows these are words in articles. I am using python, please help me with this.please see the attached image to get an idea about text file
Assuming that the words of the text file are spaced out with only a blank spaces (i.e., no attached commas and periods), the following code should work for you.
items = []
with open("data.txt") as f:
for line in f:
items += line.split()
newItemList = list(set(items))
If you would like to have the items as a single string:
newItemList = " ".join(list(set(items)))
If you want the order to be preserved as well, then do
newItemList = []
for item in items:
if item not in newItemList:
newItemList += [item]
newItemList = " ".join(newItemList)

Resources