Appending a sublist to another list in python - python-3.x

with open('LBP_for_paper.csv','r') as csvDataFile:
datarows = csv.reader(csvDataFile, delimiter=',', quotechar='|')
nofinding=[]
rawrow=[]
for row in datarows:
if row[1]=='No Finding' and row[2]=='1':
rawrow = list((row[0]+","+row[1]+","+row[2]+","+row[17]+","+row[18]))
nofinding.append(rawrow)
print(nofinding[:2])
I am reading datarows from a csv file and want to create a customized nested list based on certain columns. I want that
list((row[0]+","+row[1]+","+row[2]+","+row[17]+","+row[18]))
shall return a list like
['00030805_000.png,No Finding,1,34777,69373']
which is stored in rawrow and then append to a bigger list i.e. nofinding but i am getting output as
[['0', '0', '0', '3', '0', '8', '0', '5', '', '0', '0', '0', '.',
'p', 'n', 'g', ',', 'N', 'o', ' ', 'F', 'i', 'n', 'd', 'i', 'n', 'g',
',', '1', ',', '3', '4', '7', '7', '7', ',', '6', '9', '3', '7', '3'],
['0', '0', '0', '3', '0', '8', '0', '4', '', '0', '0', '0', '.', 'p',
'n', 'g', ',', 'N', 'o', ' ', 'F', 'i', 'n', 'd', 'i', 'n', 'g', ',',
'1', ',', '3', '5', '4', '0', '5', ',', '6', '3', '0', '8', '8']]
Desired output
[ ['00030805_000.png,No Finding,1,34777,69373'], ['00030804_000.png,No
Finding,1,35405,63088'] ]
Thank you

Your issue is that rawrow = list((row[0]+","+row[1]+","+row[2]+","+row[17]+","+row[18])) is turning the string in to a list of characters
if you want to leave this as a comma delimited string replace that line with the following:
rawrow = row[0]+","+row[1]+","+row[2]+","+row[17]+","+row[18]
or more cleanly:
rawrow = ",".join([row[row_index] for row_index in [0, 1, 2, 17, 18]])
I am curious though why you want:
[ ['00030805_000.png,No Finding,1,34777,69373'], ['00030804_000.png,No Finding,1,35405,63088'] ]
Instead of this:
[ ['00030805_000.png','No Finding',1,34777,69373], ['00030804_000.png','No Finding',1,35405,63088] ]
which you could achieve with the following:
rawrow = []
for row_index in [0, 1, 2, 17, 18]:
rawrow.append(row[row_index].split(","))
or in one line:
rawrow = [row[row_index].split(",") for row_index in [0, 1, 2, 17, 18]]
Furthermore, your whole code could be consolidated as follows:
with open('LBP_for_paper.csv','r') as csvDataFile:
datarows = csv.reader(csvDataFile, delimiter=',', quotechar='|')
nofinding = [",".join([row[row_index] for row_index in [0, 1, 2, 17, 18]]) for row in datarows if row[1]=='No Finding' and row[2]=='1']
print(nofinding[:2])

with open('LBP_for_paper.csv','r') as csvDataFile:
datarows = csv.reader(csvDataFile, delimiter=',', quotechar='|')
rawrow = []
nofindings=[]
for row in datarows:
if row[1]=='No Finding' and row[2]=='1':
rawrow = [''.join(row[row_index]) for row_index in [0, 1, 2, 17, 18] ]
nofindings.append(rawrow)
print(nofindings[:3])
Solved my issues.

Related

Python - how to reassign the cells' values in a DataFrame given a list ? - finding a fast way to achive it in a big data table

I have a big table with the size of 5,905,635*30 (see figure 1), and a list with the size 5,905,635 of rows (see figure 2). I want to reassign the cells' values in the table, given the elements of the list (see figure 3).
figure 1
figure 2
figure 3
For example, like the codes below, I want to get df2 given df1 and list1; an easy way is to loop the elements of list1, the first element is 'B', so assign the first row of column B to 1 in df1, and the second element is 'C', then assign the second row of column C to 1, and etc. The final result should be df2. The problem with this solution is too slow if I have a big size of table. I wonder if there is a fast way to achieve this goal.
df1 = pd.DataFrame({'A': ['0', '0','0', '0', '0', '0', '0'],
'B': ['0', '0','0', '0', '0', '0', '0'],
'C': ['0', '0','0', '0', '0', '0', '0'],
'D': ['0', '0','0', '0', '0', '0', '0'],
'E': ['0', '0','0', '0', '0', '0', '0']})
list1 = ['B','C','A','E','D','A','D']
df2 = pd.DataFrame({'A': ['0', '0','1', '0', '0', '1', '0'],
'B': ['1', '0','0', '0', '0', '0', '0'],
'C': ['0', '1','0', '0', '0', '0', '0'],
'D': ['0', '0','0', '0', '1', '0', '1'],
'E': ['0', '0','0', '1', '0', '0', '0']})
The problem of this solusion is too slow if I have a big size of the table. I wonder if there is a fast way to achieve this goal.

Adding a string to an array adds all characters separately

Any suggestion how to merge it better so the dual digits numbers does not split?
Sorry for bad english.
def merge(strArr):
newList = []
for x in range(len(strArr)):
newList += strArr[x]
return newList
array_test = ["1, 3, 4, 7, 13", "1, 2, 4, 13, 15"]
print(merge(array_test))
output =['1', ',', ' ', '3', ',', ' ', '4', ',', ' ', '7', ',', ' ', '1', '3', '1', ',', ' ', '2', ',', ' ', '4', ',', ' ', '1', '3', ',', ' ', '1', '5']`
expected output= [1,2,3,4,7,13,1,2,4,13,15]
Using list comprehension:
merged_arr = [n for s in array_test for n in s.split(", ")]
print(merged_arr)
This prints:
['1', '3', '4', '7', '13', '1', '2', '4', '13', '15']
It merges this way because for lists += is an array concatenation and that in this context your string object is interpreted as an array of characters:
[] += "Hello"
# Equivalent to
[] += ["H", "e", "l", "l", "o"]
If you want to join strings you can do:
out = "".join(array_test)
Your result becomes the way it is, because you take each inner string and add each character of it to your return-list without removing any spaces or commas.
You can change your code to:
def merge(strArr):
new_list = []
for inner in strArr:
new_list.extend(inner.split(", ")) # split and extend instead of += (== append)
return new_list
array_test =["1, 3, 4, 7, 13", "1, 2, 4, 13, 15"]
merged = merge(array_test)
as_int = list(map(int,merged))
print(merged)
print(as_int)
Output:
['1', '3', '4', '7', '13', '1', '2', '4', '13', '15']
[1, 3, 4, 7, 13, 1, 2, 4, 13, 15]
Without as_int()you will still hav strings in your list, you need to convert them into integers.

Trying to separate items in list by character but output returns multiple times

I am trying to separate items in a list by character, and this is done but whenever i run the code it separates the items but shows them separated multiple times. How can I fix this?
I've already tried using range in a for function, but that hasn't worked. The only thing that gives an output is using
for character in x
My code:
def rle():
askq = int(input("How many lines of RLE compressed data do you want to enter?"))
if askq < 2:
print("You must enter at least 2 lines of RLE compressed data.")
rle()
print("Please enter your RLE compressed data one line at a time")
lines = []
for i in range (0, askq):
i = input("Which lines would you like to convert?")
lines.append(i)
num=0
lines_input = [1,num]
lines2 = []
x = []
for i in range(0,askq):
num+=1
if num in lines_input:
x.append(lines[i])
for x in lines:
for character in x:
lines2.append(character)
print(lines2)
rle()
I expect the output of
lines2
to be
["0","1","d","6","1"," ","0","1","b"]
but instead i get
['0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b']
['0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b']
['0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b', '0', '1', 'd', '6', '1', ' ', '0', '1', 'b']
Try this update
def rle():
askq = int(input("How many lines of RLE compressed data do you want to enter?"))
if askq < 2:
print("You must enter at least 2 lines of RLE compressed data.")
rle()
print("Please enter your RLE compressed data one line at a time")
lines = []
for i in range (0, askq):
i = input("Which lines would you like to convert?")
lines.append(i)
new_list = []
for i in lines:
new_list.extend(list(i))
print(new_list)
rle()

How to add items from one list to another?

I have a list1 in python as listed below:
ls1
['A', 4, 'M', '1', 128.2, 169.818, '2019-02-27']
['B', 4, 'M', '1', 169.818, 172.3, '2019-02-25']
ls2
['2019-02-27','2019-02-25']
When I am trying to add another date item from another list, it is not adding as a part of each row in list rather it is adding as a seperate component like below:
ls3
['A', 4, 'M', '1', 128.2, 169.818, '2019-02-27'],
'2019-02-27',
['B', 4, 'M', '1', 169.818, 172.3, '2019-02-25'],
'2019-02-25'
I would rather needed ls3 as:
['A', 4, 'M', '1', 128.2, 169.818, '2019-02-27','2019-02-27']
['B', 4, 'M', '1', 169.818, 172.3, '2019-02-25','2019-02-25']
You can use a list comprehension:
ls1 = [['A', 4, 'M', '1', 128.2, 169.818, '2019-02-27'], ['B', 4, 'M', '1', 169.818, 172.3, '2019-02-25']]
ls2 = ['2019-02-27','2019-02-25']
new_ls1 = [l1 + [l2] for l1, l2 in zip(ls1, ls2)]
A more hack-ey way (not faster, use the first one!):
new_ls1 = list(map(list, zip(*zip(*ls1), ls2)))
Or, if you would like, you can operate in-place:
for i, item in enumerate(ls2):
ls1[i].append(item)

Mapping between matrices

I have 2 matrices:
list_alpha = [['a'],
['b'],
['c'],
['d'],
['e']]
list_beta = [['1', 'a', 'e', 'b'],
['2', 'd', 'X', 'X'],
['3', 'a', 'X', 'X'],
['4', 'd', 'a', 'c'],
And my goal is if a letter from list_alpha is in a sublist of list_beta, then the first element of that line in list_beta (the #) is added to the correct line in list_alpha.
So my output would be:
final_list = [['a', '1', '3', '4'],
['b', '1'],
['c', '4'],
['d', '2', '4'],
['e', '1']]
But I'm pretty new to python and coding in general and I'm not sure how to do this. Is there a way to code this? Or do I have to change the way the data is stored in either list?
Edit:
Changing list_alpha to a dictionary helped!
Final code:
dict_alpha = {'a': [], 'b': [], 'c': [], 'd': [], 'e':[]}
list_beta = [['1', 'a', 'e', 'b'],
['2', 'd', 'X', 'X'],
['3', 'a', 'X', 'X'],
['4', 'd', 'a', 'c'],
['5', 'X', 'X', 'e'],
['6', 'c', 'X', 'X']]
for letter in dict_alpha:
for item in list_beta:
if letter in item:
dict_alpha.get(letter).append(item[0])
print(dict_alpha)
You can use dict_alpha as same as list_alpha , then fix your for loop.
For example:
dict_alpha = [['a'],
['b'],
['c'],
['d'],
['e']]
list_beta = [['1', 'a', 'e', 'b'],
['2', 'd', 'X', 'X'],
['3', 'a', 'X', 'X'],
['4', 'd', 'a', 'c'],
['5', 'X', 'X', 'e'],
['6', 'c', 'X', 'X']]
for al in dict_alpha:
for bt in list_beta:
for i in range(1, len(bt)):
if (bt[i] == al[0]):
al.append(bt[0])
print(dict_alpha)
Output:
[['a', '1', '3', '4'],
['b', '1'],
['c', '4', '6'],
['d', '2', '4'],
['e', '1', '5']]
Hope to helpful!

Resources