array([
['192', '895'],
['14', '269'],
['1', '23'],
['1', '23'],
['50', '322'],
['19', '121'],
['17', '112'],
['12', '72'],
['2', '17'],
['5,250', '36,410'],
['2,546', '17,610'],
['882', '6,085'],
['571', '3,659'],
['500', '3,818'],
['458', '3,103'],
['151', '1,150'],
['45', '319'],
['44', '335'],
['30', '184']
])
How can I remove some of the rows and left the array like:
Table3=array([
['192', '895'],
['14', '269'],
['1', '23'],
['50', '322'],
['17', '112'],
['12', '72'],
['2', '17'],
['5,250', '36,410'],
['882', '6,085'],
['571', '3,659'],
['500', '3,818'],
['458', '3,103'],
['45', '319'],
['44', '335'],
['30', '184']
])
I removed the index 2,4,6. I am not sure how should I do it. I have tried few ways, but still can't work.
It seems like you actually deleted indices 2, 5, and 10 (not 2, 4 and 6). To do this you can use np.delete, pass it a list of the indices you want to delete, and apply it along axis=0:
Table3 = np.delete(arr, [[2,5,10]], axis=0)
>>> Table3
array([['192', '895'],
['14', '269'],
['1', '23'],
['50', '322'],
['17', '112'],
['12', '72'],
['2', '17'],
['5,250', '36,410'],
['882', '6,085'],
['571', '3,659'],
['500', '3,818'],
['458', '3,103'],
['151', '1,150'],
['45', '319'],
['44', '335'],
['30', '184']],
dtype='<U6')
Related
below are 2 lst1 and lst2 and expected output is in output as below.
lst1 = ['q','r','s','t','u','v','w','x','y','z']
lst2 =['1','2','3']
Output expected
[['q','1'], ['r','2'], ['s','3'], ['t','1'],['u','2'],['v','3'],['w','1'],['x','2'],['y','3'],
['z','1']]"
This is a very simple approach to this problem.
lst1 = ['q','r','s','t','u','v','w','x','y','z']
lst2 = ['1','2','3']
new_list = []
for x in range(len(lst1)):
new_list.append([lst1[x], lst2[x % 3]])
print(new_list) # [['q', '1'], ['r', '2'], ['s', '3'], ['t', '1'], ['u', '2'], ['v', '3'], ['w', '1'], ['x', '2'], ['y', '3'], ['z', '1']]
You could also use list comprehension in this case, like so:-
new_list = [[lst1[x], lst2[x % 3]] for x in range(len(lst1))]
You can use zip() and itertools.cycle().
from itertools import cycle
lst1 = ['q','r','s','t','u','v','w','x','y','z']
lst2 =['1','2','3']
result = [[letter, number] for letter, number in zip(lst1, cycle(lst2))]
print(result)
Expected output:
[['q', '1'], ['r', '2'], ['s', '3'], ['t', '1'], ['u', '2'], ['v', '3'], ['w', '1'], ['x', '2'], ['y', '3'], ['z', '1']]
Another solution would be to additonally use map().
result = list(map(list, zip(lst1, cycle(lst2))))
In case you wanna use tuples you could just do
from itertools import cycle
lst1 = ['q','r','s','t','u','v','w','x','y','z']
lst2 =['1','2','3']
result = list(zip(lst1, cycle(lst2)))
print(result)
which would give you
[('q', '1'), ('r', '2'), ('s', '3'), ('t', '1'), ('u', '2'), ('v', '3'), ('w', '1'), ('x', '2'), ('y', '3'), ('z', '1')]
l1= [['1', 'apple', '1', '2', '1', '0', '0', '0'], ['1',
'cherry', '1', '1', '1', '0', '0', '0']]
l2 = [['1', 'cherry', '2', '1'],
['1', 'plums', '2', '15'],
['1', 'orange', '2', '15'],
['1', 'cherry', '2', '1'],
['1', 'cherry', '2', '1']]
output = []
for i in l1:
for j in l2:
if i[1] != j[1]:
output.append(j)
break
print(output)
Expected Output:
[['1', 'plums', '2', '15'], ['1', 'orange', '2', '15']]
How to stop iteration and find unique elements and get the sublist?
How to stop iteration and find unique elements and get the sublist?
To find the elements in L2 that are not in L1 based on the fruit name:
l1= [[1,'apple',3],[1,'cherry',4]]
l2 = [[1,'apple',3],[1,'plums',4],[1,'orange',3],[1,'apple',4]]
output = []
for e in l2:
if not e[1] in [f[1] for f in l1]: # search by matching fruit
output.append(e)
print(output)
Output
[[1, 'plums', 4], [1, 'orange', 3]]
You can store all the unique elements from list1 in a new list, then check for list2 if that element exists in the new list. Something like:
newlist = []
for item in l1:
if item[1] not in newlist:
newlist.append(item)
output = []
for item in l2:
if item[1] not in newlist:
output.append(item)
print(output)
This is slightly inefficient but really straightforward to understand.
This question already has answers here:
Sorting sub-lists into new sub-lists based on common first items
(4 answers)
Closed 2 years ago.
I have a text file that has lines in following order:
1 id:0 e1:"a" e2:"b"
0 id:0 e1:"4" e2:"c"
0 id:1 e1:"6" e2:"d"
2 id:2 e1:"8" e2:"f"
2 id:2 e1:"9" e2:"f"
2 id:2 e1:"d" e2:"k"
and I have to extract a list of lists containing elements (e1,e2) with id determining the index of the outer list and inner list following the order of the lines. So in the above case my output will be
[[("a","b"),("4","c")],[("6","d")],[("8","f"),("9","f"),("d","k")]]
The problem for me is that to know that the beginning of the new inner list, I need to check if the id value has changed. Each id does not have fixed number of elements. For example id:0 has 2, id:1 has 1 and id:2 has 3. Is there a efficient way to check this condition in next line while making the list?
You can use itertools.groupby() for the job:
import itertools
def split_by(
items,
key=None,
processing=None,
container=list):
for key_value, grouping in itertools.groupby(items, key):
if processing:
grouping = (processing(group) for group in grouping)
if container:
grouping = container(grouping)
yield grouping
to be called as:
from operator import itemgetter
list(split_by(items, itemgetter(0), itemgetter(slice(1, None))))
The items can be easily generated from text above (assuming it is contained in the file data.txt):
def get_items():
# with io.StringIO(text) as file_obj: # to read from `text`
with open(filename, 'r') as file_obj: # to read from `filename`
for line in file_obj:
if line.strip():
vals = line.replace('"', '').split()
yield tuple(val.split(':')[1] for val in vals[1:])
Finally, to test all the pieces (where open(filename, 'r') in get_items() is replaced by io.StringIO(text)):
import io
import itertools
from operator import itemgetter
text = """
1 id:0 e1:"a" e2:"b"
0 id:0 e1:"4" e2:"c"
0 id:1 e1:"6" e2:"d"
2 id:2 e1:"8" e2:"f"
2 id:2 e1:"9" e2:"f"
2 id:2 e1:"d" e2:"k"
""".strip()
print(list(split_by(get_items(), itemgetter(0), itemgetter(slice(1, None)))))
# [[('a', 'b'), ('4', 'c')], [('6', 'd')], [('8', 'f'), ('9', 'f'), ('d', 'k')]]
This efficiently iterates through the input without unnecessary memory allocation.
No other packages are required
Load and parse the file:
Beginning with a text file, formatted as shown in the question
# parse text file into dict
with open('test.txt', 'r') as f:
text = [line[2:].replace('"', '').strip().split() for line in f.readlines()] # clean each line and split it into a list
text = [[v.split(':') for v in t] for t in text] # split each value in the list into a list
d =[{v[0]: v[1] for v in t} for t in text] # convert liest to dicts
# text will appear as:
[[['id', '0'], ['e1', 'a'], ['e2', 'b']],
[['id', '0'], ['e1', '4'], ['e2', 'c']],
[['id', '1'], ['e1', '6'], ['e2', 'd']],
[['id', '2'], ['e1', '8'], ['e2', 'f']],
[['id', '2'], ['e1', '9'], ['e2', 'f']],
[['id', '2'], ['e1', 'd'], ['e2', 'k']]]
# d appears as:
[{'id': '0', 'e1': 'a', 'e2': 'b'},
{'id': '0', 'e1': '4', 'e2': 'c'},
{'id': '1', 'e1': '6', 'e2': 'd'},
{'id': '2', 'e1': '8', 'e2': 'f'},
{'id': '2', 'e1': '9', 'e2': 'f'},
{'id': '2', 'e1': 'd', 'e2': 'k'}]
Parse the list of dicts to expected output
Use .get to determine if a key exists, and return some specified value, None in this case, if the key is nonexistent.
dict.get defaults to None, so this method never raises a KeyError.
If None is a value in the dictionary, then change the default value returned by .get.
test.get(v[0], 'something here')
test = dict()
for r in d:
v = list(r.values())
if test.get(v[0]) == None:
test[v[0]] = [tuple(v[1:])]
else:
test[v[0]].append(tuple(v[1:]))
# test dict appears as:
{'0': [('a', 'b'), ('4', 'c')],
'1': [('6', 'd')],
'2': [('8', 'f'), ('9', 'f'), ('d', 'k')]}
# final output
final = list(test.values())
[[('a', 'b'), ('4', 'c')], [('6', 'd')], [('8', 'f'), ('9', 'f'), ('d', 'k')]]
Code Updated and reduced:
In this case, text is a list of lists, and there's no need to convert it to dict d, as above.
For each list t in text, index [0] is always the key, and index [1:] are the values.
with open('test.txt', 'r') as f:
text = [line[2:].replace('"', '').strip().split() for line in f.readlines()] # clean each line and split it into a list
text = [[v.split(':')[1] for v in t] for t in text] # list of list of only value at index 1
# text appears as:
[['0', 'a', 'b'],
['0', '4', 'c'],
['1', '6', 'd'],
['2', '8', 'f'],
['2', '9', 'f'],
['2', 'd', 'k']]
test = dict()
for t in text:
if test.get(t[0]) == None:
test[t[0]] = [tuple(t[1:])]
else:
test[t[0]].append(tuple(t[1:]))
final = list(test.values())
Using defaultdict
Will save a few lines of code
Using text as a list of lists from above
from collections import defaultdict as dd
test = dd(list)
for t in text:
test[t[0]].append(tuple(t[1:]))
final = list(test.values())
I am trying to remove sequential duplicate separated by delimiter '>' from journey column and also aggregate values under column uu and conv. I've tried
INPUT
a=[['journey', 'uu', 'convs'],
['Ct', '10', '2'],
['Ct>Ct', '100', '3'],
['Ct>Pt>Ct', '200', '10'],
['Ct>Pt>Ct>Ct', '40', '5'],
['Ct>Pt>Bu', '1000', '8']]
OUTPUT
a=[['journey', 'uu', 'convs'],
['Ct', '110', '5'],
['Ct>Pt>Ct', '240', '15'],
['Ct>Pt>Bu', '1000', '8']]
I tried below to split but it didn't work
a='>'.join(set(a.split()))
You need to split your string by > and then you could use groupby to eliminate duplicate items in your string. For example:
x = ['Ct>Pt>Ct>Ct', '40', '5']
print(">".join([i for i, _ in groupby(x[0].split(">"))]))
# 'Ct>Pt>Ct'
You could use this as a lambda function in another groupby to aggregate the lists. Then sum each element of the same index by using zip. Check it out:
a=[['journey', 'uu', 'convs'],
['Ct', '10', '2'],
['Ct>Ct', '100', '3'],
['Ct>Pt>Ct', '200', '10'],
['Ct>Pt>Ct>Ct', '40', '5'],
['Ct>Pt>Bu', '1000', '8']]
from itertools import groupby
result = [a[0]] # Add header
groups = groupby(
a[1:],
key=lambda x: ">".join([i for i, _ in groupby(x[0].split(">"))])
)
# groups:
# ['Ct, '[['Ct', '10', '2'], ['Ct>Ct', '100', '3']]]
# ['Ct>Pt>Ct', [['Ct>Pt>Ct', '200', '10'], ['Ct>Pt>Ct>Ct', '40', '5']]]
# ['Ct>Pt>Bu', [['Ct>Pt>Bu', '1000', '8']]]
for key, items in groups:
row = [key]
for i in zip(*items):
if i[0].isdigit():
row.append(str(sum(map(int, i))))
result.append(row)
print(result)
Prints:
[['journey', 'uu', 'convs'],
['Ct', '110', '5'],
['Ct>Pt>Ct', '240', '15'],
['Ct>Pt>Bu', '1000', '8']]
How do I convert this list of lists:
[['0', '1'], ['0', '2'], ['0', '3'], ['1', '4'], ['1', '6'], ['1', '7'], ['1', '9'], ['2', '3'], ['2', '6'], ['2', '8'], ['2', '9']]
To this list of tuples:
[(0, [1, 2, 3]), (1, [0, 4, 6, 7, 9]), (2, [0, 3, 6, 8, 9])]
I am unsure how to implement this next step? (I can't use dictionaries,
sets, deque, bisect module. You can though, and in fact should, use .sort or sorted functions.)
Here is my attempt:
network= [['10'], ['0 1'], ['0 2'], ['0 3'], ['1 4'], ['1 6'], ['1 7'], ['1 9'], ['2 3'], ['2 6'], ['2 8'], ['2 9']]
network.remove(network[0])
friends=[]
for i in range(len(network)):
element= (network[i][0]).split(' ')
friends.append(element)
t=len(friends)
s= len(friends[0])
lst=[]
for i in range(t):
a= (friends[i][0])
if a not in lst:
lst.append(int(a))
for i in range(t):
if a == friends[i][0]:
b=(friends[i][1])
lst.append([b])
print(tuple(lst))
It outputs:
(0, ['1'], ['2'], ['3'], 0, ['1'], ['2'], ['3'], 0, ['1'], ['2'], ['3'], 1, ['4'], ['6'], ['7'], ['9'], 1, ['4'], ['6'], ['7'], ['9'], 1, ['4'], ['6'], ['7'], ['9'], 1, ['4'], ['6'], ['7'], ['9'], 2, ['3'], ['6'], ['8'], ['9'], 2, ['3'], ['6'], ['8'], ['9'], 2, ['3'], ['6'], ['8'], ['9'], 2, ['3'], ['6'], ['8'], ['9'])
I am very close it seems, not sure what to do??
A simpler method:
l = [['0', '1'], ['0', '2'], ['0', '3'], ['1', '4'], ['1', '6'], ['1', '7'], ['1', '9'], ['2', '3'], ['2', '6'], ['2', '8'], ['2', '9']]
a=set(i[0] for i in l)
b=list( (i,[]) for i in a)
[b[int(i[0])][1].append(i[1]) for i in l]
print(b)
Output:
[('0', ['1', '2', '3']), ('1', ['4', '6', '7', '9']), ('2', ['3', '6', '8', '9'])]
Alternate Answer (without using set)
l = [['0', '1'], ['0', '2'], ['0', '3'], ['1', '4'], ['1', '6'], ['1', '7'], ['1', '9'], ['2', '3'], ['2', '6'], ['2', '8'], ['2', '9']]
a=[]
for i in l:
if i[0] not in a:
a.append(i[0])
b=list( (i,[]) for i in a)
[b[int(i[0])][1].append(i[1]) for i in l]
print(b)
also outputs
[('0', ['1', '2', '3']), ('1', ['4', '6', '7', '9']), ('2', ['3', '6', '8', '9'])]
You can use Pandas:
import pandas as pd
import numpy as np
l = [['0', '1'], ['0', '2'], ['0', '3'], ['1', '4'], ['1', '6'], ['1', '7'], ['1', '9'], ['2', '3'], ['2', '6'], ['2', '8'], ['2', '9']]
df = pd.DataFrame(l, dtype=np.int)
s = df.groupby(0)[1].apply(list)
list(zip(s.index, s))
Output:
[(0, [1, 2, 3]), (1, [4, 6, 7, 9]), (2, [3, 6, 8, 9])]