convert a list of lists to a list of string - string

I have a list of lists like this
list1 = [['I am a student'], ['I come from China'], ['I study computer science']]
len(list1) = 3
Now I would like to convert it into a list of string like this
list2 = ['I', 'am', 'a', 'student','I', 'come', 'from', 'China', 'I','study','computer','science']
len(list2) = 12
I am aware that I could conversion in this way
new_list = [','.join(x) for x in list1]
But it returns
['I,am,a,student','I,come,from,China','I,study,computer,science']
len(new_list) = 3
I also tried this
new_list = [''.join(x for x in list1)]
but it gives the following error
TypeError: sequence item 0: expected str instance, list found
How can I extract each word in the sublist of list1 and convert it into a list of string? I'm using python 3 in windows 7.

Following your edit, I think the most transparent approach is now the one that was adopted by another answer (an answer which has since been deleted, I think). I've added some whitespace to make it easier to understand what's going on:
list1 = [['I am a student'], ['I come from China'], ['I study computer science']]
list2 = [
word
for sublist in list1
for sentence in sublist
for word in sentence.split()
]
print(list2)
Prints:
['I', 'am', 'a', 'student', 'I', 'come', 'from', 'China', 'I', 'study', 'computer', 'science']

Given a list of lists where each sublist contain strings this could be solved using jez's strategy like:
list2 = ' '.join([' '.join(strings) for strings in list1]).split()
Where the list comprehension transforms list1 to a list of strings:
>>> [' '.join(strings) for strings in list1]
['I am a student', 'I come from China', 'I study computer science']
The join will then create a string from the strings and split will create a list split on spaces.
If the sublists only contain single strings, you could simplify the list comprehension:
list2 = ' '.join([l[0] for l in list1]).split()

Related

How to get unique values in nested list along single column?

I need to extract only unique sublists based on first element from a nested list. For e.g.
in = [['a','b'], ['a','d'], ['e','f'], ['g','h'], ['e','i']]
out = [['a','b'], ['e','f'], ['g','h']]
My method is two break list into two lists and check for elements individually.
lis = [['a','b'], ['a','d'], ['e','f'], ['g','h']]
lisa = []
lisb = []
for i in lis:
if i[0] not in lisa:
lisa.append(i[0])
lisb.append(i[1])
out = []
for i in range(len(lisa)):
temp = [lisa[i],lisb[i]]
out.append(temp)
This is an expensive operation when dealing with list with 10,00,000+ sublists. Is there a better method?
Use memory-efficient generator function with an auziliary set object to filter items on the first unique subelement (take first unique):
def gen_take_first(s):
seen = set()
for sub_l in s:
if sub_l[0] not in seen:
seen.add(sub_l[0])
yield sub_l
inp = [['a','b'], ['a','d'], ['e','f'], ['g','h'], ['e','i']]
out = list(gen_take_first(inp))
print(out)
[['a', 'b'], ['e', 'f'], ['g', 'h']]

Is there a way to split strings inside a list?

I am trying to split strings inside a list but I could not find any solution on the internet. This is a sample, but it should help you guys understand my problem.
array=['a','b;','c','d)','void','plasma']
for i in array:
print(i.split())
My desired output should look like this:
output: ['a','b',';','c','d',')','void','plasma']
One approach uses re.findall on each starting list term along with a list comprehension to flatten the resulting 2D list:
inp = ['a', 'b;', 'c', 'd)', 'void', 'plasma']
output = [j for sub in [re.findall(r'\w+|\W+', x) for x in inp] for j in sub]
print(output) # ['a', 'b', ';', 'c', 'd', ')', 'void', 'plasma']

How to create a matrix (5 x 5) with strings, without using numpy? (Python 3)

So I am creating a memory matching game, in which a player will pair two words from a list of words.
I am trying to create a 2D 5x5 matrix with strings without using numpy.
I've tried with for i in range(x): for j in range(x), but I can't get it to work.
So how do I do?
Python doesn't have a built in matrix type like that, but you can pretty much emulate it with a list of lists, or with a dict keyed by ordered pairs.
Here's the list of lists approach using a list comprehension inside a list comprehension:
from pprint import pprint
matrix = [[c for c in line] for line in '12345 abcde ABCDE vwxyz VWXYZ'.split()]
pprint(matrix)
The result, pretty-printed.
[['1', '2', '3', '4', '5'],
['a', 'b', 'c', 'd', 'e'],
['A', 'B', 'C', 'D', 'E'],
['v', 'w', 'x', 'y', 'z'],
['V', 'W', 'X', 'Y', 'Z']]
You can split on different characters in the inner or outer loops.
matrix = [[word for word in line.split()] for line in 'foo bar;spam eggs'.split(';')]
You get and set elements with a double lookup, like matrix[2][3].
Results can vary with pprint depending on the width of the words. List of lists are pretty easy to print in matrix form though. .join() is the inverse of .split().
print('\n'.join('\t'.join(line) for line in matrix))
And the result in this case,
foo bar
spam eggs
This just uses a tab character '\t', which may or may not produce good results depending on your tab stops and word withs. You can control this more precisely by using the justify string methods or .format() or f-strings with specifiers.
Here's one with the pair-keyed dict. Recall that tuples of hashable types are hashable too.
{(i, j): 'x' for i in range(5) for j in range(5)}
You get and set elements with a pair lookup, like matrix[2, 3].
Again, you can use words.
{(i, j): word
for i, line in enumerate("""\
1 2 3 4 5
foo bar baz quux norlf
FOO BAR BAZ QUUX NORLF
spam eggs sausage bacon ham
SPAM EGGS SAUSAGE BACON HAM""".split('\n'))
for j, word in enumerate(line.split())}

How do i remove empty strings from my list of lists in python 3.0?

if i print my list called text:
print(text)
the return will show something like
[['this', 'example', '',], ['a','b','']]
How do i remove the empty strings from here?
With a list comprehension:
text = [['this', 'example', '',], ['a','b','']]
print([[string for string in sublist if string] for sublist in text])
# [['this', 'example'], ['a', 'b']]

Python check variable against multiple lists

So I have 3 lists of data, I need to test if any of the data I get from the json response is in any of the lists, I'm probably being stupid about it but I'm trying to learn and can't seem to get it to work right.
list1 = ['a', 'b', 'c']
list2 = ['a1', 'b1', 'c1']
list2 = ['a2', 'b2', 'c2']
#block of code...
#block of code...
content = json.loads(response.read().decode('utf8'))
data = content
for x in data:
#if x['name'] in list1: #This works fine the line below does not.
if x['name'] in (list1, list2, list3):
print("something")
I suggest something simple and straight-forward:
if (x['name'] in list1 or
x['name'] in list2 or
x['name'] in list3):
...
As a pythinc way for such tasks you can use any for simulating the OR operand and all for and operand.
So her you can use a generator expression within any() :
if any(x['name'] in i for i in (list1, list2, list3))
What about concatenating the lists?
if x['name'] in [list1 + list2 + list3]:

Resources