I have this file that contains something like this:
OOOOOOXOOOO
OOOOOXOOOOO
OOOOXOOOOOO
XXOOXOOOOOO
XXXXOOOOOOO
OOOOOOOOOOO
And I need to read it into a 2D list so it looks like this:
[[O,O,O,O,O,O,X,O,O,O,O],[O,O,O,O,O,X,O,O,O,O,O],[O,O,O,O,X,O,O,O,O,O,O],[X,X,O,O,X,O,O,O,O,O,O],[X,X,X,X,O,O,O,O,O,O,O,O],[O,O,O,O,O,O,O,O,O,O,O]
I have this code:
ins = open(filename, "r" )
data = []
for line in ins:
number_strings = line.split() # Split the line on runs of whitespace
numbers = [(n) for n in number_strings]
data.append(numbers) # Add the "row" to your list.
return data
But it doesn't seem to be working because the O's and X's do not have spaces between them. Any ideas?
Just use data.append(list(line.rstrip())) list accepts a string as argument and just splits them on every character.
Related
I am importing a CSV file into a list in Python. When I split it into list elements then print a index,the entry is printed like this.
2000-01-03,3.745536,4.017857,3.631696,3.997768,2.695920,133949200
How would I split this list so if I wanted to just print a solo element like this?
2000-01-03Here is my code so far.
def main():
list = []
filename = "AAPL.csv"
with open(filename) as x:
for line in x.readlines():
val = line.strip('\n').split(',')
list.append(val)
print(list[2])
Your current code build a list of lists, precisely a list (of rows) of lists (of fields).
To extract one single element, say first field of third row, you could do:
...
print(list[2][0])
But except for trivial tasks, you should use the csv module when processing csv file, because it is robust to corner cases like newlines or field separarors contained in fields. Your code could become:
def main():
list = []
filename = "AAPL.csv"
with open(filename) as x:
rd = csv.reader(x)
for val in rd: # the reader is an iterator of lists of fields
list.append(val)
print(list[2][0])
I have a big text file like this example:
example:
</Attributes>
FovCount,555
FovCounted,536
ScanID,1803C0555
BindingDensity,0.51
as you see some lines are empty, some are comma separated and some others have different format.
I would like to open the file and look for the lines which start with these 3 words: FovCount, FovCounted and BindingDensity. if the line start with one of them I want to get the number after the comma. from the number related to FovCount and FovCounted I will make criteria and at the end the results is a list with 2 items: criteria and BD (which is the number after BindingDensity). I made the following function in python but it does not return what I want. do you know how to fix it?
def QC(file):
with open(file) as f:
for line in f.split(","):
if line.startswith("FovCount"):
FC = line[1]
elif line.startswith("FovCounted"):
FCed = line[1]
criteria = FC/FCed
elif line.startswith("BindingDensity"):
BD = line[1]
return [criteria, BD]
You are splitting the file into lines separated by a comma (,). But lines aren't separated by a command, they are separated by a newline character (\n).
Try changing f.split(",") to f.read().split("\n") or you can use f.readlines() which basically does the same thing.
You can then split each line into comma-separated segments using segments = line.split(",").
You can check if the first segment matches your text criteria: if segments[0] == "FovCounted", etc.
You can then obtain the value by getting the second segment: value = segments[1].
I got this string:
\n
\n
N\tO\tHP\tM\tD\tU\tI\tN\tO\n
E\tS\tA\tE\tI\tT\tL\tN\tI\tN\n
N\tP\tN\tN\tN\tG\tAO\tD\tC\n
\n
\n
PERMANENTE
PETTINE
\n
\n
actually if you looks at original string ,you cannot see the \t and \n ,so I just edited to better understanding.
What is I'm trying to do is separate to 2 different list of lists,for example:
lists1 = [[NOHPMDUINO][ESAEITLNIN][NPNNNGAODC]]
lists2 = [[PERMANENTE][PETTINE]]
I tried to use many methods to solve this, but without success.
at first I removed the new lines at the beginning with .strip('\n') method, and I tried to use replace , but I don't know how to make it right.
Thank you zsomko and snakecharmerb,
Using the method of zsomko and adding strip() to remove the newline at the beginning , here is the loop that I did to divide to 2 variables:
var = True
for line in t:
if line !=['']:
if var:
group1.append(line)
else:
group2.append(line)
else:
var = False
I hope this will help to someone :) If somebody has better solution ,more efficient ,I would like to hear
First eliminate the tabs and split the string into lines:
lines = [line.replace('\t', '') for line in string.splitlines()]
Then the following would yield the list of lists in the variable groups as expected:
groups = []
group = []
for line in lines:
if group and not line:
groups.append(group)
group = []
elif line:
group.append(line)
You can break the string into separate lines using its splitlines method - this will give you a list of lines without their terminating newline ('\n') characters.
Then you can loop over the list and replace the tab characters with empty strings using the str.replace method.
>>> for line in s.splitlines():
... if not line:
... # Skip empty lines
... continue
... cleaned = line.replace('\t', '')
... print(cleaned)
...
NOHPMDUINO
ESAEITLNIN
NPNNNGAODC
PERMANENTE
PETTINE
Grouping the output in lists of lists is a little trickier. The question doesn't mention the criteria for grouping, so let's assume that lines which are not separated by empty lines should be listed together.
We can use a generator to iterate over the string, group adjacent lines and emit them as lists like this:
>>> def g(s):
... out = []
... for line in s.splitlines():
... if not line:
... if out:
... yield out
... out = []
... continue
... cleaned = line.replace('\t', '')
... out.append([cleaned])
... if out:
... yield out
...
>>>
The generator collects lines in a list (out) which it yields each time it finds a blank line and the list is not empty; if the list is yielded it is replaced with an empty list. After looping over the lines in the string it yields the list again, if it isn't empty, in case the string didn't end with blank lines.
Looping over the generator returns the lists of lists in turn.
>>> for x in g(s):print(x)
...
[['NOHPMDUINO'], ['ESAEITLNIN'], ['NPNNNGAODC']]
[['PERMANENTE'], ['PETTINE']]
Alternatively, if you want a list of lists of lists, call list on the generator:
>>> lists = list(g(s))
>>> print(lists)
[[['NOHPMDUINO'], ['ESAEITLNIN'], ['NPNNNGAODC']], [['PERMANENTE'], ['PETTINE']]]
If you want to assign the result to named variables, you can unpack the call to list:
>>> group1, group2 = list(g(s))
>>> group1
[['NOHPMDUINO'], ['ESAEITLNIN'], ['NPNNNGAODC']]
>>> group2
[['PERMANENTE'], ['PETTINE']]
but note to do this you need to know the number of lists that will be generated in advance.
I have a text file that I converted into a list, but I want it to be a multi-dimensional list. Is there a way to do this easily?
This is my code:
crimefile = open(fileName, 'r')
yourResult = [line.split(',') for line in crimefile.readlines()]
Your code does create a 2-dimensional list (assuming your file is multiple lines of numbers where each number is separated by a comma). If you want to print out each individual list in yourResult, try this: for list in yourResult: print (list) To access a certain item in the list, for example the first number on each line, simply replace print (list) with print (list[0])
I just read the 10th line from file 'text.txt'
>>>line=linecache.getline("text.txt",10)
>>>line
"['\\x02', '\\x03']\n"
I would like to create a list lst in this case of two variable '\\x02' and '\\x03'
>>>lst
['\\x02','\\x03']
I have to iterate the process for different text lines always formatted like line also with more variables.
Any suggestions?
Thank you
This will take a string in that format with an arbitrary number of elements and convert it to a list.
line = "['\\x02', '\\x03']\n"
line = line.strip()[1:-1]
lst = [x.strip()[1:-1] for x in line.split(",")]