alone,1
amazed,10
amazing,10
bad,1
best,10
better,7
excellent,10
These are some of the keywords and their 'values' that I need to store in a
data structure, preferably a list. Each line will be later used to access/extract the word and its 'value'.
The list I made in a while loop was:
line = KeywordFile.readline()
while line != '':
line=KeywordFile.readline()
line = line.rstrip()
And I tried to convert it to a list form by doing this:
list=[line]
However, when I print the list, I get this:
['amazed,10']
['amazing,10']
['bad,1']
['best,10']
['better,7']
['excellent,10']
I don't think that I'll be able to extract my 'values' from the lists that easy if they are inside quotation marks.
I'm looking for a better way to store the words and its 'value'
Thanks in advance!
A dictionnary is what you need here:
You could do something like:
line = KeywordFile.readline()
while line != '':
line=KeywordFile.readline()
line = line.rstrip().split(',')
out[line[0]] = line[1]
out will look like
{ 'amazed' : 10, 'amazing': 10, 'bad':1 ...}
and the values can be accessed out['amazed'] will return 10
Related
I have a .txt file that I want to search for specific words, or phrases. I want to be able to use an input to do this. Then I would like the file parsed for the input and printed. Basically something like this:
input("Search For:")I WANT TO ENTER MY SEARCH TERM HERE
print(I WANT TO PRINT WHAT I SEARCHED FOR ABOVE)
I am able to do this another way by creating a variable, and then just changing the variable name as needed, but this is not ideal for me. Any ideas on how to create an input to search my .txt?
word = 'Scrubbing'
#variable to store search term
with open(r'/Users/kev/PycharmProjects/find_text/common.txt', 'r') as fp:
lines = fp.readlines()
# read all lines in a list
for line in lines:
if line.find(word) != -1:
# check if string present on a current line
print(word, 'string exists in file')
print('Line Number:', lines.index(line))
print('Line:', line)
I am comparatively new to python and data science and I was working with a CSV file which looks something like:
value1, value2
value3
value4...
Thing is, I want to assign a unique number to each of these values in the csv file such that the unique number acts as the key and the item in the CSV acts as the value like in a dictionary.
I tried using pandas but if possible, I wanted to know how I can solve this without using any libraries.
The desired output should be something like this:
{
"value1": 1,
"value2": 2,
"value3": 3,
.
.
.
and so on..
}
Was just about to talk about pandas before I saw that you wanted to do it in vanilla Python. I'd do it with pandas personally, but here you go:
You can read in lines from a file, split them by delimiter (','), and then get your word tokens.
master_dict = {}
counter = 1
with open("your_csv.csv", "r") as f:
for line in f:
words = line.split(',') # you may or may not want to add a call to .strip() as well
for word in words:
master_dict[counter] = word
counter += 1
I have a csv file which is not consistent. It looks like this where some have a middle name and some do not. I don't know the best way to fix this. The middle name will always be in the second position if it exists. But if a middle name doesn't exist the last name is in the second position.
john,doe,52,florida
jane,mary,doe,55,texas
fred,johnson,23,maine
wally,mark,david,44,florida
Let's say that you have ① wrong.csv and want to produce ② fixed.csv.
You want to read a line from ①, fix it and write the fixed line to ②, this can be done like this
with open('wrong.csv') as input, open('fixed.csv', 'w') as output:
for line in input:
line = fix(line)
output.write(line)
Now we want to define the fix function...
Each line has either 3 or 4 fields, separated by commas, so what we want to do is splitting the line using the comma as a delimiter, return the unmodified line if the number of fields is 3, otherwise join the field 0 and the field 1 (Python counts from zero...), reassemble the output line and return it to the caller.
def fix(line):
items = line.split(',') # items is a list of strings
if len(items) == 3: # the line is OK as it stands
return line
# join first and middle name
first_middle = join(' ')((items[0], items[1]))
# we want to return a "fixed" line,
# i.e., a string not a list of strings
# we have to join the new name with the remaining info
return ','.join([first_second]+items[2:])
I have this file that contains something like this:
OOOOOOXOOOO
OOOOOXOOOOO
OOOOXOOOOOO
XXOOXOOOOOO
XXXXOOOOOOO
OOOOOOOOOOO
And I need to read it into a 2D list so it looks like this:
[[O,O,O,O,O,O,X,O,O,O,O],[O,O,O,O,O,X,O,O,O,O,O],[O,O,O,O,X,O,O,O,O,O,O],[X,X,O,O,X,O,O,O,O,O,O],[X,X,X,X,O,O,O,O,O,O,O,O],[O,O,O,O,O,O,O,O,O,O,O]
I have this code:
ins = open(filename, "r" )
data = []
for line in ins:
number_strings = line.split() # Split the line on runs of whitespace
numbers = [(n) for n in number_strings]
data.append(numbers) # Add the "row" to your list.
return data
But it doesn't seem to be working because the O's and X's do not have spaces between them. Any ideas?
Just use data.append(list(line.rstrip())) list accepts a string as argument and just splits them on every character.
I have a file data.txt containing following lines :
I would like to extract the lines of this file into a list of lists, each line is a list that will be contained within ListOfLines wich is a list of lists.
When there is no data on some cell I just want it to be -1.
I have tried this so far :
from random import randint
ListOfLines=[]
with open("C:\data.txt",'r') as file:
data = file.readlines()
for line in data :
y = line.split()
ListOfLines.append(y)
with open("C:\output.txt",'a') as output:
for x in range(0, 120):
# 'item' represente une ligne
for item in ListOfLines :
item[2] = randint(1, 1000)
for elem in item :
output.write(str(elem))
output.write(' ')
output.write('\n')
output.write('------------------------------------- \n')
How can I improve my program to contain less code and be faster ?
Thank you in advance :)
Well, sharing your sample data in an image don't make easy to working with it. Like this I don't even bother and I assume others do the same.
However, data = file.readlines() forces the content of the file into a list first, and then you iterate through that list. You could do that instantly with 'for line in file:'. That improves it a little.
You haven't mentioned what you want with the otput part which seems quite messy.