Python dictionary based on input file - python-3.x

I'm trying to create a dictionary object like below using the input file data structure as below, During conversion inner object is being replicated. Any advise what fix is needed for desire output
input file data:/home/file1.txt
[student1]
fname : Harry
lname : Hoit
age : 22
[Student2]
fname : Adam
lname : Re
age : 25
expected output :
{'Student1' : {'fname' : 'Harry', 'lname' : 'Hoit', 'Age' : 22},
'Student2' : {'fname' : 'Adam', 'lname' : 'Re', 'Age' : 25}}
def dict_val():
out = {}
inn = {}
path= '/home/file1.txt'
with open(path, 'r') as f:
for row in f:
row = row.strip()
if row.startswith("["):
i = row[1:-1]
# inn.clear() ## tried to clean the inner loop during second but its not correct
else:
if len(row) < 2:
pass
else:
key, value = row.split('=')
inn[key.strip()] = value.strip()
out[i] = inn
return out
print(dict_val())
current output: getting duplicate during second iteration
{'student1': {'fname': 'Adam', 'lname': 'Re', 'age': '25'},
'Student2': {'fname': 'Adam', 'lname': 'Re', 'age': '25'}}

With just a little change, you will get it. You were pretty close.
The modification includes checking for empty line. When the line is empty, write inn data to out and then clear out inn.
def dict_val():
out = {}
inn = {}
path= 'file.txt'
with open(path, 'r') as f:
for row in f:
row = row.strip()
if row.startswith("["):
i = row[1:-1]
continue
# when the line is empty, write it to OUT dictionary
# reset INN dictionary
if len(row.strip()) == 0:
if len(inn) > 0:
out[i] = inn
inn = {}
continue
key, value = row.split(':')
inn[key.strip()] = value.strip()
# if last line of the file is not an empty line and
# the file reading is done, you can check if INN still
# has data. If it does, write it out to OUT
if len(inn) > 0:
out[i] = inn
return out
print(dict_val())

When you do out[i] = inn you copy the reference/pointer to the inn dict. This means that when the inn dict is updated in the later part of the loop, your out[1] and out[2] point to the same thing.
to solve this, you can create deepcopy of the inn object.
ref : Nested dictionaries copy() or deepcopy()?

I would work the nested dictionary all at once since you're not going that deep.
def dict_val(file):
inn = {}
for row in open(file, 'r'):
row = row.strip()
if row.startswith("["):
i = row[1:-1]
elif len(row) > 2:
key, value = row.split(':')
inn[i][key.strip()] = value.strip()
return inn
print(dict_val('/home/file1.txt'))

Related

Read data from txt file, store it, use it for analyzing, write it to the txt file

The task is to read from given txt file the data add the numbers in there to the list[], so that every number in a row will be a element/object in this list. After reading the file created list will be sent to the main().
this list with the objects will be parameters for the def Analyze part in where at the same time
will be found min, max, average and sum.
def lueTiedosto(data):
Tiedosto = open("L07T4D1.txt", 'r', encoding="UTF-8")
Rivi = Tiedosto.readline()
while (len(Rivi) > 0):
data.append(int(Rivi))
Rivi = Tiedosto.readline()
for element in data:
print(element)
print(f"Tiedosto L07T4D1.txt luettu.")
Tiedosto.close()
return element
The fixed code which works:
def lueTiedosto(data):
Lue = input("Luettavan tiedoston nimi on ''.\n")
print(f"Anna uusi nimi, enter säilyttää nykyisen: ", end='')
Tiedosto = open(Lue, 'r', encoding="UTF-8")
Rivi = Tiedosto.readline()
while (len(Rivi) > 0):
data.append(int(Rivi))
Rivi = Tiedosto.readline()
print(f"Tiedosto '{Lue}' luettu.")
Tiedosto.close()
return data
Making an assumption that your input file is similar to the following:
10000
12345
10008
12000
I would do the following:
filepath = r".......\L07T4D1.txt" # Path to file being loaded
def readData(filepath: str) -> list[int]:
# Returns a list of integers from file
rslt = []
with open (filepath, 'r') as f:
data = f.readline().strip()
while data:
data = data.split(' ')
rslt.append(int(data[0]))
data = f.readline().strip()
return rslt
def analyze(data: list[int]) -> None:
# prints results of data analysis
print(f'Max Value = {max(data)}')
print(f'Min Value = {min(data)}')
print(f'Sum Value = {sum(data)}')
print(f'Avg Value = {sum(data)/len(data)}')
Running analyze(readData(filepath)) Yields:
Max Value = 12345
Min Value = 10000
Sum Value = 44353
Avg Value = 11088.25

Python - Sort list in descending order - Type Error

iPod Game Guesser - Leaderboard Feature
leaderboard = open("Score.txt", "a+")
score = str(score)
leaderboard.write(Username + ' : ' + score + '\n')
leaderboard.close()
leaderboard = open("Score.txt", "r")
Scorelist = leaderboard.readlines()
scores = {}
for row in Scorelist:
user, score = row.split(':')
scores[user] = int(score)
highest_ranking_users = sorted(scores, key=lambda x: scores[x], reverse=True)
for user in highest_ranking_users:
print (f'{user} : {score[user]}')
So this is the game I have done for my GCSE OCR Project, somehow I am getting errors for the last line of my code. \\print (f'{user} : {score[user]}')\\, the error it displays as follows:
TypeError: string indices must be integers
Please help! Any comments will be appreciated!
You are using dict type and sorted is for the list, not dict. In order to sort a dict, you have to use a list and make a dict again or just use the list to print the highest score list.
score = {'Player1' : 9, 'Player2' : 6, 'Player3' : 7, 'Player4' : 8}
sorted_score = sorted(score.items(), key=lambda x: x[1], reverse=True)
for score in sorted_score:
print('{} : {}'.format(score[0], score[1]))
This gives the result,
Player1 : 9
Player4 : 8
Player3 : 7
Player2 : 6
You can adapt my code, i.e.
with open("test.txt", "r") as file:
scores = []
for row in file.read().splitlines():
scores.append(row.split(' : '))
sorted_score = sorted(scores, key=lambda x: x[1], reverse=True)
for score in sorted_score:
print('{} : {}'.format(score[0], score[1]))

Never resets list

I am trying to create a calorie counter the standard input goes like this:
python3 calories.txt < test.txt
Inside calories the food is the following format: apples 500
The problem I am having is that whenever I calculate the values for the person it seems to never return to an empty list..
import sys
food = {}
eaten = {}
finished = {}
total = 0
#mappings
def calories(x):
with open(x,"r") as file:
for line in file:
lines = line.strip().split()
key = " ".join(lines[0:-1])
value = lines[-1]
food[key] = value
def calculate(x):
a = []
for keys,values in x.items():
for c in values:
try:
a.append(int(food[c]))
except:
a.append(100)
print("before",a)
a = []
total = sum(a) # Problem here
print("after",a)
print(total)
def main():
calories(sys.argv[1])
for line in sys.stdin:
lines = line.strip().split(',')
for c in lines:
values = lines[0]
keys = lines[1:]
eaten[values] = keys
calculate(eaten)
if __name__ == '__main__':
main()
Edit - forgot to include what test.txt would look like:
joe,almonds,almonds,blue cheese,cabbage,mayonnaise,cherry pie,cola
mary,apple pie,avocado,broccoli,butter,danish pastry,lettuce,apple
sandy,zuchini,yogurt,veal,tuna,taco,pumpkin pie,macadamia nuts,brazil nuts
trudy,waffles,waffles,waffles,chicken noodle soup,chocolate chip cookie
How to make it easier on yourself:
When reading the calories-data, convert the calories to int() asap, no need to do it every time you want to sum up somthing that way.
Dictionary has a .get(key, defaultvalue) accessor, so if food not found, use 100 as default is a 1-liner w/o try: ... except:
This works for me, not using sys.stdin but supplying the second file as file as well instead of piping it into the program using <.
I modified some parsings to remove whitespaces and return a [(name,cal),...] tuplelist from calc.
May it help you to fix it to your liking:
def calories(x):
with open(x,"r") as file:
for line in file:
lines = line.strip().split()
key = " ".join(lines[0:-1])
value = lines[-1].strip() # ensure no whitespaces in
food[key] = int(value)
def getCal(foodlist, defValueUnknown = 100):
"""Get sum / total calories of a list of ingredients, unknown cost 100."""
return sum( food.get(x,defValueUnknown ) for x in foodlist) # calculate it, if unknown assume 100
def calculate(x):
a = []
for name,foods in x.items():
a.append((name, getCal(foods))) # append as tuple to list for all names/foods eaten
return a
def main():
calories(sys.argv[1])
with open(sys.argv[2]) as f: # parse as file, not piped in via sys.stdin
for line in f:
lines = line.strip().split(',')
for c in lines:
values = lines[0].strip()
keys = [x.strip() for x in lines[1:]] # ensure no whitespaces in
eaten[values] = keys
calced = calculate(eaten) # calculate after all are read into the dict
print (calced)
Output:
[('joe', 1400), ('mary', 1400), ('sandy', 1600), ('trudy', 1000)]
Using sys.stdin and piping just lead to my console blinking and waiting for manual input - maybe VS related...

Open dictionary from .txt file en add/update new dict items and write updated dict to .txt file

I'm trying to open a dictionary from .txt-file en add/update new dict items and write updated dict to .txt-file
When I run program and give a date and hours I get this error:
-ValueError: dictionary update sequence element #0 has length 1; 2 is required.
My code:
old_dates_hours = {}
dates_hours = {}
def date():
date = input('\nEnter a date: ')
while True:
try:
hours = float(input('Enter hours: '))
except ValueError:
print('That was not a number!')
continue
else:
break
dates_hours[date.capitalize()] = hours
answer = ask_yes_no("\nWant to enter more dates?, Enter 'y' of 'n': ")
return dates_hours
def ask_yes_no(question):
"""Ask a yes or no question."""
response = None
while response not in ('y', 'n'):
response = input(question).lower()
if response == 'y':
date()
else:
print('\nYou dont have more hours to fill in ')
return response
def open_and_read_file():
hours_file = open('HOURS.txt', 'r+')
old_dates_hours = hours_file.readline()
hours_file.close()
return old_dates_hours
def write_to_file():
hours_file = open('HOURS.txt', 'w')
hours_file.write(str(dates_hours))
hours_file.close()
dates_hours = date()
print('\nNew entered hours: ', dates_hours)
old_dates_hours = open_and_read_file()
print('\nOld hours: ', old_dates_hours)
print('\nThis are the recently given hours: ', dates_hours)
dates_hours.update(old_dates_hours)
print('This are the total saved hours: ', dates_hours)
write_ = write_to_file()
I have tested the "dates_hours.update(old_dates_hours)" function in code below:
This works for me but I can't get it working in code above.
old_dates_hours = {'Za': 4, 'Zo': 6}
dates_hours = {'Ma': 13, 'Di': 9, 'Wo': 9, 'Vr': 5}
dates_hours.update(old_dates_hours)
print(dates_hours)
def write_to_file():
uren_file = open('HOURS3.txt', 'w')
uren_file.write(str(dates_hours))
uren_file.close()
write_ = write_to_file()
Quote from here
The method update() adds dictionary dict2's key-values pairs in to
dict.
Your open_and_read_file() function returns a list, not a dictionary. Because readlines() returns a list. You need to make a dictionary from your "HOURS.txt", then try to update.
EDIT:
If your txt content is something like {'Mo':4, 'Vr':6, 'So':5}, you can use this:
import ast
# rest of the code
def open_and_read_file():
hours_file = open('HOURS.txt', 'r+')
old_dates_hours = ast.literal_eval(hours_file.read())
hours_file.close()
return old_dates_hours

Assigning multiple values to dictionary keys from a file in Python 3

I'm fairly new to Python but I haven't found the answer to this particular problem.
I am writing a simple recommendation program and I need to have a dictionary where cuisine is a key and name of a restaurant is a value. There are a few instances where I have to split a string of a few cuisine names and make sure all other restaurants (values) which have the same cuisine get assigned to the same cuisine (key). Here's a part of a file:
Georgie Porgie
87%
$$$
Canadian, Pub Food
Queen St. Cafe
82%
$
Malaysian, Thai
Mexican Grill
85%
$$
Mexican
Deep Fried Everything
52%
$
Pub Food
so it's just the first and the last one with the same cuisine but there are more later in the file.
And here is my code:
def new(file):
file = "/.../Restaurants.txt"
d = {}
key = []
with open(file) as file:
lines = file.readlines()
for i in range(len(lines)):
if i % 5 == 0:
if "," not in lines[i + 3]:
d[lines[i + 3].strip()] = [lines[i].strip()]
else:
key += (lines[i + 3].strip().split(', '))
for j in key:
if j not in d:
d[j] = [lines[i].strip()]
else:
d[j].append(lines[i].strip())
return d
It gets all the keys and values printed but it doesn't assign two values to the same key where it should. Also, with this last 'else' statement, the second restaurant is assigned to the wrong key as a second value. This should not happen. I would appreciate any comments or help.
In the case when there is only one category you don't check if the key is in the dictionary. You should do this analogously as in the case of multiple categories and then it works fine.
I don't know why you have file as an argument when you have a file then overwritten.
Additionally you should make 'key' for each result, and not += (adding it to the existing 'key'
when you check if j is in dictionary, clean way is to check if j is in the keys (d.keys())
def new(file):
file = "/.../Restaurants.txt"
d = {}
key = []
with open(file) as file:
lines = file.readlines()
for i in range(len(lines)):
if i % 5 == 0:
if "," not in lines[i + 3]:
if lines[i + 3] not in d.keys():
d[lines[i + 3].strip()] = [lines[i].strip()]
else:
d[lines[i + 3]].append(lines[i].strip())
else:
key = (lines[i + 3].strip().split(', '))
for j in key:
if j not in d.keys():
d[j] = [lines[i].strip()]
else:
d[j].append(lines[i].strip())
return d
Normally, I find that if you use names for the dictionary keys, you may have an easier time handling them later.
In the example below, I return a series of dictionaries, one for each restaurant. I also wrap the functionality of processing the values in a method called add_value(), to keep the code more readable.
In my example, I'm using codecs to decode the value. Although not necessary, depending on the characters you are dealing with it may be useful. I'm also using itertools to read the file lines with an iterator. Again, not necessary depending on the case, but might be useful if you are dealing with really big files.
import copy, itertools, codecs
class RestaurantListParser(object):
file_name = "restaurants.txt"
base_item = {
"_type": "undefined",
"_fields": {
"name": "undefined",
"nationality": "undefined",
"rating": "undefined",
"pricing": "undefined",
}
}
def add_value(self, formatted_item, field_name, field_value):
if isinstance(field_value, basestring):
# handle encoding, strip, process the values as you need.
field_value = codecs.encode(field_value, 'utf-8').strip()
formatted_item["_fields"][field_name] = field_value
else:
print 'Error parsing field "%s", with value: %s' % (field_name, field_value)
def generator(self, file_name):
with open(file_name) as file:
while True:
lines = tuple(itertools.islice(file, 5))
if not lines: break
# Initialize our dictionary for this item
formatted_item = copy.deepcopy(self.base_item)
if "," not in lines[3]:
formatted_item['_type'] = lines[3].strip()
else:
formatted_item['_type'] = lines[3].split(',')[1].strip()
self.add_value(formatted_item, 'nationality', lines[3].split(',')[0])
self.add_value(formatted_item, 'name', lines[0])
self.add_value(formatted_item, 'rating', lines[1])
self.add_value(formatted_item, 'pricing', lines[2])
yield formatted_item
def split_by_type(self):
d = {}
for restaurant in self.generator(self.file_name):
if restaurant['_type'] not in d:
d[restaurant['_type']] = [restaurant['_fields']]
else:
d[restaurant['_type']] += [restaurant['_fields']]
return d
Then, if you run:
p = RestaurantListParser()
print p.split_by_type()
You should get:
{
'Mexican': [{
'name': 'Mexican Grill',
'nationality': 'undefined',
'pricing': '$$',
'rating': '85%'
}],
'Pub Food': [{
'name': 'Georgie Porgie',
'nationality': 'Canadian',
'pricing': '$$$',
'rating': '87%'
}, {
'name': 'Deep Fried Everything',
'nationality': 'undefined',
'pricing': '$',
'rating': '52%'
}],
'Thai': [{
'name': 'Queen St. Cafe',
'nationality': 'Malaysian',
'pricing': '$',
'rating': '82%'
}]
}
Your solution is simple, so it's ok. I'd just like to mention a couple of ideas that come to mind when I think about this kind of problem.
Here's another take, using defaultdict and split to simplify things.
from collections import defaultdict
record_keys = ['name', 'rating', 'price', 'cuisine']
def load(file):
with open(file) as file:
data = file.read()
restaurants = []
# chop up input on each blank line (2 newlines in a row)
for record in data.split("\n\n"):
fields = record.split("\n")
# build a dictionary by zipping together the fixed set
# of field names and the values from this particular record
restaurant = dict(zip(record_keys, fields))
# split chops apart the type cuisine on comma, then _.strip()
# removes any leading/trailing whitespace on each type of cuisine
restaurant['cuisine'] = [_.strip() for _ in restaurant['cuisine'].split(",")]
restaurants.append(restaurant)
return restaurants
def build_index(database, key, value):
index = defaultdict(set)
for record in database:
for v in record.get(key, []):
# defaultdict will create a set if one is not present or add to it if one does
index[v].add(record[value])
return index
restaurant_db = load('/var/tmp/r')
print(restaurant_db)
by_type = build_index(restaurant_db, 'cuisine', 'name')
print(by_type)

Resources