How to create a dictionary from a file with multiple lines - python-3.x

I'm trying to create a dictionary from multiple lines in a file, for i.e.
grocery store
apples
banana
bread
shopping mall
movies
clothing stores
shoe stores
What I'm trying to do is make the first row of each section (i.e. grocery store and shopping mall) the keys and everything underneath (apple, banana, bread & movies, clothing stores, shoe stores respectively) the values. I've been fiddling around with the readline approach + while loop, but I haven't been able to figure it out. If anyone knows, please help. Thanks.

One solution is to store in a variable the boolean value for whether you're at the start of a section. I don't want to give away the exciting (?) ending, but you could start with is_first=True.
OK, I guess I do want to give away the ending after all. Here's what I had in mind, more or less:
with open(fname) as f:
content = f.readlines()
is_first = True
d = {}
for line in content:
if line == '\n':
is_first = True
elif is_first:
key = line
is_first = False
else:
if key not in d:
d.put(key, '')
d.put(key, d.get(key) + line)
is_first = False
I find it easier to plan the code that way. Of course you could also solve this without an is_first variable, especially if you've already gone through the exercise of doing it with an is_first variable. I think the following is correct, but I wasn't incredibly careful:
with open(fname) as f:
content = f.readlines()
d = {}
while content:
key, content = content[0], content[1:]
if key != '\n':
value, content = content[0], content[1:]
while value != '\n':
if key not in d:
d.put(key, '')
d.put(key, d.get(key) + value)
value, content = content[0], content[1:]

#minopret has already given a pedagogically useful answer, and one that's important for beginners to understand. In a sense, even some more seemingly-sophisticated approaches are often doing that under the hood -- using a kind of state machine, I mean -- so it's important to know.
But for the heck of it, I'll describe a higher-level approach. There's a handy function itertools.groupby which groups sequences into contiguous groups. In this case, we can define a group by a bunch of lines which aren't all empty -- bool(line) is False if the line is empty and True otherwise, and then build a dict from them.
from itertools import groupby
with open("shopdict.txt") as fin:
stripped = map(str.strip, fin)
grouped = (list(g) for k,g in groupby(stripped, bool) if k)
d = {g[0]: g[1:] for g in grouped}

from itertools import groupby
with open("shopdict.txt") as fin:
stripped = map(str.strip, fin)
d = {k: g for b, (k, *g) in groupby(stripped, bool) if b}
And here's a way just using for loops
d={}
with open("shopdict.txt") as fin:
for key in fin:
key = key.strip()
d[key] = []
for item in fin:
if item.isspace():
break
d[key].append(item.strip())

Related

How can i optimise my code and make it readable?

The task is:
User enters a number, you take 1 number from the left, one from the right and sum it. Then you take the rest of this number and sum every digit in it. then you get two answers. You have to sort them from biggest to lowest and make them into a one solid number. I solved it, but i don't like how it looks like. i mean the task is pretty simple but my code looks like trash. Maybe i should use some more built-in functions and libraries. If so, could you please advise me some? Thank you
a = int(input())
b = [int(i) for i in str(a)]
closesum = 0
d = []
e = ""
farsum = b[0] + b[-1]
print(farsum)
b.pop(0)
b.pop(-1)
print(b)
for i in b:
closesum += i
print(closesum)
d.append(int(closesum))
d.append(int(farsum))
print(d)
for i in sorted(d, reverse = True):
e += str(i)
print(int(e))
input()
You can use reduce
from functools import reduce
a = [0,1,2,3,4,5,6,7,8,9]
print(reduce(lambda x, y: x + y, a))
# 45
and you can just pass in a shortened list instead of poping elements: b[1:-1]
The first two lines:
str_input = input() # input will always read strings
num_list = [int(i) for i in str_input]
the for loop at the end is useless and there is no need to sort only 2 elements. You can just use a simple if..else condition to print what you want.
You don't need a loop to sum a slice of a list. You can also use join to concatenate a list of strings without looping. This implementation converts to string before sorting (the result would be the same). You could convert to string after sorting using map(str,...)
farsum = b[0] + b[-1]
closesum = sum(b[1:-2])
"".join(sorted((str(farsum),str(closesum)),reverse=True))

comparing two arrays and get the values which are not common

I am doing this problem a friend gave me where you are given 2 arrays say (a[1,2,3,4] and b[8,7,9,2,1]) and you have to find not common elements.
Expected output is [3,4,8,7,9]. Code below.
def disjoint(e,f):
c = e[:]
d = f[:]
for i in range(len(e)):
for j in range(len(f)):
if e[i] == f[j]:
c.remove(e[i])
d.remove(d[j])
final = c + d
print(final)
print(disjoint(a,b))
I tried with nested loops and creating copies of given arrays to modify them then add them but...
def disjoint(e,f):
c = e[:] # list copies
d = f[:]
for i in range(len(e)):
for j in range(len(f)):
if e[i] == f[j]:
c.remove(c[i]) # edited this line
d.remove(d[j])
final = c + d
print(final)
print(disjoint(a,b))
when I try removing common element from list copies, I get different output [2,4,8,7,9]. why ??
This is my first question in this website. I'll be thankful if anyone can clear my doubts.
Using sets you can do:
a = [1,2,3,4]
b = [8,7,9,2,1]
diff = (set(a) | set(b)) - (set(a) & set(b))
(set(a) | set(b)) is the union, set(a) & set(b) is the intersection and finally you do the difference between the two sets using -.
Your bug comes when you remove the elements in the lines c.remove(c[i]) and d.remove(d[j]). Indeed, the common elements are e[i]and f[j] while c and d are the lists you are updating.
To fix your bug you only need to change these lines to c.remove(e[i]) and d.remove(f[j]).
Note also that your method to delete items in both lists will not work if a list may contain duplicates.
Consider for instance the case a = [1,1,2,3,4] and b = [8,7,9,2,1].
You can simplify your code to make it works:
def disjoint(e,f):
c = e.copy() # [:] works also, but I think this is clearer
d = f.copy()
for i in e: # no need for index. just walk each items in the array
for j in f:
if i == j: # if there is a match, remove the match.
c.remove(i)
d.remove(j)
return c + d
print(disjoint([1,2,3,4],[8,7,9,2,1]))
Try it online!
There are a lot of more effecient way to achieve this. Check this stack overflow question to discover them: Get difference between two lists. My favorite way is to use set (like in #newbie's answer). What is a set? Lets check the documentation:
A set object is an unordered collection of distinct hashable objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference. (For other containers see the built-in dict, list, and tuple classes, and the collections module.)
emphasis mine
Symmetric difference is perfect for our need!
Returns a new set with elements in either the set or the specified iterable but not both.
Ok here how to use it in your case:
def disjoint(e,f):
return list(set(e).symmetric_difference(set(f)))
print(disjoint([1,2,3,4],[8,7,9,2,1]))
Try it online!

Old maid card game

I am trying to write an old maid card game. Now I reach the stage of removing pairs, so if there are same numbers (2-10) and same letters(AKQJ), delete both of them. I have written several lines of code, but it does not work. Could you tell me why and help me fix it.
How can I identify the same number with different suits and delete both of them in the same list?
def x(alist):
n = '2345678910AKJQ'
a=[]
b=[]
for i in alist:
j = ''.join([k for k in i if k in n])
if not j in b:
a.append(i)
b.append(j)
return a
create a default dictionary creating a list, split the items according to last character (I used standard letters not symbols), and compose a listcomp with keys where there's only 1 value.
import re
from collections import defaultdict
deck = ['10H','AS','AH','4C','4S','5D']
dd = defaultdict(list)
for d in deck:
dd[d[:-1]].append(d[-1])
print([k for k,v in dd.items() if len(v)==1])
result:
['5', '10']

How do I make a program that checks if there are two of the same characters in a string?

I need a program that checks if ther are two or more same charactars in a string. The don't have to be right next to each other like bb they can be farther apart like Bob. They just have to have the same charactar once or more.
What I have now doesn't work because it automatically says cool has two of the same charactars:
import collections
word = 'cool'
c = collections.Counter(word)
if c>1:
>>>>print (word,'has two of the same charactars:')
else:
>>>>print (word,'has no same charactars:')
You are almost there, just need the .values() method of Counters. The following tests both cases.
import collections
def dupchar(word):
c = collections.Counter(word)
return any(i >= 2 for i in c.values())
for word in 'hot', 'cool':
print('{} has {} same characters'.format(word,
'two of the' if dupchar(word) else 'no'))
prints
hot has no same characters
cool has two of the same characters

Create a dictionary from a file

I am creating a code that allows the user to input a .txt file of their choice. So, for example, if the text read:
"I am you. You ArE I."
I would like my code to create a dictionary that resembles this:
{I: 2, am: 1, you: 2, are: 1}
Having the words in the file appear as the key, and the number of times as the value. Capitalization should be irrelevant, so are = ARE = ArE = arE = etc...
This is my code so far. Any suggestions/help?
>> file = input("\n Please select a file")
>> name = open(file, 'r')
>> dictionary = {}
>> with name:
>> for line in name:
>> (key, val) = line.split()
>> dictionary[int(key)] = val
Take a look at the examples in this answer:
Python : List of dict, if exists increment a dict value, if not append a new dict
You can use collections.Counter() to trivially do what you want, but if for some reason you can't use that, you can use a defaultdict or even a simple loop to build the dictionary you want.
Here is code that solves your problem. This will work in Python 3.1 and newer.
from collections import Counter
import string
def filter_punctuation(s):
return ''.join(ch if ch not in string.punctuation else ' ' for ch in s)
def lower_case_words(f):
for line in f:
line = filter_punctuation(line)
for word in line.split():
yield word.lower()
def count_key(tup):
"""
key function to make a count dictionary sort into descending order
by count, then case-insensitive word order when counts are the same.
tup must be a tuple in the form: (word, count)
"""
word, count = tup
return (-count, word.lower())
dictionary = {}
fname = input("\nPlease enter a file name: ")
with open(fname, "rt") as f:
dictionary = Counter(lower_case_words(f))
print(sorted(dictionary.items(), key=count_key))
From your example I could see that you wanted punctuation stripped away. Since we are going to split the string on white space, I wrote a function that filters punctuation to white space. That way, if you have a string like hello,world this will be split into the words hello and world when we split on white space.
The function lower_case_words() is a generator, and it reads an input file one line at a time and then yields up one word at a time from each line. This neatly puts our input processing into a tidy "black box" and later we can simply call Counter(lower_case_words(f)) and it does the right thing for us.
Of course you don't have to print the dictionary sorted, but I think it looks better this way. I made the sort order put the highest counts first, and where counts are equal, put the words in alphabetical order.
With your suggested input, this is the resulting output:
[('i', 2), ('you', 2), ('am', 1), ('are', 1)]
Because of the sorting it always prints in the above order.

Resources