return dictionary of file names as keys and word lists with words unique to file as values - python-3.x

I am trying to write a function to extract only words unique to each key and list them in a dictionary output like {"key1": "unique words", "key2": "unique words", ... }. I start out with a dictionary. To test with I created a simple dictionary:
d = {1:["one", "two", "three"], 2:["two", "four",
"five"], 3:["one","four", "six"]}
My output should be:
{1:"three",
2:"five",
3:"six"}
I am thinking maybe split in to separate lists
def return_unique(dct):
Klist = list(dct.keys())
Vlist = list(dct.values())
aList = []
for i in range(len(Vlist)):
for j in Vlist[i]:
if
What I'm stuck on is how do I tell Python to do this: if Vlist[i][j] is not in the rest of Vlist then aList.append(Vlist[i][j]).
Thank you.

You can try something like this:
def return_unique(data):
all_values = []
for i in data.values(): # Get all values
all_values = all_values + i
unique_values = set([x for x in all_values if all_values.count(x) == 1]) # Values which are not duplicated
for key, value in data.items(): # For Python 3.x ( For Python 2.x -> data.iteritems())
for item in value: # Comparing values of two lists
for item1 in unique_values:
if item == item1:
data[key] = item
return data
d = {1:["one", "two", "three"], 2:["two", "four", "five"], 3:["one","four", "six"]}
print (return_unique(d))
result >> {1: 'three', 2: 'five', 3: 'six'}

Since a key may have more than one unique word associated with it, it makes sense for the values in the new dictionary to be a container type object to hold the unique words.
The set difference operator returns the difference between 2 sets:
>>> a = set([1, 2, 3])
>>> b = set([2, 4, 6])
>>> a - b
{1, 3}
We can use this to get the values unique to each key. Packaging these into a simple function yields:
def unique_words_dict(data):
res = {}
values = []
for k in data:
for g in data:
if g != k:
values += data[g]
res[k] = set(data[k]) - set(values)
values = []
return res
>>> d = {1:["one", "two", "three"],
2:["two", "four", "five"],
3:["one","four", "six"]}
>>> unique_words_dict(d)
{1: {'three'}, 2: {'five'}, 3: {'six'}}
If you only had to do this once, then you might be interested in the less efficeint but more consice dictionary comprehension:
>>> from functools import reduce
>>> {k: set(d[k]) - set(reduce(lambda a, b: a+b, [d[g] for g in d if g!=k], [])) for k in d}
{1: {'three'}, 2: {'five'}, 3: {'six'}}

Related

Python reverse dictionary lookup in list comprehension

I have the following dictionary:
d = {}
d[1] = 'a'
d[2] = 'b'
d[3] = 'c'
d[4] = 'd'
I'd like to perform a reverse dictionary lookup for each character in a string:
input_string = "bad"
I get different results when I do this in a list comprehension as opposed to a nested for loop, and I don't understand why. As I understand, the list comprehension and the nested for loop should yield identical results. The list comprehension yields a list whose results are not in the order I would expect. My desired result here is that which is provided by the nested for loop, however I prefer to use the list comprehension to accomplish that. Perhaps this has something to do with python dictionary order of which I am unaware?
result1 = [key for key, value in d.items() for i in input_string if i == value]
print(result1)
> [1, 2, 4]
result2 = list()
for i in input_string:
for key, value in d.items():
if i == value:
result2.append(key)
print(result2)
> [2, 1, 4]
In order to mimic the traditional loop, the outer loop should be over input_string and the inner loop should be over d in the list comprehension:
out = [k for i in input_string for k,v in d.items() if i==v]
Output:
[2, 1, 4]

Python: Convert 2d list to dictionary with indexes as values

I have a 2d list with arbitrary strings like this:
lst = [['a', 'xyz' , 'tps'], ['rtr' , 'xyz']]
I want to create a dictionary out of this:
{'a': 0, 'xyz': 1, 'tps': 2, 'rtr': 3}
How do I do this? This answer answers for 1D list for non-repeated values, but, I have a 2d list and values can repeat. Is there a generic way of doing this?
Maybe you could use two for-loops:
lst = [['a', 'xyz' , 'tps'], ['rtr' , 'xyz']]
d = {}
overall_idx = 0
for sub_lst in lst:
for word in sub_lst:
if word not in d:
d[word] = overall_idx
# Increment overall_idx below if you want to only increment if word is not previously seen
# overall_idx += 1
overall_idx += 1
print(d)
Output:
{'a': 0, 'xyz': 1, 'tps': 2, 'rtr': 3}
You could first convert the list of lists to a list using a 'double' list comprehension.
Next, get rid of all the duplicates using a dictionary comprehension, we could use set for that but would lose the order.
Finally use another dictionary comprehension to get the desired result.
lst = [['a', 'xyz' , 'tps'], ['rtr' , 'xyz']]
# flatten list of lists to a list
flat_list = [item for sublist in lst for item in sublist]
# remove duplicates
ordered_set = {x:0 for x in flat_list}.keys()
# create required output
the_dictionary = {v:i for i, v in enumerate(ordered_set)}
print(the_dictionary)
""" OUTPUT
{'a': 0, 'xyz': 1, 'tps': 2, 'rtr': 3}
"""
also, with collections and itertools:
import itertools
from collections import OrderedDict
lstdict={}
lst = [['a', 'xyz' , 'tps'], ['rtr' , 'xyz']]
lstkeys = list(OrderedDict(zip(itertools.chain(*lst), itertools.repeat(None))))
lstdict = {lstkeys[i]: i for i in range(0, len(lstkeys))}
lstdict
output:
{'a': 0, 'xyz': 1, 'tps': 2, 'rtr': 3}

swap the keys and values in a dictionary by storing the user input in dic

First the input should be dic length consider 3. then the input to a dic is keys and values separated by spaces i,e
"A 1
B 2
C 1"
now dic={A:1, B:2, C:1}
At first the keys and values and should be swapped, and if there are same keys and there values should be merged in a list and assigned to the same key as shown below.(these program should work for any length of dictionary)
the output should be dicout={1:['A','C'], 2:B}.
Thank you.
Define:
from collections import defaultdict
def make_dict(s):
d = defaultdict(list)
xs = s.split(" ")
for k, v in zip(xs[1::2], xs[::2]):
d[k].append(v)
for k, v in d.items():
if len(v) == 1:
d[k] = v[0]
return dict(d)
Example usage:
>>> make_dict("A 1 B 2 C 1")
{'1': ['A', 'C'], '2': 'B'}

Printing list containing a string

I am trying to store a string variable containg some names, I want to store the respective variable in a list and print it, but am unable print the values which are stored in variable.
name='vsb','siva','anand','soubhik' #variable containg some names
lis=['name'] # storing the variable in a list
for x in lis:
print(x) #printing the list using loops
Image:
Maybe dictionary? Try this
variable_1 = "aa"
variable_2 = "bb"
lis = {}
lis['name1'] = variable_1
lis['name2'] = variable_2
for i in lis:
print(i)
print(lis[i])
Your name variable is actually a tuple.
Example of tuple declaration:
tup1 = ('physics', 'chemistry', 1997, 2000)
tup2 = (1, 2, 3, 4, 5 )
tup3 = "a", "b", "c", "d"
Example of list declaration:
list1 = ['physics', 'chemistry', 1997, 2000]
list2 = [1, 2, 3, 4, 5 ]
list3 = ["a", "b", "c", "d"]
For a better understanding you should read The Python Standard Library or do a tutorial.
For your problem maybe the dictionary is the solution:
# A tuple is a sequence of immutable Python objects
name='vsb','siva','anand','soubhik'
print('Tuple: ' + str(name)) # ('vsb', 'siva', 'anand', 'soubhik')
# This is a list containing one element: 'name'
lis=['name']
print('List: ' + str(lis)) # ['name']
# Dictionry with key 'name' and vlue ('vsb','siva','anand','soubhik')
dictionary={'name':name}
print('Dictionary: ' + str(dictionary))
print('Dictionary elements:')
print(dictionary['name'])
print('Tuple elements:')
for x in name:
print(x)
print('List elements:')
for x in lis:
print(x)
Output
Tuple: ('vsb', 'siva', 'anand', 'soubhik')
List: ['name']
Dictionary: {'name': ('vsb', 'siva', 'anand', 'soubhik')}
Dictionary elements:
('vsb', 'siva', 'anand', 'soubhik')
Tuple elements:
vsb
siva
anand
soubhik
List elements:
name

Get value that from another defaultdict and update the original dict

Basically, I am trying to extract the values from one dictionary and update the value in another dictionary. I have four lists as follows:
a = [1,1,2,3,4,5]
b = [0,3,0,5,6,0]
c = [2,3,4,5,6,5]
d = [20,30,40,50,60,70]
So I use a defaultdict to store key,value pairs for a,b like:
one = defaultdict(list)
for k, v in zip(a, b):
one[k].append(v)
two = defaultdict(list)
for k, v in zip(c, d):
two[k].append(v)
Essentially, b is linked to c so I am trying to extract the values in the two dictionary and then update
the values in the one dictionary
So in the end one would look like {1: 30, 3: 50, 4: 60}
This is my code:
three = defaultdict(list)
for k, v in one.items():
if v in two.keys():
newvalue = two[v].values()
three[k].append(newvalue)
But I am now getting an error at line if v in two.keys(): as unhashable type: 'list'. I'm so lost, all
I want to do is use the values from one dictionary and then use those values to find the keys (which are the values
from the other table) and then get those corressponding values.
You are creating a dictionary of list in the beginning:
one = defaultdict(list)
for k, v in zip(a, b):
one[k].append(v)
[output] : defaultdict(list, {1: [0, 3], 2: [0], 3: [5], 4: [6], 5: [0]})
two = defaultdict(list)
for k, v in zip(c, d):
two[k].append(v)
[output] : defaultdict(list, {2: [20], 3: [30], 4: [40], 5: [50, 70], 6: [60]})
Therefore when calling k,v in one.items(), you are getting a key and a list.
Simply switch to iterate through the list , and you should be good to go
three = defaultdict(list)
for k, v in one.items():
for value in v:
if value in two.keys():
newvalue = two[value]
three[k].append(newvalue)
However I'm getting this output :
defaultdict(list, {1: [[30]], 3: [[50, 70]], 4: [[60]]})
Which sounds reasonable to me, but it is not your expected one, can you please explain ?
Let's try know with dic comprehension
output = { k : two[v_2] for k,v in one.items() for v_2 in v}
[output] : {1: [30], 2: [], 3: [50, 70], 4: [60], 5: []}
Request to sum :
Of course, multiple ways of doing it , the quickest is again with dict_comprehension and sum
output_sum = {k: sum(v) for k,v in output.items()}

Resources