Dictionary within a list - how to find values - python-3.x

I have a dictionary of around 4000 Latin words and their English meanings. I've opened it within Python and added it to a list and it looks something like this:
[{"rex":"king"},{"ego":"I"},{"a, ab":"away from"}...]
I want the user to be able to input the Latin word they're looking for and then the program prints out the English meaning. Any ideas as to how I do this?

You shouldn't put them in a list. You can use a main dictionary to hold all the items in one dictionary, then simply access to the relative meanings by indexing.
If you get the tiny dictionaries from an iterable object you can create your main_dict like following:
main_dict = {}
for dictionary in iterable_of_dict:
main_dict.update(dictionary)
word = None
while word != "exit"
word = input("Please enter your word (exit for exit): ")
print(main_dict.get(word, "Sorry your word doesn't exist in dictionary!"))

Well you could just cycle through your 4000 items...
listr = [{"rex":"king"},{"ego":"I"},{"a, ab":"away from"}]
out = "no match found"
get = input("input please ")
for c in range(0,len(listr)):
try:
out = listr[c][get]
break
except:
pass
print(out)
Maybe you should make the existing list into multiple lists alphabetically ordered to make the search shorter.
Also if the entry does not match exactly with the Latin word in the dictionary, then it won't find anything.

Related

How to search if every word in string starts with any of the word in list using python

I am trying to filter sentences from my pandas data-frame having 50 million records using keyword search. If any words in sentence starts with any of these keywords.
WordsToCheck=['hi','she', 'can']
text_string1="my name is handhit and cannary"
text_string2="she can play!"
If I do something like this:
if any(key in text_string1 for key in WordsToCheck):
print(text_string1)
I get False positive as handhit as hit in the last part of word.
How can I smartly avoid all such False positives from my result set?
Secondly, is there any faster way to do it in python? I am using apply function currently.
I am following this link so that my question is not a duplicate: How to check if a string contains an element from a list in Python
If the case is important you can do something like this:
def any_word_starts_with_one_of(sentence, keywords):
for kw in keywords:
match_words = [word for word in sentence.split(" ") if word.startswith(kw)]
if match_words:
return kw
return None
keywords = ["hi", "she", "can"]
sentences = ["Hi, this is the first sentence", "This is the second"]
for sentence in sentences:
if any_word_starts_with_one_of(sentence, keywords):
print(sentence)
If case is not important replace line 3 with something like this:
match_words = [word for word in sentence.split(" ") if word.lower().startswith(kw.lower())]

How to print a specific string containing a particular word - Python

I want to print out the entire string if it contains a particular word. for example
a = ['www.facbook.com/xyz','www.google.com/xyz','www.amazon.com/xyz','www.instagram.com/xyz']
if I am looking to find the word amazon then the code should print www.amazon.com/xyz
I have found many examples in which you can find out if a string contains a word but I need to print out the entire string which contains the word.
Try this -
your_list = ['www.facebook.com/xyz', 'www.google.com/xyz', 'www.amazon.com/xyz', 'www.instagram.com/xyz']
word = 'amazon'
res = [x for x in your_list if word in x]
print (*res)
Output:
www.amazon.com/xyz
This works fine if there are only one or two strings containing the word, if there are multiple strings in the list containing that name it will print them in a horizontal line.
It needs to print line by separate line but I do not know how to incorporate this in the code. It would be interesting to see how that looks.

Extracting acronyms from each string in list index

I have a list of strings (the other posts only had single words or ints) that are imported from a file and I am having trouble using nested loops to separate each few words in an index into its own list and then taking the first letters of each to create acronyms.
I have tried picking apart each index and processing it through another loop to get the first letter of each word but the closest I got was pulling every first letter from each indexes from the original layer.
text = (infile.read()).splitlines()
acronym = []
separator = "."
for i in range(len(text)):
substring = [text[i]]
for j in range(len(substring)):
substring2 = [substring[j][:1])]
acronym.append(substring2)
print("The Acronym is: ", separator.join(acronym))
Happy Path: The list of multi-word strings will turn be translated into acronyms that are listed with linebreaks.
Example of what should output at the end: D.O.D. \n N.S.A. \n ect.
What's happened so far: Before I had gotten it to take the first letter of the first word of every index at the sentence level but I haven't figured out how to nest these loops to get to the single words of each index.
Useful knowledge: THE BEGINNING FORMAT AFTER SPLITLINES (Since people couldn't read this) is a list with indexes with syntax like this: ['Department of Defense', 'National Security Agency', ...]
What you have is kind of a mess. If you are going to be re-using code, it is often better to just make it into a function. Try this out.
def get_acronym(the_string):
words = the_string.split(" ")
return_string = ""
for word in words:
return_string += word[0]
return return_string
text = ['Department of Defense', 'National Security Agency']
for agency in text:
print("The acronym is: " + get_acronym(agency))
I figured out how to do it from a file. File format was like this:
['This is Foo', 'Coming from Bar', 'Bring Your Own Device', 'Department of Defense']
So if this also helps anyone, enjoy~
infile = open(iname, 'r')
text = (infile.read()).splitlines()
print("The Strings To Become Acronyms Are As Followed: \n", text, "\n")
acronyms = []
for string in text:
words = string.split()
letters = [word[0] for word in words]
acronyms.append(".".join(letters).upper())
print("The Acronyms For These Strings Are: \n",acronyms)
This code outputs like this:
The Strings To Become Acronyms Are As Followed:
['This is Foo', 'Coming from Bar', 'Bring Your Own Device', 'Department of Defense']
The Acronyms For These Strings Are:
['T.I.F', 'C.F.B', 'B.Y.O.D', 'D.O.D']

How to remove/delete characters from end of string that match another end of string

I have thousands of strings (not in English) that are in this format:
['MyWordMyWordSuffix', 'SameVocabularyItemMyWordSuffix']
I want to return the following:
['MyWordMyWordSuffix', 'SameVocabularyItem']
Because strings are immutable and I want to start the matching from the end I keep confusing myself on how to approach it.
My best guess is some kind of loop that starts from the end of the strings and keeps checking for a match.
However, since I have so many of these to process it seems like there should be a built in way faster than looping through all the characters, but as I'm still learning Python I don't know of one (yet).
The nearest example I could find already on SO can be found here but it isn't really what I'm looking for.
Thank you for helping me!
You can use commonprefix from os.path to find the common suffix between them:
from os.path import commonprefix
def getCommonSuffix(words):
# get common suffix by reversing both words and finding the common prefix
prefix = commonprefix([word[::-1] for word in words])
return prefix[::-1]
which you can then use to slice out the suffix from the second string of the list:
word_list = ['MyWordMyWordSuffix', 'SameVocabularyItemMyWordSuffix']
suffix = getCommonSuffix(word_list)
if suffix:
print("Found common suffix:", suffix)
# filter out suffix from second word in the list
word_list[1] = word_list[1][0:-len(suffix)]
print("Filtered word list:", word_list)
else:
print("No common suffix found")
Output:
Found common suffix: MyWordSuffix
Filtered word list: ['MyWordMyWordSuffix', 'SameVocabularyItem']
Demo: https://repl.it/#glhr/55705902-common-suffix

Expected str instance, int found. How do I change an int to str to make this code work?

I'm trying to write code that analyses a sentence that contains multiple words and no punctuation. I need it to identify individual words in the sentence that is entered and store them in a list. My example sentence is 'ask not what your country can do for you ask what you can do for your country. I then need the original position of the word to be written to a text file. This is my current code with parts taken from other questions I've found but I just can't get it to work
myFile = open("cat2numbers.txt", "wt")
list = [] # An empty list
sentence = "" # Sentence is equal to the sentence that will be entered
print("Writing to the file: ", myFile) # Telling the user what file they will be writing to
sentence = input("Please enter a sentence without punctuation ") # Asking the user to enter a sentenc
sentence = sentence.lower() # Turns everything entered into lower case
words = sentence.split() # Splitting the sentence into single words
positions = [words.index(word) + 1 for word in words]
for i in range(1,9):
s = repr(i)
print("The positions are being written to the file")
d = ', '.join(positions)
myFile.write(positions) # write the places to myFile
myFile.write("\n")
myFile.close() # closes myFile
print("The positions are now in the file")
The error I've been getting is TypeError: sequence item 0: expected str instance, int found. Could someone please help me, it would be much appreciated
The error stems from .join due to the fact you're joining ints on strings.
So the simple fix would be using:
d = ", ".join(map(str, positions))
which maps the str function on all the elements of the positions list and turns them to strings before joining.
That won't solve all your problems, though. You have used a for loop for some reason, in which you .close the file after writing. In consequent iterations you'll get an error for attempting to write to a file that has been closed.
There's other things, list = [] is unnecessary and, using the name list should be avoided; the initialization of sentence is unnecessary too, you don't need to initialize like that. Additionally, if you want to ask for 8 sentences (the for loop), put your loop before doing your work.
All in all, try something like this:
with open("cat2numbers.txt", "wt") as f:
print("Writing to the file: ", myFile) # Telling the user what file they will be writing to
for i in range(9):
sentence = input("Please enter a sentence without punctuation ").lower() # Asking the user to enter a sentenc
words = sentence.split() # Splitting the sentence into single words
positions = [words.index(word) + 1 for word in words]
f.write(", ".join(map(str, positions))) # write the places to myFile
myFile.write("\n")
print("The positions are now in the file")
this uses the with statement which handles closing the file for you, behind the scenes.
As I see it, in the for loop, you try to write into file, than close it, and than WRITE TO THE CLOSED FILE again. Couldn't this be the problem?

Resources