Searching for a wildcard pattern in a text file in Python - python-3.x

I am trying to search for similar words in Python given a wildcard pattern, for example in a text file similar to a dictionary- I could search for r?v?r? and the correct output would be words such as 'rover', 'raver', 'river'.
This is the code I have so far but it only works when I type in the full word and not the wildcard form.
name = input("Enter the name of the words file:\n"
pattern = input("Enter a search pattern:\n")`
textfile = open(name, 'r')
filetext = textfile.read()
textfile.close()
match = re.findall(pattern, filetext)
if match is True:
print(match)
else:
print("Sorry, matches for ", pattern, "could not to be found")

Use dots for blanks
name = input("Enter the name of the words file:\n"
pattern = input("Enter a search pattern:\n")`
textfile = open(name, 'r')
filetext = textfile.read()
textfile.close()
re.findall('r.v.r',filetext)
if match is True:
print(match)
else:
print("Sorry, matches for ", pattern, "could not to be found")
Also, match is a string, so you want to do
if match!="" or if len(match)>0
,whichever one suits your code.

Related

Python user input matches both the list and file

Hot to create a python code that user input matches the list of word and file.
for example
list = ["banana", "apple"]
file = open("file_path", "r")
search_word = input("Search the word you want to search: ")
for search_word in file.read()
and search_word in list
print("Search word is in the list and in the file")
else:
print("Search word is not matches")
If the search word is not in the list then don't see a need to search the contents of the file for a match since it fails the first condition. Can test that condition and if false then don't need to search the file.
You could do a simple string test if it is found in the file contents.
if search_word in data:
print("match")
However, words that contain the search word will also match (e.g. pineapple would match with apple).
You can use a regular expression to check if the word is contained anywhere in the file. The \b metacharacter matches at the beginning or end of a word so for example apple won't match the word pineapple. The (?i) flag does a case-insensitive search so the word apple matches Apple, etc.
You can try something like this.
import re
list = ["banana", "apple"]
search_word = input("Search the word you want to search: ")
if search_word not in list:
# if not in list then don't need to search the file
found = False
else:
with open("file_path", "r") as file:
data = file.read()
found = re.search(fr'(?i)\b{search_word}\b', data)
# now print results if there was a match or not
if found:
print("Search word is in the list and in the file")
else:
print("Search word does not match")

How to replace multiple strings in a file that are both lowercase as well as capitalized?

I am trying to read and replace all occurring strings within a file. Most are lowercase but there is one that is capitalized. How to a read the file so that regardless of the capitalization all strings are removed.
For reference this is the text I would like to edit and the word I would like to replace is "morning/Morning".
Text below:
"Good morning! / I was going to say you good morning / Good Afternoon Morning is when the sun comes up / I will call you in the morning"
See code below:
filename = input("Enter the filename: ")
stringToRemove = input("Enter the string to be removed: ")
infile = open(filename, 'r')
fileString = infile.read()
fileString = fileString.replace(stringToRemove, '')
infile.close()
outfile = open(filename, 'w')
outfile.write(fileString)
outfile.close()
print("Done")
You could use re.sub in case insensitive mode:
filename = input("Enter the filename: ")
stringToRemove = input("Enter the string to be removed: ")
infile = open(filename, 'r')
fileString = infile.read()
fileString = re.sub(r'\s*' + stringToRemove + r'\s*', ' ', fileString, flags=re.IGNORECASE).strip()
The output from your sample string here would be:
Good ! / I was going to say you good / Good Afternoon is when the sun comes up / I will call you in the

Question about editting very long sentence with python

I want examine the hexa sentence.
with open("C:/python_tria/HEX/sample/test.zip", "rb+") as f:
stri = str(f. read())
sta=stri.find('this is where to start')
end=stri.find('this is where to end')
My plan is extract the part between 'sta' through 'end'.
What is the solution I could take?
You could try using re.findall on the file text to find what you are looking for:
with open("C:/python_tria/HEX/sample/test.zip", "rb+") as f:
stri = str(f.read())
matches = re.findall(r'this is where to start.*?this is where to end', stri, flags=re.DOTALL)
print(matches[0]) # print the first match

Python: If a string is found, stop searching for that string, search for the next string, and output the matching strings

This code outputs the matching string once for every time it is in the file that is being searched (so I end up with a huge list if the string is there repeatedly). I only want to know if the strings from my list match, not how many times they match. I do want to know which strings match, so a True/False solution does not work. But I only want them listed once, each, if they match. I do not really understand what the pattern = '|'.join(keywords) part is doing - I got that from someone else's code to get my file to file matching working, but don't know if I need it. Your help would be much appreciated.
# declares the files used
filenames = ['//Katie/Users/kitka/Documents/appreport.txt', '//Dallin/Users/dallin/Documents/appreport.txt' ,
'//Aidan/Users/aidan/Documents/appreport.txt']
# parses each file
for filename in filenames:
# imports the necessary libraries
import os, time, re, smtplib
from stat import * # ST_SIZE etc
# finds the time the file was last modified and error checks
try:
st = os.stat(filename)
except IOError:
print("failed to get information about", filename)
else:
# creates a list of words to search for
keywords = ['LoL', 'javaw']
pattern = '|'.join(keywords)
# searches the file for the strings in the list, sorts them and returns results
results = []
with open(filename, 'r') as f:
for line in f:
matches = re.findall(pattern, line)
if matches:
results.append((line, len(matches)))
results = sorted(results)
# appends results to the archive file
with open("GameReport.txt", "a") as f:
for line in results:
f.write(filename + '\n')
f.write(time.asctime(time.localtime(st[ST_MTIME])) + '\n')
f.write(str(line)+ '\n')
Untested, but this should work. Note that this only keeps track of which words were found, not which words were found in which files. I couldn't figure out whether or not that's what you wanted.
import fileinput
filenames = [...]
keywords = ['LoL', 'javaw']
# a set is like a list but with no duplicates, so even if a keyword
# is found multiple times, it will only appear once in the set
found = set()
# iterate over the lines of all the files
for line in fileinput.input(files=filenames):
for keyword in keywords:
if keyword in line:
found.add(keyword)
print(found)
EDIT
If you want to keep track of which keywords are present in which files, then I'd suggest keeping a set of (filename, keyword) tuples:
filenames = [...]
keywords = ['LoL', 'javaw']
found = set()
for filename in filenames:
with open(filename, 'rt') as f:
for line in f:
for keyword in keywords:
if keyword in line:
found.add((filename, keyword))
for filename, keyword in found:
print('Found the word "{}" in the file "{}"'.format(keyword, filename))

Creating a autocorrect and word suggestion program in python

def autocorrect(word):
Break_Word = sorted(word)
Sorted_Word = ''.join(Break_Word)
return Sorted_Word
user_input = ""
while (user_input == ""):
user_input = input("key in word you wish to enter: ")
user_word = autocorrect(user_input).replace(' ', '')
with open('big.txt') as myFile:
for word in myFile:
NewWord = str(word.replace(' ', ''))
Break_Word2 = sorted(NewWord.lower())
Sorted_Word2 = ''.join(Break_Word2)
if (Sorted_Word2 == user_word):
print("The word",user_input,"exist in the dictionary")
Basically when I had a dictionary of correctly spelled word in "big.txt", if I get the similar from the user input and the dictionary, I will print out a line
I am comparing between two string, after I sort it out
However I am not able to execute the line
if (Sorted_Word2 == user_word):
print("The word",user_input,"exist in the dictionary")
When I try hard code with other string like
if ("a" == "a"):
print("The word",user_input,"exist in the dictionary")
it worked. What wrong with my code? How can I compared two string from the file?
What does this mean? Does it throw an exception? Then if so, post that...
However I am not able to execute the line
if (Sorted_Word2 == user_word):
print("The word",user_input,"exist in the dictionary")
because I can run a version of your program and the results are as expected.
def autocorrect(word):
Break_Word = sorted(word)
Sorted_Word = ''.join(Break_Word)
return Sorted_Word
user_input = ""
#while (user_input == ""):
user_input = raw_input("key in word you wish to enter: ").lower()
user_word = autocorrect(user_input).replace(' ', '')
print ("user word '{0}'".format(user_word))
for word in ["mike", "matt", "bob", "philanderer"]:
NewWord = str(word.replace(' ', ''))
Break_Word2 = sorted(NewWord.lower())
Sorted_Word2 = ''.join(Break_Word2)
if (Sorted_Word2 == user_word):
print("The word",user_input,"exist in the dictionary")
key in word you wish to enter: druge
user word 'degru'
The word druge doesn't exist in the dictionary
key in word you wish to enter: Mike
user word 'eikm'
('The word','mike', 'exist in the dictionary')
Moreover I don't know what all this "autocorrect" stuff is doing. All you appear to need to do is search a list of words for an instance of your search word. The "sorting" the characters inside the search word achieves nothing.

Resources