How do I create a class (OOP) that compares two words to be the same, even if they're anagrams - python-3.x

Trying to create an OOP using class to compare two words and return True if they're the same and false if not. Words can be anagrams/upper/lower case versions of each other and still be true (W1 = sTop, W2 = Pots, W1 == W2 result: True) I am new to coding, so I am struggling in the attribute part of this code. How do I get it to read the word as the same under these conditions.
I have gone through this site as well as others to find the structure and overall idea behind OOP and have pieced together what I think to be correct, however, I know it's not complete and will kick errors when I run it. I've tried calling the methods under str using my classes grade scope and it failed which I expected. Any help in explaining/writing this problem would be great. Please excuse my novice ability in coding.
class Word:
def _init_(self, word):
self.word = word
def _str_(self):
w1 == w2
return self.lower(word)
Expected outcomes:
Examples
word1 = Word("post")
word2 = Word("stop")
word1 == word2
Result: True
word1 = Word("")
word2 = Word("")
word1 == word2
Result: True
word1 = Word("aBlE")
str(word1)
Result: able
word1 = Word("able")
word2 = Word("baker")
word1 == word2
Result: False
word1 = Word("Hi there! :-)")
word2 = Word("Hit here! :-)")
word1 == word2
Result: True

Anagrams are words containing exactly the same letters, in the same numbers. You can write a function that takes in two words, sorts the letters, and compare them one by one.
def are_anagrams(word1, word2):
return sorted(word1.lower()) == sorted(word2.lower())
are_anagrams('abBa', 'BAba'), are_anagrams('abby', 'baba')
If you wanted to use a class, you could override the __eq__ dunder method that governs the behavior of operator ==:
Maybe something like this:
class AnagramWords:
def __init__(self, word):
self.word = word
self.cmp = sorted(self.word.lower())
def __eq__(self, other):
"""returns True if self is an anagram of other
"""
if isinstance(other, str):
other = AnagramWords(other)
if isinstance(other, type(self)):
return self.cmp == other.cmp
raise NotImplementedError(f'AnagramWord cannot compare to {type(other)}')
def are_anagrams(word1, word2):
return sorted(list(word1.lower())) == sorted(list(word2.lower()))
are_anagrams('abBa', 'BAba'), are_anagrams('abby', 'baba') # True, False
w1 = AnagramWords('AbBA')
w2 = AnagramWords('BBaa')
w3 = AnagramWords('bABy')
print(w1 == w2, w2 == w3) # True, False
print(w3 == 123) # NotImplementedError: AnagramWord cannot compare to <class 'int'>

First of all,
The dunder (double-underscore: '__') methods are special methods used for you or python to hook into your code, for example implementing __len__ on a class will allow you to run len(MyClass) instead of MyClass.__len__(). so in essence, your defining __str__ isn't your desired step.
Second, in you __str__ method you are trying to compare w1 == w2 which are 2 variables which you haven't defined or accepted as arguments to your function.
My answer is, not always do you need to use OOP, for example your case can be defined as a simple function as follows
edit
As I see I accidentally mixed up anagram with palindrome, I am adding the anagram version as well.
def is_anagram(w1, w2):
return sorted(w1.lower()) == sorted(w2.lower())
# and I am keeping just for reference sake the palindrome one.
def is_palindrome(w1, w2):
return w1.lower() == w2.lower()[::-1]
What I am doing in the anagram function is first I am lowercasing the words so I can compare the characters regardless of case, and then I am using the sorted function which takes a sequence (which str is a sequence) and sorts it), and then we compare the 2 to see if they are indeed anagrams of each other.
What I am doing in the palindrome function is accepting to strings and then comparing the lowercased version of w1 to the reveresed lowercase version of w2, I am reversing the string by using a slice which starts at the default index (implicit as it is blank before the first colon), default stop, and a step of negative 1, which in effect reverses the string.
In any case, I wish you great luck on your programming journey!

Related

function that returns `True` if a string contains exactly two instance of a substring and `False` if it doesn't

I'm trying to write a function that returns True if a string contains exactly two instance of a substring and False if it doesn't.
I'm getting an error:
return' outside function
I feel I'm very close but just can't quite get it, I'd appreciate being pointed in the right direction.
Should recursion be used?
s = ("xeroxes")
exes = str(x)
count = 0
def has_two_X(s):
count = s.count(exes)
for exes in s:
count = +1
if count ==2:
return True
else:
return False
if __name__ == "__main__":
print("string has :",count.s(exes))
If the code must return True if there is two or more instances of substring, you can create a dictionary and return True if value is there in dictionary or not.
mydict = {}
for letter in s:
if letter in mydict:
return True
else:
mydict[letter] = 0
return False #Since there is no substring (two substring can be formed only if it has two same letters in the string)
To find exactly if it has two substring, I would recommend the same approach where you can maintain the dictionary and the count of substring present. Add all the substring/count to the dictionary, this would take O(n^2) to generate all substrings and approx- O(n^2) hashmap memory.
After constructing hashmap, you can iterate over the hashmap to return True if it has exactly two occurences of substring.

Parsing an inconsistent field in Python [duplicate]

I am parsing file and I want to check each line against a few complicated regexs. Something like this
if re.match(regex1, line): do stuff
elif re.match(regex2, line): do other stuff
elif re.match(regex3, line): do still more stuff
...
Of course, to do the stuff, I need the match objects. I can only think of three possibilities, each of which leaves something to be desired.
if re.match(regex1, line):
m = re.match(regex1, line)
do stuff
elif re.match(regex2, line):
m = re.match(regex2, line)
do other stuff
...
which requires doing the complicated matching twice (these are long files and long regex :/)
m = re.match(regex1, line)
if m: do stuff
else:
m = re.match(regex2, line)
if m: do other stuff
else:
...
which gets terrible as I indent further and further.
while True:
m = re.match(regex1, line)
if m:
do stuff
break
m = re.match(regex2, line)
if m:
do other stuff
break
...
which just looks weird.
What's the right way to do this?
You could define a function for the action required by each regex and do something like
def dostuff():
stuff
def dootherstuff():
otherstuff
def doevenmorestuff():
evenmorestuff
actions = ((regex1, dostuff), (regex2, dootherstuff), (regex3, doevenmorestuff))
for regex, action in actions:
m = re.match(regex, line)
if m:
action()
break
for patt in (regex1, regex2, regex3):
match = patt.match(line)
if match:
if patt == regex1:
# some handling
elif patt == regex2:
# more
elif patt == regex3:
# more
break
I like Tim's answer because it separates out the per-regex matching code to keep things simple. For my answer, I wouldn't put more than a line or two of code for each match, and if you need more, call a separate method.
In this particular case there appears to be no convenient way to do this in python.
if python would accept the syntax:
if (m = re.match(pattern,string)):
text = m.group(1)
then all would be fine, but apparently you
cannot do that
First off, do you really need to use regexps for your matching? Where I would use regexps in, e.g., perl, I'll often use string functions in python (find, startswith, etc).
If you really need to use regexps, you can make a simple search function that does the search, and if the match is returned, sets a store object to keep your match around before returning True.
e.g.,
def search(pattern, s, store):
match = re.search(pattern, s)
store.match = match
return match is not None
class MatchStore(object):
pass # irrelevant, any object with a 'match' attr would do
where = MatchStore()
if search(pattern1, s, where):
pattern1 matched, matchobj in where.match
elif search(pattern2, s, where):
pattern2 matched, matchobj in where.match
...
Your last suggestion is slightly more Pythonic when wrapped up in a function:
def parse_line():
m = re.match(regex1, line)
if m:
do stuff
return
m = re.match(regex2, line)
if m:
do other stuff
return
...
That said, you can get closer to what you want using a simple container class with some operator overloading class:
class ValueCache():
"""A simple container with a returning assignment operator."""
def __init__(self, value=None):
self.value = value
def __repr__(self):
return "ValueCache({})".format(self.value)
def set(self, value):
self.value = value
return value
def __call__(self):
return self.value
def __lshift__(self, value):
return self.set(value)
def __rrshift__(self, value):
return self.set(value)
match = ValueCache()
if (match << re.match(regex1, line)):
do stuff with match()
elif (match << re.match(regex2, line)):
do other stuff with match()
You can define a local function that accepts a regex, tests it against your input, and stores the result to a closure-scoped variable:
match = None
def matches(pattern):
nonlocal match, line
match = re.match(pattern, line)
return match
if matches(regex1):
# do stuff with `match`
elif matches(regex2):
# do other stuff with `match`
I'm not sure how Pythonic that approach is, but it's the cleanest way I've found to do regex matching in an if-elif-else chain and preserve the match objects.
Note that this approach will only work in Python 3.0+ as it requires the PEP 3104 nonlocal statement. In earlier Python versions there's no clean way for a function to assign to a variable in a non-global parent scope.
It's also worth noting that if you have a big enough file that you're worried about running a regex twice for each line you should also be pre-compiling them with re.compile and passing the resulting regex object to your check function instead of the raw string.
I would break your regex up into smaller components and search for simple first with longer matches later.
something like:
if re.match(simplepart,line):
if re.match(complexregex, line):
do stuff
elif re.match(othersimple, line):
if re.match(complexother, line):
do other stuff
Why not use a dictionnary/switch statement ?
def action1(stuff):
do the stuff 1
def action2(stuff):
do the stuff 2
regex_action_dict = {regex1 : action1, regex2 : action2}
for regex, action in regex_action_dict.iteritems():
match_object = re.match(regex, line):
if match_object:
action(match_object, line)
FWIW, I've stressed over the same thing, and I usually settle for the 2nd form (nested elses) or some variation. I don't think you'll find anything much better in general, if you're looking to optimize readability (many of these answers seem significantly less readable than your candidates to me).
Sometimes if you're in an outer loop or a short function, you can use a variation of your 3rd form (the one with break statements) where you either continue or return, and that's readable enough, but I definitely wouldn't create a while True block just to avoid the "ugliness" of the other candidates.
My solution with an exemple; there is only one re.search() that is performed:
text = '''\
koala + image # wolf - snow
Good evening, ladies and gentlemen
An uninteresting line
There were 152 ravens on a branch
sea mountain sun ocean ice hot desert river'''
import re
regx3 = re.compile('hot[ \t]+([^ ]+)')
regx2 = re.compile('(\d+|ev.+?ng)')
regx1 = re.compile('([%~#`\#+=\d]+)')
regx = re.compile('|'.join((regx3.pattern,regx2.pattern,regx1.pattern)))
def one_func(line):
print 'I am one_func on : '+line
def other_func(line):
print 'I am other_func on : '+line
def another_func(line):
print 'I am another_func on : '+line
tupl_funcs = (one_func, other_func, another_func)
for line in text.splitlines():
print line
m = regx.search(line)
if m:
print 'm.groups() : ',m.groups()
group_number = (i for i,m in enumerate(m.groups()) if m).next()
print "group_number : ",group_number
tupl_funcs[group_number](line)
else:
print 'No match'
print 'No treatment'
print
result
koala + image # wolf - snow
m.groups() : (None, None, '+')
group_number : 2
I am another_func on : koala + image # wolf - snow
Good evening, ladies and gentlemen
m.groups() : (None, 'evening', None)
group_number : 1
I am other_func on : Good evening, ladies and gentlemen
An uninteresting line
No match
No treatment
There were 152 ravens on a branch
m.groups() : (None, '152', None)
group_number : 1
I am other_func on : There were 152 ravens on a branch
sea mountain sun ocean ice hot desert river
m.groups() : ('desert', None, None)
group_number : 0
I am one_func on : sea mountain sun ocean ice hot desert river
Make a class with the match as state. Instantiate it before conditional, this should store the string that you are matching against as well.
You can define a class wrapping the match object with a call method to perform the match:
class ReMatcher(object):
match = None
def __call__(self, pattern, string):
self.match = re.match(pattern, string)
return self.match
def __getattr__(self, name):
return getattr(self.match, name)
Then call it in your conditions and use it as if it was a match object in the resulting blocks:
match = ReMatcher()
if match(regex1, line):
print(match.group(1))
elif match(regex2, line):
print(match.group(1))
This should work in nearly any Python version, with slight adjustments in versions before new-style classes. As in my other answer, you should use re.compile if you're concerned about regex performance.

Finding a substring in a jumbled string

I am writing a script - includes(word1, word2) - that takes two strings as arguments, and finds if word1 is included in word2. Word2 is a letter jumble. It should return Boolean. Also repetition of letters are allowed, I am only checking if the letters are included in the both words in the same order.
>>>includes('queen', 'qwertyuytresdftyuiokn')
True
'queen', 'QwertyUytrEsdftyuiokN'
I tried turning each word into lists so that it is easier to work with each element. My code is this:
def includes(w1, w2):
w1 = list(w1)
w2 = list(w2)
result = False
for i in w1:
if i in w2:
result = True
else:
result = False
return result
But the problem is that I need to also check if the letters of word1 comes in the same order in word2, and my code doesn't controls that. I couldn't find a way to implement that with list. Just like I couldn't do this much with strings, so I think I need to use another data structure like dictionary but I don't know much about them.
I hope I understood what is your goal.
Python is not my thing, but I think I made it pythonic:
def is_subsequence(pattern, items_to_use):
items_to_use = (x for x in items_to_use)
return all(any(x == y for y in items_to_use) for x, _ in itertools.groupby(pattern))
https://ideone.com/Saz984
Explanation:
itertools.groupby transfers pattern in such way that constitutive duplicates are discarded
all items form form grouped pattern must fulfill conditions
any uses generator items_to_use as long as it doesn't matches current item. Note that items_to_use mus be defined outside of final expression so progress on it is kept every time next item from pattern is verified.
If you are not just checking substrings:
def include(a, b):
a = "".join(set(a)) # removes duplicates
if len(a) == 1:
if a in b:
return True
else:
return False
else:
try:
pos = b.index(a[0])
return include(a[1:], b[pos:])
except:
return False
print(include('queen', 'qwertyuytresdftyuiokn'))
#True

I want to know how can is shorten this code and make it look more proper

I want to know if the code I wrote can be shortened further, I was practicing and I came up to a task which asks you to return a boolean value, this is what the question says:
Given two strings, return True if either of the strings appears at the
very end of the other string, ignoring upper/lower case differences
(in other words, the computation should not be "case sensitive").
Note: s.lower() returns the lowercase version of a string.
def end_other(a, b):
x = len(b)
n = a[-x:]
y = len(a)
m = b[-y:]
if b.lower() == n.lower() or a.lower() == m.lower() :
return True
else:
return False
The Code is working properly but I wondered if it can be shortened more so it looks good.
You can write it like this:
def end_other(a, b):
n = a[-len(b):]
m = b[-len(a):]
return b.lower() == n.lower() or a.lower == m.lower()
I removed variables x and y because they are used just one time and then I also remove the if-else statement because it's unnecessary, in fact you can just return the result of the comparison instead of checking it's result and returning it a second time.

Why can't input() in Python 3.6 be placed in an if _ is _: statement like this one?

Edited version
x = input("Put 'thing' here \n")
if x is 'thing':
print("Success thingx!")
else:
print(x)
y = "thing"
if y is 'thing':
print("Success thingy!")
else:
print(x)
While I expected my result to be
Put 'thing' here
thing
#above is the input
Success thingx!
Success thingy!
I got the result
Put 'thing' here
thing
#above is the input
thing
Success thingy!
Is there an error in how I am writing this?
the is operator checks for identity. cpython use string interning to use the same str objects in different places. But, when using the input method a new str object is created and not the interned one
is => id(x1) == id(x2)
adding some id prints
x = str(input("Put 'thing' here \n"))
print('x', id(x))
print('thing', id('thing'))
if x is 'thing':
print("Success thingx!")
else:
print(x)
y = str("thing")
print('y', id(y))
if y is 'thing':
print("Success thingy!")
else:
print(x)
then the output is
thing
x 42666048
thing 42737760
thing
y 42737760
Success thingy!
you should use == if you want to test for equality
you're confusing is with ==
The is operator checks exact identity, that the two objects are actually the same exact object. This is implementation-specific and often the case for interned strings (small string literals) in cpython. The return value of input() is dynamically constructed and doesn't participate in interning here.
You want to use == to check equality between strings. Reserve is for singletons such as True, False, None, ... (Ellipsis), NotImplemented, types, etc.

Resources