Can someone explain how this code works with range and slicing? - python-3.x

s = 'eljwboboblejr' # dont paste into grader
count = 0
for i in range (len(s)):
if s[i:i+3]== 'bob':
count+=1
print('Number of times bob occurs is: ' + str(count))
I do not get how len is working here, or if s[i:i+3] == 'bob'

So what happens here is that the i goes through all the letters, and slice all the letters by i and i+3 in each loop. What len is doing is just taking the length of s (basically how many characters there are in it) and returning it as an integer. What the s[i:i+3] == 'bob' is doing is determining if the sliced string is equal to 'bob'. So imagine that the i represents all the letters in the s string. So if the sliced string that is contained by the i and i+3 has 'bob' in it, it returns true. It's not the greatest of explanations, but I hope it helps.

documentation for len is here:
https://docs.python.org/3.2/library/functions.html#len
It will be implemented in string as a magic private function (__len__, I believe).
documentation for range is here:
https://docs.python.org/3.2/library/functions.html#range
With one arg, range generates integers 0 to that arg (excluding arg itself).
The slice in the loop evaluates to 'elj', then 'ljw', then 'jwb', ... in subsequent iterations. The slice [a:b] doesn't include the b'th element.

Related

calling functions in python 3 from within a function

Given a string, return the count of the number of times that a substring length 2 appears in the string and also as the last 2 chars of the string, so "hixxxhi" yields 1 (we won't count the end substring).
last2('hixxhi') → 1
last2('xaxxaxaxx') → 1
last2('axxxaaxx') → 2
I found this question in one of the websites (https://codingbat.com/prob/p145834).
The answer to the above question as given on the website is as follows :
def last2(str):
# Screen out too-short string case.
if len(str) < 2:
return 0
# last 2 chars, can be written as str[-2:]
last2 = str[len(str)-2:]
count = 0
# Check each substring length 2 starting at i
for i in range(len(str)-2):
sub = str[i:i+2]
if sub == last2:
count = count + 1
return count
I have a doubt on the below mentioned line of code
last2 = str[len(str)-2:]
Now, I know that this piece of code is extracting the last 2 letters of the string 'str'. What I am confused about is the variable name. As you can see that the variable name is same as the name of the function. So is this line calling the function again and updating the value of the variable 'str' ??
def last2(str):
. . .
This creates a parameter called str that shadows the built-in str class*. Within this function, str refers to the str parameter, not the str built-in class.
This is poor practice though. Don't name your variables the same thing as existing builtins. This causes confusing situations like this, and leads to issues like this.
A better name would be something that describes what purpose the string has, instead of just a generic, non-meaningful str.
* The built-in str is actually a class, not a plain function. str(x) is a call to the constructor of the str class.
def last2(str):
if len(str) == 0:
return 0
last_two = str[-2::]
count = 0
for i in range(len(str)):
if last_two == str[i :i + 2]:
count += 1
return count-1
this is the answer that was correct for me for the first time. The official answer is better, but this one might be less confusing for you.

Max Length Removal

The problem is If there is “100” as a sub-string in the string, then we can delete this sub-string. The task is to find the length of longest sub-string which can be make removed?
s=input('')
i=0
if '100' not in s:
print('0')
else:
st=''
while i<len(s)-2:
if s[i:i+3]=='100':
s= s.replace('100','')
a=s.find('100')
if a<=i:
st=st+'100'
i=a
else:
st='100'
i=i+1
else:
i=i+1
print(len(st))
for the input: 101001010000,this code is printing 9 instead of 12,
somehow the else part is not getting executed..
please someone help me out
s.replace() removes all occurrences of the substring, not just the first, and searching from the start.
This means that '101001010000'.replace('100', '') replaces two occurrences:
>>> '101001010000'.replace('100', '')
'101000'
but you count that as one replacement.
str.replace() takes a third argument, the number of replacements to be made, see the documentation:
str.replace(old, new[, count])
Return a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.
Use that to limit the number of replacements.

How to get longest alphabetically ordered substring in python

I am trying to write a function that returns the longest substring of s in which the letters occur in alphabetical order. For example, if s = 'azcbobobegghakl', the function should return 'beggh'
Here is my function, which is still not complete but it does not return the list of sub;
the return error is:
"IndexError: string index out of range"
def longest_substring(s):
sub=[]
for i in range (len(s)-1):
subs=s[i]
counter=i+1
while ord(s[i])<ord(s[counter]):
subs+=s[counter]
counter+=1
sub.append(subs)
return sub
It is not optimal (works in linear time O(n)) but i made some modification to your code (in Python 3):
def longest_substring(s):
length = len(s)
if length == 0 : # Empty string
return s
final = s[0]
for i in range (length-1):
current = s[i]
counter = i+1
while counter < length and ord(s[i]) <= ord(s[counter]):
current += s[counter]
counter +=1
i+=1
if len(final) < len(current):
final = current
return final
s = 'azcbobobegghakl'
print(longest_substring(s))
Output:
beggh
Modifications:
You are comparing character with fixed position i.e. in while loop you are incrementing only counter not i so I incremented
the ith position also.(So we avoid checking the characters which are already checked, So it does this in linear time O(n) I think..)
Also you are only checking less than for condition while ord(s[i])<ord(s[counter]): But you also have to check for equals too.
You created one list where you append every sequence which is unnecessary unless you want do any other calculations on the
sequence, So I take string and if previous sequence's length is small
then I updated it with new sequence.
Note : If two sequence's length is same then 1st occurring sequence is shown as output.
Another Input:
s = 'acdb'
Output:
acd
I hope this will help you.

Searching a minimal string meeting some conditions

Recently, I was asked the following problem during an interview.
Given a string S, I need to find another string S2 such that S2 is a subsequence of S and also S is a subsequence of S2+reverse(S2). Here '+' means concatenation. I need to output the min possible length of S2 for given S.
I was told that this is a dynamic programming problem however I was unable to solve it. Can somebody help me with this problem?
EDIT-
Is there a way to do this in O(N2) or less.
There are 2 important aspects in this problem.
Since we need S as a substring of S2+reverse(S2), S2 should have
atleast n/2 length.
After concatenation of S2 and reverse(S2), there is a pattern where
the alphabets repeats such as
So the solution is to check from the center of S to end of S for any consecutive elements. If you find one then check the elements on either side as shown.
Now if you are able to reach till the end of the string, then the minimum number of elements (result) is the distance from start to the point where you find consecutive elements. In this example its C i.e 3.
We know that this may not happen always. i.e you may not be able to find consecutive elements at the center. Let us say the consecutive elements are after the center then we can do the same test.
Main string
Substring
Concatenated string
Now arrives the major doubt. Why we consider only the left side starting from center? The answer is simple, the concatenated string is made by S+reverse(S). So we are sure that the last element in the substring comes consecutive in the concatenated string. There is no way that any repetition in the first half of the main string can give a better result because at least we should have the n alphabets in the final concatenated string
Now the matter of complexity:
Searching for consecutive alphabets give a maximum of O(n)
Now checking elements on either side iteratively can give a worst case complexity of O(n). i.e maximum n/2 comparisons.
We may fail many times doing the second check so the we have a multiplicative relation between the complexities i.e O(n*n).
I believe this is a correct solution and didn't find any loophole yet.
Let's say that S2 is "apple". Then we can make this assumption:
S2 + reverseS2 >= S >= S2
"appleelppa" >= S >= "apple"
So the given S will something including "apple" to not more than "appleelppe". It could be "appleel" or "appleelpp".
String S ="locomotiffitomoc";
// as you see S2 string is "locomotif" but
// we don't know S2 yet, so it's blank
String S2 = "";
for (int a=0; a<S.length(); a++) {
try {
int b = 0;
while (S.charAt(a - b) == S.charAt(a + b + 1))
b++;
// if this for loop breaks that means that there is a character that doesn't match the rule
// if for loop doesn't break but throws an exception we found it.
} catch (Exception e) {
// if StringOutOfBoundsException is thrown this means end of the string.
// you can check this manually of course.
S2 = S.substring(0,a+1);
break;
}
}
System.out.println(S2); // will print out "locomotif"
Congratulations, you found the minimum S2.
Each character from S can be includes in S2 or not. With that we can construct recursion that tries two cases:
first character of S is used for cover,
first character of S is not
used for cover,
and calculate minimum of these two covers. To implement this, it is enough to track how much of S is covered with already chosen S2+reverse(S2).
There are optimizations where we know what result is (found cover, can't have cover), and it is not needed to take first character for cover if it will not cover something.
Simple python implementation:
cache = {}
def S2(S, to_cover):
if not to_cover: # Covered
return ''
if not S: # Not covered
return None
if len(to_cover) > 2*len(S): # Can't cover
return None
key = (S, to_cover)
if key not in cache:
without_char = S2(S[1:], to_cover) # Calculate with first character skipped
cache[key] = without_char
_f = to_cover[0] == S[0]
_l = to_cover[-1] == S[0]
if _f or _l:
# Calculate with first character used
with_char = S2(S[1:], to_cover[int(_f):len(to_cover)-int(_l)])
if with_char is not None:
with_char = S[0] + with_char # Append char to result
if without_char is None or len(with_char) <= len(without_char):
cache[key] = with_char
return cache[key]
s = '21211233123123213213131212122111312113221122132121221212321212112121321212121132'
c = S2(s, s)
print len(s), s
print len(c), c

Matlab - How do I compare two strings letter by letter?

Essentially, I have two strings of equal length, let's say 'AGGTCT' and 'AGGCCT' for examples sake. I want to compare them position by position and get a readout of when they do not match. So here I would hope to get 1 out because there is only 1 position where they do not match at position 4. If anyone has ideas for the positional comparison code that would help me a lot to get started.
Thank you!!
Use the following syntax to get the number of dissimilar characters for strings of equal size:
sum( str1 ~= str2 )
If you want to be case insensitive, use:
sum( lower(str1) ~= lower(str2) )
The expression str1 ~= str2 performs char-by-char comparison of the two strings, yielding a logical vector of the same size as the strings, with true where they mismatch (using ~=) and false where they match. To get your result simply sum the number of true values (mismatches).
EDIT: if you want to count the number of matching chars you can:
Use "equal to" == operator (instead of "not-equal to" ~= operator):
sum( str1 == str2 )
Subtract the number of mismatch, from the total number:
numel(str1) - sum( str1 ~= str2 )
You can compare all the element of the string:
r = all(seq1 == seq2)
This will compare char by char and return true if all the element in the resulting array are true. If the strings can have different sizes you may want to compare the sizes first. An alternative is
r = any(seq1 ~= seq2)
Another solution is to use strcmp:
r = strcmp(seq1, seq2)
Just would like to point out that you are asking to calculate the hamming distance (as you ask for alternatives - the article contains links to some). This is already discussed here. In short the builtin command pdist can do it.

Resources