could someone instruct me to understand this code - python-3.x

def count_char(text, char):
count = 0
for c in text:
if c == char:
count += 1
return count
filename = input("Enter a filename: ")
with open(filename) as f:
text = f.read()
for char in "abcdefghijklmnopqrstuvwxyz":
perc = 100 * count_char(text, char) / len(text)
print("{0} - {1}%".format(char, round(perc, 2)))

It's a script that counts the relative occurrence of letters abcdefghijklmnopqrstuvwxyz in the given text file.
The first block defines a function that counts how many times the character char is present in the text:
def count_char(text, char):
count = 0
for c in text:
if c == char:
count += 1
return count
The second block asks you to input the name of the file:
filename = input("Enter a filename: ")
and saves the contents of that file as a string in the variable text:
with open(filename) as f:
text = f.read()
The third block displays the relative occurrence of characters a b c d e f g h i j k l m n o p q r s t u v w x y z in text.
For each of these characters, it first computes the proportion of the amount of the given characters in the text count_char(text, char) to the total length of the text len(text) and multiplies the result by 100 to convert it to percentage:
perc = 100 * count_char(text, char) / len(text)
and displays the results as a formatted string. The numbers in curly brackets are replaced by the character char and the percentage of its occurrence, rounded to two decimals round(perc, 2):
print("{0} - {1}%".format(char, round(perc, 2)))
You can read more about string formatting in Python here.

Related

Finding a substring that occurs k times in a long string

I'm trying to solve some algorithm task, but the solution does not pass the time limit.
The condition of the task is the following:
You are given a long string consisting of small Latin letters. We need to find all its substrings of length n that occur at least k times.
Input format:
The first line contains two natural numbers n and k separated by a space.
The second line contains a string consisting of small Latin letters. The string length is 1 ≤ L ≤ 10^6.
n ≤ L, k ≤ L.
Output Format:
For each found substring, print the index of the beginning of its first occurrence (numbering in the string starts from zero).
Output indexes in any order, in one line, separated by a space.
My final solution looks something like this:
def polinomial_hash(s: str, q: int, R: int) -> int:
h = 0
for c in s:
h = (h * q + ord(c)) % R
return h
def get_index_table(inp_str, n):
q = 1000000007
power = q ** (n-1)
R = 2 ** 64
M = len(inp_str)
res_dict = {}
cur_hash = polinomial_hash(inp_str[:n], q, R)
res_dict[cur_hash] = [0]
for i in range(n, M):
first_char = inp_str[i-n]
next_char = inp_str[i]
cur_hash = (
(cur_hash - ord(first_char)*(power))*q
+ ord(next_char)) % R
try:
d_val = res_dict[cur_hash]
d_val += [i-n+1]
except KeyError:
res_dict[cur_hash] = [i-n+1]
return res_dict
if __name__ == '__main__':
n, k = [int(i) for i in input().split()]
inp_str = input()
for item in get_index_table(inp_str, n).values():
if len(item) >= k:
print(item[0], end=' ')
Is it possible to somehow optimize this solution, or advise some alternative options?!

Get the nth occurrence of a letter in a string (python)

Let's say there is a string "abcd#abcd#a#"
How to get the index of the 2nd occurrence of '#' , and get the output as 9?
Since the position of the second occurrence of '#' is 9
Using a generator expression:
text = "abcd#abcd#a#"
gen = (i for i, l in enumerate(text) if l == "#")
next(gen) # skip as many as you need
4
next(gen) # get result
9
As a function:
def index_for_occurrence(text, token, occurrence):
gen = (i for i, l in enumerate(text) if l == token)
for _ in range(occurrence - 1):
next(gen)
return next(gen)
Result:
index_for_occurrence(text, "#", 2)
9
s = 'abcd#abcd#a#'
s.index('#', s.index('#')+1)

Multiplying all the digits of a number in python

If i have a number 101, i want to multiply all the digits of the number (1*0*1) but this result will become Zero. Instead how to split the number into 10 and 1 and multiply(10*1).
Similar examples,
3003033 -> 300*30*3*3 or
2020049 -> 20*200*4*9
You could use a negative look behind to check its not the start of the list and a positive look ahead for nums that are not 0 as your split point.
REGEX: Essentially this says split where the next num is not a 0 and it not the start of the line
/
(?<!^)(?=[^0])
/
gm
Negative Lookbehind (?<!^)
Assert that the Regex below does not match
^ asserts position at start of a line
Positive Lookahead (?=[^0])
Assert that the Regex below matches
Match a single character not present in the list below [^0]
0 matches the character 0 literally (case sensitive)
CODE
import re
from functools import reduce
def sum_split_nums(num):
nums = re.split(r'(?<!^)(?=[^0])', str(num))
total = reduce((lambda x, y: int(x) * int(y)), nums)
return " * ".join(nums), total
nums = [3003033, 2020049, 101, 4040]
for num in nums:
expression, total = sum_split_nums(num)
print(f"{expression} = {total}")
OUTPUT
300 * 30 * 3 * 3 = 81000
20 * 200 * 4 * 9 = 144000
10 * 1 = 10
40 * 40 = 1600
Let a and b be two integer numbers. Let c be a new number made by putting n zeros in the right side of b. Then multiplying a and c is equal to multiplying a and b and 10^n.
Now you can simplify what you want to do to the following: Multiply digits of your number to each other with the agreement that instead of 0, you will put 10. So actually you don't need to split your number.
Here I defined two functions. In both of them the idea is to convert your number to a string, run a for-loop on its digits and by an if condition in the case
1) multiply the previous result to the new digit if it is not 0, otherwise multiply to 10.
def multio1(x):
s = str(x)
ans = 1
for i in range(len(s)):
if s[i] != '0':
ans *= int(s[i])
else:
ans *= 10
return(ans)
2) multiply the previous result to the new digit if it is not 0, otherwise add one unit to the number of zeros. Then at the end put as many as number of zeros, zeros at the right side of your final result.
def multio2(x):
s = str(x)
ans = 1
number_of_zeros = 0
for i in range(len(s)):
if s[i] != '0':
ans *= int(s[i])
else:
number_of_zeros += 1
if number_of_zeros != 0:
ans = str(ans)
for i in range(number_of_zeros):
ans += '0'
ans = int(ans)
return(ans)
Now the multio1(x) and multio2(x) for x=101,3003033,2020049, both gives equal results shown in below.
10,81000,144000
That's kind of odd, but this code will work:
a = '3003033'
num = ''
last_itr = 0
tot=1
for i in range(len(a)-1):
if a[i]=='0' and a[i+1]<='9' and a[i+1]>'0':
tot*=int(a[last_itr:i+1])
last_itr=i+1
elif a[i]>'0' and a[i]<='9' and a[i+1]<='9' and a[i+1]>'0':
tot*=int(a[i])
last_itr=i+1
tot*=int(a[last_itr:len(a)])
print(tot)
Just put your number at a

Python - Input a str. and convert it to a number. (print calculation)

How do I input a letter (a , b or c) and then print out a result as a int.(5 * 1) and not a str? (5 * a).
number = int(input("Input a number: "))
letter = input("Input a latter: ")
a = 1
b = 5
c = 3
print(number * letter)
You can simply use a dictionary here as follows:
For your code:
number = int(input('Enter a number: '))
letter = input('Enter a letter: ')
dict = {'a':1,'b':5,'c':3}
print( number * dict[letter.lower()])
You can try this:
number = int(input('Input a number: '))
letter = input('Input a letter: ')
a = 1
b = 5
c = 3
letter = letter.lower()
if letter in 'abc':
idx = 'abc'.index(letter)
letter = [a, b, c][idx]
print(number * letter)
For a more flexible way, you can do this:
number = int(input('Input a number: '))
letter = input('Input a letter: ')
a = 1
b = 5
c = 3
d = 4
letter = letter.lower()
# locals() returns a dict of all variables in the current scope.
v = locals().get(letter)
if v is None:
print('"%s" is not an expected choice!' % letter)
else:
print(number * v)

Printing integers tiled horizontally - Python3

I'm trying to obtain the following: i want to print a range of integers, but if the integer contains more than 10 digits, the '1' in '10' needs to be printed on top.
e.g.:
6 - > 123456
13 - >...................1111
..........1234567890123
Remark, that if it contains less then 10 digits, there's no 'upper line' printed. And the '.' should be replaced just by spaces, but the editor won't let me do that
I've tried the following:
line10 = ''
line1 = ''
if length > 10:
for i in range(length):
if (i + 1) // 10 == 0:
line10 += ' '
else:
line10 += str((i + 1) // 10)
for i in range(length):
line1 += str((i + 1) % 10)
if length > 10:
print(line10)
print(line1)
And: this works, but how can you make it work for let's say 100 or 1000, without having to copy the lines of code?
Thanks in advance.
There may be a more elegant solution to your problem, but I believe this does what you require:
def number_printer(n):
lines = [[] for m in range(len(str(n)))]
for i in range(1, n+1):
diff = len(str(n))-len(str(i))
if diff > 0:
for z in range(diff):
lines[z].append(" ")
for x, y in enumerate(str(i)):
lines[x+diff].append(y)
else:
for x, y in enumerate(str(i)):
lines[x].append(y)
for line in lines:
print "".join(line)
if __name__ == "__main__":
number_printer(132)
Essentially, it is checking the length of each number it counts through against the lenght of the number you wish to print (in this example 132). Wherever it finds a difference (where diff > 0), it appends the appropriate number of blank spaces so all the numbers align (e.g. for the number 12 it would append 1 blank space, as the difference in length between 12 and 132 is 1).
Hopefully this is what you were after!

Resources