I am a novice programmer learning Python using the book: "Introduction to Computation and Programming Using Python" by John V Gurrag.
I am attempting a finger exercise in Chapter 3:
"The Empire State Building is 102 stories high. A man wanted to know the highest floor from which he could drop an egg without the egg breaking. If it broke, he would go down a floor and try again. He would do this until the egg did not break. At worst, this method requires 102 eggs. Implement a method that at worst uses 7 eggs".
I implemented it this way. I do not know if it is the most efficient, non-recursive method
x = int(input("The highest floor without egg breaking is: "))
eggs_left = 7
guess_list = [ans]
while eggs_left>0:
if ans==x:
print('Highest floor without egg break is',ans)
print('Sequence of guesses: ',guess_list)
print('Eggs remaining are: ',eggs_left)
elif ans<x:
eggs_left = eggs_left-1
print("One egg broken")
if eggs_left==0:
print("No more eggs")
if abs(ans-x)>1:
ans= (high+low)//2


problem with rounding in calculating minimum amount of coins in change (python)

I have a homework assignment in which I have to write a program that outputs the change to be given by a vending machine using the lowest number of coins. E.g. £3.67 can be dispensed as 1x£2 + 1x£1 + 1x50p + 1x10p + 1x5p + 1x2p.
However, I'm not getting the right answers and suspect that this might be due to a rounding problem.
change=float(input("Input change"))
while change!=0:
if change-2>=0:
if change-1>=0:
if change-0.5>=0:
if change-0.2>=0:
if change-0.1>=0:
Input: 2.3
Output: 10010
i.e. 2.2
Input: 3.4
Output: 11011
i.e. 3.3
Some actually work:
Input: 3.2
Output: 11010
i.e. 3.2
Input: 1.1
Output: 01001
i.e. 1.1
Floating point accuracy
Your approach is correct, but as you guessed, the rounding errors are causing trouble. This can be debugged by simply printing the change variable and information about which branch your code took on each iteration of the loop:
initial value: 3.4
taking a 2... new value: 1.4
taking a 1... new value: 0.3999999999999999 <-- uh oh
taking a 0.2... new value: 0.1999999999999999
taking a 0.1... new value: 0.0999999999999999
1 1 0 1 1
If you wish to keep floats for output and input, multiply by 100 on the way in (cast to integer with int(round(change))) and divide by 100 on the way out of your function, allowing you to operate on integers.
Additionally, without the 5p, 2p and 1p values, you'll be restricted in the precision you can handle, so don't forget to add those. Multiplying all of your code by 100 gives:
initial value: 340
taking a 200... new value: 140
taking a 100... new value: 40
taking a 20... new value: 20
taking a 20... new value: 0
1 1 0 2 0
Avoid deeply nested conditionals
Beyond the decimal issue, the nested conditionals make your logic very difficult to reason about. This is a common code smell; the more you can eliminate branching, the better. If you find yourself going beyond about 3 levels deep, stop and think about how to simplify.
Additionally, with a lot of branching and hand-typed code, it's very likely that a subtle bug or typo will go unnoticed or that a denomination will be left out.
Use data structures
Consider using dictionaries and lists in place of blocks like:
which can be elegantly and extensibly represented as:
denominations = [200, 100, 50, 10, 5, 2, 1]
used = {x: 0 for x in denominations}
In terms of efficiency, you can use math to handle amounts for each denomination in one fell swoop. Divide the remaining amount by each available denomination in descending order to determine how many of each coin will be chosen and subtract accordingly. For each denomination, we can now write a simple loop and eliminate branching completely:
for val in denominations:
used[val] += amount // val
amount -= val * used[val]
and print or show a final result of used like:
278 => {200: 1, 100: 0, 50: 1, 10: 2, 5: 1, 2: 1, 1: 1}
The end result of this is that we've reduced 27 lines down to 5 while improving efficiency, maintainability and dynamism.
By the way, if the denominations were a different currency, it's not guaranteed that this greedy approach will work. For example, if our available denominations are 25, 20 and 1 cents and we want to make change for 63 cents, the optimal solution is 6 coins (3x 20 and 3x 1). But the greedy algorithm produces 15 (2x 25 and 13x 1). Once you're comfortable with the greedy approach, research and try solving the problem using a non-greedy approach.

How to skew random choice probability towards one option?

I am using the random library in python to select win or lose.
import random
choice = random.choice(['win','lose'])
Is there anyway I can say set the probability in the code to say I want more lose than win everytime the code runs?
Well, #CloC is technically right, you could do that, but better use the standard Python library as it is designed, less code, less bugs
F.e., sampling with probability of Win being 60%
v = random.choices(["Win", "Lose"], weights=[60, 40], k=1)
A way to control random (which wont be random...) would be to:
Generate a number between 1 and 100:
n = random.randint(1,101)
and then compare it with the percentage of win/loses you want :
chances_to_win = 51 # 50%
if n < chances_to_win:
print(str(chances_to_win-1) +"% chances to Win")
choice = "Win"
print(str(100-(chances_to_win-1)) +"% chances to Lose")
choice = "Lose"
That way, you can control what is the percentage of wins and loses.

Can One Decrease Overall Accuracy in Python?

I'm making a program to calculate primes, and I look at the remainder of the possible prime I'm testing and all the primes I have so far, but stop if I get to the point where I am comparing the PossPrime to anything above its square root. (I can explain this if needed). I don't care about any digits after the decimal point of the sqrt; is there a way to tell Python not to bother calculating those?
And, is there a way to integrate that aspect (only testing the primes under it until I get to the sqrt) into the for loop?
#There's some boring setup before here that isn't problematic.
while True:
PossWasDivis = False #initialize the var (used to convey if the possible prime was divis)
sqrtP = int(sqrt(PossPrime))
for iterationOfArray in range(2, sqrtP):
# print ("Comparing: (", PossPrime, "% (", GlobPrimeList [iterationOfArray],")) == 0")
if (PossPrime % (GlobPrimeList [iterationOfArray])) == 0:
# print(PossPrime, "was divisable by", GlobPrimeList[iterationOfArray],"! Breaking for loop")
PossWasDivis = True
if (GlobPrimeList [iterationOfArray]) > sqrtP:
if PossWasDivis == False: # Occurs when none of the tested #s are divis
GlobPrimeList.append (PossPrime)
#Switch between incramenting PossPrime between 2 and 4
if PossPrimeStat == 2:
PossPrimeStat = 4
PossPrime += 4
else: # if PossPrimeStat == 4:
PossPrimeStat = 2
PossPrime += 2
The GlobPrimeList has some primes preloaded in it to start out with and continues finding more indefinitely until I cancel the program.
I don't care about any digits after the decimal point of the sqrt; is there a way to tell Python not to bother calculating those?
math.sqrt() works with floating-point values. The precision you get is baked right in to the math.
Python does contain functions for "arbitrary-precision" math, where you can specify how much precision you will get... but those functions are much slower than the functions that operate on float values. If you want faster code, you will be using float values, and there is no way to tell Python not compute the whole value.

Given a phrase without spaces add spaces to make proper sentence

This is what I've in mind, but it's O(n^2):
For ex: Input is "Thisisawesome", we need to check if adding the current character makes the older found set any longer and meaningful. But in order to see till where we need to back up we'll have to traverse all the way to the beginning. For ex: "awe" and "some" make proper words but "awesome" makes the bigger word. Please suggest how can we improve the complexity. Here is the code:
void update(string in)
int len= in.length();
int DS[len];
string word;
for(int i=0; i<len; i++) DS[i]=0;
for(int i=0; i<len; i++)
for(int j=i+1; j<=len; j++)
word = in.substr(i,j-i);
DS[j-1] = (DS[j-1] > word.length()) ? DS[j-1] : word.length();
There is a dynamic programming solution which at first looks like it is going to be O(n^2) but which turns out to be only O(n) for sufficiently large n and fixed size dictionary.
Work through the string from left to right. At the ith stage you need to work out whether there is a solution for the first i characters. To solve this, consider every possible way to break those i characters into two chunks. If the second chunk is a word and the first chunk can be broken up into words then there is a solution. The first requirement you can check with your dictionary. The second requirement you can check by looking to see if you found an answer for the first j characters, where j is the length of the first chunk.
This would be O(n^2) because for each of 1,2,3,...n lengths you consider every possible split. However, if you know what the longest word in your dictionary is you know that there is no point considering splits which make the second chunk longer than this. So for each of 1,2,3...n lengths you consider at most w possible splits, where w is the longest word in your dictionary, and the cost is O(n).
I have coded my solution today, and will put it on a web site tomorrow. Anyway, the method is as follows:
Arrange the dictionary in a trie.
The trie can help to do multiple matches quickly, because all dictionary words starting with the same letters can be matched at the same time.
(e.g. "chairman" matches "chair" and "chairman" in a trie.)
Use Dijkstra algorithm to find the best match.
(e.g. for "chairman", if we count "c" as position 0, then we have the relationships 0->5, 0->8, 1->5, 2->5, 5->8. These relationship form a network perfect for Dijkstra algorithm.)
(Note: Where's the weights of the edges? See the next point.)
Assign weighting to dictionary words.
Without weighting bad matches do weight over good matches. (e.g. "iamahero" becomes "i ama hero" instead of "i am a hero".)
The SCOWL dictionary at serve the purpose well, because it has dictionaries of different sizes. These sizes (10, 20, etc.) is a good choice for weighing).
After some tries I found a need to reduce the weighing of words ending with "s", so "eyesandme" become "eyes and me" instead of "eye sand me".
I have been able to split a paragraph in milliseconds. The algorithm has linear complexity on the length of the string to be splitted, so the algorithm scales well as long as memory is enough.
Here's the dump (sorry for bragging). (The passage selected is "Novel" in Wikipedia.)
D:\GoogleDrive\programs\WordBreaker>"word breaker"<novelnospace.txt>output.txt
D:\GoogleDrive\programs\WordBreaker>type output.txt
Number of words after reading words-10.txt : 4101
Number of words after reading words-20.txt : 11329
Number of words after reading words-35.txt : 43292
Number of words after reading words-40.txt : 49406
Number of words after reading words-50.txt : 87966
Time elapsed in reading dictionary: 0.956782s
Enter the string to be broken into words:
a novel is along narrative normally in prose which describes fictional character
s and events usually in the form of a sequential story while i an watt in the ri
se of the novel 1957 suggests that the novel came into being in the early 18 th
century the genre has also been described as possessing a continuous and compreh
ensive history of about two thousand years with historical roots in classical gr
eece and rome medieval early modern romance and in the tradition of the novel la
the latter an italian word used to describe short stories supplied the present g
eneric english term in the 18 th century miguel de cervantes author of don quixo
te is frequently cited as the first significant europe an novelist of the modern
era the first part of don quixote was published in 1605 while a more precise de
finition of the genre is difficult the main elements that critics discuss are ho
w the narrative and especially the plot is constructed the themes settings and c
haracterization how language is used and the way that plot character and setting
relate to reality the romance is a related long prose narrative w alter scott d
efined it as a fictitious narrative in prose or verse the interest of which turn
s upon marvellous and uncommon incidents whereas in the novel the events are acc
ommodated to the ordinary train of human events and the modern state of society
however many romances including the historical romances of scott emily brontes w
u the ring heights and her man melvilles mo by dick are also frequently called n
ovels and scott describes romance as a kind red term romance as defined here sho
uld not be confused with the genre fiction love romance or romance novel other e
urope an languages do not distinguish between romance and novel a novel isle rom
and err o ma nil roman z o
Time elapsed in splitting: 0.00495095s
D:\GoogleDrive\programs\WordBreaker>type novelnospace.txt

Understanding Bayes' Theorem

I'm working on an implementation of a Naive Bayes Classifier. Programming Collective Intelligence introduces this subject by describing Bayes Theorem as:
Pr(A | B) = Pr(B | A) x Pr(A)/Pr(B)
As well as a specific example relevant to document classification:
Pr(Category | Document) = Pr(Document | Category) x Pr(Category) / Pr(Document)
I was hoping someone could explain to me the notation used here, what do Pr(A | B) and Pr(A) mean? It looks like some sort of function but then what does the pipe ("|") mean, etc?
Pr(A | B) = Probability of A happening given that B has already happened
Pr(A) = Probability of A happening
But the above is with respect to the calculation of conditional probability. What you want is a classifier, which uses this principle to decide whether something belongs to a category based on the previous probability.
See for a complete example
I think they've got you covered on the basics.
Pr(A | B) = Pr(B | A) x Pr(A)/Pr(B)
reads: the probability of A given B is the same as the probability of B given A times the probability of A divided by the probability of B. It's usually used when you can measure the probability of B and you are trying to figure out if B is leading us to believe in A. Or, in other words, we really care about A, but we can measure B more directly, so let's start with what we can measure.
Let me give you one derivation that makes this easier for writing code. It comes from Judea Pearl. I struggled with this a little, but after I realized how Pearl helps us turn theory into code, the light turned on for me.
Prior Odds:
O(H) = P(H) / 1 - P(H)
Likelihood Ratio:
L(e|H) = P(e|H) / P(e|¬H)
Posterior Odds:
O(H|e) = L(e|H)O(H)
In English, we are saying that the odds of something you're interested in (H for hypothesis) are simply the number of times you find something to be true divided by the times you find it not to be true. So, say one house is robbed every day out of 10,000. That means that you have a 1/10,000 chance of being robbed, without any other evidence being considered.
The next one is measuring the evidence you're looking at. What is the probability of seeing the evidence you're seeing when your question is true divided by the probability of seeing the evidence you're seeing when your question is not true. Say you are hearing your burglar alarm go off. How often do you get that alarm when it's supposed to go off (someone opens a window when the alarm is on) versus when it's not supposed to go off (the wind set the alarm off). If you have a 95% chance of a burglar setting off the alarm and a 1% chance of something else setting off the alarm, then you have a likelihood of 95.0.
Your overall belief is just the likelihood * the prior odds. In this case it is:
((0.95/0.01) * ((10**-4)/(1 - (10**-4))))
# => 0.0095009500950095
I don't know if this makes it any more clear, but it tends to be easier to have some code that keeps track of prior odds, other code to look at likelihoods, and one more piece of code to combine this information.
I have implemented it in Python. It's very easy to understand because all formulas for Bayes theorem are in separate functions:
#Bayes Theorem
def get_outcomes(sample_space, f_name='', e_name=''):
outcomes = 0
for e_k, e_v in sample_space.items():
if f_name=='' or f_name==e_k:
for se_k, se_v in e_v.items():
if e_name!='' and se_k == e_name:
elif e_name=='':
return outcomes
def p(sample_space, f_name):
return get_outcomes(sample_space, f_name) / get_outcomes(sample_space, '', '')
def p_inters(sample_space, f_name, e_name):
return get_outcomes(sample_space, f_name, e_name) / get_outcomes(sample_space, '', '')
def p_conditional(sample_space, f_name, e_name):
return p_inters(sample_space, f_name, e_name) / p(sample_space, f_name)
def bayes(sample_space, f, given_e):
sum = 0;
for e_k, e_v in sample_space.items():
sum+=p(sample_space, e_k) * p_conditional(sample_space, e_k, given_e)
return p(sample_space, f) * p_conditional(sample_space, f, given_e) / sum
sample_space = {'UK':{'Boy':10, 'Girl':20},
'FR':{'Boy':10, 'Girl':10},
'CA':{'Boy':10, 'Girl':30}}
print('Probability of being from FR:', p(sample_space, 'FR'))
print('Probability to be French Boy:', p_inters(sample_space, 'FR', 'Boy'))
print('Probability of being a Boy given a person is from FR:', p_conditional(sample_space, 'FR', 'Boy'))
print('Probability to be from France given person is Boy:', bayes(sample_space, 'FR', 'Boy'))
sample_space = {'Grow' :{'Up':160, 'Down':40},
'Slows':{'Up':30, 'Down':70}}
print('Probability economy is growing when stock is Up:', bayes(sample_space, 'Grow', 'Up'))
Pr(A | B): Conditional probability of A : i.e. probability of A, given that all we know is B
Pr(A) : Prior probability of A
Pr is the probability, Pr(A|B) is the conditional probability.
Check wikipedia for details.
the pipe (|) means "given".
The probability of A given B is equal to the probability of B given A x Pr(A)/Pr(B)
Based on your question I can strongly advise that you need to read some undergraduate book on Probability Theory first. Without this you will not advance properly with your task on Naive Bayes Classifier.
I would recommend you this book or look at MIT OpenCourseWare.
The pipe is used to represent conditional probability.
Pr(A | B) = Probability of A given B
Let's say you are not feeling well and you surf the web for the symptoms. And the internet tells you that if you have these symptoms then you have XYZ disease.
In this case:
Pr(A | B) is what you are trying to find out, which is:
The probability of you having XYZ GIVEN THAT you have certain symptoms.
Pr(A) is the probability of having the disease XYZ
Pr(B) is the probability of having those symptoms
Pr(B | A) is what you find out from the internet, which is:
The probability of having the symptoms GIVEN THAT you have the disease.
