def exp3(a,b):
if b == 1:
return a
if (b%2)*2 == b:
return exp3(a*a, b/2)
else: return a*exp3(a,b-1)
This is a recursive exponentiator program.
Question 1:
If b is even, it will exceute (b%2)2 == b. If b is odd, it will exceute aexp3(a,b-1). There is no problem in my program. If b is 4, (4%2)*2=0, and 0 is not equal to b. So I can't understand how to calculate b when it's even.
Question 2:
I want to calucate the number of steps in the program. so according to my textbook, I can get the formual as follows.
b even t(b) = 6 + t(b/2)
b odd t(b) = 6 + t(b-1)
Why is the first number 6? How can I get the number 3 in the beginning?
Your (b%2)*2 == b test is never true. I think you want b % 2 == 0 to test if b is even. The code still gets the right answer because the other recursive case (intended only for odd b values) works for even ones too (it's just less efficient).
As for your other question, I have no idea where the 6 is coming from either. It depends a lot on what you're counting as a "step". Usually it's most useful to discuss performance in terms of "Big-O" values rather than specific numbers.
Related
You are working at the cash counter at a fun-fair, and you have different types of coins available to you in infinite quantities. The value of each coin is already given. Can you determine the number of ways of making change for a particular number of units using the given types of coins?
counter = 0
def helper(n,c):
global counter
if n == 0:
counter += 1
return
if len(c) == 0:
return
else:
if n >= c[0]:
helper(n - c[0], c)
helper(n,c[1:])
def getWays(n, c):
helper(n,c)
print(counter)
return counter ```
#the helper function takes n and c
#where
#n is amount whose change is to be made
#c is a list of available coins
Let n be the amount of currency units to return as change. You wish to find N(n), the number of possible ways to return change.
One easy solution would be to first choose the "first" coin you give (let's say it has value c), then notice that N(n) is the sum of all the values N(n-c) for every possible c. Since this appears to be a recursive problem, we need some base cases. Typically, we'll have N(1) = 1 (one coin of value one).
Let's do an example: 3 can be returned as "1 plus 1 plus 1" or as "2 plus 1" (assuming coins of value one and two exist). Therefore, N(3)=2.
However, if we apply the previous algorithm, it will compute N(3) to be 3.
+------------+-------------+------------+
| First coin | Second coin | Third coin |
+------------+-------------+------------+
| 2 | 1 | |
+------------+-------------+------------+
| | 2 | |
| 1 +-------------+------------+
| | 1 | 1 |
+------------+-------------+------------+
Indeed, notice that returning 3 units as "2 plus 1" or as "1 plus 2" is counted as two different solutions by our algorithm, whereas they are the same.
We therefore need to apply an additional restriction to avoid such duplicates. One possible solution is to order the coins (for example by decreasing value). We then impose the following restriction: if at a given step we returned a coin of value c0, then at the next step, we may only return coins of value c0 or less.
This leads to the following induction relation (noting c0 the value of the coin returned in the last step): N(n) is the sum of all the values of N(n-c) for all possible values of c less than or equal to c0.
Happy coding :)
I have the following df,
A id
[ObjectId('5abb6fab81c0')] 0
[ObjectId('5abb6fab81c3'),ObjectId('5abb6fab81c4')] 1
[ObjectId('5abb6fab81c2'),ObjectId('5abb6fab81c1')] 2
I like to flatten each list in A, and assign its corresponding id to each element in the list like,
A id
ObjectId('5abb6fab81c0') 0
ObjectId('5abb6fab81c3') 1
ObjectId('5abb6fab81c4') 1
ObjectId('5abb6fab81c2') 2
ObjectId('5abb6fab81c1') 2
I think the comment is coming from this question ? you can using my original post or this one
df.set_index('id').A.apply(pd.Series).stack().reset_index().drop('level_1',1)
Out[497]:
id 0
0 0 1.0
1 1 2.0
2 1 3.0
3 1 4.0
4 2 5.0
5 2 6.0
Or
pd.DataFrame({'id':df.id.repeat(df.A.str.len()),'A':df.A.sum()})
Out[498]:
A id
0 1 0
1 2 1
1 3 1
1 4 1
2 5 2
2 6 2
This probably isn't the most elegant solution, but it works. The idea here is to loop through df (which is why this is likely an inefficient solution), and then loop through each list in column A, appending each item and the id to new lists. Those two new lists are then turned into a new DataFrame.
a_list = []
id_list = []
for index, a, i in df.itertuples():
for item in a:
a_list.append(item)
id_list.append(i)
df1 = pd.DataFrame(list(zip(alist, idlist)), columns=['A', 'id'])
As I said, inelegant, but it gets the job done. There's probably at least one better way to optimize this, but hopefully it gets you moving forward.
EDIT (April 2, 2018)
I had the thought to run a timing comparison between mine and Wen's code, simply out of curiosity. The two variables are the length of column A, and the length of the list entries in column A. I ran a bunch of test cases, iterating by orders of magnitude each time. For example, I started with A length = 10 and ran through to 1,000,000, at each step iterating through randomized A entry list lengths of 1-10, 1-100 ... 1-1,000,000. I found the following:
Overall, my code is noticeably faster (especially at increasing A lengths) as long as the list lengths are less than ~1,000. As soon as the randomized list length hits the ~1,000 barrier, Wen's code takes over in speed. This was a huge surprise to me! I fully expected my code to lose every time.
Length of column A generally doesn't matter - it simply increases the overall execution time linearly. The only case in which it changed the results was for A length = 10. In that case, no matter the list length, my code ran faster (also strange to me).
Conclusion: If the list entries in A are on the order of a few hundred elements (or less) long, my code is the way to go. But if you're working with huge data sets, use Wen's! Also worth noting that as you hit the 1,000,000 barrier, both methods slow down drastically. I'm using a fairly powerful computer, and each were taking minutes by the end (it actually crashed on the A length = 1,000,000 and list length = 1,000,000 case).
Flattening and unflattening can be done using this function
def flatten(df, col):
col_flat = pd.DataFrame([[i, x] for i, y in df[col].apply(list).iteritems() for x in y], columns=['I', col])
col_flat = col_flat.set_index('I')
df = df.drop(col, 1)
df = df.merge(col_flat, left_index=True, right_index=True)
return df
Unflattening:
def unflatten(flat_df, col):
flat_df.groupby(level=0).agg({**{c:'first' for c in flat_df.columns}, col: list})
After unflattening we get the same dataframe except column order:
(df.sort_index(axis=1) == unflatten(flatten(df)).sort_index(axis=1)).all().all()
>> True
To create unique index you can call reset_index after flattening
x-.y includes all items of x except for those that are cells of y
But what if I want to get all items that are cells of x and of y?
I can achieve this by
x -.^:2 y
But it require running expensive operation twice.
Is there a better solution?
e. is often useful when working with sets.
x e. y
gives a list of matches:
for each item of x return 1 if it exists in the "set" y, 0 otherwise.
1 2 3 4 e. 5 9 2
0 1 0 0
Then,
x (e. # [) y
selects those elements that do exist in both lists.
1 2 3 4 (e. # [) 5 9 2
2
5 8 (e. # [) i.12
5 8
Doing -. twice is the classic way of implementing intersection in J.
The inefficiency is minor (a constant factor - and, in general, you should not concern yourself with efficiency issues in J unless they exceed a factor of 2 - when you have resource problems you're generally going to want to focus on the factor of 1000 or greater issues).
Put differently, if ([-.-.) or -.^:2 is too slow for you then -. would also be too slow for you. (This can happen on extremely large data sets where the underlying implementation has been inefficient. Recent versions of J have had some work done, to correct this issue.)
Disappointing, perhaps, but practical.
I am confused on how this is computed.
Input: groupBy (\x y -> (x*y `mod` 3) == 0) [1,2,3,4,5,6,7,8,9]
Output: [[1],[2,3],[4],[5,6],[7],[8,9]]
First, does x and y refer to the current and the next element?
Second, is this saying that it will group the elements that equal 0 when it is modded by 3? If so, how come there are elements that are not equal to 0 when modded by 3 in the output?
Found here: http://zvon.org/other/haskell/Outputlist/groupBy_f.html
To answer your second question: We compare two elements by multiplying them and seeing if the result is divisible by 3. "So why are there elements in the output not divisible by 3?" If they aren't divisible, that doesn't filter them out (that's what filter does); rather, when the predicate fails, the element goes into a separate group. When it succeeds, the element goes into the current group.
As to your first question, this took me a little while to figure out... x and y aren't two consecutive elements; rather, y is the current element and x is the first element in the current group. (!)
1 * 2 = 2; 2 `mod` 3 = 2; 1 and 2 go in separate groups.
2 * 3 = 6; 6 `mod` 3 = 0; 2 and 3 go in the same group.
2 * 4 = 8; 8 `mod` 3 = 2; 4 gets put in a different group.
...
Notice, on that last line, we're looking at 2 and 4 — not 3 and 4, as you might reasonably expect.
First, does x and y refer to the current and the next element?
Roughly, yes.
Second, is this saying that it will group the elements that equal 0 when it is modded by 3? If so, how come there are elements that are not equal to 0 when modded by 3 in the output?
The lambda defines a relation between two integers x and y, which holds whenever the product x*y is a multiple of 3. Since 3 is prime, x must be a multiple of 3 or y must be such.
For the input [1,2,3,4,5,6,7,8,9], it is first checked whether 1 is in relation with 2. This is false, so 1 gets its own singleton group [1]. Then, we proceed we 2 and 3: now the relation holds, so 2,3 will share their group. Next, we check whether 2 and 4 are in relation: this is false. So, the group is [2,3] and not any larger. Then we proceed with 4 and 5 ...
I must confess that I do not like this example very much, since the relation is not an equivalence relation (because it is not transitive). Because of this, the exact result of groupBy is not guaranteed: the implementation might test the relation on 3,4 (true) instead of 2,4 (false), and build a group [2,3,4] instead.
Quoting from the docs:
The predicate is assumed to define an equivalence.
So, once this contract is violated, there are no guarantees on what the output of groupBy might be.
The groupBy function takes a list and returns a list of lists such that each sublist in the result contains only equal elements, based on the equality function you provide.
In this case, you are trying to find all subsets where for all sublist elements x and y, mod (x*y) 3 == 0 (and the ones where it doesn't == 0). Slightly weird, but there you go. groupBy only looks at adjacent elements. sort the list to reduce the number of duplicate sets.
What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20?
I could easily brute force the solution in an imperative programming language with loops. But I want to do this in Haskell and not having loops makes it much harder. I was thinking of doing something like this:
[n | n <- [1..], d <- [1..20], n `mod` d == 0] !! 0
But I know that won't work because "d" will make the condition equal True at d = 1. I need a hint on how to make it so that n mod d is calculated for [1..20] and can be verified for all 20 numbers.
Again, please don't give me a solution. Thanks.
As with many of the Project Euler problems, this is at least as much about math as it is about programming.
What you're looking for is the least common multiple of a set of numbers, which happen to be in a sequence starting at 1.
A likely tactic in a functional language is trying to make it recursive based on figuring out the relation between the smallest number divisible by all of [1..n] and the smallest number divisible by all of [1..n+1]. Play with this with some smaller numbers than 20 and try to understand the mathematical relation or perhaps discern a pattern.
Instead of a search until you find such a number, consider instead a constructive algorithm, where, given a set of numbers, you construct the smallest (or least) positive number that is evenly divisible by (aka "is a common multiple of") all those numbers. Look at the algorithms there, and consider how Euclid's algorithm (which they mention) might apply.
Can you think of any relationship between two numbers in terms of their greatest common divisor and their least common multiple? How about among a set of numbers?
If you look at it, it seems to be a list filtering operation. List of infinite numbers, to be filtered based on case the whether number is divisible by all numbers from 1 to 20.
So what we got is we need a function which takes a integer and a list of integer and tells whether it is divisible by all those numbers in the list
isDivisible :: [Int] -> Int -> Bool
and then use this in List filter as
filter (isDivisible [1..20]) [1..]
Now as Haskell is a lazy language, you just need to take the required number of items (in your case you need just one hence List.head method sounds good) from the above filter result.
I hope this helps you. This is a simple solution and there will be many other single line solutions for this too :)
Alternative answer: You can just take advantage of the lcm function provided in the Prelude.
For efficiently solving this, go with Don Roby's answer. If you just want a little hint on the brute force approach, translate what you wrote back into english and see how it differs from the problem description.
You wrote something like "filter the product of the positive naturals and the positive naturals from 1 to 20"
what you want is more like "filter the positive naturals by some function of the positive naturals from 1 to 20"
You have to get Mathy in this case. You are gonna do a foldl through [1..20], starting with an accumulator n = 1. For each number p of that list, you only proceed if p is a prime. Now for the previous prime p, you want to find the largest integer q such that p^q <= 20. Multiply n *= (p^q). Once the foldl finishes, n is the number you want.
A possible brute force implementation would be
head [n|n <- [1..], all ((==0).(n `mod`)) [1..20]]
but in this case it would take way too long. The all function tests if a predicate holds for all elements of a list. The lambda is short for (\d -> mod n d == 0).
So how could you speed up the calculation? Let's factorize our divisors in prime factors, and search for the highest power of every prime factor:
2 = 2
3 = 3
4 = 2^2
5 = 5
6 = 2 * 3
7 = 7
8 = 2^3
9 = 3^2
10 = 2 * 5
11 = 11
12 = 2^2*3
13 = 13
14 = 2 *7
15 = 3 * 5
16 = 2^4
17 = 17
18 = 2 * 3^2
19 = 19
20 = 2^2 * 5
--------------------------------
max= 2^4*3^2*5*7*11*13*17*19
Using this number we have:
all ((==0).(2^4*3^2*5*7*11*13*17*19 `mod`)) [1..20]
--True
Hey, it is divisible by all numbers from 1 to 20. Not very surprising. E.g. it is divisible by 15 because it "contains" the factors 3 and 5, and it is divisible by 16, because it "contains" the factor 2^4. But is it the smallest possible number? Think about it...