Google Kickstart Round D : Record Breaker Problem. Please hep me to debug my program - python-3.x

Problem
Isyana is given the number of visitors at her local theme park on N consecutive days. The number of visitors on the i-th day is Vi. A day is record breaking if it satisfies both of the following conditions:
The number of visitors on the day is strictly larger than the number of visitors on each of the previous days.
Either it is the last day, or the number of visitors on the day is strictly larger than the number of visitors on the following day.
Note that the very first day could be a record breaking day!
Please help Isyana find out the number of record breaking days.
Input
The first line of the input gives the number of test cases, T. T test cases follow. Each test case begins with a line containing the integer N. The second line contains N integers. The i-th integer is Vi.
Output
For each test case, output one line containing Case #x: y, where x is the test case number (starting from 1) and y is the number of record breaking days.
Limits
Time limit: 20 seconds per test set.
Memory limit: 1GB.
1 ≤ T ≤ 100.
0 ≤ Vi ≤ 2 × 105.
Test set 1
1 ≤ N ≤ 1000.
Test set 2
1 ≤ N ≤ 2 × 105 for at most 10 test cases.
For the remaining cases, 1 ≤ N ≤ 1000.
Sample
Input
Output
4
8
1 2 0 7 2 0 2 0
6
4 8 15 16 23 42
9
3 1 4 1 5 9 2 6 5
6
9 9 9 9 9 9
Case #1: 2
Case #2: 1
Case #3: 3
Case #4: 0
In Sample Case #1, the bold and underlined numbers in the following represent the record breaking days: 1 2 0 7 2 0 2 0.
In Sample Case #2, only the last day is a record breaking day.
In Sample Case #3, the first, the third, and the sixth days are record breaking days.
In Sample Case #4, there is no record breaking day.
-------------------------------------------------------------------------
This python code which i submitted in 'Kickstart Round D problem 1: Record Broker'. I executed this code on my local machine and there wasn't any run time error on top of that I couldn't brute force any test case that could break the code or give wrong answer. But while doing submission in kickstart it gives me runtime error. What could be the issue for getting run time error on kickstart? Please help!
cases = int(input().strip())
for q in range(1, cases + 1):
l = int(input().strip())
ls = list(map(int, input().split(" ")))
l = len(ls)
local_max = 0
count = 0
for i in range(l):
if i == 0:
if ls[1] < ls[0]:
local_max = ls[0]
count += 1
continue
if i == l - 1:
if ls[i] > ls[i - 1] and ls[i] > local_max:
count += 1
break
if ls[i] > ls[i - 1] and ls[i] > ls[i + 1] and ls[i] > local_max:
count += 1
local_max = ls[i]
continue
print("Case #{}: {}".format(q, count))

this line:
if ls[1] < ls[0]:
What if ls only has 1 element?
After fixing that your code still returns the wrong answer if there's only 1 element.
Also, consider this test case:
"65 87 87 34 64 59 93 20 95 85 24 99 62 100 60 19 100"
64 is not valid, but you count it.

Related

Is there a way to sort a list so that rows with the same value in one column are evenly distributed?

Hoping to sort (below left) by sector but distribute evenly (below right):
Name
Sector.
Name.
Sector
A
1
A
1
B
1
E
2
C
1
H
3
D
4
D
4
E
2
B
1
F
2
F
2
G
2
J
3
H
3
I
4
I
4
C
1
J
3
G
2
Real data is 70+ rows with 4 sectors.
I've worked around it manually but would love to figure out how to do it with a formula in excel.
Here's a more complete (and hopefully more accurate) idea - the carouselOrder is the column I'd like to generate via a formula.
guestID
guestSector
carouselOrder
1
1
1
2
1
5
3
1
9
4
1
13
5
2
2
6
2
6
7
2
10
8
2
14
9
3
3
10
3
7
11
3
11
12
2
18
13
1
17
14
1
20
15
1
23
16
2
21
17
2
24
18
2
27
19
1
26
20
1
29
21
1
30
22
1
31
23
3
15
24
3
19
25
3
22
26
3
25
27
3
28
28
1
32
29
4
4
30
4
8
31
4
12
32
4
16
When using Office 365 you can use the following in D2: =MOD(SEQUENCE(COUNTA(A2:A11),,0),4)+1
This create the repetitive counter of the sectors 1 to 4 to the total count of rows in your data.
In C2 use the following:
=BYROW(D2#,LAMBDA(x,
INDEX(
FILTER($A$2:$A$11,$B$2:$B$11=x),
SUM(--(D$2:x=x)))))
This filters the Names that equal the sector of mentioned row and indexes it to show only the result where the row in the filter result equals the count of the same sector (D2#) up to current row.
Let's try the following approach that doesn't require to create a helper column. I would like to explain first the logic to build the recurrence, then the excel formula that builds such recurrence.
If we sort the input data Name and Sector. by Sector. in ascending order, the new positions of the Name values (letters) can be calculated as follow (Table 1):
Name
Sector.Sorted
Position
A
1
1+4*0=1
B
1
1+4*1=5
C
1
1+4*2=9
E
2
2+4*0=2
F
2
2+4*1=6
G
2
2*4*2=10
H
3
3+4*0=3
J
3
3+4*1=7
D
4
4+4*0=4
I
4
4+4*1=8
The new positions of Name (letters) follows this pattern (Formula 1):
position = Sector.Sorted + groupSize * factor
where groupSize is 4 in our case and factor counts how many times the same Sector.Sorted value is repeated, starting from 0. Think about Sector.Sorted as groups, where each set of repeated values represents a group: 1,2,3 and 4.
If we are able to build the Position values we can sort Name, based on the new positions via SORTBY(array, by_array1) function. Check SORTBY documentation for more information how this function works.
Here is the formula to get the Name sorted in cell E2:
=LET(groupSize, 4, sorted, SORT(A2:B11,2), sName,
INDEX(sorted,,1),sSector, INDEX(sorted,,2),
seq0, SEQUENCE(ROWS(sSector),,0), mapResult,
MAP(sSector, seq0, LAMBDA(a,b, IF(b=0, "SAME",
IF(a=INDEX(sSector,b), "SAME", "NEW")))), factor,
SCAN(-1,mapResult, LAMBDA(aa,c,IF(c="SAME", aa+1,0))),
pos,MAP(sSector, factor, LAMBDA(m,n, m + groupSize*n)),
SORTBY(sName,pos)
)
Here is the output:
Explanation
The name sorted represents the input data sorted by Sector. in ascending order, i.e.: SORT(A2:B11,2). The names sName and sSector represent each column of sorted.
To identify each group we need the following sequence (seq0) starting from 0, i.e. SEQUENCE(ROWS(sSector),,0).
Now we need to identify when a new group starts. We use MAP function for that and the result is represented by the name mapResult:
MAP(sSector, seq0, LAMBDA(a,b, IF(b=0, "SAME",
IF(a=INDEX(sSector,b), "SAME", "NEW"))))
The logic is the following: If we are at the beginning of the sequence (first value of seq0), then returns SAME otherwise we check current value of sSector (a) against the previous one represented by INDEX(sSector,b) if they are the same, then we are in the same group, otherwise a new group started.
The intermediate result of mapResult is:
Name
Sector Sorted
mapResult
A
1
SAME
B
1
SAME
C
1
SAME
E
2
NEW
F
2
SAME
G
2
SAME
H
3
NEW
J
3
SAME
D
4
NEW
I
4
SAME
The first two columns are shown just for illustrative purpose, but mapResult only returns the last column.
Now we just need to create the counter based on every time we find NEW. In order to do that we use SCAN function and the result is stored under the name factor. This value represents the factor we use to multiply by 4 within each group (see Table 1):
SCAN(-1,mapResult, LAMBDA(aa,c,IF(c="SAME", aa+1,0)))
The accumulator starts in -1, because the counter starts with 0. Every time we find SAME, it increments by 1 the previous value. When it finds NEW (not equal to SAME), the accumulator is reset to 0.
Here is the intermediate result of factor:
Name
Sector Sorted
mapResult
factor
A
1
SAME
0
B
1
SAME
1
C
1
SAME
2
E
2
NEW
0
F
2
SAME
1
G
2
SAME
2
H
3
NEW
0
J
3
SAME
1
D
4
NEW
0
I
4
SAME
1
The first three columns are shown for illustrative purpose.
Now we have all the elements to build our pattern for the new positions represented with the name pos:
MAP(sSector, factor, LAMBDA(m,n, m + groupSize*n))
where m represents each element of Sector.Sorted and factor the previous calculated values. As you can see the formula in Excel represents the generic formula (Formula 1 see above). The intermediate result will be:
Name
Sector Sorted
mapResult
factor
pos
A
1
SAME
0
1
B
1
SAME
1
5
C
1
SAME
2
9
E
2
NEW
0
2
F
2
SAME
1
6
G
2
SAME
2
10
H
3
NEW
0
3
J
3
SAME
1
7
D
4
NEW
0
4
I
4
SAME
1
8
The previous columns are shown just for illustrative purpose. Now we have the new positions, so we are ready to sort based on the new positions for Name via:
SORTBY(sName,pos)
Update
The first MAP can be removed creating an array as input for SCAN that has the information of sSector and the index position to be used for finding the previous element. SCAN only allows a single array as input argument, so we can combine both information in a new array. This is the formula can be used instead:
=LET(groupSize, 4, sorted, SORT(A2:B11,2), sName,
INDEX(sorted,,1),sSector, INDEX(sorted,,2),
factor, SCAN(-1,sSector&"-"&SEQUENCE(ROWS(sSector),,0),
LAMBDA(aa,b, LET(s, TEXTSPLIT(b,"-"),item, INDEX(s,,1),
idx, INDEX(s,,2), IF(aa=-1, 0, IF(1*item=INDEX(sSector, idx), aa+1,0))))),
pos,MAP(sSector, factor, LAMBDA(m,n, m + groupSize*n)),
SORTBY(sName,pos)
)
We use inside of SCAN a LET function to calculate all required elements for doing the comparison as part of the calculation of the corresponding LAMBDA function. We extract the item and the idx position used to find previous element of sSector via:
1*item=INDEX(sSector, idx)
we are able to compare each element of sSector with previous one, starting from the second element of sSector. We multiply item by 1, because TEXTSPLIT converts the result to text, otherwise the comparison will fail.

Calculate production capacity per product/day up to goal

I have the following data.
Available resources data per day:
A
B
C
D
E
F
G
H
I
J
K
L
M
N
2
resources
day
3
1
2
3
4
5
6
7
8
9
10
11
12
4
empl.1
8
8
4
2
2
4
4
8
8
5
empl.2
8
4
4
8
4
8
6
empl.3
And different products and it's production per hour (per employee) and the required quantity per part:
P
Q
R
S
2
product
production/hour
required qty
3
4
prod.1
1
60
5
prod.2
1
6
6
prod.3
2
4
From this data I want to calculate the number of products that can be produced per day based on the available employees for the day and the production capacity for that product up until the goal is reached for that product.
edit: calculation from original post was calculating to hours spent per product per day only, not to qty of products produced; also the MOD-part gave wrong calculation results if the daily produced qty exceeds the goal
I use the following formula to calculate the above (used in C11 and dragged to the right):
=LET(
prod,BYROW($B11:B13,LAMBDA(r,SUM(r))),
reached,--(prod<$S$4:$S$6),
dayprod,IFERROR(SUM(C4:C6)/SUM(reached*$R$4:$R$6),0)*reached*$R$4:$R$6,
IF(prod+dayprod>$S$4:$S$6,dayprod-((prod+dayprod)-$S$4:$S$6),dayprod))
This results in the following:
A
B
C
D
E
F
G
H
I
J
K
L
M
N
9
product
day
10
1
2
3
4
5
6
7
8
9
10
11
12
11
prod.1
2
8
4
6
2
8
4
0
8
12
6
0
12
prod.2
2
4
0
0
0
0
0
0
0
0
0
0
13
prod.3
4
0
0
0
0
0
0
0
0
0
0
0
This formula sums the hours from the employees available that day and divides their hours over the products that did not reach the goal yet.
If the goal is reached the available hours are divided over the remaining products to produce.
Screenshot of the data + current result:
Now the problem I'm having is the following:
If the goal is reached for a product somewhere halfway the day the dayprod-((prod+dayprod)-$S$4:$S$6)-part of the function calculates the remaining hours of production for that product for that day, but the available hours from the employees are divided over each product that needs production still, but let's take the following example:
prod.1, day 2: value 8
prod.2, day 2: value 4
The 8 for prod.1 is calculated based on both prod.1 & prod.2 in need for production still and both take 1 hour per person to produce one.
Having 16 hours available that day that means a capacity of 8 for each.
But the challenge lies in the goal being reached halfway the day.
In fact the first 4 hours are used by both employees to produce 4 of each product.
The last 4 hours both employees can focus on prod.1 resulting in not qty 4 of production for the last 4 hours, but 4 + 4 which results in a total of 12 being produced for prod.1, not 8 like now calculated.
How can I get the formula to add the remaining time to the remaining products?
Original post, prior to edit, containing error (not calculating to number of products, but to number of hours spent per product per day only)
I use the following formula to calculate the above (used in C11 and dragged to the right):
=LET(
prod,BYROW($B11:B13,LAMBDA(r,SUM(r))),
reached,--(prod<$S$4:$S$6),
dayprod,IFERROR(SUM(C4:C6)/SUM(reached*$R$4:$R$6),0)*reached*$R$4:$R$6,
IF(prod+dayprod>$S$4:$S$6,dayprod-MOD(prod+dayprod,$S$4:$S$6),dayprod))
This results in the following:
A
B
C
D
E
F
G
H
I
J
K
L
M
N
9
product
day
10
1
2
3
4
5
6
7
8
9
10
11
12
11
prod.1
2
8
4
6
2
8
4
0
8
12
6
0
12
prod.2
2
4
0
0
0
0
0
0
0
0
0
0
13
prod.3
4
0
0
0
0
0
0
0
0
0
0
0
This formula sums the hours from the employees available that day and divides their hours over the products that did not reach the goal yet.
If the goal is reached the available hours are divided over the remaining products to produce.
Screenshot of the data + current result:
Now the problem I'm having is the following:
If the goal is reached for a product somewhere halfway the day the MOD-part of the function calculates the remaining qty for that product for that day, but the available hours from the employees are divided over each product that needs production still, but let's take the following example:
prod.1, day 2: value 8
prod.2, day 2: value 4
The 8 for prod.1 is calculated based on both prod.1 & prod.2 in need for production still and both take 1 hour per person to produce one.
Having 16 hours available that day that means a capacity of 8 for each.
But the challenge lies in the goal being reached halfway the day.
In fact the first 4 hours are used by both employees to produce 4 of each product.
The last 4 hours both employees can focus on prod.1 resulting in a total of 12 being produced for prod.1, not 8.
I kind of broke my head on getting this far, but from here I could use some help.
How can I get the MOD part of the formula to add the remaining time to the remaining products?
I was able to find a solution to my problem.
I had to use the result from the formula in place and check if the sum up to the current day (including that day's production) exceeds the goal. If so I needed to get the time difference between the day's production and that day's production needed to get to the goal. The difference is the time of production to be added to the remaining part(s) for that day that did not reach the goal yet, also not when adding the day's production.
This results in the following formula in C11 dragged to the right:
=LET(
prod,BYROW($B11:B13,LAMBDA(r,SUM(r))),
prodhour,$R$4:$R$6,
goal,$S$4:$S$6,
reached,--(prod<goal),
dayprod,(IFERROR(SUM(C$4:C$6)/SUM(reached*prodhour),0)*reached*prodhour)*prodhour,
preres,IF(prod+dayprod>goal,dayprod-((prod+dayprod)-goal),dayprod),
timecorr,(dayprod*(dayprod<>preres)-preres*(dayprod<>preres))/prodhour,
reachedcorr,reached*(timecorr=0),
dayprodcorr,(IFERROR(SUM(timecorr)/SUM(reachedcorr*prodhour),0)*reachedcorr*prodhour)*prodhour,
IF(prod+dayprod>=goal,dayprod-((prod+dayprod)-goal),dayprod+dayprodcorr))
Where preres is the previous result (from where I got stuck in the opening post).
And the corr parts are taking care of the correction if goal is reached for a product and there was still production time remaining.

Selecting columns by their position while using a function in pandas

My dataframe looks somthing like this
frame = pd.DataFrame({'id':[1,2,3,4,5],
'week1_values':[0,0,13,39,64],
'week2_values':[32,35,25,78,200]})
I am trying to apply a function to calculate the Week over Week percentage difference between two columns('week1_values' and 'week2_values') whose names are being generated dynamically.
I want to create a function to calculate the percentage difference between weeks keeping in mind the zero values in the 'week1_values' column.
My function is something like this:
def WoW(df):
if df.iloc[:,1] == 0:
return (df.iloc[:,1] - df.iloc[:,2])
else:
return ((df.iloc[:,1] - df.iloc[:,2]) / df.iloc[:,1]) *100
frame['WoW%'] = frame.apply(WoW,axis=1)
When i try to do that, i end up with this error
IndexingError: ('Too many indexers', 'occurred at index 0')
How is it that one is supposed to specify columns by their positions inside a function?
PS: Just want to clarify that since the column names are being generated dynamically, i am trying to select them by their position with iloc function.
Because working with Series, remove indexing columns:
def WoW(df):
if df.iloc[1] == 0:
return (df.iloc[1] - df.iloc[2])
else:
return ((df.iloc[1] - df.iloc[2]) / df.iloc[1]) *100
frame['WoW%'] = frame.apply(WoW,axis=1)
Vectorized alternative:
s = frame.iloc[:,1] - frame.iloc[:,2]
frame['WoW%1'] = np.where(frame.iloc[:, 1] == 0, s, (s / frame.iloc[:,1]) *100)
print (frame)
id week1_values week2_values WoW% WoW%1
0 1 0 32 -32.000000 -32.000000
1 2 0 35 -35.000000 -35.000000
2 3 13 25 -92.307692 -92.307692
3 4 39 78 -100.000000 -100.000000
4 5 64 200 -212.500000 -212.500000
You can use pandas pct_change method to automatically compute the percent change.
s = (frame.iloc[:, 1:].pct_change(axis=1).iloc[:, -1]*100)
frame['WoW%'] = s.mask(np.isinf(s), frame.iloc[:, -1])
output:
id week1_values week2_values WoW
0 1 0 32 32.000000
1 2 0 35 35.000000
2 3 13 25 92.307692
3 4 39 78 100.000000
4 5 64 200 212.500000
Note however that the way you currently do it in your custom function is biased. Changes from 0->20, or 10->12, or 100->120 would all produce 20 as output, which seems ambiguous.
suggested alternative
use a classical percent increase, even if it leads to infinite:
frame['WoW'] = frame.iloc[:, 1:].pct_change(axis=1).iloc[:, -1]*100
output:
id week1_values week2_values WoW
0 1 0 32 inf
1 2 0 35 inf
2 3 13 25 92.307692
3 4 39 78 100.000000
4 5 64 200 212.500000

Dropping all id rows if at least one cell meets a given criterion (e.g. has a missing value)

My dataset is in the following form:
clear
input id var
1 20
1 21
1 32
1 34
2 11
2 .
2 15
3 21
3 22
3 1
3 2
3 5
end
In my true dataset, observations are sorted by id and by year (not shown here).
What I need to do is to drop all the rows of a specific id if (at least) one of the following two conditions is met:
there is at least one missing value of var.
var decreases from one row to the next (for the same id)
So in my example what I would like to obtain is:
id var
1 20
1 21
1 32
1 34
Now, my unfortuante attempt has been to use row-wise operations together with by, in order to create a drop1 variable to be later used to subset the dataset.
Something on these lines (which is clearly wrong), :
bysort id: gen drop1=1 if var[_n] < var[_n-1] | var[_n]==.
This doesn't work, and I am not even sure that I am considering the most "clean" and direct way to solve the task.
How would you proceed? Any help would be highly appreciated.
My interpretation is that you want to drop the complete group if either of two conditions are met. I assume your dataset is sorted in some way, most likely, based on another variable. Otherwise, the structure is fragile.
The logic is simple. Check for decreasing values but leave out the first observation of each group, i.e., leave out _n == 1. The first observation, if non-missing, will always be smaller. Then, check also for missings.
clear
set more off
input id var
1 20
1 21
1 32
1 34
2 11
2 .
2 15
3 21
3 22
3 1
3 2
3 5
end
// maintain original sequencing
gen orig = _n
order id orig
bysort id (orig) : gen todrop = sum((var < var[_n-1] & _n > 1) | missing(var))
list, sepby(id)
by id : drop if todrop[_N]
list, sepby(id)
One way to do this is to create some indicator variable as you had attempted. If you only want to drop where var decreases from one observation to the next, you could use:
clear
input id var
1 20
1 21
1 32
1 34
2 11
2 .
2 15
3 21
3 22
3 1
3 2
3 5
4 .
4 2
end
gen i = id if mi(var)
bysort id : egen k = mean(i)
drop if id == k
drop i k
drop if var[_n-1] > var[_n] & _n != 1
However, if you want to get the output you supplied in the post (drop all subsequent observations where var decreases from some max value), you could try the following in place of the last line above.
local N = _N
forvalues i = 1/`N' {
drop if var[_n-1] > var[_n] & _n != 1
}
The loop just ensures that the drop if var... section of code is executed enough so that all observations where var < 34 are dropped.

spoj - CPCRC1C, sum of digits of numbers 1 to n, need clarification, not solution

Once, one boy's teacher asked him to calculate the sum of numbers 1 through n.
the boy quickly answered, and his teacher made him another challenge. He asked him to calculate the sum of the digits of numbers 1 through n.
Input
Two space-separated integers 0 <= a <= b <= 109.
Output
The sum of the digits of numbers a through b.
Example
Input:
1 10
Output: 46
can someone explain what is meant by sum of the digits of numbers a to b?
from above, sum of {1 2 3 4 5 6 7 8 9 10 } is 55 , it is a well known Gaussian formula
but the output is 46!
if i count from 2 to 9, excluding the border numbers 1 and 10, the answer is 44 , still not 46
So what is meant by sum of digits of numbers?
1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + (1 + 0)
Don't treat the 10 as the number 10, rather the digits 1 and 0

Resources