Where is a mistake in my task? The test does not skip any further - python-3.x

Zenyk recorded losses daily throughout
n
days, but suddenly noticed that he had made a mistake and lost data on enemy losses on one of the days. However, Zenyk knows the total number of losses
y
, as well as the fact that every day the enemy lost a positive number of soldiers. In this way he can reconstruct the number of enemy casualties for that day, provided that Zenic has not made another mistake in his records. Help Zenic recover the number of enemy casualties for a day for which there is no data. If Zenyk made an additional mistake in the notebook, display Another mistake!
Incoming data
The first line contains two integers
n
and
x
— the number of days in which Zenyk recorded statistics, as well as the total losses of the enemy.
In the next line is given
n - 1
numbers
ai
, separated by blanks — data on daily enemy losses according to Zenik's notebook.
Output data
Output one number - the number of enemy casualties on the day Zenyk forgot to record the data.
If Zenyk made another mistake in his notebook and the data in the notebook is contradictory, then output Another mistake!.
n, x = map(int, input().split())
for i in range(n-1):
a = list(map(int, input().split()))
b = sum(a)
c = x - b
print(c)
if x < b:
print('Another mistake!')
Input:
1 100000
Output:
100000
Input:
2, 10
2
Output: 8
Input:
2 15
47
Output:
'Another mistake!'
This is my code, but the test does not skip any further!
Can you suggest what is wrong and correct my code for this task

Related

Confused in behaviour of 2-d array in python

n,m,k=map(int,input().split())
students=[int(x) for x in input().split()]
classroom=[]
count=0
rows=[0]*k
for i in range(m):
classroom.append(rows)
for i in students:
for j in range(k):
if c[i-1][j]==1:
continue
else:
c[i-1][j]=1
count+=1
break
print(classroom)
`"""
I want to calculate the number of students who are seated in their preferred row(it should vacant for student), in my case 0 is my vacancy and there are n students with their preferred rows( array of n with ith elemnt as preferred row).
Now my input is 5 2 2
1 1 2 1 1
here n=5,k=2(row length), m=2(no. of rows)
array=[1,1,2,1,1](students in the above code)
as per my code classroom will be my 2d array of size 2x2
Now, here logically it should print the classroom [[1,1],[1,0]] but im unable to understand why it is printing the classroom [[1,1],[1,1]]
I have testeed with input 5 2 2
1
so logically it should print classroom [[1,0][0,0]] but it is printing classroom [[1,0],[1,0]]. I have tested this on python 3 .
Please let me know what did i do wrong or what is the concept i didn't understand or what is logic behind this`
This line
classroom.append(rows)
appends the same address again and again. Thus when one of them is modified, all the others are modified. That's why the rows of your output are all the same.
Change this line to
classroom.append([0] * k)
This ensures that the rows are independent of each other.

Calculate months of coverage based on multiple parameters in Pandas

I need to mark the month with 1s when the patient was covered by some product. One dose provides coverage for 1 month. Also i would like to see the gaps in coverage.
Another detail is that quantity may affect months of coverage too. Lets say the quantity is 2, then patient is covered for next 2 months.
Right now I using df.loc which works with the first dose, but can't wrap my mind around how to calculate those gaps in coverage.
df = pd.DataFrame({'patient':['1','2','3','4','5','6','7'], 'dose1':['A','B','B','A','C','C','C'],
'qty1':[1,2,1,4,1,3,4],
'days_since_last_dose1':[0,0,0,0,0,0,0],
'dose2':['B','A','B','A','C','B','C'],
'qty2':[1,2,1,4,1,3,4],
'days_since_last_dose2':[23,56,120,43,30,15,60],
'dose3':['B','B','B','A','A','C','B'],
'qty3':[3,1,1,2,1,3,4],
'days_since_last_dose3':[44,22,67,150,76,32,21], 'M1':[0,0,0,0,0,0,0], 'M2':[0,0,0,0,0,0,0], 'M3':[0,0,0,0,0,0,0], 'M4':[0,0,0,0,0,0,0], 'M5':[0,0,0,0,0,0,0], 'M6':[0,0,0,0,0,0,0], 'M7':[0,0,0,0,0,0,0], 'M8':[0,0,0,0,0,0,0], 'M9':[0,0,0,0,0,0,0], 'M10':[0,0,0,0,0,0,0], 'M11':[0,0,0,0,0,0,0], 'M12':[0,0,0,0,0,0,0]})
prod_1 = ['A']
prod_2 = ['B']
prod_3 = ['C']
df.loc[(df['dose1'].isin(prod_1)) & (df['qty']==1), ('MONTH1')] = 1
For example patient got Dose_1 (qty=1), which got him covered for 30 days, and comes back for Dose_2 (qty=2) 120 days later. Now it should be represented as:
M1 = 1, M2 = 0, M3 = 0, M4 = 0, M5 = 1 (patient came back 120 days after the first dose + double qty), M6 = 1, M7 = 0, M8 = 0 and so on.
welcome to stackoverflow,
for i in range(len(df['patient'])): #for loop to separate each patient in a separate dict
newdict={}
for k,v in df.items():
newdict[str(k)]=v[i]
data.append(newdict)
for log in data: # for loop to add up the total quantities and assign months to 1
for dose in range(log['qty1']+log['qty2']+log['qty3']):
log[f'M{dose+1}']=1
df = pd.DataFrame.from_dict(data)
df = df.to_csv('doses.csv')
i enjoyed figuring this out.
so basically i separated each patient and then added up the qty and put that through a for loop which assigns one to the months within the range of the added up quantities i hope that's what you were aiming for.
Edit:
for i in range(len(df['patient'])):
newdict={}
for k,v in df.items():
newdict[str(k)]=v[i]
data.append(newdict)
for log in data:
for dose in range(1,log['qty1']+1):
log[f'M{dose}']=1
gap = 1 + round((log['days_since_last_dose2']/30)+0.5)
for dose in range(gap,(gap+log['qty2'])):
log[f'M{dose}']+=1
gap1 = 1 + round((log['days_since_last_dose3']/30)+0.5)+gap
for dose in range(gap1,gap1+log['qty3']):
log[f'M{dose}']+=1
ok so i found the algorithm to calculate coverage, 2 indicates overlap in coverage.

Print the first value of a dataframe based on condition, then iterate to the next sequence

I'm looking to perform data analysis on 100-years of climatological data for select U.S. locations (8 in particular), for each day spanning the 100-years. I have a pandas dataFrame set up with columns for Max temperature, Min temperature, Avg temperature, Snowfall, Precip Total, and then Day, Year, and Month values (then, I have an index also based on a date-time value). Right now, I want to set up a for loop to print the first Maximum temperature of 90 degrees F or greater from each year, but ONLY the first. Eventually, I want to narrow this down to each of my 8 locations, but first I just want to get the for loop to work.
Experimented with various iterations of a for loop.
for year in range(len(climate['Year'])):
if (climate['Max'][year] >=90).all():
print (climate.index[year])
break
Unsurprisingly, the output of the loop I provided prints the first 90 degree day period (from the year 1919, the beginning of my data frame) and breaks.
for year in range(len(climate['Year'])):
if (climate['Max'][year] >=90).all():
print (climate.index[year])
break
1919-06-12 00:00:00
That's fine. If I take out the break statement, all of the 90 degree days print, including multiple in the same year. I just want the first value from each year to print. Do I need to set up a second for loop to increment through the year? If I explicitly state the year, ala below, while trying to loop through a counter, the loop still begins in 1919 and eventually reaches an out of bounds index. I know this logic is incorrect.
count = 1919
while count < 2019:
for year in range(len(climate['Year'])):
if (climate[climate['Year']==count]['Max'][year] >=90).all():
print (climate.index[year])
count = count+1
Any input is sincerely appreciated.
You can achieve this without having a second for loop. Assuming the climate dataframe is ordered chronologically, this should do what you want:
current_year = None
for i in range(climate.shape[0]):
if climate['Max'][i] >= 90 and climate['Year'][i] != current_year:
print(climate.index[i])
current_year = climate['Year'][i]
Notice that we're using the current_year variable to keep track of the latest year that we've already printed the result for. Then, in the if check, we're checking if we've already printed a result for the year of the current row in the loop.
That's one way to do it, but I would suggest taking a look at pandas.DataFrame.groupby because I think it fits your use case well. You could get a dataframe that contains the first >=90 max days per year with the following (again assuming climate is ordered chronologically):
climate[climate.Max >= 90].groupby('Year').first()
This just filters the dataframe to only contain the >=90 max days, groups rows from the same year together, and retains only the first row from each group. If you had an additional column Location, you could extend this to get the same except per location per year:
climate[climate.Max >= 90].groupby(['Location', 'Year']).first()

Working out number of years in compound interest using python function

School Question:
Build a function retirement_age(PMT, i, FV, start_age) that calculates the (whole) age at which your customer can retire, if they:
Invest an amount, PMT at the END of every YEAR (with the first
payment made exactly one year from now),
at an interest rate of i% per year, compounded annually.
They require an amount of AT LEAST FV in order to be able to afford
retirement.
They just turned start_age years old.
I am struggling to solve the number of years PMT would take to reach FV
This is my code:
def retirement_age(PMT, i, FV, start_age):
count = 0
while PMT <= FV: #PMT set to loop till it reaches FV
PMT = PMT * (1+i)
count = count + 1 #adds 1 to count value until while loop satisfied
age = count + start_age #adds count value to start_age to determine retirement age
return int(age) #returns age
print (retirement_age(20000, 0.1, 635339.63, 20))
my answer with this code:
57
The answer is supposed to be:
35
I can't tell what I'm doing wrong. And the task specifically mentions that we are not allowed to import external functions like math for example, which means I can't use math.log() which would probably solve all my problems.
First, I'll note that broad debugging questions like this aren't very appropriate for SO.
Having said that, I played around with it and after reading the specs again, I found the issue(s). I figured I might as well post it.
You only need to keep calculating while the principal is less than the future value. You can stop once they're equal.
The main issues however were that you aren't adding any money each year. You're just accumulating interest on the initial principal. And...
You invested PMT immediately. The investment doesn't happen until the end of the year, as the instructions emphasize. That means at the start of the looping, he has 0 invested. That means he doesn't start accumulating interest until the start of the second loop/year.
def retirement_age(PMT, i, FV, start_age):
age = start_age
p = 0
while p < FV:
p = PMT + p * (1+i)
age += 1
return int(age)
print(retirement_age(20000, 0.1, 635339.63, 20))
# 35
I introduced p to keep track of the running balance since it's separate from what's being added each year. Your logic for keeping track of age was also a little convoluted, so I simplified it down a bit.

pandas how to flatten a list in a column while keeping list ids for each element

I have the following df,
A id
[ObjectId('5abb6fab81c0')] 0
[ObjectId('5abb6fab81c3'),ObjectId('5abb6fab81c4')] 1
[ObjectId('5abb6fab81c2'),ObjectId('5abb6fab81c1')] 2
I like to flatten each list in A, and assign its corresponding id to each element in the list like,
A id
ObjectId('5abb6fab81c0') 0
ObjectId('5abb6fab81c3') 1
ObjectId('5abb6fab81c4') 1
ObjectId('5abb6fab81c2') 2
ObjectId('5abb6fab81c1') 2
I think the comment is coming from this question ? you can using my original post or this one
df.set_index('id').A.apply(pd.Series).stack().reset_index().drop('level_1',1)
Out[497]:
id 0
0 0 1.0
1 1 2.0
2 1 3.0
3 1 4.0
4 2 5.0
5 2 6.0
Or
pd.DataFrame({'id':df.id.repeat(df.A.str.len()),'A':df.A.sum()})
Out[498]:
A id
0 1 0
1 2 1
1 3 1
1 4 1
2 5 2
2 6 2
This probably isn't the most elegant solution, but it works. The idea here is to loop through df (which is why this is likely an inefficient solution), and then loop through each list in column A, appending each item and the id to new lists. Those two new lists are then turned into a new DataFrame.
a_list = []
id_list = []
for index, a, i in df.itertuples():
for item in a:
a_list.append(item)
id_list.append(i)
df1 = pd.DataFrame(list(zip(alist, idlist)), columns=['A', 'id'])
As I said, inelegant, but it gets the job done. There's probably at least one better way to optimize this, but hopefully it gets you moving forward.
EDIT (April 2, 2018)
I had the thought to run a timing comparison between mine and Wen's code, simply out of curiosity. The two variables are the length of column A, and the length of the list entries in column A. I ran a bunch of test cases, iterating by orders of magnitude each time. For example, I started with A length = 10 and ran through to 1,000,000, at each step iterating through randomized A entry list lengths of 1-10, 1-100 ... 1-1,000,000. I found the following:
Overall, my code is noticeably faster (especially at increasing A lengths) as long as the list lengths are less than ~1,000. As soon as the randomized list length hits the ~1,000 barrier, Wen's code takes over in speed. This was a huge surprise to me! I fully expected my code to lose every time.
Length of column A generally doesn't matter - it simply increases the overall execution time linearly. The only case in which it changed the results was for A length = 10. In that case, no matter the list length, my code ran faster (also strange to me).
Conclusion: If the list entries in A are on the order of a few hundred elements (or less) long, my code is the way to go. But if you're working with huge data sets, use Wen's! Also worth noting that as you hit the 1,000,000 barrier, both methods slow down drastically. I'm using a fairly powerful computer, and each were taking minutes by the end (it actually crashed on the A length = 1,000,000 and list length = 1,000,000 case).
Flattening and unflattening can be done using this function
def flatten(df, col):
col_flat = pd.DataFrame([[i, x] for i, y in df[col].apply(list).iteritems() for x in y], columns=['I', col])
col_flat = col_flat.set_index('I')
df = df.drop(col, 1)
df = df.merge(col_flat, left_index=True, right_index=True)
return df
Unflattening:
def unflatten(flat_df, col):
flat_df.groupby(level=0).agg({**{c:'first' for c in flat_df.columns}, col: list})
After unflattening we get the same dataframe except column order:
(df.sort_index(axis=1) == unflatten(flatten(df)).sort_index(axis=1)).all().all()
>> True
To create unique index you can call reset_index after flattening

Resources