I've had a bit of trouble explaining this so please bear with me. I'm also very new to using excel so if there's a simple fix, I apologize in advance!
I have two columns, one listing number of days starting from 0 and increasing consecutively. The other column has the number of orders delivered. The two correspond to each other. For example, I've typed out how it would look below. It would mean that there were 100 orders delivered in 1 day, 150 orders delivered in 2 days, 800 orders delivered in 3 days, etc.
Is there a way to get summary statistics (mean, median, mode, upper and lower quartiles) for the number of days it took for the average order to get delivered? The only way I can think of solving this is to manually punch in "1" 100 times, "2" 150 times, etc. into a new column and take median, mean, and upper & lower quartile from that, but that seems extremely inefficient. Would I use a pivot table for this? Thank you in advance!
I tried using the data analysis add-on and doing summary statistics that way, but it didn't work. It just gave me the mean, median, mode, and quartiles of each individual column. It would have given me 3 for median number of days for delivery and 300 for median number of orders.
Method 1
The mean is just
=SUMPRODUCT(A2:A6,B2:B6)/SUM(B2:B6)
Mode is the value with highest frequency
=INDEX(A2:A6,MATCH(MAX(B2:B6),B2:B6,0))
The quartiles and median (or any other quantile by varying the value of p) from first principles following this reference
=LET(p,0.25,
values,A2:A6,
freq,B2:B6,
N,SUM(freq),
h,(N+1)*p,
floorh,FLOOR(h,1),
ceilh,CEILING(h,1),
frac,h-floorh,
cusum,SCAN(0,SEQUENCE(ROWS(values)),LAMBDA(a,c,IF(c=1,0,a+INDEX(freq,c-1)))),
xlower,XLOOKUP(floorh-1,cusum,values,,-1),
xupper,XLOOKUP(ceilh-1,cusum,values,,-1),
xlower+(xupper-xlower)*frac)
Method 2
If you don't like doing it this way, you can always expand the data like this:
=AVERAGE(XLOOKUP(SEQUENCE(SUM(B2:B6),1,0),SCAN(0,SEQUENCE(ROWS(A2:A6)),LAMBDA(a,c,IF(c=1,0,INDEX(B2:B6,c-1)+a))),A2:A6,,-1))
=MODE(XLOOKUP(SEQUENCE(SUM(B2:B6),1,0),SCAN(0,SEQUENCE(ROWS(A2:A6)),LAMBDA(a,c,IF(c=1,0,INDEX(B2:B6,c-1)+a))),A2:A6,,-1))
=QUARTILE.EXC(XLOOKUP(SEQUENCE(SUM(B2:B6),1,0),SCAN(0,SEQUENCE(ROWS(A2:A6)),LAMBDA(a,c,IF(c=1,0,INDEX(B2:B6,c-1)+a))),A2:A6,,-1),1)
=MEDIAN(XLOOKUP(SEQUENCE(SUM(B2:B6),1,0),SCAN(0,SEQUENCE(ROWS(A2:A6)),LAMBDA(a,c,IF(c=1,0,INDEX(B2:B6,c-1)+a))),A2:A6,,-1))
and
=QUARTILE.EXC(XLOOKUP(SEQUENCE(SUM(B2:B6),1,0),SCAN(0,SEQUENCE(ROWS(A2:A6)),LAMBDA(a,c,IF(c=1,0,INDEX(B2:B6,c-1)+a))),A2:A6,,-1),3)
I was able to build a discount curve for the Treasury market. However, I'm looking to use this to find the key rate risks of an individual bond (and eventually a portfolio of bonds).
The key rate risk I'm looking for is if I have a 30Y bond and we shift the 1y rate that was used to discount the bond, while holding the other rates constant, how much does the price of the bond change by? Repeating this for the tenors (eg. 2Y, 5Y, 7Y, etc) and summing the result should get you to the overall duration of the bond, but provides a better view of how the risk exposure breaks down.
http://www.investinganswers.com/financial-dictionary/bonds/key-rate-duration-6725
Is anyone aware of any documentation that demonstrates how to do this? Thank you.
Given that you have already built the bond and the discount curve, and you have linked them in some way similar to:
discount_handle = RelinkableYieldTermStructureHandle(discount_curve)
bond.setPricingEngine(DiscountingBondEngine(discount_handle))
you can first add a spread over the existing discount curve and then use the modified curve to price the bond. Something like:
nodes = [ 1, 2, 5, 7, 10 ] # the durations
dates = [ today + Period(n, Years) for n in nodes ]
spreads = [ SimpleQuote(0.0) for n in nodes ] # null spreads to begin
new_curve = SpreadedLinearZeroInterpolatedTermStructure(
YieldTermStructureHandle(discount_curve),
[ QuoteHandle(q) for q in spreads ],
dates)
will give you a new curve with initial spreads all at 0 (and a horrible class name) that you can use instead of the original discount curve:
discount_handle.linkTo(new_curve)
After the above, the bond should still return the same price (since the spreads are all null).
When you want to calculate a particular key-rate duration, you can move the corresponding quote: for instance, if you want to bump the 5-years quote (the third in the list above), execute
spreads[2].setValue(0.001) # 10 bps
the curve will update accordingly, and the bond price should change.
A note: the above will interpolate between spreads, so if you move the 5-years points by 10 bps and you leave the 2-years point unchanged, then a rate around 3 years would move by about 3 bps. To mitigate this (in case that's not what you want), you can add more points to the curve and restrict the range that varies. For instance, if you add a point at 5 years minus one month and another at 5 years plus 1 month, then moving the 5-years point will only affect the two months around it.
I am trying to create a spreadsheet that can find the most likely probability that a student scored a specific grade on a test.
Only one student can score a grade and only one grade can have a student.
I have limited information about each student.
There are 5 students (1,2,3,4,5)
and the grades possible are only (100,90,80,70,60)
In the spreadsheet a 1 denotes that the student DIDN'T score that grade.
Does anyone know how to make a simulation that I can find the most likely probability of what student scored what grade?
Link:
https://docs.google.com/spreadsheets/d/1a8uUIRzUKsY3DolTM1A0ISqMd-42WCUCiDsxmUT5TKI/edit?usp=sharing
Based on your response in comments, each student has an equal likelihood of getting each grade. No simulation is necessary.
If you want to simulate it anyway, don't use Excel*. Create a vector of students, and pair it with a shuffled vector of the grades. Lather, rinse, repeat as many times as you want to see that the student-to-grade matching is uniformly distributed.
* - To get an idea of how bad Excel can be for random variate generation, enable the Analysis Toolpak, go to "Data -> Data Analysis" on the ribbon, and select "Random Number Generation". Fill in the tabs that you want 10 variables, number of random numbers 2000, a "Normal" distribution, leave the mean and std dev at 0 and 1, and enter a "Random Seed" value of 123. You will find that the resulting table contains 3 instances of the value "-9.35764". Values that extreme should occur about once per twenty thousand years if you generate a billion a second. Getting three of them is so extreme that it should happen once per 1030 times the current estimated age of the universe. Conclude that a) it's your lucky day, or b) Excel sucks at random numbers, and despite being informed about this as far back as 1998 Microsoft hasn't bothered to fix it.
You need 100 lbs of bird feed. John's bag can carry 15 lbs and Mark's bag can carry 25 lbs. Both guys have to contribute exactly the same total amount each. What's the lowest number of trips each will have to take?
I have calculated this using systems of equations.
15x + 25y = 100
15x - 25y = 0
This equals out to:
John would have 3.33 trips and Mark would have 2 trips. Only one problem: you can't have 1/3 of a trip.
The correct answers is:
John would take 5 trips (75 lbs) and Mark would take 3 trips (75 lbs).
How do you calculate this? Is there an excel formula which can do both layers of this?
Assuming you put the total bird feed required in A1 and John's and Mark's bag limits in B1 and B2 respectively, then this formula in C1:
=MATCH(TRUE,INDEX(2*ROW(INDIRECT("1:100"))*LCM($B$1:$B$2)>=$A$1,,),0)*LCM($B$1:$B$2)/B1
will give the lowest number of trips required of John. Copying this formula down to C2 will give the equivalent result for Mark.
Note that the 100 in the part:
ROW(INDIRECT("1:100"))
was arbitrarily chosen and will give correct results providing neither John nor Mark is required to make more than twice that number of trips, i.e. 200. Obviously you can amend this value if you feel it necessary (up to a theoretical limit of 2^20).
Regards
Since John and Mark need to carry the same total amount of bird feed, what they will carry has to be a multiple of the least common multiple.
Since they both carry that amount the total amount will always be an even multiple of the LCM.
So find the least even multiple of the LCM that is larger than 100. And calculate the number of trips John and Mark will have to take from that.
For John:
CEILING(100/(2*LCM(15; 25));1)*LCM(15;25)/15
For Mark:
CEILING(100/(2*LCM(15; 25));1)*LCM(15;25)/25
I am trying to figure out what the optimal number of products I should make per day are, displaying the values in a chart and then using the chart to find the optimal number of products to make per day.
Cost of production: $4
Sold for: $12
Leftovers sold for $1
So the ideal profit for a product is $8, but it could be -$3 if it's left over at the end of the day.
The daily demand of sales has a mean of 150 and a standard deviation of 30.
I have been able to generate a list of random numbers using to generate a list of how many products: NORMINV(RAND(),mean,std_dev)
but I don't know where to go from here to figure out the amount sold from the amount of products made that day.
The number sold on a given day is min(# produced, daily demand).
ADDENDUM
The decision variable is a choice you make: "I will produce 150 each day", or "I will produce 145 each day". You told us in the problem statement that daily demand is a random outcome with a mean of 150 and a SD of 30. Let's say you go with producing 150, the mean of demand. Since it's the mean of a symmetric distribution, half the time you will sell everything you made and have no losses, but in most of those cases you actually could have sold more and made more money. You can't sell products you didn't make, so your profit is capped at selling 150 on those days. The other half of the time, you won't sell all 150 and will take a loss on the unsold items, reducing your profit a bit. The actual profit on any given day is a random variable, because it is determined by random demand.
Since profit is random, you can calculate your average earnings across many days based on the assumption that you produce 150. You can also average earnings based on the assumption that you produce 140 per day, or 160 per day, or any other number. It sounds like you've been asked to plot those average earnings versus how many you decided to produce, and choose a production level that results in the highest long-term average earnings.