Variation of weighted interval scheduling given fixed number of classrooms - dynamic-programming

I had a question about solving a weighted interval scheduling problem given a fixed number of classrooms. So, initially, we are given a set of intervals, each with a starting time and finishing time, and each with a weight. So, the aim of the problem is to find a scheduling in two classrooms that maximizes the weight. Is there an efficient way to do this by dynamic programming?
My approach was trivial, since I built an algorithm that simply maximizes the intervals for each classroom. Is there a better way to do this?

My idea is not fully dynamic programming. But I think it will help.
Sort all classes by their starting time.
Now for a class i find next class j which start time is greater or equal then this end time. (Using binary search you can find this because we have an sorted array which is sorted by starting time)
Assume max_so_far is an array and max_so_far[z] contain the max_weight class from z to last
For all i find the max of summation of weight of class[i] and weight max_so_far[j]
Please find the code here
Time complexity of this code is O(nLog(n)).

Related

sample size for a single arm study based on median time to event

In my master thesis, I need to determine and calculate the number of cases for median time to event. The method is according to Brookmeyer & Crowley, 1982. My question is: How can I determine the sample size according to Brookmeyer? So determine the number of cases for median time to event. How can I define the equation for N? I know how to calculate the confidence interval, but my problem, how do I determine the case number theoretically for this.
Edit:
"Designing the trial with different characteristics: planning a single arm study without historical control. How can I determine the sample size N and what method is the best", this is my plan. Assuming "Median Time to event "PFS" ". I want to determine the sample size N and then calculate it, that's why I thought that I can clearly use or find a formula for N. I firmly assume that the survival time is exponentially distributed I want to see with it: 1- Sample size based on distributional assumptions? 2- No implementation available? How to derive p-value? Thanks for further help, best regards

The linearity conditions required by LP solver are not satisfied

So for an assignment I have to find the schedule that minimizes the sum of absolute differences
between the demanded and scheduled number of workers per time interval by solving an
integer linear optimization model.
So I modeled my schedule as a set cover problem and created a row with the demanded number of workers and a row with the actual number of workers.
I take the summation of the absolute differences between the rows as object and try to minimize that.
=SUM(ABS(C39:Z39-C33:Z33))
However I get the error "The linearity conditions required by LP solver are not satisfied" and I don't get why since the Linearity report says yes on everything.
*X_i is the number of times a shift is chosen.
ABS() is not a linear function. Who knows why excel doesn't call that out... it's internal solver does not have a great reputation.
You might try to just change your OBJ function to some penalty * uncovered jobs and see if you can get your mode up & running. Then maybe subtract the used workers from the available, sum that up and add in a penalty for unused workers....
As #AirSquid has already pointed out the absolute value is not a linear function. However in your context it is possible to linearize it. You can us that
minimizing abs(sum x_i)
is equivalent to
minimising sum a_i, where a_i are new variables with constraints a_i>=x_i, a_i>=-x_i.

How to design a score or signature function based on the time series data

I want to design a score or signature function based on a time series signal. Usually, the signal has ups and downs.
For a given time window, I desire to design the score function based on the number of times it fluctuates, the duration of the fluctuations, and the magnitude of the fluctuations. I am wondering what kind of math I can use to design the function. I am not sure if the statistical features (mean, median, and so on) would be enough to design unique function such that two time windows would be distinguishable.
Thanks!
Summary statistics will not give you what you want... but it can still be useful.
Things you can try:
Zero crossings on the signal will give you number of fluctuations. You'll have to use some central tendency value to move the signal about the 0 line in order to do this. Alternatively you can use FFT on the original to find the harmonic frequency as part of the score.
Could define the duration of fluctuations as the difference between zero crossings divided by two (since one fluctuation will reach the 0-line twice).
Magnitude can be done by finding the local minima and maxima - check out some packages with peak finding functions. You might want to use the mean or median to rule out local minima and maxima that fall on the wrong side of the line. Alternatively, finding the zero crossings on the derivative signal and then mapping them back to the original will give you all the local minima and maxima as well.

Geometric Series - partial sum (processing efficiency)

so here is my situation. I have to solve a math problem on server end and could expect tens of thousands of requests a second so I'm trying to find the most efficient path to solving the problem.
Client will submit some number, let's call it A, and I need to determine base of the exponent in a geometric series (see below) where the result will be as close to A as possible without exceeding it.
The problem is that in the real-world, each value of the geometric series is rounded, so standard math can't apply.
round(x^1)+round(x^2)+round(x^3).
I can use the partial sum of geometric series equation to find some rough upper and lower limits using:
((x)^(n+1)-1)/((x)-1)
So say x=2 is a lower limit and x=2.03 is an upper limit... and the value i'm solving for is x=2.02392372838123.
So far the only solution i found was to use a recursive function to go through decimals individually testing until I find the number, but the load on the server is too high at the volume of requests I expect. (I am using node.js).
Does anyone have any thoughts or suggestions on a more efficient way to solve this? Again the only reason I can't solve this with math alone (to the best of my skill) is because of the real-world rounding of numbers in the sum.
Thanks.

Occurrence prediction

I'd like to know what method is best suited for predicting event occurrences.
For example, given a set of data from 5 years of malaria infection occurrences and several other factors that affect the occurrences, I'd like to predict the next five years for malaria infection occurrences.
What I thought of doing was to derive a kind of occurrence factor using fuzzy logic rules, and then average the occurrences with the occurrence factor to get the first predicted occurrence, and then average all again with the predicted occurrence and keep on iterating for all five years, but I decided to seek for help online.
There are many ways to do forecasting, each has its own advantages and disadvantages. The science of determining the accuracy of a forecast often consists of trying to minimize error. All forecasting comes down to using the past as a predictor of the future, adjusting it by some amount. E.g. tomorrow the temperature will be the same as today, plus or minus some amount. How you decide the +/- is what varies.
Here are a range of techniques you might want to review:
Moving Averages (simple, single, double)
Exponential Smoothing
Decomposition(Trend + Seasonality + Cyclicals + Irregualrities)
Linear Regression
Multiple Regression
Box-Jenkis (a.k.a. ARIMA,
Auto-Regressive Integrated Moving
Average)
Sorry, for the vague answer but forecasting is complex stuff.
What you describe about feeding your predictions back into the model to produce future predictions is standard stuff. I don't know if "fuzzy logic" gets you anything in particular. As any forecasting instructor will tell you, sometimes you just squint and look at the data. Context is everything.
I would use a logit or probit model to predict occurrence given a set of exogenous circumstances. Not sure why you want to iterate. That would basically be equivalent to including a lag in the regression formula. You could do it, and as long as the coefficient was <1, you wouldn't have the explosion problem.
If you want to introduce an element of endogeneity to the independent variables, you could use a VAR.
I think with your idea as stated, you'll have asymptotic behavior as time goes by. Either your data will converge to 0, or it will explode. That said, you'd probably have to give some data and/or describe its properties before anyone can help you. This is basically a simulation, and the factors are everything when it comes to extrapolation.

Resources