VBA: Generating Data that mimics specific parameters (Avg, StdDev..etc) - excel

I have modified a VBA array function given to me here: Excel Generate Normalized Data
That question will explain what I am after.
Download the excel I am using to completely understand:
http://www.mediafire.com/?smq5tl9poitdacc
I am using the following data (The left side are values I enter for the data to be based upon, the right side is the results of the generated data):
As you can see, the % Diff is very good for Avg Click, but Click/Time is off when there is a high Day StdDev (Day +/-). The difference when there is a low Day Stddev is close to 0.
I think this is because the var NoClickDaysPerClick_Running_Avg becomes inaccurate because the NoClickDays_Total (Which is used indirectly to determine the running avg) is "guessed" at the start, and needs to be reassessed each click because the high StdDev adds randomness and the original "guess" becomes more and more inaccurate.
I am not sure if this is the problem, or if it is how I can even solve it.
I am just looking for advice on the best way to do what it is I want. I am not sure why the stdDev are so far off either, but thats not a big deal. I'd rather have a more accurate Click/Time than anything else- reguardless what the Day StdDev is.
here is the function in my VBA:
Function ClickSpacer(Total_Days As Long, ClicksPerDay_Desired_Avg As Double, Clicks_Desired_Deviation As Double, Clicks_Min As Integer, Clicks_Max As Integer, TotalClicksOverTotalDays_Desired_Avg As Double, NoClickDays_Desired_Deviation As Double, NoClickDays_Min As Integer, NoClickDays_Max As Integer)
Dim Day_Array() As Integer
ReDim Day_Array(1 To Total_Days, 1 To 1)
Dim NumDaysToGetClicks As Double
Dim ClickOffset As Long
Dim Clicks_Total As Long
Dim Clicks_SoFar As Long
Dim Clicks_Remaining As Long
Dim NoClickDaysPerClick_Desired_Avg As Double
' Number of clicks that are needed to Achieved desired Avg of clicks over time
Clicks_Total = Round(Total_Days * TotalClicksOverTotalDays_Desired_Avg, 0)
' Number of days in which the user has to click atleast once to achieve desired Avg. clicks per day
NumDaysToGetClicks = Round(Clicks_Total / ClicksPerDay_Desired_Avg, 0)
' The number of non-click days in order fill out the total days
NoClickDays_Total = Round(Total_Days - NumDaysToGetClicks, 0)
' The guessimated average of non-click days per click to fill out total non-click days
' This is never used, just used for comparsion of the running Avg
NoClickDaysPerClick_Desired_Avg = NoClickDays_Total / NumDaysToGetClicks
'This variable is here to achieved closer results to the desired StdDev.
'a higher multiplyer will not limit the deviation but just give an average deviation
'For example, if the Average was 3 with a +/- 2, then with a StdDevMulti of 1
'ALL numbers will be 1 (3-2) through 5 (3+2) with an avg of 3 and stddev of 2, the numbers will NEVER exceed the StdDev.
'With a StdDevMulti of 2, the numbers will be 0 through 6, but should still have an
'Avg deviation of 2.
StdDevMulti = 1
NoClickDays_Desired_Deviation = NoClickDays_Desired_Deviation * StdDevMulti
Clicks_Desired_Deviation = Clicks_Desired_Deviation * StdDevMulti
'Set the obvious defaults
ClickedDaysSoFar = 0
Clicks_SoFar = 0
NoClickDays_SoFar = 0
'Give the ClickOffset a starting value
ClickOffset = NoClickDaysPerClick_Desired_Avg
Do
'used to find the "running" average of days not clicked
NoClickDays_Remaining = NoClickDays_Total - NoClickDays_SoFar
'used to find the "running" average of clicks per day
Clicks_Remaining = (Clicks_Total - Clicks_SoFar)
'used in both "running" averages mentioned above and also will
'mark the end of the while loop.
RemainingClickedDays = (NumDaysToGetClicks - ClickedDaysSoFar)
' Find what the average num. click should be based on the remaining
' and then apply the deviation. Only accept a click below its max
' above its min.
Do
' Generate a random number between -1 and 1
SignChanger = Rnd() - Rnd()
' Apply the randomized StdDev
Clicks_Deviation = Clicks_Desired_Deviation * SignChanger
'Figure out the "running" average
ClicksPerDay_Running_Avg = Clicks_Remaining / RemainingClickedDays
'Figure out a click value and round to the nearest whole number
Generated_Clicks = Round(ClicksPerDay_Running_Avg + Clicks_Deviation, 0)
' Make sure it meets the requirements, if not, try again
Loop While Generated_Clicks < Clicks_Min Or Generated_Clicks > Clicks_Max
' Set the click value to the spaced-out array index
Day_Array(ClickOffset, 1) = Generated_Clicks
'Find a random space based upon the "running" avg. and desired deviation
'Make sure it between the min and max required.
Do
' Generate a random number between -1 and 1
SignChanger = Rnd() - Rnd()
' Apply the randomized StdDev
NoClickDays_Deviation = NoClickDays_Desired_Deviation * SignChanger
'Figure out the "running" average
NoClickDaysPerClick_Running_Avg = NoClickDays_Remaining / RemainingClickedDays
'Figure out a space value and round to the nearest whole number
Generated_NoClickDays = Round(NoClickDaysPerClick_Running_Avg + NoClickDays_Deviation, 0)
' Make sure it meets the requirements, if not, try again
Loop While Generated_NoClickDays < NoClickDays_Min Or Generated_NoClickDays >= NoClickDays_Max
'Define the array index based upon the spacing previously generated.
' Make sure to "add" upon the already known index. Add 1 because you
'have to account for the index the click occupies
ClickOffset = ClickOffset + Generated_NoClickDays + 1
'These should be self-explaintory
ClickedDaysSoFar = ClickedDaysSoFar + 1
Clicks_SoFar = Clicks_SoFar + Generated_Clicks
NoClickDays_SoFar = NoClickDays_SoFar + Generated_NoClickDays
Loop While ClickOffset < Total_Days And RemainingClickedDays > 0
'Set the array equal to the clicks so that it returns the array as
'we want. Ideally this will be just replace Total_Days fields under
'the base, so not to require a array-function. Neither of these work:
'ClickSpacer = Range("P1:P" & UBound(Day_Array) + 1).Value
'Range("P1:P" & UBound(Day_Array) + 1) = Application.Transpose(Day_Array)
ClickSpacer = Day_Array
End Function

I think your assumption is correct. The "problem" with the code you have above is that it uses StdDev as the basis for generating random numbers, so the standard deviation will tend to be accurate and the mean will be less accurate.
If you want more accuracy with the mean and less with the standard deviation, then you'll have to "flip" how numbers are generated: they'll need to center around your desired mean and use the desired standard deviation as a guide, rather than the other way around.
I have an idea about how this can be done, but it will take more concentration than I can apply at work, so I'll have to come back and edit this later. I'll see what I can do.

Related

variable looping

i am attempting to create a code to help do some design work.
i want my code to be able to read a radius of a circle e.g. 40m
we have rules that say we can use multiples of 12m to achieve this e.g. 12+12+12+12 but this does not satisify the issue so we need to then replace one of those numbers with a smaller number starting at 10m so equation would look like this 12+12+12+10 and if that still doesnt work change a different number so equation will look like this 12+12+10+12 we need this to loop until we can get it to 40m(subject to change could be upwards of 93m just using 40m for simplicity)
is there a way to set up variables that add to each other to get to the desired result but loop through changing the numbers until a desired result is hit.
to add to this the final number can be a a close estimate as long as it is 0.5m long minimum
e.g. if our radius is equal to 41.7 then no whole number will satisfy this so my intention is to have a loop that finds the best close answer and then an additional bit of code that will then adjust the final number
i have no idea where to start so havnt tried much
i expect to be able to input the radius, number of potential numbers to be used aka its not always goign to be 4 different numbers it could be more or less we know roughly how many we need just not the order they need to be in
the output we expect to have the cells on excel filled out with the length per item in a different cell e.g. 12m in one cell , 12m in a different cell, 12m in a different cell and then 4m in a different cell as this will fit our requirements.
if there is a way to loop through numbers until a desired number is hit that would be ideal just not sure how i would write that in code
As others already noted, it's not fully clear what you need.
Anyway, the code below distributes the radius (B1) in the number (B3) of addends that are multiples of the divisor you specify in B2. Until the remainder gets less than the divisor.
Sub distribute()
Dim radius As Integer, divisor As Integer, addends As Integer
Dim times As Integer
Dim i As Integer
radius = Range("B1").Value
divisor = Range("B2").Value
addends = Range("B3").Value
times = WorksheetFunction.Ceiling(radius / divisor / addends, 1)
For i = 1 To addends
If radius > divisor * times Then
Cells(5, i).Value = divisor * times
radius = radius - divisor * times
Else
Cells(5, i).Value = radius
radius = 0
End If
Next
End Sub

VBA Script to Extract data every 30s Interval

I have a list of voltage outputs and the time stamps at which they were recorded. For some background, my test increments a voltage by 1mV every 30s from 0 - 5V. The system reads every second or so, it can be inconsistent. The sheet has over 70000 rows but I only need just over 5000.
So far I have used =RIGHT(TEXT(A1, "hh:mm:ss"),2) to extract the seconds from the time. Unfortunately, the timestamps are always perfectly every 30s so I can't simply filter for every 0s and 30s. Sometimes 30s is skipped and it reads at say 35s.
How would I create a VBA script that extracts the increments at every 30s interval and if there is no perfect 30s jump, select the closest match to it (i.e. 29 or 31)?
Any guidance would be greatly appreciated!
OK, I think that the following will work. Note that this needs to be entered as an Excel formula array in the Flags output column, with the function's input parameter being the corresponding column that contains the times. (So the output column's formula should look like: {=FlagEvery30Sec(D2:D36)})
' Returns 1 if the corresponding row should be used, and 0 otherwise
Public Function FlagEvery30Sec(SourceTimes As Range) As Integer()
Dim Times() As Variant
Dim Flags() As Long
' get the source values
Times = SourceTimes
' set the output values array to the same size
Dim First As Long, Last As Long
First = LBound(Times, 1)
Last = UBound(Times, 1)
ReDim Flags(First To Last, 1 To 1)
Dim curr As Long, prev As Long
prev = First
curr = First
Flags(curr, 1) = 1
curr = curr + 1
Dim currTime As Date
Dim currSecs As Double, prevSecs As Double, lastFlagSecs As Double
lastFlagSecs = CDbl(Times(curr, 1)) * 24 * 60 * 60
While curr <= Last
Flags(curr, 1) = 0 ' assume not flagged, change later
currSecs = CDbl(Times(curr, 1)) * 24 * 60 * 60
If (currSecs - lastFlagSecs) >= 30 Then
If ((currSecs - lastFlagSecs) - 30) <= (30 - (prevSecs - lastFlagSecs)) Then
Flags(curr, 1) = 1
lastFlagSecs = currSecs
Else
Flags(prev, 1) = 1
lastFlagSecs = prevSecs
End If
End If
prevSecs = currSecs
prev = curr
curr = curr + 1
Wend
FlagEvery30Sec = Flags
End Function
Note that there are basically two approaches to this:
Take the first timestamp, mark off every 30 seconds, then find the entry that is closest to each mark and flag it. Or,
Take the first timestamp and find the following entry that is closest to 30 seconds after it. Then take that entry's time as the base timestamp and find the next entry that is closest to 30 seconds after that. Repeat this until you reach the end.
My function above uses the second method. You should note that this can drift though and the final result may be more or less marks than TotalSeconds/30.

How to round a list of decimals in Excel, so that the sum of the whole numbers equal a defined total?

Having difficulty developing an excel function that will round a list of decimals, so that the sum of the whole numbers equals the original or a defined total.
Edit
I guess one way to do it is writing a function that first searches for the largest numbers and rounds them to the nearest whole number. That whole number is then counted and the function moves on to the next, until the total count equals the target total.
The problem that I am running into is that if there are too many numbers that are closer to 0, then the function will never equal the target total. So what the function then needs to do is identify the largest decimals, round them up, count, and then move on to the next until the sum of the count is equal to the target total. The left over data can then round to 0.
Sorry, I hope this is clearer....
I am dealing with larger data sets where the totals of the rounded whole numbers have much larger deviations to the original total.
It would preferable if this could be accomplished with an excel function, otherwise I am also open to doing in VBA.
Thanks!
Edit 3: Here is an example data set:
Please study my project VBA.Round.
Browse to paragraph Rounding a series of numbers to a sum
Code is way too much to post here, but an example workbook is included for download.
Example:
This function will read the range of distribution values, round the sum, and fill the two ranges with rounded values of 2 and zero decimals, totalling to the requested total (confirmed, as seen, by the formula):
' Practical example for using Excel ranges for RoundSum
'
' Source URL:
' https://stackoverflow.com/questions/63715043/how-to-round-a-list-of-decimals-in-excel-so-that-the-sum-of-the-whole-numbers-e
'
' 2020-09-14. Gustav Brock, Cactus Data ApS, CPH.
'
Public Sub RoundDistribution()
' Named ranges. These should pairwise match in row size.
Const VolumeName As String = "Volume"
Const PercentValuesName As String = "Percent_Distribution"
Const ValuesName As String = "Distribution"
Const RoundedValuesName As String = "Rounded_Distribution"
Dim Range As Excel.Range
Dim Values() As Currency
Dim Results() As Currency
Dim Total As Integer
Dim Index As Integer
' Read percent distribution values from the named range.
Set Range = ThisWorkbook.Names(PercentValuesName).RefersToRange
' Read original volume value.
Total = ThisWorkbook.Names(VolumeName).RefersToRange(1, 1)
' Dim input and output arrays.
ReDim Values(1 To Range.Rows.Count)
ReDim Results(1 To Range.Rows.Count)
' Fill input array.
For Index = LBound(Values) To UBound(Values)
Values(Index) = Range(Index, 1)
Next
' Round total and retrieve array with distribution values.
Results = RoundSum(Values, RoundMid(Total), 2)
' Fill named range with distribution values.
For Index = LBound(Results) To UBound(Results)
ThisWorkbook.Names(ValuesName).RefersToRange(Index, 1) = Results(Index)
Next
' Round total and retrieve array with rounded distribution values.
Results = RoundSum(Values, RoundMid(Total))
' Fill named range with rounded distribution values.
For Index = LBound(Results) To UBound(Results)
ThisWorkbook.Names(RoundedValuesName).RefersToRange(Index, 1) = Results(Index)
Next
End Sub
Output:
Note please, that the function is capable of rounding to any number of decimals, and to select one instance only of the values 0.34 to obtain a match.
The full demo (Excel workbook) and code is still for download on GitHub.
I'd agree with #pghcpa this is rather an arithmetic problem.
One idea for solution:
order the numbers descending on their fraction part
take the floor of each (i.e ignoring the fractions)
take the sum of those floors
compare that sum to the desired sum, take the difference
This way you'd probably have a positive difference, so you can start from the top and add 1 to each number downwards, all until the difference is gone.

New to VBA in Excel. Trying to sum an incremented function

So what I am trying to do is take the difference between two values(x) and (y) to get (n). I then want to run (x) through a formula (n) times incrementing (x) each time. Then I want to output the sum all of those results into a cell. Cant figure out how to do it neatly within one cell like normal, so I've turned to VBA for once.
Here is what I have so far:
Sub Math()
'
'Math
'
Dim i As Integer
i = 0
Do While i < ((E42) - (d42))
cell(h42).Value = ((((d42) + i) ^ 2) * 100) / 3
End Sub
What I'm stuck on is how to get the result of each loop and sum them all together. I expect to have an i value that can range anywhere from 1-100. The only way I can think that would definitely work is messy where i would have a large number of cells in a column set aside that will calculate each of the iterations individually, then sum all of those together.
Alternatively, if theres a way to write a function that can calculate the sum(n) of ((x+n)^2)*100/3 then that would be much nicer but I cant think of how it would be written.
Here is how you can make a function (which can be used directly in worksheet formulas) to form a sum:
Function eval_sum(n As Long, x As Double) As Double
Dim s As Double, i As Long
For i = 0 To n - 1
s = s + (x + i) ^ 2
Next i
eval_sum = s * 100 / 3
End Function
This function evaluates:
100/3 * (x^2 + (x+1)^2 + (x+2)^2 + ... + (x+(n-1))^2)
It wasn't completely clear if this is what you were trying to do. If not, you can tweak the code to fit your needs.
Small point: I used Long rather than Integer. Forget that Integer exists. It is really legacy from the days of limited memory and 16-bit computers. In modern Excel an Integer is stored with 32 bits anyway. You are just risking overflow for no good reason.

Need to calculate a formula

I need some math help, or some excel help if that works better. I have a set of data points and I need to calculate the intermediate data. The table below are the known data points, what I need to know is if I enter 1200 ft, what MEG is available.
FOOTAGE MEG
1000 19.3
2000 20.66
3000 21.328
4000 21.398
5000 20.976
6000 20.155
7000 19.023
8000 17.658
9000 16.133
10,000 14.513
11,000 12.854
12,000 11.208
13,000 9.617
14,000 8.117
15,000 6.736
16,000 5.493
17,000 4.411
18,000 3.487
19,000 2.724
20,000 2.114
I have entered these into excel and proceded to find a formula that their chart believes to be the answer to my questions. The formula they give is
y = 8E-12x3 - 3E-07x2 + 0.0018x + 18.218
This actually gets me really good results for anything under 12k feet. After that the results stray further and further from accurate until I start getting negative numbers after 18k feet.
I tried entering more orders for it to calculate against, but that just made things worse.
Would I be better off splitting the chart in 2 (>10k ft and <10k ft) and using 2 formulas, or is there a good solution available using the whole chart?
Precision is tremendously important. In fact, there are several serious problems here with what you have done.
Mere use of coefficients with a single digit of precision will cause terrible problems. Remember that x is as large as 20000. Cubing a number of that size will be a huge number. Now, multiply it by a number on the order of 8e-12, and what do you get?
Oh, by the way, the actual value of those coefficients should be closer to
[8.38044124105504e-12 -2.95337111670131e-07 0.00176948515975282 18.2176584107327]
So does this make a difference?
8e-12*20000^3
ans =
64
8.38044124105504e-12*20000^3
ans =
67.0435299284403
It DOES make a difference, a serious one.
You MIGHT choose to use simple linear interpolation, but the cubic is a bit smoother. Beware extrapolation, as a cubic will do strange things if you try it. In fact though, the cubic polynomial has a significant amount of lack of fit. You can do significantly better using a 4th order polynomial, as long as you are careful to scale the independent variable (footage) by dividing by 10000.
a4 = -3.02325078929022
a3 = 21.0780945560741
a2 = -46.9692303618201
a1 = 26.3111163470058
a0 = 17.1162276831784
MEG = a0 + a1*footage/10000 + a2*(footage/10000)^2 +
a3*(footage/10000)^3 + a4*(footage/10000)^4
Note the importance of scaling by 10000 (or at least a number that is chosen to transform your numbers so they are on the order of 1 or so.)
I'd not go much past that point though in terms of increasing the order of the fit.
Can I suggest you use a small VBA script, which uses linear interpolation to pull out a value from your list:
Public Function Linterp(Tbl As Range, x As Double) As Variant
' linear interpolator / extrapolator
' Tbl is a two-column range containing known x, known y, sorted x ascending
Dim nRow As Long
Dim iLo As Long, iHi As Long
nRow = Tbl.Rows.Count
If nRow < 2 Or Tbl.Columns.Count <> 2 Then
Linterp = CVErr(xlErrValue)
Exit Function '-------------------------------------------------------->
End If
If x < Tbl(1, 1) Then ' x < xmin, extrapolate from first two entries
iLo = 1
iHi = 2
ElseIf x > Tbl(nRow, 1) Then ' x > xmax, extrapolate from last two entries
iLo = nRow - 1
iHi = nRow
Else
iLo = Application.Match(x, Application.Index(Tbl, 0, 1), 1)
If Tbl(iLo, 1) = x Then ' x is exact from table
Linterp = Tbl(iLo, 2)
Exit Function '---------------------------------------------------->
Else ' x is between tabulated values, interpolate
iHi = iLo + 1
End If
End If
Linterp = Tbl(iLo, 2) + (Tbl(iHi, 2) - Tbl(iLo, 2)) * (x - Tbl(iLo, 1)) / (Tbl(iHi, 1) - Tbl(iLo, 1))
End Function
You call this from your sheet with something like:
=Linterp(A1:b10, 1200)
You can easily tweak the code to adjust how you want values outside of the range to be handled.
On a slightly different note, you may also be interested in this http://www.codecogs.com/excel_render which can draw out your equations.

Resources