How to use excel data to find period - excel

I have three Excel columns of data from an experiment with a pendulum: time, angle displacement, and angular velocity. I was wondering if there is a way in Excel to calculate and then graph the period (and, if possible, display the function for the graph)... I realize it's kinda a dumb question. I'm still new at Excel.
Thanks for any pointers u can give!

In case the Analysis ToolPak is installed, one can use Tools->Data Analysis->Fourier Analysis. If the data is a superposition of harmonic functions (sin,cos), the corresponding frequencies (or inverse periods) will appear as peaks in the Fourier analysis.

Related

Rebinning data in Excel

I have some wind data in excel that is binned by wind speed and direction. I want to re-bin it to cover different intervals (assuming the data is uniform across the original intervals). i.e I want to go from:
To
Can anyone give me some pointers? Struggling to think of an elegant way to do this.

Excel, Determine where data takes a dive

I'm trying to determine where, in a set of measurement data, the data takes a dive...
... so I can plot a vertical line and
... plot a horizontal line in the graph.
I have no problem doing the 2nd and 3rd bullet points above on my own, so that's taken care of.
The problem I need help with is the first bullet point - determining WHERE the data takes a dive - WHERE the data crosses a threshold that basically says, "Whatever-it-is you're measuring, is no longer performing as it is expected to.".
Here's what I'm doing:
I am taking measurements using a measuring device and that device is logging the measurements in its internal memory and allowing me to download that measurement data to my computer into a csv when the test session is complete.
I pull that csv into an xls and plot the data on a graph. (see attached image)
Here's what I want to do:
If you look at the attached image I would like to find the value where the data DEFINITELY crosses BELOW the horizontal line so I can say, "Here is where the device being tested 'gave up the ghost' and was no longer able to perform as desired."
What the data roughly looks like:
Each measurement set will have the rough look and feel of the attached image but slightly different each time. (because each object I am testing will have roughly the same performance characteristics but they all have their own manufacturing defects and variations.)
The data set for the attached image is a data set of 7000 measurements.
I never really know where the horizontal line will be.
Examples of the data sets I have gotten in the past several tests look like this:
(394 to 0)
(390000 to 0)
(3.88 to 0)
(375000 to 0)
(39.55 to 0)
(59200 to 0)
and each data set will have about 1,000 to 7,000 measurements each.
Here's how I was trying to solve this issue:
I was using SLOPE() and trying to latch onto where the slop of the line took a dive / started to work its way to a zero slope (which is a vertical line) so when it starts approaching a really small slope then it MUST be taking a dive. That didn't really work.
I was looking at using STDEV.P() in Excel and feeding it the entire data set. Then I was looking at doing the same thing but feeding it only the first 10, 30, 60 measurements but then I thought - we never really know just how many measurements will come through. Then I thought I would use the first 10% of the measurements and feed that to STDEV.P().
Please let me know what you think of this and please let me know of any ideas you may have.
Thanks.
H
Something like this should work to flag when the decay rate increases.
To find what 'direction' your data is going in you need the derivative.
Excel doesn't have a derivative formula but you can set it up pretty easily by using the (change in y)/(change in x) as demonstrated here:
http://faculty.educ.ubc.ca/sanderson/lab/CLFbiom/demo/diff.htm
I would then check a formula which counts how many datarows you have (=COUNTA(A:A) or similar)
Then uses that to get a step of 10% of your data
Then check the value of the derivative in a cell against a cell 10% further down. If it's still a negative (to account for the slight downhill at first) then you'll know
The right way to go about this is to model the data with an unknown discontinuity, something like "if time < break_time then (some constant plus noise) else (decaying exponential)". A maximum likelihood estimation for that model might require iteration or other operations which are clumsy in Excel -- maybe you should consider VB or Python or some other programming language. I.e. choose the tool to fit the problem and not the other way around.
See Seber and Wild, "Nonlinear Regression", for an extensive discussion of models with discontinuities.
If your data can be generally characterized as having:
(A) a more or less flat plateau region, followed by
(B) a downward trending region
then a basic strategy could be to start at then end of the data and march towards the beginning one point at a time, checking to see that the values are increasing. Once they stop increasing, you've found the break point.
The strategy assumes (unwisely?) that the downward trending region is smooth/noiseless. To make the solution more robust to noise, you could compare values that are 5 apart, or 10 apart, or whatever interval works to filter out the noise. Or you could use a moving average.
This strategy could potentially be made more efficient by starting the search somewhere in the middle of the data but still in downward trending portion. If you know (based on experience) that any value that is (say) 0.5X the maximum is in the downward trending portion, you could start the search there.
Hope that helps.
It appears as though you want to detect when the slope changes from something near zero to something negative. One way to detect this is to calculate the 2nd derivative of the values (calculate the slope of the slope). The 2nd derivative should be near zero in the flat portion of the data AND in the downward trending portion of the data. It should go negative at the break point. So finding the minimum (most negative) value of the 2nd should locate the break point.
To implement this, you probably will need to filter noise. So calculate the first derivative (slope) over some suitable window of data:
=SLOPE(moving window of say 25 raw values)
Then calculate the second derivative (slope of slope):
=SLOPE(moving window of say 25 slope values)
Then look for the minimum.
Hope that helps.

Averaging many curves with different x and y values

I have several curves that contain many data points. The x-axis is time and let's say I have n curves with data points corresponding to times on the x-axis.
Is there a way to get an "average" of the n curves, despite the fact that the data points are located at different x-points?
I was thinking maybe something like using a histogram to bin the values, but I am not sure which code to start with that could accomplish something like this.
Can Excel or MATLAB do this?
I would also like to plot the standard deviation of the averaged curve.
One concern is: The distribution amongst the x-values is not uniform. There are many more values closer to t=0, but at t=5 (for example), the frequency of data points is much less.
Another concern. What happens if two values fall within 1 bin? I assume I would need the average of these values before calculating the averaged curve.
I hope this conveys what I would like to do.
Any ideas on what code I could use (MATLAB, EXCEL etc) to accomplish my goal?
Since your series' are not uniformly distributed, interpolating prior to computing the mean is one way to avoid biasing towards times where you have more frequent samples. Note that by definition, interpolation will likely reduce the range of your values, i.e. the interpolated points aren't likely to fall exactly at the times of your measured points. This has a greater effect on the extreme statistics (e.g. 5th and 95th percentiles) rather than the mean. If you plan on going this route, you'll need the interp1 and mean functions
An alternative is to do a weighted mean. This way you avoid truncating the range of your measured values. Assuming x is a vector of measured values and t is a vector of measurement times in seconds from some reference time then you can compute the weighted mean by:
timeStep = diff(t);
weightedMean = timeStep .* x(1:end-1) / sum(timeStep);
As mentioned in the comments above, a sample of your data would help a lot in suggesting the appropriate method for calculating the "average".

Microsoft Excel. Piece-wise Least-Squares Fit with Solver. My excel sheet produces right answers sometimes, wrong answers other times

I am trying to do a non-linear regression with data I have for my research. Since it is nonlinear, I can't use Simplex LP. Instead I was doing GRG Nonlinear with upper and lower bounds on all parameters.
It is weird because Excel produces answers that are right sometimes, and other times it is wrong. I have to manually change the parameters to arbitrary numbers, run Solver again, and hope that it is right. Let me show you my excel sheet.
https://drive.google.com/a/case.edu/file/d/0Bw0aJV0lW2eTaHFRUFhobVZ4NWs/edit?usp=sharing
Basically, I am looking to two linear lines to fit the data I have. The raw data can be divided into two portions, both linear. The point which the two lines cross is the Critical Value.
The correct output with my raw data is where the Critical Value = 0.006707. The last time I ran it, which is on the excel sheet, you can see that Critical Value = 3.36E-06.
If it would help you to understand this better in context, I am measuring surface tension of various systems. The critical value is so called Critical Micelle Concentration in my field.
Thanks Guys.
If you are using Excel solver, I think it is falling in local optima. Restarting the solver after setting the different variables to random values may help getting out of local optima values. Before restarting the solver, record the values of the objective function and compare it with the new value found by the solver.
Alternatively, use a better solver outside Excel. There are plenty of open source solvers available on the Internet.

Monte Carlo Simulation in Excel for Non-normal Distributions

I would like to simulate the performance a baseball player. I know his expected performance for every future year and the standard deviations of those performances (based on regression analysis). At first, I was thinking of using the NORMINV(RAND(),REF,REF) function in excel, but the underlying distribution of baseball players' performances is dramatically right skewed. Is there a way that I can perform this sort of analysis in Excel or some other free or low-cost software? The end-goal here is for the simulation to use the right skewed distribution. Thanks very much.
R has lots of tools to do this sort of analysis, though you'd have to look through the docs to figure out how to use it. R is free, at least for non-commercial use.
If you have a cumulative distribution table (that is evenly spaced and sufficiently detailed) then you can easily generate random values from this distribution in Excel by looking up a uniform random number generated by RAND() in your distribution table and take the corresponding "x-axis" value.
=OFFSET($A$1,MATCH(RAND(),$B$2:$B$102),0)
A1 is the cell just above the table of "x-axis" values.
B2:B102 is the cumulative distribution table.
This is a simplified example. Some small modifications may be needed to handle edge-cases and adjust for biases.
If you have enough empirical data you should be able to create the cumulative distribution table.

Resources