Trigonometric functions not always exact in Excel - excel

I need to convert values between formats used by different applications, and one of the applications performs calculations in degrees, not radians.
Thus, the application always expects cos(90) to equal exactly 0.
But in Excel, there are rounding errors when I do cos(radians(90)).
How do I handle situations like this?

Related

Limiting decimal precision in calculations

I currently have a very large dataset in SPSS where some variables have up to 8 decimal places. I know how to change the options so that SPSS only displays variables to 2 decimal places in the data view and output. However, SPSS still applies the 8 decimal precision in all calculations. Is there a way to change the precision to 2 decimal places without having to manually change every case?
You'll have to round each of the variables. You can loop through them like this:
do repeat vr=var1 var2 var3 var4 var5.
compute vr=rnd(vr, 0.01).
end repeat.
If the variables are consecutive in the file you can use to like this:
do repeat vr=var1 to var5.
....
SPSS Statistics treats all numeric variables as double-precision floating point numbers, regardless of the display formats. Any computations (other than those in one older procedure, ALSCAL) are done in double precision, and you don't have any way to avoid that.
The solution that you've applied will not actually make any calculations use only two decimals of precision. You'll start from that point with numbers rounded to approximately what they would be to two decimals, but most of them probably aren't exactly expressable as double precision floating point numbers, so what you're actually using isn't what you're seeing.

Why do Excel and Matlab give different results?

I have 352k values and I want to find the most frequent values from all of them.
Numbers are rounded to two decimal places.
I use the commands mode(a) in Matlab and mode(B1:B352000) in Excel, but the results are different.
Where did I make a mistake, or which one can I believe?
Thanks
//edit: When I use other commands like average, the results are the same.
From Wikipedia:
For a sample from a continuous distribution, such as [0.935..., 1.211..., 2.430..., 3.668..., 3.874...], the concept is unusable in its raw form, since no two values will be exactly the same, so each value will occur precisely once. In order to estimate the mode of the underlying distribution, the usual practice is to discretize the data by assigning frequency values to intervals of equal distance, as for making a histogram, effectively replacing the values by the midpoints of the intervals they are assigned to. The mode is then the value where the histogram reaches its peak. For small or middle-sized samples the outcome of this procedure is sensitive to the choice of interval width if chosen too narrow or too wide
Thus, it is likely that the two programs use a different interval size, yielding different answers. You can believe both (I presume) but knowing that the value returned is an approximation to the true mode of the undelying distribution.

Excel Interpolate with logarithmic prediction

Is there a function within Excel to Interpolate while taking into account a logarithmic prediction?
At the moment I am using linear interpolation but would like to find a better way to fill in the blanks if possible.
There's no logarithmic regression or interpolation in Excel, even in the Anlaysis ToolPak. You'll need much more advanced software for that, such as MatLab.
If you're stuck working in Excel... here's a possible mathematical solution:
Rather than working with the raw data x and y, instead try plotting x and a^y, where a is the base. (Or plotting log(x,a) against y.) If you have the correct base a (and there's no vertical offset), you will then have a linear relationship from which you can perform a linear interpolation as normal, then convert the interpolated values back to actual values by taking the log of them.
If you don't know what a is, then you can instead calculate a line of best fit for an arbitrary a, calculate the standard residuals, and then use Problem Solver to modify a until you get the lowest possible standard residuals, at which point you have the best estimate of a.
Similarly if there is a vertical offset b, you'll need to test some variables there that also result in a linear relationship. Plot x against a^(y-b)

Averaging many curves with different x and y values

I have several curves that contain many data points. The x-axis is time and let's say I have n curves with data points corresponding to times on the x-axis.
Is there a way to get an "average" of the n curves, despite the fact that the data points are located at different x-points?
I was thinking maybe something like using a histogram to bin the values, but I am not sure which code to start with that could accomplish something like this.
Can Excel or MATLAB do this?
I would also like to plot the standard deviation of the averaged curve.
One concern is: The distribution amongst the x-values is not uniform. There are many more values closer to t=0, but at t=5 (for example), the frequency of data points is much less.
Another concern. What happens if two values fall within 1 bin? I assume I would need the average of these values before calculating the averaged curve.
I hope this conveys what I would like to do.
Any ideas on what code I could use (MATLAB, EXCEL etc) to accomplish my goal?
Since your series' are not uniformly distributed, interpolating prior to computing the mean is one way to avoid biasing towards times where you have more frequent samples. Note that by definition, interpolation will likely reduce the range of your values, i.e. the interpolated points aren't likely to fall exactly at the times of your measured points. This has a greater effect on the extreme statistics (e.g. 5th and 95th percentiles) rather than the mean. If you plan on going this route, you'll need the interp1 and mean functions
An alternative is to do a weighted mean. This way you avoid truncating the range of your measured values. Assuming x is a vector of measured values and t is a vector of measurement times in seconds from some reference time then you can compute the weighted mean by:
timeStep = diff(t);
weightedMean = timeStep .* x(1:end-1) / sum(timeStep);
As mentioned in the comments above, a sample of your data would help a lot in suggesting the appropriate method for calculating the "average".

Microsoft Excel. Piece-wise Least-Squares Fit with Solver. My excel sheet produces right answers sometimes, wrong answers other times

I am trying to do a non-linear regression with data I have for my research. Since it is nonlinear, I can't use Simplex LP. Instead I was doing GRG Nonlinear with upper and lower bounds on all parameters.
It is weird because Excel produces answers that are right sometimes, and other times it is wrong. I have to manually change the parameters to arbitrary numbers, run Solver again, and hope that it is right. Let me show you my excel sheet.
https://drive.google.com/a/case.edu/file/d/0Bw0aJV0lW2eTaHFRUFhobVZ4NWs/edit?usp=sharing
Basically, I am looking to two linear lines to fit the data I have. The raw data can be divided into two portions, both linear. The point which the two lines cross is the Critical Value.
The correct output with my raw data is where the Critical Value = 0.006707. The last time I ran it, which is on the excel sheet, you can see that Critical Value = 3.36E-06.
If it would help you to understand this better in context, I am measuring surface tension of various systems. The critical value is so called Critical Micelle Concentration in my field.
Thanks Guys.
If you are using Excel solver, I think it is falling in local optima. Restarting the solver after setting the different variables to random values may help getting out of local optima values. Before restarting the solver, record the values of the objective function and compare it with the new value found by the solver.
Alternatively, use a better solver outside Excel. There are plenty of open source solvers available on the Internet.

Resources