I have X,Y,Z 3D point data I would like to average the Z value by an X/Y Grid in Excel - excel

I am working in Excel to analyze planar flatness measurements over time. I have 3 columns of X,Y,Z point data from a 3D measurement. I want to set up a grid using X/Y locations, I would like each grid square to be 10"x 10", and I would like an average Z value for all points that fall within any specific grid square.
So I have measurements over 200" in X and 100" in Y. Which gives me 10 squares by 20 squares. I have A-J and 1-20 for individual cells named A1, A2,. . .B1, B2. .J19, J20.
Right now I have this:
D2:D201 are the Z values for the 200 points I have as an example
B2:B201 are the X values
C2:C201 are the Y values
A1 =(SUMIFS(D2:D201,B2:B201,">0",B2:B201,"<=10",C2:C201,">0",C2:C201,"<=10")/COUNTIFS(B2:B201,">0",B2:B201,"<=10",C2:C201,">0",C2:C201,"10"))
This gives me the average of all the Z values for points that fall in the X=0-10 and Y=0-10 Grid.
But I have to manually iterate all of the >X,<=X+10 and the >Y,<=Y+10 for all 200 individual grids. Also this doesn't work if I have more points in one measurement than another because I would then have to edit 200 individual data ranges to compare one data set to another
Is there a way to automate this so that I can enter variables like "Columns=20" "Rows=10" "Grid size = 10" and get all of that data?
The point for all of this is to compare data sets that don't have perfectly matching X&Y values for a given Z value to see trends. So I might have a grid C10 that has data from Set A with a point X25.014, Y100.106, Z.010 and another set of data (Set B) with a similar point X25.015, Y100.150, Z.011. and I would like to see a trend of X=+.001 for grid C10 from whatever sheet I put all of the data into.
So I will also accept any alternative way to get that.

Related

Excel - max between 2 series

I have 2 series of data. For sake of simplicity, lets say the data looks like below,
set 1:
1 3
2 3.5
3 4
4 4.5
5 5
6 5.5
7 6
8 6.5
9 7
10 7.5
set 2:
1.5 2
2.8 4.5
3.5 8
4.5 6
5.5 4.8
6.5 4
7.5 6.5
8.5 9
9.5 3
10.5 4
After charting these 2 sets, I want to get the line with the higher data. I want the black line, In the attached pic. How do I get that? My actual data has thousands of data points, so doing this manually isn't possible.
Added later: Another thing I forgot to mention, in my actual data 1 set has about 500 x,y values, and the other set has about 50 values. Though the end points have same/similar x values.
Thanks for your help.
Given your information about the chart and the tables, I would do something like this:
The new series will be based on two formulas:
In Column H, I have the formula for the max value (between your two series):
=MAX(B2,E2)
In Column G, I have the formula that based on the Max value (formula above), which X value I should use (X-value from Series 1 or 2).
=IF(H2=B2,A2,D2)
Then I can plot my graph:
Series 1, Column B
Series 2, Column E
Series 3, Column H.
All series uses the X values of Column G.
Introduction
A few assumptions/comments/pitfalls/constraints regarding my solution:
Set 1 and Set 2 are in columns A till D.
The combined data set will combine the x-values of both Sets, and will have additional data points where the lines cross.
It involves several helper columns, in particular to allow you to copy/paste this across multiple worksheet with data.
I did not try to condense too much, to improve readability, and probably some helper columns could be combined.
It was tested with the data set from the question, but difficult to guarantee all "boundary" conditions, e.g. identical data points between Set 1 and Set 2, zero overlap between the two data sets, empty data sets, etc. (I did test some of these, see my comments at the end).
Set 1 and Set 2 must be sorted (on x-values). If this is not the case, a few additional helper columns are needed to sort the data dynamically.
To better understand the solution described below, see herewith the resulting graph, based on the data set in the question (although I added one data point [2.5;3.75] to avoid having the data points of Set 1 and Set 2 perfectly alternating):
General solution outline / methodology
Combine both datasets in a single (sorted) column;
For all x-values, determine highest y-value, between the y-value in the Set, and the calculated y-value on the line segment from the neighboring values in the other Set (looks simple, in particular with the given example data set, but this is quite tricky to do when data sets have no alternating x-values);
Find the points (x & y values) where the lines of the graph are crossing (intersecting), let's call this Set 3
Combine and sort (on x-values) the three data sets in a two columns (for x & y values).
The details and formulas
For the formulas, I assume row 1 contains headings, and the data start on row 2. All formulas should be entered in row 2, except for a few, where I mention to put them in row 3 (because they need data from the preceding row). The result is in columns E (x-values) and F (y-values), and G till AG are helper columns).
Column E : =INDEX(AH$2:AH$30;MATCH(ROWS(AH$2:AH2);$AJ$2:$AJ$30;0)) These is the actual result. Gets all x-values in AH and sorts these based on an index column AJ; this should actually be the last column in the logical flow, but for presentation purposes it is cleaner to have this next to the input data sets;
F : =INDEX(AF$2:AF$30;MATCH(ROWS(AF$2:AF2);$AG$2:$AG$30;0)) Same for y-values;
G : =IF(ISNA(H2);NA();COUNTIF($H$2:$H$30;"<="&H2)) Creates index to sort combined x-values of both data sets. You also can dynamically sort without such helper column, but then you need a VLOOKUP or INDEX/MATCH and with long decimal numbers I have some bad experiences with these;
H : =IF(ROW()-1<=COUNT($A$2:$A$30);A2;IF((ROW()-1)<=(COUNT($A$2:$A$30)+COUNT($C$2:$C$30));INDEX($C$2:$C$30;ROW()-COUNT($A$2:$A$30)-1;1);NA())) Combines x-values of both data sets, i.e. in columns A & C;
I : =IF(ROW()-1<=COUNT($B$2:$B$30);B2;IF((ROW()-1)<=(COUNT($B$2:$B$30)+COUNT($D$2:$D$30));INDEX($D$2:$D$30;ROW()-COUNT($B$2:$B$30)-1;1);NA())) Same for the y-values;
J : =IF(ROW()-1<=COUNT($A$2:$A$30);"S1";IF((ROW()-1)<=(COUNT($A$2:$A$30)+COUNT($C$2:$C$30));"S2";NA())) Assign "S1", or "S2" to each data point, as indication from which data set they come;
K : =IF(J2=J3;INTERCEPT(I2:I3;H2:H3);NA()) Determines the intercept of the line segment starting at that data point;
L : =IF(J2=J3;SLOPE(I2:I3;H2:H3);NA()) Same for slope;
M : =INDEX(H$2:H$30;MATCH(ROWS(H$2:H2);$G$2:$G$30;0)) Sorts all x-values;
N : =INDEX(I$2:I$30;MATCH(ROWS(I$2:I2);$G$2:$G$30;0)) Same for y-values
O : =INDEX(J$2:J$30;MATCH(ROWS(J$2:J2);$G$2:$G$30;0)) Same for corresponding "S1/S2" value to indicate from which data set they come;
P : =INDEX(K$2:K$30;MATCH(ROWS(K$2:K2);$G$2:$G$30;0)) Same for intercept;
Q : =INDEX(L$2:L$30;MATCH(ROWS(L$2:L2);$G$2:$G$30;0)) Same for slope;
R : =IF(O2="S1";"S2";"S1") Inversion between S1 & S2.
S : {=IFERROR(INDEX($O$2:$Q2;MAX(IF($O$2:$O2=$R3;ROW($O$2:$O2)-ROW(INDEX($O$2:$O2;1;1))+1));2);NA())} Array formula to be put in cell S3 (hence ctrl+shift+enter) that will search for the intercept of the preceding data point of the other data set.
T : {=IFERROR(INDEX($O$2:$Q2;MAX(IF($O$2:$O2=$R3;ROW($O$2:$O2)-ROW(INDEX($O$2:$O2;1;1))+1));3);NA())} Same for slope;
U : =IF(OR(ISNA(N2);NOT(ISNUMBER(S2)));NA();M2*T2+S2) Calculates the y-value on the line segment of the other data set;
V : =MAX(IFNA(U2;N2);N2) Maximum value between the original y-value and the calculated y-value on the corresponding line segment of the other data set;
W : =(V2=N2) Checks whether the y-value comes from the original data set or not;
X : =IF(O2="S1";IF(W2;"S1";"S2");IF(W2;"S2";"S1")) Determines on which data set (line) the y-value sits (S1 or S2);
Y : =IFERROR(AND((X2<>X3);COUNTIF(X3:$X$30;X2)>0);FALSE) Determines when the data sets cross (i.e. the lines on the graph intersect);
Z : =IF(Y2;(S2-P2)/(Q2-T2);NA()) Calculates x-value of intersection;
AA : =IF(Y2;Z2*Q2+P2;NA()) Calculates y-value of intersection;
AB : =COUNTIF($Z$2:$Z$30;"<="&Z2) Index to sort the newly calculated intersection points (I sort them because then the combining with the other data sets is straightforward, re-using formula of column H;
AC : =INDEX(Z$2:Z$30;MATCH(ROWS(Z$2:Z2);$AB$2:$AB$30;0)) Sorted x-values of intersection points;
AD : =INDEX(AA$2:AA$30;MATCH(ROWS(AA$2:AA2);$AB$2:$AB$30;0)) Same for y-values;
AE : =IF(ROW()-1<=COUNT(M$2:M$30);M2;IF((ROW()-1)<=(COUNT(M$2:M$30)+COUNT(AC$2:AC$30));INDEX(AC$2:AC$30;ROW()-COUNT(M$2:M$30)-1;1);NA())) Combine x-values of Set 1, Set 2, and the intersection points;
AF : =IF(ROW()-1<=COUNT(V$2:V$30);V2;IF((ROW()-1)<=(COUNT(V$2:V$30)+COUNT(AD$2:AD$30));INDEX(AD$2:AD$30;ROW()-COUNT(V$2:V$30)-1;1);NA())) Same for y-values;
AG : =IF(ISNA(AE2);NA();COUNTIF($AE$2:$AE$30;"<="&AE2)) Create index to sort the resulting data set (and this is used to calculate the final results in columns E & F;
All formulas go until row 30, but this need to be changed of course based on the actual data sets. The idea is to add these formulas to one worksheet, and then columns E > AG can be copied to all other worksheets. There are obviously quite a few #NA values, but this is on purpose, and are not errors or mistakes. On request, I can share the actual spreadsheet, so you do not have to retype all formulas.
Some additional comments
You have to modify some formulas (the sort indices) if there are identical x-values, either within Set 1 (which I will not cover here, as it seems this would be unlikely, or be data input errors), or between Set 1 and Set 2. The dynamic sorting does not work in that case. A workaround is to create a "synthetic" sort column, e.g. with =TEXT(J2;"0000.00000000000")&L2. This formats all numbers the same way as text, and appends S1 or S2. So this should give unique sort values, which would sort the same way as the corresponding numbers.
Empty data sets or data sets with only 1 value are not treated correctly either (the intercept formulas and finding values for the "previous" data point are meaningless in these cases).

Excel scatter plot

Hello and good day to all
I have question related to excel graph. I have some set of values like this
Now I want to plot these values as a scatter graph. I want to to draw in such a way that by keeping X values same and 4 different Y plots i.e. A, B, C, D in a single graph. Meaning I dont want to merge these Y values on a single X value. Is there any way to do that? I thank you for your time and help.
The data does not make much sense. All X values are exactly the same, so all data markers will be at exactly the same X position.
To create a scatter chart with multiple series:
- select the data from A1 to B11
- insert a scatter chart
- select the B, C, D ranges (C1 to E11) and copy
- select the chart and use Paste Special > insert as new series with series name in first row
Since all X values are the same, many data points overlap and are not distinguishable.
Edit after comment: if you do NOT want to plot the values at their true X position (i.e. the value 1.4), then use four pairs of X / Y coordinates. Use X values 1 to 4 with A to D. Then use text boxes to replace the X axis labels or hide the labels and show the legend instead.

Scatter plot for variable number of rows and specific columns

I want to create an automated scatter plot. This is the first example table based on the step size I end up measuring A, B, C, D for a specific frequency. In this scatter plot I created manually you can see I want to plot C v/s A for a particular frequency.
But I need to do this automatically as based on the step size the number of row can change. Here, since the step size decreased the number of samples increased, and now the scatter plot needs to update number of A and C values it plots.
Is there a formula I can use without using any macros?
The relation between the step size and frequency is (number of samples of a single frequency = (360/step size)) so for a step size of 60 you will have in reality six entries of frequency 100 and six of 200 .
You can use formulas to define chart ranges if you hide the formulas in named ranges. Combine that with the fact that #N/A values are not plotted and you can get this to work without VBA.
For your example graph you could define two names ranges as follows:
Name: A_100
Refers To: =IF(Sheet1!$E$3:$E$100=100,OFFSET(Sheet1!$A$3,0,0,360/Sheet1!$B$1,1),NA())
and
Name: C_100
Refers To: =IF(Sheet1!$E$3:$E$100=100,OFFSET(Sheet1!$C$3,0,0,360/Sheet1!$B$1,1),NA())
Then set the X and Y axis of the chart to SheetName!A_100 and SheetName!C_100
The if statement filters out all the points not at frequency 100, if you have a formula for selecting the frequency replace "Sheet1!$E$3:$E$100=100" with that.
The offset function takes the first cell in the column and expands the number of rows according to your 360/step size formula.

How to put two x,y coordinate values as one in scatter chart in excel

I want to do something like this
I want to make a point x,y coordinate for one and x,y for another and show two points in a single scatter chart.
I am unable to figure out how to do this
x y x2 y2
1 1 1 8
From data above I need to show two points (1,1) and (1,8)
Depending on your data layout, the data points can be added as one series or as two series. Steps vary with your Excel version.
Try this: click an empty cell that has no data in neighbor cells. Click Insert > XY Scatter chart. This will create a blank XY chart. Now add the series.
Right-click the chart > Select Data > Add > select the range(s) for the X and Y values. Repeat for any other series.
If the data is in one contiguous table, you only need one series.

How do I plot just the outer edge of a curve in Excel?

I have data that looks like this:
Risk Return
9.2 6.5
7.8 3.4
6.4 5.2
.
.
.
.
10.2 6.4
I created a scatter plot that looks like this:
How can I get just the outer edge of the curve? I'd like it to look like this:
Maybe pick off the co-ordinates for points that are nearest to the upper boundary, plot just those co-ordinates as a separate series and add a suitable trend line. Or create the envelope values from the formula shown by plotting the selection with Display Equation on chart. Simplified example:
Let say your Risk is in column A rows 2-100, and Return is column B rows 2-100.
Create a column C with this formula (its an array formula so use Ctrl-Shift-Enter):
=MAX(IF($A$2:$A$100<=A2,$B$2:$B$100))=B2
Copy down to all of column C. This formula is saying get the max return, of all points with the same or lower risk than this point. If that return is equal to this points return then TRUE.
Now create columns D and E and with formulas like:
=IF(C2,A2,0)
=IF(C2,B2,0)
This will 0 out the points that are not part of the frontier, you can then plot these points.

Resources