Exporting Data in X,Y,Z format - excel

I have a number of Excel files containing coordinate data that is not normalized.
A single row describes multiple points.
Invariably, one column gives the X value. As an illustration some files have say 3 columns representing Z values, and one column representing Y.
X Y1 Y2 Y3 Z
Other files have Multiple Y and Z columns in which point sets may share Y or Z columns.
There may be as many as 40 columns in a table and there are usually about 300 rows.
I need to export the data in CSV format with X, Y, Z values. Is there some relatively simple way to generate a sheet containing X, Y, Z coordinates from such data so that it can be exported?
So that
X Y1 Y2 Y3 Z
becomes
X, Y1, Z ; multiple rows
x, Y2, Z ; multiple rows
X, Y3, Z ; multiple rows
Such that the x and z values are repeated one for each set of Y values.
There is one other format that I have to deal convert. One column contains the X, values, one Row contains the y values, and the z values are in the row and column with the X and Y value.
Complicating all this is that there may be omitted or missing values in some of the cells.
This appears to be the same problem in python
Convert datafiles 'X' 'Y' 'Z' 'data' format
In response to ANDY:
A sheet may be laid out as columns:
X Y1 Y2 Z1 Y3 Y4 Y5 Z2
The coordinate sets are:
X, Y1, Z1
X, Y2, Z1
X, Y3, Z2
X, Y4, Z2

Related

How to use VBA to incorporate functions to avoid occupation of extra cells?

This spreadsheet aims to calculate distances between atoms, and we want to improve the functions so as to avoid the occupation of extra columns. (See image postscripted. Atom coordinates are given in the Column A to D, and the atom pair whose distance should be calculated is given in Column F to G.)
Currently in the first step, coordinates of specified atoms are picked up in columns I to O. e.g. Cell I4 is filled with the function:
=VLOOKUP($F4,$A$4:$E$1023,2,FALSE)
and then in the next step, the distance could be resolved in Column Q with Euclidean distance formula on the coordinates picked up. e.g. Cell Q4 is:
=SQRT(POWER((I4-M4),2)+POWER((J4-N4),2)+POWER((K4-O4),2))
According to the distance calculating algorithm, once the two atoms are specified, the distance is then determined. Thus, is it possible to write a function with VBA to gracefully incorporate these functions and take away these pilot processes from columns I to O? (Because these columns will be used otherwise in the future; and the code readability would be terrible if we put, for example, the six VLOOKUP functions directly into the final SQRT function.)
I'm new to VBA. Any help would be appreciated. Thanks!
The original data in this spreadsheet is as below: (From the third line)
Atom_No X_coordinate Y_coordinate Z_coordinate Atom_No1 Atom_No2 X1 Y1 Z1 X2 Y2 Z2 Distance
1 2.35739851 13.17160225 4.022993565 4 2 3.827347994 9.501971245 8.374602318 4.403610706 11.14351559 6.991936684 2.222276039
2 4.403610706 11.14351559 6.991936684 3 2 0.721047342 12.58075523 2.64032793 4.403610706 11.14351559 6.991936684 5.879067059
3 0.721047342 12.58075523 2.64032793 1 4 2.35739851 13.17160225 4.022993565 3.827347994 9.501971245 8.374602318 5.879068118
4 3.827347994 9.501971245 8.374602318 2 1 4.403610706 11.14351559 6.991936684 2.35739851 13.17160225 4.022993565 4.13699687
… … … … 3 1 0.721047342 12.58075523 2.64032793 2.35739851 13.17160225 4.022993565 2.22227577
Finally, these two modules will work and get the correct result when CoordinateTableRange and Atom_No1-2 pair is given like this table. You could load these two modules, and write in column R (e.g. R4 cell) with
=AtomsDistance(F4,G4)
, then you'll find the distance is the same as you got in col Q.
Function CoordinateVLookUp(Atom_No As Integer, CoordinateTableRange As Range, Column As Integer, isFuzzy As Boolean) As Double
'To find the atom coordinates according to columns, with CoordinateTableRange selected and isFuzzy specified.
Dim myResult As Variant
myResult = Application.WorksheetFunction.VLookup(Atom_No, CoordinateTableRange, Column, isFuzzy)
If IsError(myResult) Then
MsgBox ("No result found.")
Else
CoordinateVLookUp = myResult
End If
End Function
Function AtomsDistance(Atom_No1 As Integer, Atom_No2 As Integer) As Double
'To call CoordinateVLookUp function above to acquire the x, y, z coordinates of both atoms, and then calculate the distance through Euclidean distance formula.
Dim x1 As Double
Dim y1 As Double
Dim z1 As Double
Dim x2 As Double
Dim y2 As Double
Dim z2 As Double
Dim CoordinateTableRange As Range
Set CoordinateTableRange = Range("A4:E1023") 'set should be added
x1 = CoordinateVLookUp(Atom_No1, CoordinateTableRange, 2, False)
y1 = CoordinateVLookUp(Atom_No1, CoordinateTableRange, 3, False)
z1 = CoordinateVLookUp(Atom_No1, CoordinateTableRange, 4, False)
x2 = CoordinateVLookUp(Atom_No2, CoordinateTableRange, 2, False)
y2 = CoordinateVLookUp(Atom_No2, CoordinateTableRange, 3, False)
z2 = CoordinateVLookUp(Atom_No2, CoordinateTableRange, 4, False)
AtomsDistance = Math.Sqr((x1 - x2) * (x1 - x2) + (y1 - y2) * (y1 - y2) + (z1 - z2) * (z1 - z2))
End Function

Excel Count/Sum Based on Multiple Criteria for Same Row

I have a relatively simple problem which I am getting stumped on - perhaps it is this brain fog from Covid. I'll try my best to explain the problem.
Here is a simulated dataset:
A B C D E F G H I J K L M N
1 X1 X2 X3 Y1 Y2 Y3 X1 X2 X3 X1 X2 X3 Ct St
2 1 2 0.2 0 2 0.5 1 2 0.1 2 0.3
3 1 2 0.3 1 1 0.2 1 0.3
4 1 2 0.6 1 2 0.1 1 0.6
5 1 2 1.1 2 0.7 1 0.5 1 1.1
A-N reflects the column names while the first column (1-5) reflects the row names in Excel.
Each column has been labelled as either X (e.g., male) and Y (e.g., female). There are three characteristics for male (X1, X2, X3) and three characteristics for female (Y1, Y2, Y3). We can think of adjacent columns as belonging to a trait (e.g., X1, X2, and X3 in columns A, B and C form a set of male characteristics for trait 1; X1, X2, and X3 in columns G, H and I form a set of similar characteristics but for trait 2, etc.).
For each row, I would like to calculate a count total (Ct, see column M) and sum total (St, see column N) based on a set of conditions.
Count total: Count the number of male (X) traits that feature a "1" for X1 and "2" for X2, giving a 'count total'.
Sum total: Sum the X3 values over male (X) traits that feature a "2" for X2, giving a 'sum total'.
I have manually calculated the count totals and sum totals for each column to make these definitions clearer. In row 1, there are two traits that fulfil the count total criteria (Ct = 2), whereby their X1 values = 1 and X2 values = 2. Notice that while the X2 value in column H qualifies (X2 = 2), X1 in column G is not equal to 1, so it is not counted. Furthermore, we only sum the X3 values for traits 1 and 2 (e.g., X3 in Column C and X3 in Column L), giving us a total of 0.3 (0.2 + 0.1).
The formulae should ignore sets of values that qualify but are for female traits (e.g., see row 3) and should work across missing values (e.g., in col J, row 4, X1 is missing, so it cannot be counted, even if X2 in col K row 4 features a qualifying value of 2).
I hope that makes sense.
My instinct was to use a SUMPRODUCT formula, but I am struggling to integrate the two conditions, e.g., for each row:
=SUMPRODUCT(((A1:L1="X1")*(A2:L2=1))*((A1:L1="X2")*(A2:L2=2)))
Any guidance would be much appreciated.
I haven't checked this thoroughly, but suggest for Ct
=SUMPRODUCT((A$1:J$1="X1")*(A2:J2=1)*(B$1:K$1="X2")*(B2:K2=2))
and for St
=SUMPRODUCT((A$1:J$1="X1")*(A2:J2=1)*(B$1:K$1="X2")*(B2:K2=2)*(C$1:L$1="X3")*C2:L2)
copied down.

Get the values of 4 unknowns from 2 given equations

I have 2 equations in which there are a total of 10 values and out of these 10 values 6 are known and the rest 4 values are unknown. Is there any method in python which could solve this type of problem?
I am talking about the second and fourth equations. Here all the X values are known and in the fourth equation, it is μ12 instead of μ21.
SymPy can tell you what two of the 4 are in terms of the other two (and known values):
>>> from sympy import var, solve
var('i1 x2d k12 x1 x3 mu12 x2 x4 i2 x4d mu21')
(i1, x2d, k12, x1, x3, mu12, x2, x4, i2, x4d, mu21)
>>> solve(
... (i1*x2d-(k12*(x1-x3)+mu12*(x2-x4)),
... i2*x4d-(k12*(x3-x1)+mu21*(x4-x2))),
... (i1,i2,mu12,mu21))
{i1: mu12*(x2 - x4)/x2d + (k12*x1 - k12*x3)/x2d,
i2: mu21*(-x2 + x4)/x4d + (-k12*x1 + k12*x3)/x4d}

Plane fitting through points in 3D using python

I have points in 3D space.
X Y Z
0 0.61853 0.52390 0.26304
1 0.61843 0.52415 0.26297
2 0.62292 0.52552 0.26108
3 0.62681 0.51726 0.25622
4 0.62772 0.51610 0.25903
I have defined a plane through the points which should vertically divide these points, but it is not dividing them vertically or horizontally. The plane and the points are way apart while I'm plotting them.
def plane_equation(x1, y1, z1, x2, y2, z2, x3, y3, z3):
a1 = x2 - x1
b1 = y2 - y1
c1 = z2 - z1
a2 = x3 - x1
b2 = y3 - y1
c2 = z3 - z1
a = b1 * c2 - b2 * c1
b = a2 * c1 - a1 * c2
c = a1 * b2 - b1 * a2
d = (- a * x1 - b * y1 - c * z1)
return a, b, c, d
# Finding the equation of the plane
a, b, c, d = plane_equation(x0, y0, z0, x1, y1, z1, x2, y2, z2)
print("equation of plane is ", a, "x +", b, "y +", c, "z +", d, "= 0.")
x = np.arange(0, 1, 0.1)
y = np.arange(0, 1, 0.1)
X,Y = np.meshgrid(x,y)
Z = a*X + b*Y + d
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.scatter(df.x, df.y, df.z, color = 'c', marker = 'o', alpha = 0.5)
surf = ax.plot_surface(X, Y, Z)
equation of plane is -0.0002496952000000007 x + 0.00036812320000000016 y + 0.0007697304000000002 z + -0.00024088567529317268 = 0.
I need plane to pass through these points and should be in vertical direction. The blue plane should pass through cyan points and plane should be in vertical direction.
There are several things that might go wrong here:
too small values
Your normal is not normalized and its coordinates are very small magnitude (0.000???) so its possible that your plot is handling all the values as zero (as the plot on your image is of the plane Z=0 which has nothing to do with values you provided).
From your feedback in chat This assumption of mine was right so to solve this problem just normalize your normal:
n(nx,ny,nz) /= sqrt(nx*nx + ny*ny +nz*nz)
And compute the d with this new values:
d = nx*x0 + ny*y0 + nz*z0
where (x0,y0,z0) is any from the selected points.
Wrongly selected points
The 3 selected points should be not too close to each other and not on a single line. If they do the computed normal is invalid. Also if you select points containing big noise the accuracy is lowered by it...
To improve this select 3 points randomly compute normal. Compute n such normals and average them together. The higher n the better accuracy.
Fit
To improve accuracy even more you can try to fit the normal and d. simply by using normal from #1 or #2 and fit its coordinates and d in near range to minimize the avg or max distance of all points to plane. However this is O(n.log^4(m)) where n is the number of used points and m relate to the fitted range of each parameter, but provides the best accuracy you can get.
You can use binary search or Approximation search or what ever optimizer your environment have at disposal

Excel Mapping with multiple matches

my knowledge of excel extends to most types of advanced formulas. I don't know much about how to use VBA or macros. I have a problem which I'm struggling to solve using formulas. I have a sheet with two columns that looks like this:
x1 y1
x1 y2
x1 y3
x1 y4
x2 y2
x2 y3
x2 y4
x3 y1
x4 y2
And I'm trying to map these onto a a sheet like this:
y1 y2 y3 y4
x1 1 1 1 1
x2 0 1 1 1
x3 1 0 0 0
x4 0 1 0 0
I usually try to apply a vlookup solution to such problems, but I can't figure out how to get vlookup to work given that the x values appear multiple times in the first table, and vlookup will always just stop at the first appearance.
Please let me know how to best approach solving this problem.
Thanks so much!
Use COUNTIFS()
=COUNTIFS($A:$A,$D2,$B:$B,E$1)
But a pivot table may be better suited

Resources