Create Line-Chart with different X-Values - excel

I have a certain number of measurements. Each in the following form:
Table A:
| Time [s] | Value |
| 0.5 | 2.0 |
| 50.3 | 33.7 |
| 100.0 | 25.5 |
Table B:
| Time [s] | Value |
| 1.3 | 12.7 |
| 27.8 | 25.0 |
| 97.5 | 20.0 |
| 100.0 | 7.1 |
Table C:
...
The time is always the same, from 0.0 seconds to 100.0 seconds.
The measurement-points as to be seen in the example differ.
I now want to display the different measurements in one chart. Each table has its own line-graph. The X-Axis would display the Time.
Is something like this possible in Excel?

Solved my problem by using a Scatter graph instead of a Line graph...

Related

Calculating the difference in value between columns

I have a dataframe with YYYYMM columns that contain monthly totals on the row level:
| yearM | feature | 201902 | 201903 | 201904 | 201905 |... ... ... 202009
|-------|----------|--------|--------|--------|--------|
| 0 | feature1 | Nan | Nan | 9.0 | 32.0 |
| 1 | feature2 | 1.0 | 1.0 | 1.0 | 4.0 |
| 2 | feature3 | Nan | 1.0 | 4.0 | 8.0 |
| 3 | feature4 | 9.0 | 15.0 | 19.0 | 24.0 |
| 4 | feature5 | 33.0 | 67.0 | 99.0 | 121.0 |
| 5 | feature6 | 12.0 | 15.0 | 17.0 | 19.0 |
| 6 | feature7 | 1.0 | 8.0 | 15.0 | 20.0 |
| 7 | feature8 | Nan | Nan | 1.0 | 9.0 |
I would like to convert the totals to the monthly change. The feature column should be excluded as I need to keep the feature names. The yearM in the index is a result of pivoting a dataframe to get the YYYYMM on the column level.
This is how the output would look like:
| yearM | feature | 201902 | 201903 | 201904 | 201905 |... ... ... 202009
|-------|----------|--------|--------|--------|--------|
| 0 | feature1 | Nan | 0.0 | 9.0 | 23.0 |
| 1 | feature2 | 1.0 | 0.0 | 0.0 | 3.0 |
| 2 | feature3 | Nan | 1.0 | 3.0 | 5.0 |
| 3 | feature4 | 9.0 | 6.0 | 4.0 | 5 |
| 4 | feature5 | 33.0 | 34.0 | 32.0 | 22.0 |
| 5 | feature6 | 12.0 | 3.0 | 2.0 | 2.0 |
| 6 | feature7 | 1.0 | 7.0 | 7.0 | 5.0 |
| 7 | feature8 | Nan | 0.0 | 1.0 | 8.0 |
The row level values now represent the change compared to the previous month instead of having the total for the month.
I know that I should start by filling the NaN rows in the starting column 201902 with 0:
df['201902'] = df['201902'].fillna(0)
I could also calculate them one by one with something similar to this:
df['201902'] = df['201902'].fillna(0) - df['201901'].fillna(0)
df['201903'] = df['201903'].fillna(0) - df['201902'].fillna(0)
df['201904'] = df['201904'].fillna(0) - df['201903'].fillna(0)
...
...
Hopefully there's a smarter solution though
use iloc or drop to access the other columns, then diff with axis=1 for row-wise differences.
monthly_change = df.iloc[:, 1:].fillna(0).diff(axis=1)
# or
# monthly_change = df.drop(['feature'], axis=1).fillna(0).diff(axis=1)

Looking up a cross-reference but a range

I hope you can help me please.
How would I go about looking up data in the following cross-reference table?
I have the header row (i.e. 25) value and the column (mm) value and want to return the x/y value. i.e I have and item with (header row) X = 25 and (mm) Y= 0.48 item and want 1.6 to be returned.
+--------------+-------+---------+---------+---------+
| (mm) width | 10~20 | 20.1~30 | 30.1~40 | 40.1~50 |
+--------------+-------+---------+---------+---------+
| 0.20~0.45 | 1.3 | 1.8 | 2.1 | 3.5 |
| 0.46~0.60 | 1.4 | 1.6 | 1.8 | 2.3 |
| 0.61~0.70 | 1.5 | 1.7 | 1.6 | 2.1 |
| 0.71~0.80 | 0.7 | 1.1 | 2.2 | 3.1 |
+--------------+-------+---------+---------+---------+
Thanks a lot for your support.
Try,
=INDEX(B2:E5, MATCH(TEXT(H2, "0.00"), A2:A5), MATCH(TEXT(G2, "0"), B1:E1))
With your current set up:
=SUMPRODUCT((G2>=--LEFT(B1:E1,FIND("~",B1:E1)-1))*(G2<=--MID(B1:E1,FIND("~",B1:E1)+1,2))*(G3<=--MID(A2:A5,FIND("~",A2:A5)+1,3))*(G3>=--LEFT(A2:A5,FIND("~",A2:A5)-1)),B2:E5)
But if you modify the numbers a little to just include the minimums:
This simpler formula will work:
=INDEX(B2:E5,MATCH(G2,B1:E1),MATCH(G3,A2:A5))

tensorflow timeseries different lengths

I try to get a timeseries into tenserflow to work for an LSTM. I have 4 Files but I'm not sure how to get them together running together. The biggest problem I have is that my first dataset has 1 Data-point per year but 2 others monthly data which should be used for correlation to predict the first set. The 4th Dataset just has some Metadata like Species and Coordinates. Should I put them together somehow, if so how? Any advice in right direction would be nice.
I already looked to the timeseries documentation of tenserflow and also was trying to follow this guide: https://machinelearningmastery.com/multivariate-time-series-forecasting-lstms-keras/
but I struggle with getting the year and month data good together. I manage the data in R but run Tensorflow in Python. I'm more familiar with R in general.
Thank you all for being here!
Header samples of the Data structure:
File1.csv:
years | noaa-tree-2657 |noaa-tree-2658 |noaa-tree-2659 |noaa-tree-2662
1901 | 1.676948 | 1.305594 | 0.6756204 | 0.7149572
1902 | 1.562344 | 0.899884 | 0.5102933 | 0.6351094
1903 | 1.687270 | 1.354678 | 0.9899198 | 0.6158589
File2.csv:
noaa-tree-2657 |noaa-tree-2658 |noaa-tree-2659 |noaa-tree-2662 |noaa-tree-2664
1 6.41 | 1.85 | 0.33 | 8.61 | 6.07
2 10.45 | 3.20 | 0.38 | 8.58 | 5.30
3 10.81 | 4.30 | 1.50 | 9.34 | 8.50
File3.csv:
noaa-tree-2657 |noaa-tree-2658 |noaa-tree-2659 |noaa-tree-2662 |noaa-tree-2664
1 -0.3 | 11.0 | 10.1 | -22.4 | -15.1
2 -2.9 | 10.2 | 8.8 | -14.5 | -13.3
3 1.0 | 14.3 | 14.7 | -13.8 | -12.7
File4.csv:
noaa-tree-2657 |noaa-tree-2658 |noaa-tree-2659 |noaa-tree-2662 |noaa-tree-2664
1 QUPR | PSME | PSME | PCGL | THOC
2 280.28 | 249.65 | 250.08 | 298 | 280.72
3 39.1 | 31.45 | 32.72 | 56.55 | 48.47

Looking to create weighted average of partitioned columns in Excel

Horrible title, but I couldn't find a way to describe what I'm trying to do concisely. This question was posed to me by a friend, and I'm usually competent in Excel, but in this case I am totally stumped.
Suppose I have the following data:
| A | B | C | D | E | F | G | H |
---------------------------------------------------------------------
1 | 0.50 | 0.50 | 1 | | | 0.30 | 0.30 | |
2 | 0.25 | 0.75 | 2 | | | 0.40 | 0.70 | |
3 | 1.00 | 1.75 | 8 | | | 0.30 | 1.00 | |
4 | 0.75 | 2.50 | 2 | | | 0.50 | 1.50 | |
5 | 1.25 | 3.75 | 3 | | | 1.75 | 3.25 | |
6 | 0.50 | 4.25 | 1 | | | 0.25 | 3.50 | |
7 | 1.00 | 5.25 | 0 | | | 0.50 | 4.00 | |
8 | 0.25 | 5.50 | 2 | | | 0.30 | 4.30 | |
9 | 0.25 | 5.75 | 9 | | | 0.25 | 4.55 | |
10 | 0.75 | 6.50 | 4 | | | 0.70 | 5.25 | |
11 | | | | | | 1.00 | 6.25 | |
12 | | | | | | 0.25 | 0.25 | |
Column A represents the distance traveled while the measurement in column C was collected. Column B represents the total distance traveled so far. So C1 represents some value produced during the process from distance 0 to 0.5. B2 represents the value from distance 0.5 to 0.75, and B3 represents the value from 0.75 to 1.75, etc...
Column F represents a PLANNED second iteration of the same process, but with different measurement intervals. What I need is a way to PREDICT column H, based on a WEIGHTED AVERAGE of values from column C, based on where the intervals in column F intersect with the intervals in column A. For example, since F2 represents the measurement taken from distance 0.30 to 0.70 (an interval of 0.4, split 50/50 across the measurements in C1 and C2), H2 would be equal to: C1*0.5 + C2*0.5: 1.5.
Another example: H3 represents the expected measurement from an interval between 0.7 and 1.0, which is split between C2 (from 0.7 to 0.75 = 0.05) and C3 (from 0.75 to 1.0 = 0.25). So H3 = 16.6%*C2 + 83.3%*C3 = 0.332+6.664 = 6.996.
I'm looking for a way to do this in an Excel spreadsheet without using VBA or breaking it down into something like a Python script to process externally, but so far I'm not finding any way to do it.
Any ideas for accomplishing this entirely within Excel without any special add-ins/scripts installed ?
It's not pretty, but I think the following should work for all except H1 (which would need an added zero row):
=(MAX(0,INDEX(B:B,MATCH(G2,B:B,1))-G1)*INDEX(C:C,MATCH(G2,B:B,1)) +
(G2-INDEX(B:B,MATCH(G2,B:B,1)))*INDEX(C:C,MATCH(G2,B:B,1)+1)) /
MAX(G2-G1,G2-INDEX(B:B,MATCH(G2,B:B,1)))
It matches the values in B and C and weights them accordingly.

Calculating median with three conditions to aggregate a large amount of data

Looking for some help here at aggregating more than 60,000 data points (a fish telemetry study). I need to calculate the median of acceleration values by individual fish, date, and hour. For example, I want to calculate the median for a fish moving from 2:00-2:59PM on June 1.
+--------+----------+-------+-------+------+-------+------+-------+-----------+-------------+
| Date | Time | Month | Diel | ID | Accel | TL | Temp | TempGroup | Behav_group |
+--------+----------+-------+-------+------+-------+------+-------+-----------+-------------+
| 6/1/10 | 01:25:00 | 6 | night | 2084 | 0.94 | 67.5 | 22.81 | High | Non-angled |
| 6/1/10 | 01:36:00 | 6 | night | 2084 | 0.75 | 67.5 | 22.81 | High | Non-angled |
| 6/1/10 | 02:06:00 | 6 | night | 2084 | 0.75 | 67.5 | 22.65 | High | Non-angled |
| 6/1/10 | 02:09:00 | 6 | night | 2084 | 0.57 | 67.5 | 22.65 | High | Non-angled |
| 6/1/10 | 03:36:00 | 6 | night | 2084 | 0.75 | 67.5 | 22.59 | High | Non-angled |
| 6/1/10 | 03:43:00 | 6 | night | 2084 | 0.57 | 67.5 | 22.59 | High | Non-angled |
| 6/1/10 | 03:49:00 | 6 | night | 2084 | 0.57 | 67.5 | 22.59 | High | Non-angled |
| 6/1/10 | 03:51:00 | 6 | night | 2084 | 0.57 | 67.5 | 22.59 | High | Non-angled |
+--------+----------+-------+-------+------+-------+------+-------+-----------+-------------+
I suggest adding a column (say hr) to your data (containing something like =HOUR(B2) copied down to suit) and pivoting your data with ID, Date, hr and Time for ROWS and Sum of Accel for VALUES. Then copy the pivot table (in Tabular format, without Grand Totals) and Paste Special, Values. On the copy, apply Subtotal At each change in: hr, Use function: Average, Add subtotal to: Sum of Accel then select the Sum of Accel column and replace SUBTOTAL(1, with MEDIAN(. Change Average to Median if required.

Resources