Filtering discrepancies in duplicate measurements - excel

I have a dataset with the following problem.
Sometimes, a temperature sensor would return duplicate readings at the exact same minute, where sometimes 1 of 2 of the duplicates is "reasonable" and the other is slightly off.
For example:
TEMP TIME
1 24.5 4/1/18 2:00
2 24.7 4/1/18 2:00
3 24.6 4/1/18 2:05
4 28.3 4/1/18 2:05
5 24.3 4/1/18 2:10
6 24.5 4/1/18 2:10
7 26.5 4/1/18 2:15
8 24.4 4/1/18 2:15
9 24.7 4/1/18 2:20
10 22.0 4/1/18 2:20
Line 5, 7 & 10 are readings that are to be removed as they are too high or low (doesn't make sense that within 5 minutes it will rise and drop more than a degree in a relatively stable environment).
The goal at the end with this dataset is to "average" the similar values (such as in line 1 & 2) and just remove the lines that are too extreme (such as line 5 & 7) from the dataset entirely.
Currently my idea to formulate this is to look at a previously obtained row, and if one of the 2 duplicates is +/- 0.5 degree, to mark in a 3rd column with TRUE so I can filter out all the TRUE values in the end. I'm not sure how to communicate within the if statement that I'm looking for a + OR - 0.5 of a previous number however. Does anyone know?

Here is a google sheet example that does what you want:
https://docs.google.com/spreadsheets/d/1Va9RjSeulOfVTd-0b4EM4azbUkYUb22jXNc_EcafUO8/edit?usp=sharing
What I did:
Calculate a column of a 3-item running average of the data using "=AVERAGE(B3:B1)"
Filter the list using "=IF(ABS(B2-C2) < 1, B2, )"
Calculate the average of the filtered list
The use of Absolute Value is what provides "+ OR -" that you were looking for. It is saying if the distance between two numbers is too much, then don't include the term.

So, A Simple Solution came to my mind. Follow the Following steps given below:
Convert Data to Table
Add a 4th column at the last
Enter the formula "Current Value - Previous Value"
Filter the Column with high difference values
Delete those rows of filtered data and you'll be left with Normal Values
Here's the ref. Image

Or If you want to consider the Same time difference only then do the following:
Convert your data to Table
Add 4th column at the end of table
Writhe the Following Formula to 4th Column
IF(Current_Time = Previous_Time, Current_Temp-Previous_Temp,"")
Filter and Delete the Data with high Difference
See the following Image:

Related

I came across this question and the answers. I'm having similar challenge but i tried the answer to non-365 office users but could not get it

this data set below. It has 3 columns (PTQ, % Growth, $ Growth) that all need ranked individually then summed up and then ranked again for a total power rank of each region. Is there any way I can do this with a single formula? I do this a lot and it would be nice not to have to rank everything individually each time.
To clarify, I do not want to rank first on one column then another, they all need to be ranked equally together.
Data:
Region
PTQ
% Growth
$ Growth
TR ARIZONA
103
17.5
201330
TR IDAHO UTAH
75.5
-6.3
-69976
TR LA HAWAII
99.4
19.2
194840
TR LA NORTH
125
32.7
241231
TR NORTHERN CALIFORNIA
102.3
26.2
308824
TR NORTHWEST
91.1
-0.6
-4801
TR SAN FRANSISCO
76.9
-16.7
-158387
TR SOUTHERN CALIFORNIA
106.9
30.8
495722
TR TUCSON
100.3
7.6
34888
Assuming the same layout as P.b., in I4:
=1+SUMPRODUCT(N(MMULT(CHOOSE({1,2,3},RANK(C$4:C$12,C$4:C$12),RANK(D$4:D$12,D$4:D$12),RANK(E$4:E$12,E$4:E$12)),{1;1;1})<SUM(RANK(C4,C$4:C$12),RANK(D4,D$4:D$12),RANK(E4,E$4:E$12))))
and copied down.
This is quite challenging in older Excel, but possible nonetheless:
=IFERROR(
INDEX(
MMULT(--(RANK($C$4:$C$12,$C$4:$C$12)+RANK($D$4:$D$12,$D$4:$D$12)+RANK($E$4:$E$12,$E$4:$E$12)>=TRANSPOSE(RANK($C$4:$C$12,$C$4:$C$12)+RANK($D$4:$D$12,$D$4:$D$12)+RANK($E$4:$E$12,$E$4:$E$12))),ROW($C$4:$C$12)^0),ROW($A1))
-SUMPRODUCT(--(MMULT(--(RANK($C$4:$C$12,$C$4:$C$12)+RANK($D$4:$D$12,$D$4:$D$12)+RANK($E$4:$E$12,$E$4:$E$12)>=TRANSPOSE(RANK($C$4:$C$12,$C$4:$C$12)+RANK($D$4:$D$12,$D$4:$D$12)+RANK($E$4:$E$12,$E$4:$E$12))),ROW($C$4:$C$12)^0)
=INDEX(MMULT(--(RANK($C$4:$C$12,$C$4:$C$12)+RANK($D$4:$D$12,$D$4:$D$12)+RANK($E$4:$E$12,$E$4:$E$12)>=TRANSPOSE(RANK($C$4:$C$12,$C$4:$C$12)+RANK($D$4:$D$12,$D$4:$D$12)+RANK($E$4:$E$12,$E$4:$E$12))),ROW($C$4:$C$12)^0),
ROW($A1))))+1
,"")
(requires being entered with ctrl+shift+enter)
Explanation:
First an array is made of the sum of the 3 rankings:
RANK($C$4:$C$12,$C$4:$C$12)+RANK($D$4:$D$12,$D$4:$D$12)+RANK($E$4:$E$12,$E$4:$E$12)
This results in your so called Rank Sum - array.
Then - since RANK requires a range, not an array, we need an alternative to create a ranking of the array: MMULT can do that.
MMULT(RankSum>=RankSum,ROW(RankSum)^0) creates an array of the ranked RankSum, however. If 2 are ranked equally - for instance rank 1 - it's rank both as 2, not 1. Therefore I used SUMPRODUCT to calculate the number of items in the calculated MMULT-array that equal the indexed MMULT-array result as an alternative to COUNTIF, which is also limited to take a Range, not Array. So MMULTarray-SUMPRODUCT(--(MMULTarray=IndexedMMULTarray)) is your end result.
Calculation is based on your data being in B4:E12 and formula above is entered (with ctrl+shift+enter) in a cell in row 4 and copied down; I4 in the shared screenshot.
Even though this formula answers your question, I doubt this is what you thought what it would be. Changing the range to a different range by itself could be very teasing. And calculating the rankings manually and sum/rank them is probably easier to maintain. You may make it more dynamical by adding INDEX in the ranges.

How to count items in Excel within a date range or without an end date

Note: date formats are DD/MM/YYYY
I have a list of records, each with one column for a start date, and one for an end date.
Every record has a start date, but if an item is current it has no end date and the end date cell is blank.
I want to write a/some formulas to determine how many records were a given age at a given date, rounded down to the nearest whole year.
So for example, how many records were 0-1 years old at the date at (cell reference R1), and then how many were 1-2 years, 2-3 years etc.
I want this to be reusable so that I can update the date at R1 each month and it recalculates automatically. This is easy enough for R1=TODAY, as I can assume all end dates are in the past, but for R1=EDATE(TODAY,-12) it becomes trickier.
As an example, in the yellow highlighted cell I want to calculate how many records were between 1&2 years old as of 30/06/21 (S1), AND were current at the time (i.e. exclude from the count any records that have an end date before 30/06/21).
The blue highlighted area is my data, the green area is what I'm trying to calculate. I don't mind adding an extra data column or two if it assists in the calculation, but I don't want to have to add an extra column for every year that I'm trying to calculate, if it can be avoided.
Start Date
End Date
Years (as of 30/06/2022)
Age
30/06/2022
30/06/2021
30/06/2020
30/06/2019
30/06/2018
30/06/2017
30/06/2016
30/06/2015
30/06/2014
30/06/2013
20/09/2021
0.77
13
0
7/09/2020
4/12/2020
0.24
12
0
6/08/2019
2.90
11
0
17/02/2020
2.37
10
0
1/04/2019
3.25
9
0
16/03/2020
18/11/2020
0.68
8
0
17/08/2021
19/11/2021
0.26
7
0
23/08/2022
-0.15
6
0
16/11/2020
1/04/2022
1.37
5
0
20/04/2020
21/10/2021
1.50
4
0
7/05/2019
26/02/2021
1.81
3
2
29/06/2020
7/01/2021
0.53
2
5
16/08/2021
20/04/2022
0.68
1
5
0
13
I created a table for the data (insert table) --> tblData
=LET(calculatedAge,MAP(tblData[Start Date],tblData[End Date],
LAMBDA(startdate,enddate,
ROUNDDOWN((MIN(IF(ISBLANK(enddate),E$1,enddate),E$1)-startdate)/365,0))
),
filteredAges,FILTER(calculatedAge,calculatedAge=$D2),
IFERROR(ROWS(filteredAges),0))
The MAP-function returns the calculated age per target date (E$1) - rounded down.
It simulates a helper column - that then can be filtered.
Thanks to #Robert Mearns for the FILTER-hack - as COUNTIF doesn't work in this scenario (s. Using LET with COUNTIF and Array, e.g. MAP)

Excel - Find row with conditional statement in XLOOKUP

I'm trying to use XLOOKUP to find a value based on user inputs.
The table looks like this:
Type Start End 33 36 42 48
---------------------------------------
4002 1 7 1.17 1.34 1.5 1.84
4002 8 12 1.84 1.67 2.1 3.45
User selects type, number (can be between start and end), and 33-48
I can nest an XLOOKUP to specify the 3 criteria
=XLOOKUP(*type* & *number* , *typeRange* & *numberRange* ,XLOOKUP(*33-48* , *33-48Range* , *ResultRange* ))
And I can find if a value is between the columns
=IF(AND(*number*>=*Start*,*number*<=*End*),TRUE,FALSE)
Can I combine the two? The data is redundant for numbers 1-7, and I would like to keep the table small.
You sort-of can combine them. I have added a couple of extra rows to the table to see what would happen if you had different Type values as well as number values. The problem then is that if you used approximate match and put in a number like thirteen which is out of range, you might end up getting the next row of the table which would be incorrect. One way round it would be to use the options in Xlookup to search for next-smaller-item in the Start column and next-larger-item in the End column and see if the results match:
=IF(XLOOKUP(I2&TEXT(J2,"00"),A2:A7&TEXT(B2:B7,"00"),XLOOKUP(K2,D1:G1,D2:G7),,-1)=XLOOKUP(I2&TEXT(J2,"00"),A2:A7&TEXT(C2:C7,"00"),
XLOOKUP(K2,D1:G1,D2:G7),,1),XLOOKUP(I2&TEXT(J2,"00"),A2:A7&TEXT(C2:C7,"00"),XLOOKUP(K2,D1:G1,D2:G7),,1),"Error")
If you have some checks in place which make it impossible for number to be out of range, then you can simplify the formula:
=XLOOKUP(I2&TEXT(J2,"00"),A2:A7&TEXT(B2:B7,"00"),XLOOKUP(K2,D1:G1,D2:G7),,-1)
or
=XLOOKUP(I2&TEXT(J2,"00"),A2:A7&TEXT(C2:C7,"00"),XLOOKUP(K2,D1:G1,D2:G7),,1)

Summing every first month columns in Excel

I am trying to add the sum of the first 7 columns and then the next 7th columns etc in Excel. So for example if I have the below data and I needed to be added weekly,
Day 02/01/2017 03/01/2017 04/01/2017 05/01/2017 06/01/2017 07/01/2017 08/01/2017 09/01/2017 10/01/2017 11/01/2017 12/01/2017 13/01/2017 14/01/2017 15/01/2017
Presented Calls 1000 1550 900 1455 789 987 1435 1200 1675 1230 1232 1400 999 650
So if I want to add the presented calls from 02/01 - 07/01 this should be sum(B2:H2)
Then the sum of the presented calls from 08/01-15/01 this should be sum(I2:M2)
etc
However at the moment in Excel it is being sum(B2:H2) then sum(C2:I2) which is incorrect, can anyone help?
You can use the OFFSET() function combined with the COLUMN() function and a bit of arithmetic to get the desired range to sum.
Try entering this formula and fill across.
=SUM(OFFSET($B$2,0,(COLUMN()-COLUMN($B$2))*7,1,7))

How to make a "trending" or "averaging" curve

I have a spreadsheet on which I've been tracking my weight for the last year.
I weigh myself nearly every day, and I can be off by as much as 5 pounds from day to day.
I would like make a graph shows the overall pattern of my weight loss / gain, but without all of the noise.
What are some formulas that I can use to calculate the overall trend?
Place the raw daily measurements in A1 thru A365In B2 enter:
=(A1+A2+A3)/3
and copy down. Column B will give you a smoother dataset for plotting and trending.
Once you have enough data points a "moving average" will help reduce the daily noise. Let's say you have 10 data points starting in A1:
120.0 119.0 114.1 116.7 112.0 108.7 107.9 104.6 108.9 111.7
In cell C2 you could use the formula AVERAGE(A1:C1) and copy it to the end of your data set. THe relative references will always average the last 3 measurements.
Now your data looks like:
120.0 119.0 114.1 116.7 112.0 108.7 107.9 104.6 108.9 111.7
117.7 116.6 114.3 112.5 109.5 107.1 107.1 108.4
So your second row has far less variation that the raw data.
You can also get fancy and make the number of measurements variable. If that number were stored in A5 (below your data) then the formula would be something like
=AVERAGE(OFFSET(C1,0,0,1,-MIN(COLUMN(),$A$5)))
The MIN ensures that you don't go past the beginning of the data set (if you do a 5-day moving average you can;t go back 5 days from the 4th day, etc.)

Resources