Lookup previous date for the same item in an unsorted table - excel

Trying to create a way to QC daily data before pulling it into Spotfire to automate a monthly report. The data is sourced from the field so the data quality isn't the greatest. Plus some days get missed because there wasn't a guy working in the area that day.
I tried doing all of the calculations in Spotfire and managed to get it but after talking to some folks when I was asking about ways to QC the data. They all suggested doing the calculations, data cleaning, and QCing in Excel before pulling it into Spotfire.
Not sure how to put a table in here but are the important columns.
Column A: unique well identifier
Column G: hours and value that I want to lookup
Column I: date
I managed to get all of the calculations and filtering completed within Spotfire but am wanting to replicate them in Excel.
The first thing I am trying to do is pull the previous dates HOURS value (Column H). Some wells will have a value on the previous date and some might not.
The below formulas are a few that I tried out:
Formula 1: =DATE(YEAR(DATA!$I2),MONTH(DATA!$I2),DAY(DATA!$I2)-1)
Formula 2: =INDEX(A:I,MATCH(1,(I:I=DATA!$I2)*(A:A=DATA!$A2),1),9)
Results:
Formula 1 always yielded the previous day which would work for 99% of
the cases.
Formula 2 yielded #N/A

I don't know why you don't want to do the whole thing using Spotfire.
Here is an example of an expression, using Spotfire, to get the previous date (which seems to be what you want, but without a dataset it is hard to figure out) :
Min([Date_Col]) OVER (Previous([Date_Col]))
You may want to use the DatePart function first to only have the date if you currently have a datetime

Related

excel sumproduct formula calculating time

hello dear forum members and admins,
i created a dashboard to calculate customer numbers based on raw data prepared on day and time set.
there is no problem here I can calculate. but I also want to calculate the monthly average customer numbers based on the time set. for this purpose, I created a data table in a daily_pax_sheet. for e.g in January between 10:00 - 10:30 CUSTOMERS E totally number is 35 and this 35 amount occurred in 8 days, then it calculates 4 amounts. in daily_pax_details sheet row 107 formula firstly calculate sum of the data then it divides the day number (for e.g this amount occurs in 8 days) but in some cases, raw data include more than 20k line and it's calculating and waiting too much. is there any other way to do it in a quick way ? how can I change this formula to make calculation quick ?
https://docs.google.com/spreadsheets/d/1y-2Ke2ssskzSM-wYszU54CIEraPbEX4X/edit#gid=459027650
UPDATE: thanks for the idea and solution from members
I also realized another solution and I think it will help other people in the future. we can get the data to the pivot table with counting the unique values with changing field settings. To do it you should get the data pivot table with selecting "add this data to the data model". then changing value field settings with the "Distinct value". hope it will help another pppl.
Your SUMIFS formulas use whole columns as criteria, so every calculation checks more than 1 million cells. And because you use more than 1 column, then every time you update *anything in the file, Excel checks more than 4 million cells because of your formulas...
I changed all to use the righ rawdata range of cells used, not whole columns. And now it works perfect.
As example, your formula is:
=SUMIFS('raw data'!$R:$R;'raw data'!$D:$D;'Daily customer'!C$3;'raw data'!$I:$I;'Daily customer'!$A4)
And I changed to:
=SUMIFS('raw data'!$R$2:$R$2461;'raw data'!$D$2:$D$2461;'Daily customer'!C$3;'raw data'!$I$2:$I$2461;'Daily customer'!$A4)
Because in your sheet raw data your data goes from row 2 to 2461, so in every calculation Excel checks only 2460 cells, not 1 million....
Change all your formulas like this and you should notice a better performance indeed.
UPDATE: I've uploaded the modified file: https://docs.google.com/spreadsheets/d/11isonBHFJTFFWtZTJg66JHbtyrD0XFGl/edit?usp=sharing&ouid=114417674018837700466&rtpof=true&sd=true
It works smoothly for me. No lag or anything. I can change any cell value, move the graph, filter cells and everything is done almost instantly.

Excel - Get unique count based on multiple columns (including date)

I'm trying to get a unique count of data in Column B that fall into the month of June (Column A date field)
Screenshot of Spreadsheet
I highlighted the rows that fall within June in Orange and the duplicate data in red to make it easier to view.
Count Total formula is a simple:
=COUNTA(A:A)-1
Unique Data formula is:
=SUMPRODUCT(1/COUNTIF(B2:B21,B2:B21))
Count June formula is:
=COUNTIFS(A:A,">=01/06/2020",A:A,"<30/6/2020")
But I can't figure out how get a count of unique data that falls within June (expected result is 13)
I've tried filter/unique formulas based on
Excel - Count unique values that meets multiple criteria
But I just can't get it to work. I know I could do it with VBA but this is part of a larger spreadsheet and every other part of the spreadsheet I've been able to do with Formulas, so would like to be able to do this last part with formulas too.
Anyone can help will be a life saver, it's been driving me nuts for the last couple hours.
In Excel 2016, which does not have the UNIQUE or FILTER functions, you can use this somewhat convoluted formula for a Unique count of June entries:
=SUM(IF(FREQUENCY(IF(LEN(IF(MONTH(Table1[Date])=6,Table1[Data],""))>0,MATCH(IF(MONTH(Table1[Date])=6,Table1[Data],""),IF(MONTH(Table1[Date])=6,Table1[Data],""),0),""),IF(LEN(IF(MONTH(Table1[Date])=6,Table1[Data],""))>0,MATCH(IF(MONTH(Table1[Date])=6,Table1[Data],""),IF(MONTH(Table1[Date])=6,Table1[Data],""),0),""))>0,1))
This part of the formula: IF(MONTH(Table1[Date])=6,Table1[Data],"") returns an array consisting of all of the June Data entries.
The LEN(... eliminates the resultant blanks
The Frequency function will then have us wind up with a count of 1 for each entry.
Then we just add it up.
Note that I used a Table and structured references, but you can convert it to regular addressing if you need to.
Of course, if you had Excel O365, you could use the simpler:
=COUNTA(UNIQUE(FILTER(Table1[Data],MONTH(Table1[Date])=6)))

Excel - Take Average of Monthly Data

I have a historical data set for commodity pricing. Throughout the data set, the data starts inputting prices on specific days, rather than the average of the entire month. In order to keep the flow of having only the average pricing for the months.
In the best case scenario, I would use an Averageif function, however, the data for each month doesn't display a consistent amount of days.
How can I automate this the process: If the month is the same as the previous row and different than the next row, calculate the average of the ^ rows until you hit the next month.
Here's a simple display of what I mean:
]1
You can use a pivot table to get the output you want. It will also be neatly organized instead of having your averages mixed in with a mostly blank column. Photo below shows the set-up/output of a pivot table generated with random data.
For a solution without pivot tables, you can use the following formula :
=AVERAGEIFS($B$1:$B$30;$A$1:$A$30;">="&(A1-DAY(A1)+1);$A$1:$A$30;"<="&EOMONTH(A1;0))
The above example is from cell C1, and can be copied down the entire list. The dates are in $A$1:$A$30 and the values in $B$1:$B$30. The first conditions test on the first day of the month (calculated as A1-DAY(A1)+1),and the second condition as last day of the month (calculated as EOMONTH(A1;0)
This will obviously put the average value of the month on each row, but will also work if your data is not sorted on date. If this is the case, and you only want to display one number per month in the column (as in your own example), you can add an additional IF statement wrapped around the formula:
=IF(EOMONTH(A2;0)=EOMONTH(A1;0);"";AVERAGEIFS($B$1:$B$30;$A$1:$A$30;">="&(A1-DAY(A1)+1);$A$1:$A$30;"<="&EOMONTH(A1;0)))
So it will display empty in all cells, except where the month changes.

Excel duplicate date subtraction

I have a table in Excel, with names in Column A, and dates in Column B. Names are present several times for the most part, each carrying a payment date in Column B. So if someone received one payment, the name is there once, with a date. If someone received eight, the name is there eight times, with eight different dates.
What I'm looking for is a method, to take each name (not the occurrence, but the same string), and present the difference of the maximum and the minimum date for every string that is the same (i.e. the date range of payments for every single person).
I tried basically everything in Excel. Conditional formatting and Pivot Tables did not help (the latter can only add, not subtract when using PT). Manual work would take a lot of time even if specifying min and max values for the entries, since the table has 17033 rows with 2218 unique names.
I would be grateful if you could help. I suspect the solution is not that hard, but I cannot really get my head around it.
You can use a standard array formula to list the distinct names e.g. in D2:-
=INDEX($A$2:$A$10000,MATCH(1,(COUNTIF($A$1:$D1,$A$2:$A$10000)=0)*($A$2:$A$10000<>""),0))
Then another one to find the maximum for each name and subtract the minimum:-
=MAX(IF($A$2:B$10000=$D2,$B$2:$B$10000))-MIN(IF($A$2:A$10000=$D2,$B$2:$B$10000))
But these are slow with ~10000 rows - a pivot table is much faster.
I would keep things really simple by putting the minimum and maximum date as value fields in the pivot table and manually adding a formula to subtract one from the other - will post a screenshot later.

Return all cell values in a row if Month & Year match a dd/mm/yyyy formatted date cell in that row

This is possibly a tough one so forgive me while I spell out the exact circumstances and what is needed:
We have a workbook which is populated with data from our support calls (most of it is text) which is then analysed for certain criteria which are just Yes/No options (i.e. incorrect information received, late notice, etc). One of the columns is the date the support call was raised, formatted as dd/mm/yyyy. All of this data is on sheet 2.
What I have been asked to do is to have a management-friendly interface on sheet 1:
two boxes on sheet 1 labelled 'Month' and 'Year' - a manager can enter say January in Month and 2014 in Year and if they match the dd/mm/yyyy of any calls which were raised, this will then extract the whole row of values and place on sheet 3.
On sheet 1, a graph will then be populated from the data on sheet 3 to show things like how many support calls had incorrect information.
Any ideas? I've tried going through VLOOKUPS, MATCH and INDEX and can't find anything which makes any sense to me.
UPDATE
Thanks for all the input in such a short time frame. Apologies for not providing more information first time around - was on a tight deadline and had limited time to write the original post. Many thanks to both user2140261 and Scott Gall for the hints and explanations concerning pivot tables. I think that has given me enough information to head in the right direction (I ended up having to do the first graph manually, but seem to have some promising results with my first attempts with pivot tables and charts) so thank you once again.
When I have this properly worked out, I'll post some dummy information showing how it works in case anyone with a similar problem finds it useful.
May I suggest the use of a pivot table?
First break the Date Field into multiple columns on sheet two (which do not need to be visible btw).
Formula for Month: = Text(,"mmm")
Formula for Year: = Text(,"yyyy")
Then insert a pivot table on Sheet 1 using the whole data range on sheet two as the source.
Set the two new Columns (Month and Year) as Filters and the user can simply pick a month and year to view (multi select should also be available)
You will need to play around with what to put in the rows and columns and values a bit...
Keep in mind the default "Value" calculation excel will do is Count this is rarely the desired measure for the "Values" usually want SUM.. this can be changed by clicking the small down arrow for the value field and editing the field properties.
Note your graph can then be fed from the resulting pivot table.
Hope you find this helpful... if so please vote up.

Resources