Hello, i need to create a table with the count of processed and received in the specific Interval band - powerbi-desktop

i have a data which has received time and processed time of a particular claim , i've created a column with banding according to the hour received/processed, now i have to create a table in powerbi to show the count of received/processed according to the Interval Band,
data and sample table from excel is attachedSample Table
Using these metrics, I constructed 2 new PowerBI tables for processed and received count.
processed band is equal to SELECTCOLUMNS(PA Online Receive,"Processed IntervalBand",(PA Online Receive[Processed Interval Band]),"ActualProcessedTime",PA Online Receive[Actual PROCESSED TIME]).
Received band is defined as SELECTCOLUMNS(PA Online Receive,"ReceivedCount",COUNT(PA Online Receive[Received Interval Band]),"Received IntervalBand",(PA Online Receive[Received Interval Band]),"ActualProcessedTime",PA Online Receive[Actual PROCESSED TIME]),"Received IntervalBand
then built a new table with distinct TAT Bands like (a)0-1, b)1-2, etc., and connected the processed/received interval band with the main table and actual processed time with it.
I'm getting the incorrect count in the table from visual.

Related

Merging the 30-second sampling rate epochs into 30 minutes sampling epochs in the Excel File

I am a PhD student in Sport Science. In my excel file, there were five columns (Date, Time, ExactTime (computed by me using excel function), Activity Level) as shown in the attached photo. As you can see, in the "ExactTime" column, each row indicates the activity level at the 30-second intervals. However, my PhD supervisor would like to have a excel file containing the average activity level at each 30-minute interval, rather than the default 30-second interval. For instance, the first row becomes 09-08-2021 12:12:00 (in the first column) and the average activity level from 09-08-2021 11:42:00 to 12:12:00. Grateful if I could have some step-by-step guidelines on how to do it! Many thanks! The link to my data file is attached.(https://drive.google.com/drive/folders/1roIDdcxGwsq9l630YR0gapQ_yM9hvf0g?usp=sharing)
enter image description here
To have a excel file containing the average activity level at each 30-minute interval, rather than the default 30-second interval.

Use excel to calcluate average and stdev of time differences in a time series?

EDIT1: download file with 2 days of real data
My home automation controller collects data from several 4-in-1 motion sensors in different rooms of my house. The sensor prioritizes motion, sending motion reports every few seconds, but also independently reports temperature, humidity, and illuminance. I am trying to determine if the temp and humidity reports are sent frequently enough to automate control of heaters and exhaust fans.
Sensors independently report each category to the controller, which sends data to excel. Sample data below, but without motion reports that clutter up the real data.
A pivot table generated from the raw data:
Answering the question of frequency takes me several manual steps. Sorting/filtering the dataset for temp/humidity by room, then manually adding a time diff column
where time diff = (<current Date-Time cell> - <prev Date-Time cell>)*24*60. I then calculate the average and stdev of minutes between reports by manually selecting, in turn, each room/category subset in the time diff column; once for the average and once for the stdev.
After a few more manual steps, I end up with this desired result:
BUT I have to do it all over every time new data is added to the table. I'm certain excel can do this automatically, but I didn't find a solution through pivots, power pivots, slicing, or queries. I'm hoping one of you excel gurus can help. Thanks!

Pivot Table project - Avoid using many INDEX and MATCH functions that make Excel crash

I need some help with an Excel Project that's giving me headaches. I succeeded to achieve everything I wanted but the result is too heavy for Excel and it crashes all the time. I'm over-using the INDEX and MATCH functions on large tables (50 000+ lines) and Excel doesn't like it. I'm looking for a way to do the same thing in a lighter way for Excel.
Here's what I achieved : I created a report that helps me analyzing my employees's performance VS their billing targets. To create such a report, I used a Pivot Table.
That Pivot Table needs this information as its source :
Each sales that every employee made (amount in $ and date)
The hourly rate of each employee (which changes for every period, see TABLE1 below)
The billing target for each employees (which changes for every period, see TABLE1 below)
Here's my setup. I have 3 tables :
TABLE1 (See attached image) - A table where I manually input data for each of my employees (hourly rate and billing target). Their billing target and hourly rate change every period. So, each period has a different line and I indicate the first day of the period and the last day of the period.
TABLE2 (See attached image) - Table that contains sales data exported from another software I use. Each line represents an amount sold by an employee to a customer on a specific date. This table is pretty heavy and contains more than 50 000 lines. Moreover, the last 2 columns of this table use Index and Match functions to get the right hourly rate and the right billing target from TABLE1. That means that each of those 50 000 lines uses the INDEX and MATCH functions twice… This part is too heavy for Excel and I need a workaround.
Moreover, TABLE2 is getting refreshed every few days with new data coming from my other software (an ERP). So the solution I need to find must take that into account and must be permanent (I try to avoid steps that will have to be done everytime I refresh TABLE2 with new data).
TABLE3 - A Pivot Table that uses TABLE2 as its data source. I use the slicer to select the name of an employee and a timeline to specify which months I want to display. Then the Pivot Table shows my employee's statistics grouped by months. The main statistic is the amount of "billed hours" for each employee, which is in reality the amount of sales made by that employee, divided by their hourly rate on a specific date.
My thoughts :
It is absurd that TABLE2 uses that many INDEX and MATCH functions. For example, if Employee1 made 500 sales between 2020-07-01 and 2020-07-31 (the same month, thus the same period, thus the same hourly rate and billing target), there will be 500 different lines that will use INDEX and MATCH to get the same hourly rate and billing target from TABLE1. That leads to a lot of duplicated calculation and a lot of duplicated data.
Would it be possible for a Pivot Table Calculated Field to use INDEX and MATCH in its formula? And would it be lighter for Excel to do so?
Another way would be to add, at the bottom of TABLE2, 12 lines per year (1 for each month) for every employee where I would write their hourly rate and the billing target. That way, the Pivot Table would be able to display an hourly rate and a billing target for each month, for each employee. That solution would work and would be lighter for Excel, but it would create a high risk of making mistakes while manually inputting the data.
I'm open to all suggestions including VBA!
Thank you very much for your precious time!
EDIT : FORMULA
As requested, here's my INDEX AND MATCH formula that is in TABLE2 and gets the hourly rate from TABLE1 :
=INDEX(TAB_Employee_Data[[#All];[Hourly_Rate]];MATCH([#[Date (Cell)]]; IF(TAB_Employee_Data[[#All];[Name]]=[#[Employee(Cell)]];TAB_Employee_Data[[#All];[First day of the period]]);1))
TAB_Employee_Data is the tab that contains "TABLE1".
I translated the names of the fields since all my work is in French.
This formula does the following : it searches the name of an employee in TABLE1 and finds the period which fits the date of a line in TABLE2.
Also, to work properly, I need to sort the lines of TABLE1 in chronological order.
TABLE 1 :
TABLE 2:

Statistical functions on non-numerical value

I am not looking for any code or formula but a rationale/logic.
Background: My data set comes in Date/Time format where a new timestamp is created for each new occurrence of an event.
My goal is to calculate number of occurrences within each hour for a given day. Unfortunately, system does not capture number if occurrences per period as integers. So I have count the number of time an hour value appears within the hour i.e number of times 4 o'clock hour appears. I am currently using Pivot Table in Excel to count the number of times each hour appears. Fields in Rows are hour and dates, and field in Values is count of hour.
Trouble is that I cannot use any summarize functions to get stuff like sum, min, max, percentile, and standard deviation. For example, changing count to sum will only add up all hours. So sum of 4 o'clock hour will return 12 instead of 3. So I am having to use array formulas on pivot table to give me max and min etc.
If I was to use this data in data viz tools like Tableau or Power BI. I won't be able to get very far. I am looking for a suggestions/workaround that can allow me to manipulate my data in a way so it can be used in Pivot Tables in Excel and in data viz tools.
I know my questions is not specific to one tool but I am looking to enhance me understanding of data and data manipulations techniques.
EDIT: Please see attached image
Build a data model, using PowerPivot. Join your fact table to a calendar dimension table. Create a row count measure - you can then summarise that measure to suit (sum, average, min, etc)

Filtering block data in Excel by rows ( weather forecast data from National Weather Service)

I need to get data for weather forecasts for a period of several years for 18 US cities. I can obtain the data from the National Weather Services but the problem is that the Excel file contains data for all US cities and the data is presented in blocks. I got the data in uncompressed format from here: http://www.mdl.nws.noaa.gov/~mos/archives/mex.html. For example, here is the data for January 2006: https://www.dropbox.com/s/lf552bgbwdyusli/01.2006.xlsx. I need only information about min and max temperature for 18 particular cities, which means I need only the first 5 rows of data for each city. The data cannot be sorted by columns and cannot be transposed as it contains text cells.
I was wondering whether there is a faster way to select the data I need than searching for each particular city and copy and pasting the data in a new worksheet. This method would be fine if I had only a couple of files of data, but I have around 100 of them.
Thank you very much for your help!

Resources