Reformatting Excel datastructure from multiple rows in columns/rows format - excel

I have the following case: I got data in an format like:
date - timestamp - value where there are 29 values per day, so date stayes the same
What i need is something like:
date 1 per row
timestamp times 29 for each timestamp in a day as columns
values where date and timestamp meets
Is there a way to reformat the structure completely? As the data is pretty big (10 years, 29 data per day) it would take ages to do manually. I need it in the normal excel format as result so i can easily import the data in c#.
What i have:
22.03.2018 08:00 200
22.03.2018 08:30 202
22.03.2018 ...
22.03.2018 22:00 120
23.03.2018 08:00 12
What I want:
08:00 08:30 ... 22:00
22.03.2018 200 202 ... 120
23.03.2018 12
Every help would be appreciated :)
Br

Related

Count values of an Excel column in different ranges

I want to count number of values in different ranges in an Excel column.
Example 1:
Imagine I have some data in 40 rows, each one happened in different time of day. Like the below image:
now I want to count number of row repeated in different ranges, for example, count number of rows that have time between 12:00 to 18:00, again count 18:00 to 00:00, and more to 11:59 (next 12:00)
Time range
Count
00:00 to 6:00
?
06:00 to 12:00
?
12:00 to 18:00
?
18:00 to 23:59
?
Finally I have a table with 4 rows that shows how many row I have in those ranges and I can create a chart by that.
Example 2:
Count people based on age range. result would be like this:
Age range
Count
12 to 18
3
18 to 25
5
25 to 35
4
35 to 45
1
45 to 60
2
P.S:
I used countif with logical AND, but it didn't work. like this: =COUNTIFS(C:C,"AND(<00:00, >2:00)")
A more correct use of COUNTIFS (which is different from COUNTIF), would be :
'Counts values strictly between 00:00 and 2:00
=COUNTIFS(C:C,">00:00",C:C,"<2:00")
Hope it helps

Pandas groupby Month with annual summary

I have a list of items describing some orders placed like this
items=[('September 2021',1,40),('June 2022',1,77),....]
In order to get a dataframe grouped by how many orders did I receive and how much did I get paid I do the following
tabla2=tabla.sort_values(by=['Date']).groupby(['Date']).agg({'Subscriptions':'count','Total amount (€)':'sum'}).astype('float64').round(2)
What I want is to include a row with the yearly numbers after each month of that year, and a Totals at the bottom of it
For the totals I do the following
df1=pd.DataFrame(pd.Series({'Date':"<b>Totals</b>",'Subscriptions':"<b>{}</b>".format(tabla['Subscriptions'].sum().astype('int')),
'Total amount (€)':"<b>{}</b>".format(tabla['Total amount (€)'].sum().round(2))})).T.set_index(['Date'])
tabla2=tabla2.append(df1)
The <b> is for making it bold later when representing it with plotly.
So I end up having something like this
Date Subscriptions Total amount (€)
September 2021 15 345
.... ... ...
<b>2021</b> 132 1256
June 2022 17 452
... ... ...
<b>2022</b> 144 3215
<b>Totals</b> 1234 4567
What is the most pythonic way of accomplish this from the tabla2 dataframe?

Average column by specific datetime associated values

I have one column with the time in format "dd/mm/yyyy hh:mm" and another with the temperature for that time point. I am looking to calculate the average temperature of the day and night of each month separately. I.e. average all temperatures between 06:00 and 18:00 in May and all temperature between 18:00 and 06:00 in May and then the same for March and so on.
Time Celsius(C)
06/05/2016 10:49 28
06/05/2016 11:49 29
06/05/2016 12:49 31
06/05/2016 13:49 27.5
06/05/2016 14:49 24
06/05/2016 15:49 25
06/05/2016 16:49 24.5
06/05/2016 17:49 23.5
06/05/2016 18:49 23
06/05/2016 19:49 22.5
06/05/2016 20:49 22.5
I am currently using the following formula:
=AVERAGEIFS(C2:C3643,B2:B3643,">=01/05/2016",B2:B3643,"<=31/05/2016",B2:B3643,">=01/05/2016 06:00",B2:B3643,"<=31/05/2016 18:00")
To try and calculate an average if the date is within May and during the day - however it doesn't appear to be working and when I change the hour periods it still spits out the same number (which is the average for the month).
You can use a long SUMPRODUCT Formula:
For the 600 to 1800 in May:
=SUMPRODUCT(($A$2:$A$12>=DATE(2016,5,1))*($A$2:$A$12<=DATE(2016,5,31))*(MOD($A$2:$A$12,1)>=TIME(6,0,0))*(MOD($A$2:$A$12,1)<=TIME(18,0,0))*B2:B12)/SUMPRODUCT(($A$2:$A$12>=DATE(2016,5,1))*($A$2:$A$12<=DATE(2016,5,31))*(MOD($A$2:$A$12,1)>=TIME(6,0,0))*(MOD($A$2:$A$12,1)<=TIME(18,0,0)))
You can always replace all the DATE() and TIME() parts with cell references instead of hard coding them.
To get the between 1800 and 600 we need to shift it to an OR with + between the Time Boolean instead of *:
=SUMPRODUCT(($A$2:$A$12>=DATE(2016,5,1))*($A$2:$A$12<=DATE(2016,5,31))*((MOD($A$2:$A$12,1)<=TIME(6,0,0))+(MOD($A$2:$A$12,1)>=TIME(18,0,0)))*B2:B12)/SUMPRODUCT(($A$2:$A$12>=DATE(2016,5,1))*($A$2:$A$12<=DATE(2016,5,31))*((MOD($A$2:$A$12,1)<=TIME(6,0,0))+(MOD($A$2:$A$12,1)>=TIME(18,0,0))))
This relies on you creating a table of months and time ranges like below:
Enter this formula in E2 and drag around as needed. It's an array formula, so must be entered with Ctrl-Shift-Enter:
=AVERAGE(IF(
((MONTH($A$2:$A$101)=MONTH(E$1&1))*
((MOD(HOUR($A$2:$A$101)-LEFT($D2,2),24))>=0)*
((MOD(HOUR($A$2:$A$101)-LEFT($D2,2),24))<12)),
$B$2:$B$101))
Notes:
The MONTH(E$1&1) part lets you get a month number from text like
"Jan"
I used MOD and subtraction of the left part of the time range to get
the target hour in a range from 0 to 23. This made it possible to filter on values between 1 and 11.
If Barry Houdini were still around he could do it in half the space, I'm sure.
While working with datetime and when the actual day is reckoned beyond midnight like in 18 hrs to 6 hrs the next day, I find it useful to offset the time back and do the calculations.
6:00 to 18:00 =AVERAGE(IF((MONTH($A$2:$A$12-0.25)=D2)*(MOD($A$2:$A$12-0.25,1)<0.5),$B$2:$B$12,""))
18:00 to 6:00 =AVERAGE(IF((MONTH($A$2:$A$12-0.25)=D2)*(MOD($A$2:$A$12-0.25,1)>=0.5),$B$2:$B$12,""))
These are array formulas entered with Ctrl-Shift-Enter.
Here I am offsetting time by 0.25 days which is 6 hours.

How to calculate results and plot data depending on results of a specific column in excel?

I have a .csv file and it contains thousands of rows. I collected this data file as output of running my program for 60 minutes. This output file contains time column (in forum HH:MM:SS:MS), this time column is recording time for my outputs. I want to get plot of my other columns in my output.csv file according to my time column (taking the results for all columns every 1 minutes).
For example:
I have a row like this:
Data Time
----- -----
455 10:00:00
894 10:00:00
785 10:00:00
898 10:00:01
789 10:00:01
748 10:00:02
248 10:00:02
788 10:00:02
148 10:00:02
742 10:00:02
... ...
266 10:01:00
... ...
Is there any easy way to plot other columns with rows according to time column (taking the results for all columns every 1 minutes) ?
While the question is not completely clear/consistent, I understand you want to count the number of data for each of the first 15 intervals
10:00 <= time < 10:01
etc.
For the first interval, you can use
=SUMPRODUCT(($B$2:$B$8>=TIME(10;0;0))*($B$2:$B$10000<TIME(10;1;0)))
I assume your time data is in B2:B10000.
You can expand this range as needed, there is no problem in having an excess range (blank cells will not be counted).
Or you could use
=SUMPRODUCT(($B:$B>=TIME(10;0;0))*($B:$B<TIME(10;1;0)))
You can easily create a column with the start time for each interval, and another column using (a modification of) this formula to for the data count.
Then you would plot the two columns just created.

PowerPivot fiter to slice data per hour, day or month

I have an Excel database file that contains the total passenger passes from a specific location. The total number of passenger passes is counted in a period of 2 minutes(e.g. 14:45:00 to 14:46:59). I have imported my database into PowerPivot and have also created relevant PivotTables and PivotCharts with some slicers to analyze them. How can I create a slicer which filters data in greater periods of time like hour, day or month?
You'll need to create a Date/time table which contains all hours in your data.
First you'll need to create a calculated column in your original data that has the date and the hour
Formula to use:
=DATE(YEAR([Original Date]),MONTH([Original Date]),DAY([Original Date]))+HOUR([Original Date])/24
Result in your original data table
Original Date Calculated_DateHour
02/15/2015 14:15 02/15/2015 14:00
02/15/2015 16:25 02/15/2015 16:00
Now, in Excel create a Master Date table with unique date/time for all data
[DateTable]
Master_DateHour Year Month Day Hour
02/15/2015 13:00 2015 2 15 13
02/15/2015 14:00 2015 2 15 14
02/15/2015 15:00 2015 2 15 15
02/15/2015 16:00 2015 2 15 16
02/15/2015 17:00 2015 2 15 17
{...}
03/17/2015 02:00 2015 3 17 2
Next you'll create a relationship between the DateTable[Master_DateHour] (lookup table) and Data[Calculated_DateHour].
Now you should be able to slice based on the Year/Month/Day/Hour as you see fit.
Additionally, you could create a hierarchy in the [DateTable] to give you more control over the presentation of Year/Month/Day/Hour.

Resources