I have a large Excel file where data is collected 48 times per day for the month, but this has been collected vertically rather than horizontally.
I need to split this for each date of the month but the data is going back 3 years meaning I would have to do a transpose formula over 340 times.
Is there a quicker way of doing this?
Current Format:
Desired Format:
Related
I'm new to programming and data analysis in general, and need some help with a large dataset file (43 GB). It is a list of High Frequency trades fro a stock containing two columns I'm intrested in: Time (in UTC format including milliseconds, e.g. 2019-01-01T00:06:41.033529796Z) and price. I have managed to open the file using delimiter software and split it into 509 files which would fit in an excel sheet.
I now need to compare the price change during 5 minute intervals based on the prices in this file.
My first problem is that Excel doesnt the approriate time format for interpretation.
Secondly, I need to understand perhaps using the =FLOOR formula, to split the lsit of trade times into 5 mins intervals and find the difference in corespongin prices.
I have tried making excel recognise the UTC format with no success. Any help would be appreciated!
So I have a very large set of data (4 million rows+) with journey times between two location nodes for two separate years (2015 and 2024). These are stored in dat files in a format of:
Node A
Node B
Journey Time (s)
123
124
51.4
So I have one long file of over 4 million rows for each year. I need to interpolate journey times for a year between the two for which I have data. I've tried Power Query in Excel as well as Power BI Desktop but have had no reasonable solution beyond cutting the files into smaller < 1 million row pieces so that Excel can manage.
Any ideas?
What type of output are you looking for? PowerBI can easily handle this amount of data, but it depends what you expect your result to be. If you're looking for the average % change in node to node travel time between the two years, then PowerBI could be utilised as it is great at aggregating and comparing large datasets.
However, if you are wanting an output of every single node to node delta between those two years i.e. 4M row output, then PowerBI will calculate this, but then what do you do with it.... a 4M long table?
If you're looking to have an exported result >150K rows (PowerBI limit) or >1M rows (Excel limit), then I would use Python for that (as mentioned above)
I need to find out how many days (not times) when temperature have been below -15 degrees.
I have found data from a website that gives me the temperature every half hour but that gives me 48 values for each date.
If the temperature have been below -15 more than one time during the date it should only be counted as one.
Any ideas?
This is my data:
More example data
More example data
According to your sample image's columns, put this in an unused cell.
=SUMPRODUCT((D$2:D$999<=-15)/(COUNTIFS(B$2:B$999, B$2:B$999, D$2:D$999, "<=-15")+(D$2:D$999>-15)))
Adjust the rows to your actual data range to minimize calculating blank rows.
I'm looking to selectively copy a list of data in Excel for the purposes of reducing the quantity.
In the first column I have Date/time and in the second column I have a data value, in this case it's electrical meter readings.
The data is currently given very 15 minutes and what I'm trying to do is reduce that to every hour. i.e. effectively create a new column which extracts only the data from the original list for every hour (Also with no gaps in the rows, therefore condensing the length of the list).
Any advice much appreciated!
I have a 2-way data table in Excel (as in the option under "What-If Analysis"). There are 50 rows when analysing a 50 year deal, but only 4 rows when analysing a 4 year deal.
I only want to use one data table (of 50 rows) but I don't want it to calculate all of the values if it doesn't have to. e.g. if I have a five year deal I want the values in the first 5 rows to be calculated, but for the rest I would like it to display 0 or a blank.
Is there a way to do this without VBA?
(I was thinking with VBA I could create a whole new data table every time I run it, but would prefer not to as I am still developing structures.)
I'm guessing that either your row labels will step consistently. Just blank out the years that are not required, when not required (rows 6 and upwards in your example for 5 years), and repopulate with series fill to suit.