Adding large data in Excel - excel

I have around 200,000 data in excel which is separated in per 15min for each day for two years. Now, I want to add for each day(including all the 15mins data at once) eg. 01/01/2014 to 12/31/2016. I did it using a basic formula (=sum(range)) but it is very time consuming. Can anyone help me find a easy way to resolve this problem?

It's faster and more reliable to work with big data sets using Ms Power Query in which you can perform data analitics and process with single queries, or using Power View of the dataset loaded into the datamodel that really works really fast.

Related

Is there any way for cleaning data in excel which is manually added with the heavy typos?

I am having data which is reported in excel format and the same is created manually. There are lots of typos expected in the report as the person feeding the data in excel sheet is highly unskilled. The way the person listens to the data entry, as per his understanding the data is entered in excel eg Shree/Shri/Sri etc. sounds almost similar while they are pronounced but all 3 are different as the data as whole.
At present I am solving the problem by cleaning the data with Open Refine java based localhost solution using various clustering methods available in the software. The data is added to the excel pool in incremental manner hence the excel rows are increasing after each update. Open refine is consuming heavy resources as it works on localhost.
It will be really helpful if there is any other way round to solve the problem.
Thanks in advance!

Memory issue using DAX query in python script

I am having memory issue when I execute DAX query inside the code in ref picture. If its like 10000 row it works however more than that it create memory issue. My query may return up to 50 Millions of data.
Question 1: What should be the efficient way to execute the query.
Question 2: What settings or properties might change to adjust huge amount of data.
Question 3: Is that possible to use partition and split data to fill into data table?
I am new in python coding. Please suggest me if code needs to change or any other efficient way to pull the data and send to data frame? my end goal is to send all data to CSV format into data lake. Its currently working however, for smaller amount of rows. I have tested till 10k is working in few min. its super inefficient seems to me.
Thanks in Advance!
I'm not sure how I would automate it, but the Export Data function from DAX Studio works really fast. I just did 500k rows in under a minute. It works against tables in the model, so if you are working with a DAX Expression, you would have to create it as a DAX Table in the model first.

Why is Excel Pivot's 'Running total' option so much faster than Access DSUM?

I am wondering, if I make a pivottable in Excel from a recordset with about 50000 lines, it takes about 30 seconds to produce a running total in a date field. Yet when I want to achieve the same result in an Access table, the DSUM takes over 30 minutes. Same data... Why is there so much performance difference? What does Excel do in the background?
You might find this article helpful:
http://azlihassan.com/apps/articles/microsoft-access/queries/running-sum-total-count-average-in-a-query-using-a-correlated-subquery
Here's what it says about Dsum, DLookup, etc.
They involve VBA calls, Expression Service calls, and they waste
resources (opening additional connections to the data file.)
Particularly if JET must perform the operation on each row of a query,
this really bogs things down.
Alternatives include looping through the recordset in VBA or creating a subquery. If you need to use DSUM, make sure your field is indexed and avoid text fields.

Excel - Best Way to Connect With Access Data

Here is the situation we have:
a) I have an Access database / application that records a significant amount of data. Significant fields would be hours, # of sales, # of unreturned calls, etc
b) I have an Excel document that connects to the Access database and pulls data in to visualize it
As it stands now, the Excel file has a Refresh button that loads new data. The data is loaded into a large PivotTable. The main 'visual form' then uses VLOOKUP to get the results from the form, based on the related hours.
This operation is slow (~10 seconds) and seems to be redundant and inefficient.
Is there a better way to do this?
I am willing to go just about any route - just need directions.
Thanks in advance!
Update: I have confirmed (due to helpful comments/responses) that the problem is with the data loading itself. removing all the VLOOKUPs only took a second or two out of the load time. So, the questions stands as how I can rapidly and reliably get the data without so much time involvement (it loads around 3000 records into the PivotTables).
You need to find out if its the Pivot Table Refresh or the VLOOKUP thats taking the time.
(try removing the VLOOKUP to see how long it take just to do the Refresh).
If its the VLOOKUP you can usually speed that up.
(see http://www.decisionmodels.com/optspeede.htm for some hints)
If its the Pivot table Refresh then it depends on which method you are using to get the data (Microsoft Query, ADO/DAO, ...) and how much data you are transferring.
One way to speed this up is to minimize the amount of data you are reading into the pivot cache by reducing the number of columns and/or predefining a query to subset the rows.

Creating Excel dashboards from info stored on SQL

I'm creating an Excel dashboard that imports a variable number months' worth of financial/accounting information from a database to an Excel sheet. Using this information I have a Calculations sheet that computes some financial indicators, again, month by month. Finally, this information is displayed in graphs on a separate sheet (one indicator per graph, with the monthly information plotted to see the tendencies). Currently I have written VBA code that formats the sheets to accomodate the number of months requested, pull the data from the SQL server, and update the graphs. Since there are 53 indicators for each operation (6 operations), this process takes about 3 minutes.
Does anyone recommend a better way to do this? The current way 'works' but I've often thought that there must be a more efficient way to do this.
Thanks!
Chris
You could look at skipping out the excel part and using SQL server reporting services (SSRS). If you have ever used business objects or crystal reports its kind of the same thing and I would imagine would offer better performance than doing things in excel.

Resources