Most efficient way to create a comparative table with Excel? - excel

using Excel (mostly the VBA part), I would like to do the following:
I have two different tables with different kind of data, but they both have at least the 4 following columns in common:
the name of the employee
the name of the project he worked on
the day he worked on it
the number of hour in that day that he worked on it.
One of the table represents the planified working time (by instance this employee should work x hours on project A and y hours on project B on the 5th of May...) and the other table represents the actual working time, i. e. the time the employee actually spent working on a project.
The actual project on which the employee worked on may differ from what was planified, or it may be the same but he might have spent a different amount of time on it.
Being new to Excel I was wondering if any of you could give me ideas on the most efficient way to do this. Since I have a lot of rows, I'm a bit reticent at the idea of using too many loops.
Thank you!

If these are structured tables (let us say their worksheet names being Planned and Actual) in Excel I would create a MS Query like this:
SELECT Plan.Employee, Plan.Project, SUM(Plan.PlannedHrs), SUM(Act.ActualHrs),SUM(Act.ActualHrs) / SUM(Plan.PlannedHrs)
FROM [Planned$] as Plan
INNER JOIN [Actual$] as Act
ON Plan.Project = Act.Project AND Plan.Employee=Act.Employee
GROUP BY Plan.Employee, Plan.project, Act.Project, Act.Employee
The above query will compare hrs per project (Planned vs Actual). This is just an example as you can calculate other metrics.
Using MS Queries?: Data->From Other Sources->From Microsoft Query
or use my SQL AddIn (just to create the query): http://www.analystcave.com/excel-tools/excel-sql-add-in-free/

Related

Tableau Calculated Field using FIXED

I have a database in Tableau from an Excel file. Every row in the database is one ticket (assigned to an Id of a customer) for a different theme park across two years.
The structure is like the following:
Every Id can buy tickets for different parks (or same park several times), also in different years.
What I am not able to do is flagging those customers who have been in the same park in two different years (in the example, customer 004 has been to the park a in 2016 and 2017).
How do I create this calculated field in Tableau?
(I managed to solve this in Excel with a sumproduct fucntion, but the database has more than 500k rows and after a while it crashes / plus I want to use a calculated field in case I update the excel file with a new park or a new year)
Ideally, the structure of the output I thought should be like the following (but I am open to different views, as long I get to the result): flag with 1 those customers who have visited the same park in two different years.
Create a calculated field called customer_park_years =
{ fixed [Customerid], [Park] : countd([year]) }
You can use that on the filter shelf to only include data for customer_park_years >= 2
Then you will be able to visualize only the data related to those customers visiting specific parks that they visited in multiple years. If you also want to then look at their behavior at other parks, you'll have to adjust your approach instead of just simply filtering out the other data. Changes depend on the details of your question.
But to answer your specific question, this should be an easy way to go.
Note that countd() can be slow for very large data sets, but it makes answering questions without reshaping your data easy, so its often a good tradeoff.
Try this !
IFNULL(str({fixed [Customerid],[Park]:IF sum(1)>1 then 1 ELSE 0 END}),'0')

Excel Query looking up multiple values for the same name and presenting averages

Apologies if this has been asked before. I would be surprised if it hasn't but I am just not hitting the correct syntax to search and get the answer.
I have a table of raw data for my staff, it contains data on the name of the employee who completed a job and the start and finish times, among other things. I have no unique ID's other than name, and I cant change that as I'm part of a large organisation and I have to make do with the data I'm given.
what I would like to do it present a table (Table 2) that shows the name of the employee and then takes the start/finish times for all of their jobs on table 1 and presents the average time taken across all of their jobs.
I have used Vlookup in the past but I'm not sure it will cut it here. the raw data table contains approx 6000 jobs each month.
On table 1 i work out the time taken for each job with this formula;
=IF(V6>R6,V6-R6,24-R6+V6) (R= started Time) (V= Completed Time) in 24hr clock.
I have gone this route as some jobs are started before midnight and completed afterwards. Although my raw data also contains dates (started/completed) in separate columns so I am open to an experts feedback on this and if there is a better way to work out the total time form start to completion.
I believe the easiest way to tackle this would be with a Pivot Table. Calculate the time taken for each Name and Job combination in Table 1; create a pivot table with the Name in the Row Labels and the Time in the Values -- change the Time Values to be an average instead of a sum:
Alternatively, you could create a unique list of names, perhaps with Data > Remove Duplicates and then use an =AVERAGEIF formula:
Thanks this give me the thread to pull on, I have unique names as its the persons full name, but ill try pivot tables to hopefully make it a little more future proof for other things to be reports on later.

Have VBA for Unique items in multicolumn range, how to filter multiple rows on previous results?

Intro
I'm trying to enhance a basic planning sheet (see below) with additional sorting.
The first column lists the resources. Each week has 2 columns representing 20 hours per column.
Example readout;
In week 30 Aron works 40h on project A. In week 31 he works 20h on project A and 20h on project B.
Jeff does not work in WK30 and works 40 on project C in WK31
My Goal
Generate a unique list of projects over the weeks
Be able to filter based on project name and get only the rows of the resources working on that project. (No specific need to filter out the other projects in the same row. So if I filter on project "A", I want to only see the rows of Aron and Dave)
What I have
Basically item 1 is covered as follows:
The 2nd column (Projects) is an array-formula generated by a VBA function (taken from here) that returns all unique items in a multi column range. The cell formula looks like this, where the second argument of UniqueItems() determines if we only return the number of unique items (TRUE) or a list of all unique values (FALSE).
=TRANSPOSE(UniqueItems($C$4:$H$6,FALSE))
What is missing
Item 2 of my goal list is missing. If I currently select the filter option for Projects (see screenshot)
and filter on Project A, then I only get row 5 and not also row 4.
How would I go about filtering this properly?
VBA code is allowed or pointers to which regular formula functions I should use. A complete different solution with the same results is also fine. I thought about pivot tables, but I think it cannot handle empty cells around the range which is common if there's no work for that resource.
The sheet used can be downloaded from here
IMHO, you will be much better off creating a worksheet serving as a normalized database table with one row per the following columns: person, week, project. Lastly, a final column for the number of hours (and optionally, a cost for those hours customized per worker as needed)
Then use a pivot table to build the view you posted (or any other reporting view you need).
This will let you create multiple pivot tables that easily answer questions like:
how many hours are planned for project X in total? by a certain date?
how many total hours are planned for each worker per time period - who is over/under utilized?
how many total hours are planned for each worker per project?
etc.
Keeping the data separate from the reports is safer, and more modular - I wouldn't want a VBA bug to have the potential of corrupting/deleting raw data.

Looking up values from different tables including newly found values

I have several documents which contain statistical data of performance of companies. There are about 60 different excel sheets representing different months and I want to collect data into one big table. Original tables looks something like this, but are bigger:
Each company takes two rows which represent their profit from the sales of the product and cost to manufacture the product.I need both of these numbers.
As I said, there are ~60 these tables and I want to extract information about Product2. I want to put everything into one table where columns would represent months and rows - profit and costs of each company. It could be easily done (I think) with INDEX function as all sheets are named similarly. The problem I faced is that at some periods of time other companies enter the market:
Some of them stay, some of them fail. I would like to collect information on all companies that exist today or ever existed, but newly found companies distort the list (in second picture we see, that company BA is in 4th row, not BB). As row of a company changes from time to time, using INDEX becomes problematic, because in some cases results of different companies get into one row. Adjusting them one by one seems very painful.
Maybe there is some quick and efficient method to solve such problem?
Any help or ideas would be appreciated.
One think you may want to try is linking the Excel spreadsheets as tables in Access. From there you can create a query that ties the tables together. As data changes in the spreadsheets, the query will reflect those changes.

Excel vb project-best practice

I'm not a vb developer neither so familiar with excel. Anyway i have a project to be done using MS Excel (cannot use access).
System is to provide a ratio analysis(ans some other analysis) of companies where data from an annual report need to entered to the system. Then based on several reports data I can derive graphs and all other information.
My question
Now I can store data in a single sheet like using is as as a database. it'll be like
CompanyName Year Data1 Data2 Data3...
Here the CompanyName can be duplicated as many Years data can be entered. If I use this method Each time I derive company data, I have to search for the relevant rows in the worksheet and keep lots of data in an array as I read through those rows and produce the final result.
Or I can use separate worksheet for each company. Then I only have to search for the relevant sheet name and perform operations in that worksheet it self easily.
So what is the best way to do this?
Thanks
Whatever way works. IMO you could create a defined range (or many) and issue SQL against it just like it was Access table(s). I'm for keeping all like data on the same worksheet even for different companies; but that's just my 2 cents. You can create a pivot to separate out the information and slice/dice it however needed
Since someone liked the comment as an answer:...
It might be simpler to do some of this just using formulas and Excel functions. The basic approach would be to keep the data on one sheet and sort it by year within company so that all the years for a company are grouped together. Then use Filter to create a list of unique companies. These steps get repeated each time you add new data.
Then create 2 formulas for each company: the first uses MATCH to find the first row containing the company name and the second uses COUNTIF to find how many rows there are for the company. Then you can use OFFSET(firstrow,ColumnIndex,NumberOfRows,1) (or similar) to get the required range of data for Charts and ratio analysis etc.

Resources