Looking to clean Excel data and make it into a proper dataset using macro/formulae. This is how my data looks like.
Need to convert it like this:
The number of rows can reach >3000 and number of projects under each name is not constant.
Few steps identified are:
1. Remove the blank rows against each task(marked in red)
2. Remove Utilization column
3. Available hours column to populate the value in each empty cell below.
4. Start data and End Date from cell L4 in two new columns
5. Team Name in cell A9 to be populated in a new column "Team Name"
I've tried using the recorded the macro but it doesn't work in a new data set. Hence looking for some expert help.
The keyword to identify what you have to do is unpivot.
You can unpivot your data using:
Power Query, following these steps;
VBA macro, following these steps.
There are also others possible solutions to unpivot data...just use Google to find them.
Related
So I have to clean data where from a given range of rows maybe 2 or 3 are exact same, rest have at least one column different. I need a way to find it out as I don't want to do it manually. I've tried conditional formatting but that only works with columns.
In the image you can see rows 550:569 a few of them are exactly same. How do I highlight or find out that. I don't want to manually check each column
enter image description here
Insert a column (let's say column AG) where you put a formula like =TEXTJOIN(",",TRUE,A2:AF2)
Sort the range per the new column
Eliminate duplicate using Excel's Remove Duplicate tool.
I am using excel sheet and i have data column as shown below:
As we can see that some of the names are duplicate or appeared twice. My question is how can count unique name records or rows associated with each name for summary column.
Out put i am looking for is shown below:
Not sure which formula to use as count is counting all of that data i.e. '7' in this case. How can i use count or any other function to count unique records as shown above?
You can do what you're after with a pivot table.
Click the Insert tab then select "Recommended Pivot Tables".
A window will open up prompting you to select the data range. I recommend using a named range for your list and referencing that, but you can just highlight the list directly if you want.
Once the data range is selected, click "Ok" and new window will open with exactly what you want. A unique values list and a "Count of Column1". It is the default of the recommended pivot tables.
I outlined this because it's easy and fast, but it's important to understand you can make this pivot table yourself from scratch if you learn about pivot tables in general. Pivot tables are often overlooked in Excel as an option.
Lastly, you could get really advanced with Excel Power Queries. Just Google "Excel Power query" and you will be shown all kinds of information on them. They are a close second place in power to manipulate Excel data short of using VBA.
Good luck!
CountA(Unique(D2:D8,,False)) = 5 [Count(Unique(D2:D8)) is the same as False is the default.]
CountA(Unique(D2:D8,,True)) = 3 (once and only once)
Note: the Unique function was released in late 2019 to Office 365. So if you want to use this check your version, not present in 1908, present in 2006.
Edit: It's actually in 2002, I just updated my 1908 machine.
HTH
If names duplicates are removed the following formula can be used: =COUNTIF(B:B,F2)
If duplicates must be removed by formula, MATCH (searches for a specified item in a range of cells, and then returns the relative position of that item in the range.) and SMALL (Returns the k-th smallest value in a data set.) functions can be used as shown.
C$1048576 is used to reference last row number for a big list case.
formulas:
Column A, names sequence
Colunm B, names
Column C, formula =MATCH(B2,B:B,0)
Column D, formula =IF(COUNTIF(C2:$C$1048576,C2)=1,C2,"")
Column E, formula =SMALL(D:D,A2)
Column F, formula =VLOOKUP(E2,A:B,2,0)
Column G, formula =COUNTIF(B:B,F2)
For anyone like me without O265's lovely Unique & Filter Functions, and who doesnt want to use a pivot table, and there are many ways to do this, but this i have just done this in normal excel.
List of data in Column H, Formula in column O3. Drag down. Highlights your distinct and unique values from H.
=IF(COUNTIF(H:H,H28)=1,"U - "&COUNTIF(H:H,H28),IF(COUNTIF(H$1:H27,H28)=1,"U - "&COUNTIF(H:H,H28),"-"))
Formula is short. You can just do this and drag down. Apply the same principal to your worksheet data wherever it is.
=IF(COUNTIF(H:H,H3)=1,"U",IF(COUNTIF(H$1:H2,H3)=1,"U","-"))
Similarly, you can just use this formula here (credit goes to this source for this one):
=(COUNTIF($H$1:$H1,$H1)=1)+0
Id like to point out that the above formula is a better formula than mine. It highlights with a "1" (or with a tweak, the value of your choice) the first time any value is seen/spotted on any given list, whether duplicate or unique.
Whereas mine is a bit "more random" when picking up the "unique and distict" values.
Mine gets there in the end, but Extend Office's gets there first, as I think is proper (getting the first time a unqique distict value is spotted/occurs.).
Formula in K5 =IF((COUNTIF($H$5:$H5,$H5)=1)+0=1,"UNIQUE DIST","") and drag down...
You could append/add a normal basic countif after the results to show how many actual times the given value appears if you wanted. :
=IF((COUNTIF($H$5:$H5,$H5)=1)+0=1,"UNIQUE DIST","")&" - "&COUNTIF(H:H,H5)
I got a problem with some popular Excel question, dynamic ranges and data validation drop-downs and auto-populate. Lets say I got 2 sheets, and on one sheet I got drop-downs to choose from another sheet, and that is not a problem when I define cells and range using:
=OFFSET($A$19;;;COUNTA('0528 - info'!$E$2))
..but what about when I wanna add some new cells in between,so that they can be automatically recognized in which group they belong:
As you see for instance Column B has some "groups" where you can find more different "values" like in Column C, like Power Supply has MV1 and MV2... and so on. My drop-downs on the sheet 1 are called exactly like this "groups" and I did reference them manually using given function. But is it possible to populate my drop-downs automatically when I add for instance MV3 beneath MV2 in this table? Or RN7 on 14th row? Everytime I add new values I have to extend my dropdowns (what is fine..), but problem will be when I share this table to others, they gonna forget it 90%.
I hope you get my point, any suggest will be fine!
p.s. Indirect doesnt work in a way it should - It gives me all instances from the Column but not specific ones that I need.
=INDIRECT("Table4[VarEDS]")
Well this option gave me again what I already had before - all "matches" from the Column and still not ONLY matches that are for certain group. ...
If your Data Validation source is a "Table" as shown in your image then you can take advantage of "Table Column" Range which is dynamic. That means whenever you refer that column as NAMED range and if make changes to the column (Edit, Add, Delete) it will reflect in the referred cell.
You can use this technique even for ranges not in table. You need to NAME them with offset formula and make dynamic.
You can find dynamic address of your column as shown in the image below. Select entire column WITHOUT Header
Name your column data range with appropriate name as shown in image below
Then in Data Validation Window refer this name using F3 as shown in image below.
Then you can see... Even if you edit, add or delete any row in the column the data validation will change
Editing based on your comment below: If you want text from column B and Column C appear together in the validation dropdown list. Insert column in the table and join text from column B and C and then make data validation based on that column as shown in Colum D in image below
Finally I think I understood your question.
Watch this video
Excel: Find Multiple Matches & Dependent Drop Down List
After some days of searching and trying I got what I wanted - wasnt wasy job at all. Needed to combine more functions with the help of couple of videos from Leyla (Xelplus):
https://www.youtube.com/watch?v=gu4xJWAIal8
https://www.youtube.com/watch?v=7fYlWeMQ6L8&t=5s
First step was to make unique list of my values (text in my case) on separate sheet:
=IFERROR(INDEX(t_VarGroup[Vargrouptext];MATCH(0;INDEX(COUNTIF($J$2:J2;t_VarGroup[Vargrouptext]););0));"")
Then I needed to "extract" all the values that are belonging to the certain unique values:
=#IF($I3<COLUMNS($K$2:K$2);"";INDEX(t_EDS[[VarEDS]:[VarEDS]];AGGREGATE(15;3;(t_VarGroup[[Vargrouptext]:[Vargrouptext]]=$J3)/(t_VarGroup[[Vargrouptext]:[Vargrouptext]]=$J3)*(ROW(t_VarGroup[Vargrouptext])-ROW(t_VarGroup[[#Headers];[Vargrouptext]]));COLUMNS($K$2:K$2))))
FUrthermore, I created Unique drop down list:
=OFFSET($J$3;;;COUNTIF($J$3:$J$14;"?*"))
And then dependent drop down list nearby using:
=OFFSET($K$2;MATCH($H$2;$J$3:$J$17;0);;1;COUNTIF(OFFSET($K$2;MATCH($H$2;$J$3:$J$17;0);;1;20);"?*"))
And because I made it on other sheet, I had to reference them to an appropriate sheet name where my main sheet is - with drop downs, it is actually very useful for my future work and for everyone else who has struggling with drop downs but on a bit specific way =))
credits to: #Naresh Bhople for suggestion about Youtube videos.
I have a rather large excel sheet 20k+ rows. My excel document has three sheets named CM, PP, and CH.
CM only contains the Information I use.
PP is the public Information that contains ALL data.
CH is my change log.
What I'm trying to do is take the values from my CM sheet in Column A "CM(A)" and find them in the PP sheet Column A "PP(A)", then copy the found values from PP(A) and PP(F) "The sixth column over" to the third sheet CH(A) and CH(B).
This in of itself is rather simple, where I'm having a hard time is that sheet PP can contain multiple instances of the value in CM(A). The catch though is that I only need One of those specific values which is indicated by a value of "26" in column PP(B).
I just have no idea how to write the nested formula to make this happen.
Visual Goal of Formula
If you're unfamiliar with Array Formulas, you should definitely look into those as they are extremely helpful for tasks like this. You would need to use a conditional to test if the item occured more than once and then execute an INDEX-MATCH or VLOOKUP based on the results returned. I recreated your data structure and was able to achieve appropriate results using this formula in B1 of sheet CM:
=IF(COUNTIF(PP!A:A,A1)>1,INDEX(PP!F:F,MATCH(1,(PP!A:A=A1)*(PP!B:B=26),0)),VLOOKUP(A1,PP!A:F,6,FALSE))
Array formulas must be entered using Ctl+Shift+Enter, as is noted in the linked documentation.
I've a data set that shows;
employee name
date
time work started
time work ended
Now I am trying to have a report like sheet where I can select a certain employee name from a list of employees to view his/her time attended for a particular month.
I tried vlookup but went no where since I need to lookup by two columns plus a row.
Is this possible? without macros or vba.
Thanks
Since name and date are unique identifiers it is possible to use the sumifs function.
For ‘time in’ and ‘Rachel’ this will look as follows:
=Sumifs(column ‘time in’ from data set, column ‘name’ from dataset, “Rachel”, column ‘date’ from data set, “10/01/2017”)
Where Rachel and the date also can be a referenced cell.
=AGGREGATE(15,6,ROW(SHEET1!$A$2:$E$22)/((SHEET2!$B$1=SHEET1!$B$2:$B$22)*(SHEET2!$A4=SHEET1!$C$2:$C$22)),1)
The above formula will grab the row number that matches your criteria. to pull the information you want, you can place the row number inside an INDEX formula to get the following:
=INDEX(SHEET1!$D:$E,AGGREGATE(15,6,ROW(SHEET1!$A$2:$E$22)/((SHEET2!$B$1=SHEET1!$B$2:$B$22)*(SHEET2!$A4=SHEET1!$C$2:$C$22)),1),COLUMN(A1))
You can place the above in your first Time cell and copy right and down. You will see errors if criteria do not exist. ie no person of that name or no date data for that person. to avoid this you can wrap the whole thing in an IFERROR like the below:
=IFERROR(INDEX(SHEET1!$D:$E,AGGREGATE(15,6,ROW(SHEET1!$A$2:$E$22)/((SHEET2!$B$1=SHEET1!$B$2:$B$22)*(SHEET2!$A4=SHEET1!$C$2:$C$22)),1),COLUMN(A1)),"Nothing found")
if you would rather a blank than nothing found display change the "nothing found" to "" or 0 if you want 0 to be displayed.
Note: Aggregate is performing array like calculations in this case. As such you do not want to full column references as it will cause a lot of unnecessary calculations to be performed. Because you have unique entries, SUMIFS option given in another answer is a much better choice.
I think a pivot table will do the job for you.
Place the employee name in the filter, place date and
times in the rows.
Remove subtotals from the Pivot Table
Change Table layout to tabular and Repeat rows
Right click on the Time In and select Ungroup
Then you have the image below.
I have the following layout:
In B11 write this formula and drag down:
=INDEX($B$2:$E$5,MATCH($B$7&$A11,$B$2:$B$5&$C$2:$C$5,0),3)
In C11 write this and drag down:
=INDEX($B$2:$E$5,MATCH($B$7&$A11,$B$2:$B$5&$C$2:$C$5,0),4)
Note that these are Array-Formulas, so you need to enter them with CTRL + SHIFT + ENTER instead of the normal Enter.
You will get a #NV error if the employee hasn't worked on one of the dates A11 and A12. So you could surround the Formula with IFERROR to avoid this.