Pull row based on Column Match from multiple tables - excel

I am trying to do the following and am stuck.
I have an excel spreadsheet with 3 tabs.
One tab is an input file
Second tab is a set of data
Third tab is a set of data
For #1, the first tab contains has a list of file names and where they are located.
I then use power query to combine those two columns, FileNames and QuickCheck here to produce my table that I want to run "Quick Checks" against.:
For #2 and #3, those tabs contain customer data
Basically, with power query, how do I run a search where If the Custom column in 1 matches the quick check column in #1, pull that row of data and output it to another tab? My desired output file needs to look like this:

You can merge the query #1 and #2 using the Merge Queries button and then expanding the new column it generates by clicking on the button with the two arrows in the column header. You can do the same merge with queries #1 and #3, and then append those two queries together using the Append Queries button.
If you want to keep queries 1-3 unmodified, you can Duplicate or Reference the query you're going to use as the left table by expanding the queries pane (next to the table preview in the editor), right-clicking on the query name in the queries pane and selecting the relevant context menu item. You can then do the merge step on that without modifying the original query.

Related

Is there a way to match values in Excel when an individual cell has multiple values?

Above is a picture of my Excel sheet. I have 2 columns of data that have multiple data points in them (separated by commas). This is how my data is spit out after running an online psychology experiment. I'm hesitant to split text to columns because some lines only have 3 values and other lines have 20+. Essentially, I need to match values in one column to values in the second column. For example, the first value in column G needs to match with the first value in column H. The second value needs to match with the second value, etc. I don't need to match up every value in both columns, however. I only need a (defined) subset of values.
I'm not sure if this is possible to do in Excel (or any Excel add-on) without separating the values into separate columns, but any help is appreciated!
I've seen this before in survey data - the output uses "packed data" where each cell contains many values. You will need Excel 2010+ for Windows (or Excel 365) for this solution. Otherwise, there a solution that is also Mac compatible that does not involve VBA, but it takes time to construct. This approach should take you 10 mins to do - a lot of steps, but it is just clicking.
Let's say that these are your data in two columns in a table.
Click anywhere inside the table. Open the Data tab and click on From Table/Range:
This will convert your data into an Excel Table and ask you if your table has headers - yes it does. Click OK.
This will open the Power Query (PQ) editor (congratulations, you are now a step closer to data scientist, so take a selfy with this screen in the back and share on social media).
You will see in the Applied Steps on the right hand side that PQ has helpfully detected the data type in a step called Changed Type. You need to undo that because it will likely think that your comma separated numbers are just one giant number. So click the X on the left side of that step.
On the right side, you can expand out Queries as shown above. Right click on your table and select Duplicate.
NB: This is not the most efficient way to do this, but I think this is something you just want do one time and you probably don't want to go hacking through the Advanced Editor.
So now you have two tables:
Rename Table1 (2) to Output in the box on the right hand side just to create some clarity.
Right Click on the Response RT column in Output and Remove it. Click on Table1 and do the same thing to the Response column. So now you have Table1 with only the Response RT and Output with only the Responses. Now we will parse these into rows of cleaned data.
Parse Table1
First, in Table1, click on the Response RT column and in the Home tab you will see Split Column. 1) Click on that and choose By Delimiter.
2) It will default to Comma, but you need to click on Advanced options and choose the Rows radio button.
Click OK and it should turn your data into rows of separated numbers and change to the type (this time helpfully) to decimal.
Now you need to add an index. 3) Go to the Add Column tab and click on Add Index, starting from 1.
Parse Ouput Table
Now go back to Output and repeat steps 1), 2) and 3) for it as well. Then you will have to take an extra step to clean up your text column. Right-Click on the Response column and choose Transform > Trim on the data.
That will get rid of those spurious spaces.
Merge Them Back Together
While you still have the Output table selected, go to the Home tab and choose Merge Queries.
It will bring up this window:
Choose Table1 from the bottom dropdown. Click on Index on both tables and click OK. You will get something like this:
Click on the button on the top right of the Table1 column and then unselect Index and Use original column name as prefix.
Click OK. Right click the Index column and Remove it. You now have your answer, but you still need to bring it back to Excel.
Putting it back in Excel
Click on Close and Load to on the left hand of the Home tab. To keep things simple, just click OK.
It will put both Output and Table1 as worksheets into your workbook, (this is where I said it is not the most efficient approach - you can always delete the Table1 worksheet. Excel will complain when you do, but you can ignore it.) Output is your answer.
Congratulations, you just did an ETL (extract transform and load) operation in data analytics. Do another selfy with the answer and share on social media.

How to filter in excel when cell has multiple items?

Im looking to create a contract list for my work. We are in construction and have multiple contractor that do various jobs. I have a column for trade (meaning what kind of work they do) that I'm trying to filter out.
Is there a way that I can separate them with commas, hashtags, etc?
Select you data, click get & Transform tab on DATA menu and from table
A power query window will be opened
Right ckeck Trade col, click split by delimiter
you'll get a split col like this
select three trade cols and click unpivot, you'll get an output like this
close and load back the data to excel, you'll get an desired output where you can put slicer of your choice

Excel populate table from same workbook

I have a table on Sheet1. I want to pull in data into this table based on entry in another tab.
For example, I have a table
based on information in other tabs, I want to populate the Score column.
Can this be accomplished using SQL query or Powerquery (pseudocode- Select "Score" from other tab where Name = Jack )
I can look for events in VBA when if data is entered in the other tabs it can grab it and paste it in this table but it seems messy.
The reason I want to do this is, there are multiple tabs where people can enter their Scores. They cannot enter this in the main tab otherwise I wouldn't have a problem.
Load all datasets from your worksheet to PowerQuery and keep them as connections only (w/o loading to worksheet) except the one that you want to populate with the data in the end. After loading you can merge different datasets using selected columns as the key for joining. The result of the merge would be a table from which you could retrieve particular columns of interest.

Excel: find and order matches by column

I´m currently working with a huge epidemiological dataset with several Excel-files. The files contain pathology and clinical report for almost 30k patients. Each patient can have several pathology and clinical reports. The patients are assigned an unique ID.
I want combine all files into one so that ID for patient X001 would contain all the information form all the files. I cannot just copy/paste because the number of rows (IDs) in the files vary.
Here is an example of what I want to accomplish.
I want to combine two lists as follows.
As you can see that List1 and List 2 vary in row numbers. Also there are IDs in list1 that are not found in list2 and vice versa.
I want to merge them so that they align and match, see image below. Can someone provide a code for this? I cannot do this manually since I have 100k rows in list1 and 30k rows in list2...that would take several weeks to do with a risk of errors.
You can merge tables combined utilizing Excels built in Power Query, which can be found under the Data tab.
Note: Photos are taken from Excel 2016
The first step is to create the queries:
Within the Get & Transform section under the Data click on New Query -> From File -> From Workbook and select the appropriate workbook that has the table you want to merge
Select the appropriate sheets in which your tables are found, and confirm that they are displaying properly
If you notice that the table is not correct, you can make changes to it via the Edit button below.
For example, if you notice that your Column headers are being treated as a normal value, you can click Use First Row as Headers under the Power Query Editor Home -> Transform
I would also recommend changing the name of the query so it makes more sense down the line
Once you are happy with the way the query is looking, click on the Close and Load Dropdown menu under the Power Query Editor Home and select Close and Load To...
Select Only Create Connection to add it into your Workbook Queries without duplicating the table.
Repeat the above steps for each table in which you are looking to merge.
Once you have all of your tables linked via Queries, you can now move on to merging them:
Under the same section of New Query select Combine Queries -> Merge
Select the two queries you are looking to merge in each of the respective boxes
Confirm that they are correct via the preview window (don't worry if not all rows show)
Rule of thumb would also be to select your largest query first, and the smaller second
Next, highlight the columns in which you are looking to merge based on. For your example it would be the ID. This is done simply by clicking on the column within the preview
Finally change the Join Kind to Full Outer and click OK
From here you should be back in the Power Query Editor
The final steps are modifying this merged query to your desired output
You should notice that there is a new column added next to your first original table with the name of the query at the top, next to the name is a button that allows you to expand out this query.
Select the appropriate columns you would like to merge into the other table and click OK
If at any point you make a mistake, you can retrace your changes under Applied Steps within the Query Settings Pane
Once you are happy with the way your newly merged query looks, go ahead and click on Close and Load
Your should now have access to your new merged query that will update based on changes made to the original connected files
If you want to make any additional changes going forward from this point just click anywhere inside of the table and you should see both the Table Tools and Query Tools tabs appear at the top

Excel VBA Code for concatenating data in multiple columns and removing duplicates

In reference to the attached "Example Photo" Image...
I would like to concatenate the unique data in Columns I and K into one cell (separated by line break) and remove the duplicated information in the other columns. My goal is to have the data look like rows 2 and 7 without the duplicated rows in between.
You can download Power Query or if you have Excel 2016 it is a default its name is Get & Tranform in the Data Tab.
Select any cell in your main table.
Go to Power Query or Data and select From Table/Range.
It will be a box with the range OK.
It will open the Query Editor
Go to Home select Group by.
In the Options:
Group by: Add all the fields you don't want to concatenate.
New Column name: It could be "Group".
Operation: Select All Rows.
OK.
Go to Add Column select Custom Columna.
Concatenate field Name
[Column Named Step 5][Column Name where is the data to concatenate]
Go to the new field and click in the right corner (Arrows) and select Extract Values....
Select delimiter #(lf) OK.
Go to Home tab and select Advanced Editor.
There look for ""#(lf)"" and delete the extra "" it should be "#(lf)" click in Done.
Got Home select Close & Load.
It will create a new sheet with a table with your new data.
Use Wrap Text in Home tab to see the lines break.
You can append more data in the main table and it will be just a right click refresh in the Power Query Table and you will get your data.
I made this tutorial. It is in Spanish but I am using the English Excel version.

Resources