From table to 2 column in Excel - excel

I would like to create two columns from a table.
The correct solution will be this (this will be to 2 separate columns):
From this table:
The table is much longer than this, this is an example.
Can you help me to solve this?
Thanks,
Gergo

To load your existing data into PQ, it needs to be a table. To make this easy, I would convert it to a table and have "My table has headers" unchecked. Then put some labels on your identifier rows so we can easily convert them. It looked like this after I did it.
Then load your table into PQ with the "From Table" option. It will probably try to set types and promote headers by default, which isn't helpful, so delete those steps. It should look like this.
To be able to map your multiple row headers, we'll need to do some pivot transforms. Make a reference to the table we just imported.
Keep the first 3 rows, select the first column and "Unpivot Other Columns". Then select the first column again and "Pivot Columns". Select the "Value" column as your "Values Column" and under the "Advanced options" select "Don't Aggregate".
After this pivot you'll now have a table that maps all of your columns to their header rows. I converted the column with the numbers to text since we'll be appending it as text later on. The result table looks like this:
The full code for this query was:
let
Source = Table1,
#"Kept First Rows" = Table.FirstN(Source,3),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Kept First Rows", {"Column1"}, "Attribute", "Value"),
#"Pivoted Column" = Table.Pivot(#"Unpivoted Other Columns", List.Distinct(#"Unpivoted Other Columns"[Column1]), "Column1", "Value"),
#"Changed Type" = Table.TransformColumnTypes(#"Pivoted Column",{{"Region", type text}})
in
#"Changed Type"
Then go back and make another reference to your imported table. This time remove the top 3 rows, select the first column and Unpivot Other Columns. You'll get a list of your dates with all the other columns of data unpivoted next to them. Now you can Merge Queries by matching the Attribute columns from this table and the table where you mapped out your header rows.
Expand the merged data and now you have your header rows mapped to every line of data in your table. You can add a custom column that creates the unique ID column you wanted, then remove and move around the columns to get the result data you want. The custom column code looked like this for me:
[Country Code]&"_"&DateTime.ToText([Column1], "MM/dd/yyyy")&"_"&[Region]
And this was the result table:
The full M code for this part was:
let
Source = Table1,
#"Removed Top Rows" = Table.Skip(Source,3),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Removed Top Rows", {"Column1"}, "Attribute", "Value"),
#"Merged Queries" = Table.NestedJoin(#"Unpivoted Other Columns",{"Attribute"},RowIDs,{"Attribute"},"RowIDs",JoinKind.LeftOuter),
#"Expanded RowIDs" = Table.ExpandTableColumn(#"Merged Queries", "RowIDs", {"Country Code", "Country", "Region"}, {"Country Code", "Country", "Region"}),
#"Added Custom" = Table.AddColumn(#"Expanded RowIDs", "ID", each [Country Code]&"_"&DateTime.ToText([Column1], "MM/dd/yyyy")&"_"&[Region]),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Value", "ID"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Other Columns",{"ID", "Value"})
in
#"Reordered Columns"

You can use the unpivot function of Power Query to achieve this. Below instruction might at first seem long and difficult but it takes no more than one minute when you get the logic.
At first, arrange your headers into just one row. I think the country name can be omitted. So create a new header row as Row:4 and concatenate country code and integer with a such formula in B4 and copy right:
=B1&B3
Now copy 4th row and paste onto the same row as Values Only. The you may delete rows 1-3 and left with just one row as the header. Now you should have AT32 in B1, AT38 in C1 etc.
Now get your data into Power Query. When a non-blank cell is selected in your table, go to Data Ribbon, press From Table/Range button. Now your data is in Power Query.
Select columns with numeric values in it, not the dates and press Transform / Unpivot Columns / Unpivot Columns button. This will get your data into the requested format.
You may now get your sorted & pivoted data back to Excel using Home / Close & Load button.
To get all country and date information into cell, create a new column and use a formula like below:
= LEFT(B2,2) & "_" & TEXT(A2, "dd/mm/yyyy") & "_" & RIGHT(B2, 2)

Related

Excel - Analyzing / Counting the comma separated values in cells

Here's an Excel sheet, where we track Demand of Products across the countries [Sheet/Table name: Data]
Country
Products
India
A
Australia
A,B
Brazil
B, C
This Data will be used to understand the demand of the products across the countries, by simply counting products for each country. This is how the data will look like:
Products
Demand
A
2
B
2
C
1
[Sheet/Table name: Product-Demand]
One of the ways, I was able to do this was :
Split the comma separated values in the Products Column/cell, and then CountIFS
However, this approach involved
Every time the Data table was updated, manually copy-pasting this to another sheet.
The products may increase, ex: new product "Z" may be launched
We use Power BI to create a heat-map of Products in demand across the country. The Power BI Service pulls the data every Monday. Manual effort kind of destroys the automation.
Please advise/guide on
What's the best way to count the products (or comma separated values) in a cell with least amount of manual work.
Thanks!
Where the data is in columns A and B on both sheets, and there is a header row with the data starting in row 2, on the Product-Demand sheet, in cell B2, place this formula:
=COUNTIF(Data!$B$2:$B$4,"*" & A2 & "*")
then drag down
You need to transform the data into a more useful structure. One way to do that is to use Power Query.
Start with your source data as a table called Table1.
Then, paste the following code in the Advanced Editor in Power Query:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Country", type text}, {"Products", type text}}),
#"Replaced Value" = Table.ReplaceValue(#"Changed Type"," ","",Replacer.ReplaceText,{"Products"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Replaced Value", "Products", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Products.1", "Products.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Products.1", type text}, {"Products.2", type text}}),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"Country"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"})
in
#"Removed Columns"
This code will produce a table of usefully-structured data. The final table you want can be produced by a PivotTable using Power Query's output table as a source.

Grab information from multiple cells and display it in one cell based on specific text match

I'm at my wits end. What formula should I even go for? Can I achieve what I'm trying to do in Google Sheets?
I have attached a picture of a mockup what I'm trying to do.
Basically I have 2 tables. Let us call them "Calendar" - TABLE A and "Schedule" TABLE B.
TABLE A - I mark manually an event and a name or names behind it.
TABLE B - Is the table I'm trying to create a formula for. In the picture colored green is where the formula bit should be. Basically trying to search TABLE A and match the Date (19 July) with the name (Mary)
So in text the formula matches TABLE A information of 19 July and every cell with Mary and displays it in TABLE B, under the fields Mary and 19 July displays what TABLE A had going for him.
Cool if I could simply trim the end result and remove other names, so if there was "Mary and Richard", it would know to remove those names from the fields.
So basically display every cell with "Mary" in TABLE A, and display it in TABLE B under one cell.
Here's an option with a helper sheet.
Let's say your data is on a sheet called Data, range Data!A1:AB21:
Create a helper sheet called Formula:
Cell B2 gets the headers from the Data sheet:
={Data!A1:AB1}
Cells A2:A5 have the fixed names, Mark, John, Richard, Tom.
This formula goes in EVERY cell within the range B2:AC5 (drag down and drag across):
=arrayformula(iferror(textjoin(",",1,trim(regexreplace(filter(Data!A:A,regexmatch(Data!A:A,$A2)),textjoin("|",1,$A$2:$A$5),))),))
A final Sheet called 'Results' has this in cell A1:
=transpose(Formula!A:AC)
It's not as elegant as having an arrayformula in a single cell on the Formula sheet, but it does work once you've set the data range and dragged the formula down and across.
You can obtain your results using Power Query provided that:
There is a reproducible method of obtaining the Names
In your example, each name is a single word, and they all occur together at the end of an activity substring
Multi-name words will require a different algorithm to extract.
To use Power Query (available in Windows Excel 2010+ and Office 365)
Select some cell in your Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
M Code
let
Source = Excel.CurrentWorkbook(){[Name="TableA"]}[Content],
//need to hard code list of names
names = {"Mary","John","Richard","Tom"},
//Type all as text
#"Changed Type" = Table.TransformColumnTypes(Source,
List.Transform(Table.ColumnNames(Source), each {_, Text.Type})),
//Unpivot to => a two column list
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type", {}, "Dates", "Value"),
//Extract the names
#"Added Custom" = Table.AddColumn(#"Unpivoted Columns", "Names", each List.Intersect({Text.Split([Value]," "),names})),
//Extract the Activities
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Activity", each Text.Start([Value],Text.PositionOf([Value],[Names]{0})-1)),
//Remove unneeded column
#"Removed Columns" = Table.RemoveColumns(#"Added Custom1",{"Value"}),
//Expand the names to => single row for each name/date
#"Expanded Names" = Table.ExpandListColumn(#"Removed Columns", "Names"),
//Group by Date and Name
// Aggregate the activites
grouped = Table.Group(#"Expanded Names",{"Dates","Names"},{
{"Activities", each Text.Combine([Activity],", "), type text}
}),
#"Pivoted Column" = Table.Pivot(grouped, List.Distinct(grouped[Names]), "Names", "Activities"),
//reArrange columns to desired order
#"Reordered Columns" = Table.ReorderColumns(#"Pivoted Column",List.Combine({{"Dates"}, names}))
in
#"Reordered Columns"
Table A
Table B

Merging multiple data sources for a line graph in Excel and working with uneven data sizes

I have 15ish data sources that contain a date and some values. But the dates can be different from one sheet to another.
Sheet 1
Date
Infected
2020-03-28
10
2020-03-29
20
...
Sheet 2
Date
Infected
2020-04-15
5
2020-04-16
7
...
My goal is to produce a combined line graph containing all the sheets, but some tables have more data than others with the date series.
I can think of only one option to make them all the same size and cover all date values: Merge queries in Power Query (essentially many left joins to bring all sheets together and combine all the dates).
Is there another option I'm missing to combine these tables? Something at the graph level maybe so they can all refer to their own date series?
Tables that are structured like this and stored in different worksheets can be merged using the pivot function. Here are 2 scenarios based on the assumption each table is saved on a separate worksheet within a single Excel workbook:
1. Each table is saved as an Excel Table. The Tables are combined into a single table using Power Query and the resulting table is then loaded into a new worksheet.
2. The tables are not saved as Excel Tables, and doing this manually or figuring out how to execute this process with a VBA script seems like a lot of work. In this case, the tables can be more easily combined by creating a new Excel workbook and then connecting to the file containing the tables by using Power Query.
Here is an example for each scenario. The dataset consists of 3 tables each containing 10 rows where the dates between tables overlap a bit.
Scenario 1 - Data stored in Excel Tables
Here is what the sample data looks like in the first worksheet containing an Excel Table named Source1:
With all tables saved as Excel Tables and named following the same convention, it is time to open the Power Query Editor and complete the following steps:
Create a blank query and type = Excel.CurrentWorkbook() in the formula bar to list all the named objects contained in the workbook.
Filter the Name column to keep only rows with names starting with Source to avoid including the table combining all the data into this query once it is completed and refreshed.
Click on the expand button of the Date column, uncheck Use original column name as prefix and click OK.
Change the Date column data type to date.
Pivot the Name column with the Infected column as the Values Column and with Don't Aggregate selected under Advanced Options.
Now the data is merged appropriately and can be loaded in a new worksheet to create the line graph:
Here is the M code:
let
Source = Excel.CurrentWorkbook(),
#"Filtered Rows" = Table.SelectRows(Source, each Text.StartsWith([Name], "Source")),
#"Expanded Content" = Table.ExpandTableColumn(#"Filtered Rows", "Content", {"Date", "Infected"}, {"Date", "Infected"}),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Content",{{"Date", type date}}),
#"Pivoted Column" = Table.Pivot(#"Changed Type", List.Distinct(#"Changed Type"[Name]), "Name", "Infected")
in
#"Pivoted Column"
Scenario 2 - Data not stored in Excel Tables
In this scenario, the data is stored in an external Excel workbook datasources.xlsx. Here is what the sample data looks like in the first worksheet:
In a separate Excel workbook, open the Power Query Editor and complete the following steps:
Create a blank query and type the following line in the formula bar using the appropriate file path: = Excel.Workbook(File.Contents("C:\FilePath\datasources.xlsx"))
Remove all columns except Name and Data.
Click on the expand button of the Data column, uncheck Use original column name as prefix and click OK.
Promote the first row to column headers.
Edit the data types for the Date and Infected columns.
Select the Date column and remove rows containing errors.
Pivot the Sheet1 column with the Infected column as the Values Column and with Don't Aggregate selected under Advanced Options.
Now the data is merged appropriately and can be loaded in a new worksheet to create the line graph, yielding the same result as in Scenario 1 except for the column headers and legend labels:
Here is the M code:
let
Source = Excel.Workbook(File.Contents("C:\FilePath\datasources.xlsx")),
#"Removed Columns" = Table.RemoveColumns(Source,{"Item", "Kind", "Hidden"}),
#"Expanded Data" = Table.ExpandTableColumn(#"Removed Columns", "Data", {"Column1", "Column2"}, {"Column1", "Column2"}),
#"Promoted Headers" = Table.PromoteHeaders(#"Expanded Data", [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Sheet1", type text}, {"Date", type date}, {"Infected", Int64.Type}}),
#"Removed Errors" = Table.RemoveRowsWithErrors(#"Changed Type", {"Date"}),
#"Pivoted Column" = Table.Pivot(#"Removed Errors", List.Distinct(#"Removed Errors"[Sheet1]), "Sheet1", "Infected")
in
#"Pivoted Column"

Concatenate multiple values into one cell, based on a lookup ID

I'm trying to concatenate multiples values in one cell based on a lookup search (ID), the thing is this ID sometimes might be alone but other time might be between multiple IDs in the same cell (at the beginning, middle or end separated by commas). I've been using the below formula but only returns the first value when the ID is alone.
Current formula --> =TEXTJOIN(",",TRUE,IFERROR(XLOOKUP(A2,D:D,E:E),"ID not found"))
Hope you can help.
Thanks.
Rows highlighted in blue, yellow and green are the expected results (I did them manually).
Row 7 is the actual result (wrong/incomplete) for the current formula.
You can try below approach to get results:
=TEXTJOIN(",",TRUE,IF(ISNUMBER(SEARCH(A2,D:D,1)),E:E,""))
One suggestion would be to limit the entire column usage to improve formula speed.
This can also be done via power query. I used different ID values from your table on the right, but it should still work correctly.
Create a table that has ID's in Col1 and the Value to be Returned in Col2
Open the table in Power Query (Data tab > Get & Transform Data via From Table/Range.
If you don't have power query, reference this Complete Guide to Installing Power Query.
Power Query Steps
Change type of Col: Value to be Returned to text
Split Col: ID by Delimeter = ','
You know should have multiple ID columns (ID.X). Select all ID columns > Right click on header > select Unpivot Other Columns
Remove any unecessary columns
Select the column with all of your ID's > right click and select Group By
Under new column name, enter a new column header. Change the Operation to Sum. For Column, select your column that contains your Values to be Returned
Reference this guide for the next step. You need to manually configure the M code in the formula bar & change the formula from List.Sum([COL]).. to Text.Combine([COL], ",")..
The last step is to make sure that your new column is a text column, not a number.
I've attached a copy of my workbook, which should hopefully help. If not, I've pasted my code from the Advanced Editor below in case that is helpful. Be sure to update Table & Column names accordingly based on your workbook.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", type text}, {"Value to be returned", type text}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "ID", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"ID.1", "ID.2", "ID.3"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"ID.1", type text}, {"ID.2", type text}, {"ID.3", type text}}),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"Value to be returned"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Value", "Value to be returned"}),
#"Grouped Rows" = Table.Group(#"Reordered Columns", {"Value"}, {{"Value.1", each Text.Combine([Value to be returned], ", "), type number}}),
#"Changed Type2" = Table.TransformColumnTypes(#"Grouped Rows",{{"Value.1", type text}})
IN
#"Changed Type 2"
Let me know if this works or if you have any follow up questions.

Excel Separate Pipe Delimited String into separate columns - where value starts with A, B, C but values in string are not cronological

I have an excel sheet, which has values from a database. One cell has responses from a multiple choice question which is pipe delimited. eg a user could ask what colour cars have you had? With options such as:
A. Black
B. White
C. Red
D. Yellow
So a user can respond with A, C and D. these values are stored in one cell as "A. Black|C. Red|D. Yellow" I want to separate each of these values in a A column, B column, C column and D column.
I tried using the Text to Column feature but this does know that the A column should only contain As.
I thin I need to add a formula to each column which looks for the "A.", "B.", "C." or "D." and then finds the next available pipe character. I think I need a substring of some sort maybe. Something like this maybe:
=LEFT(C2,LEN(C2)-FIND("A.",C2))
But I don't know how to find the next occurrence of the pipe symbol - any ideas? Is there a nextIndexOf function in excel maybe?
Many thanks in advance
I think this would require either VBA or Power Query, because you are talking about having to analyze the data to determine what column it goes into. It's actually very easy to do in Power Query, but if you've never even heard of Power Query (it's built into Excel 2016+ and is a free add-in for 2013) then this solution is probably out of scope without an extensive explanation of what Power Query is.
That being said, for anyone interested, I took this dummy set of data, and converted it into an output table with Power Query which I believe is what was being asked in the question.
This is the code for the Query. A quick summary is we split it by the pipe delimiters and index it. Then the main trick is using the Group function combined with the Transpose function to get the split columns into an unpivoted data set. After expanding back out from the Group we can then split out the "ABCD" section from the answer, and pivot off that to get the ABCD columns aligned with the original entries.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3", "Column1.4"}),
#"Added Index" = Table.AddIndexColumn(#"Split Column by Delimiter", "Index", 1, 1),
#"Grouped Rows" = Table.Group(#"Added Index", {"Index"}, {{"Rows", each Table.Transpose(Table.RemoveColumns(_, {"Index"})), type table}}),
#"Expanded Rows" = Table.ExpandTableColumn(#"Grouped Rows", "Rows", {"Column1"}, {"Rows.Column1"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Rows", each ([Rows.Column1] <> null)),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Filtered Rows", "Rows.Column1", Splitter.SplitTextByDelimiter(". ", QuoteStyle.Csv), {"Selection", "Answer"}),
#"Pivoted Column" = Table.Pivot(#"Split Column by Delimiter1", List.Distinct(#"Split Column by Delimiter1"[Selection]), "Selection", "Answer")
in
#"Pivoted Column"

Resources