I am working with an excel sheet where rows inside a particular column is written using new lines.
.
For e.g. in Fig 1. Col D and Col E have been represented using new lines. i.e. A = Very Good, Needs Improvement. What I am trying to get is this in another form as shown. Any pointers in this regard would be helpful.
Try to use "Get&Transform" aka Powerquery.
Steps:
Select your data and load it (with headers) into PQ.
Add a new custom column (named 'Custom' for example) and use the following custom column formula:
Table.FromColumns({Text.Split([Grades],"#(lf)"), Text.Split([Comment],"#(lf)")})
On the newly created column, click the expand button (top right) and expand both columns.
Delete columns 'Grades', 'Comments'.
Additionally you could rename the last two columns back to 'Grades' and 'Comment'.
To make things a litle easier you could also just apply the following M-code:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each Table.FromColumns({Text.Split([Grades],"#(lf)"), Text.Split([Comment],"#(lf)")})),
#"Expanded {0}" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Column1", "Column2"}, {"Custom.Column1", "Custom.Column2"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded {0}",{"Grades", "Comment"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Custom.Column1", "Grades"}, {"Custom.Column2", "Comment"}})
in
#"Renamed Columns"
Your end result should look like:
Try esProc, split and expand multiline words in an excel cell into multiple rows as following code.
A
1 =file("data.xlsx").xlsimport#t()
2 =A1.run(Grades=Grades.split("\n"),Comment=Comment.split("\n"))
3 =A2.news(Grades.len();Names,Class,Year,Grades(#):Grades,Comment(#):Comment)
4 =file("result.xlsx").xlsexport#t(A3)
For more explanation, see http://c.raqsoft.com/article/1609902051322
DISCLAIMER: This is about our tool esProc. It’s freemium.
Related
I have some Data in my Csv. file and I need to delete all rows except every 5th, how can I do that ?
I'd advise you to load the CSV into PowerQuery. Though PQ by no means is my forte, I'd then take the following steps:
Add an Index-Column with a starting index of '1' and a standard increment of '1';
Add a custom column based on modulus 5, e.g.: =Number.Mod([Index],5)=0;
Filter your custom column based on 'TRUE' values;
Remove the index- & custom column.
For example:
Add the index column:
Add the custom column:
Filter the custom column:
Delete the index- and custom column:
End up with only every 5th row:
For what it's worth, this is the m-code of me loading the data from my worksheet (source). You can load the data through CSV:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", Int64.Type}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each Number.Mod([Index],5)=0),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = true)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index", "Custom"})
in
#"Removed Columns"
Not-tested pseudo-code, but it will be something like that:
for i=end downto beginning:
if mod(i,5) != 0 then
Range(i,1).EntireRow.Delete
end if
step -1
It is crucial to go from end to beginning, or you'll mess up the indexes in your rows :-)
You could make a helper column with this formula:
=MOD(ROW(A2)-ROW($A$2)+1;5)
Replace semicolon with comma, if your Excel version needs.
ROW($A$2) is the first row of the data.
Then apply a filter
Data--> Filter
and remove the tick for the 0
Then delete rows.
then remove the filter
Delete sheet rows
Remove Filter
Remove helpercolumn
Then you can export the Excel file as csv.
Opening a csv in Excel, changing it and saving, might not work.
The end goal would be to change the granularity of a report, where each row would be repeated X times (where X is the nr of IDs in one cell), with the relevant ID on each row
So data like this
which should be displayed as such
is there a way in which each row can be repeated, with the relevant IDs from the 4th column?
I tried something in Power Query Editor, however I only figured out a way to create more columns based on how many IDs there are - but its the ideal solution
I also found this article which is really helpful https://www.extendoffice.com/documents/excel/4054-excel-duplicate-rows-based-on-cell-value.html#a1 yet it only solves half of the problem, as it would only duplicate the rows based on how many IDs there are - how can this be done in a way that it actually populates the relevant ID too?
You can use this query:
let
Source = Excel.CurrentWorkbook(){[Name="Sheet1"]}[Content],
#"Split Column by Delimiter" = Table.SplitColumn(Source, "IDs", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), {"Ids.1", "Ids.2", "Ids.3"}),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Split Column by Delimiter", {"Name", "date", "detail"}, "Attribute", "ID"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"})
in
#"Removed Columns"
It first uses the "split column" function and then unpivots the table by keeping the first three columns.
You have to adjust the sheet name and the column names as well.
2nd option:
let
Source = Excel.CurrentWorkbook(){[Name="Tabelle1"]}[Content],
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Ids", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Ids")
in
#"Split Column by Delimiter"
Using the advanced options of the split column dialog and splitting into rows.
I would like to extract the top 5 players based on the sales by each employee (without Pivot Table / Auto filter).
Refer my input and output screenshot
Snapshot
Any suggestions, how to obtain first top 5 ranks (even if repeated; as shown in the screenshots)
I have verified Extract Top 5 Values for Each Group in a List without VBA and some other links also.
Thanks in advance for your time and consideration! Please let me know if my request is unclear and/or if you have any specific questions.
This is what I use to track the top 5 absentees...
Edit to suit your needs.
Formula in cell A1:
=INDEX(A$13:A52,AGGREGATE(15,6,ROW($1:$40)/(B$13:B$52=B1),COUNTIF(B$1:B1,B1)))
Formula in cell B1:
LARGE(B$13:B$52,ROW())
An alternative approach using Power Query which is available in Excel 2010 Professional Plus and all later versions of Excel.
Steps are:
Add your input data table to the Power Query Editor;
Sort the table by Sales then by Name;
Add an Index Column starting from 1;
Filter the Index column to show values less than or equal to 5;
Remove the Index column, then you should have something like the following:
Close & Load the output table to a new worksheet (by default).
Here are the power query M Codes for your reference. All functions used are within GUI so it should be easy and straight forward.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Employee", type text}, {"Month", type text}, {"Sales", type number}}),
#"Sorted Rows" = Table.Sort(#"Changed Type",{{"Sales", Order.Descending}, {"Employee", Order.Ascending}}),
#"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1),
#"Filtered Rows" = Table.SelectRows(#"Added Index", each [Index] <= 5),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index"})
in
#"Removed Columns"
Let me know if you have any questions. Cheers :)
Try this one. As you have in your sample:
On Cell E16:
=VLOOKUP(LARGE($C$3:$C$12,ROW()-15),CHOOSE({2/1},$A$3:$A$12,$C$3:$C$12),2,FALSE)
On Cell F16:
=VLOOKUP(LARGE($C$3:$C$12,ROW()-15),CHOOSE({2/1},$B$3:$B$12,$C$3:$C$12),2,FALSE)
On Cell G16:
=VLOOKUP(LARGE($C$3:$C$12,ROW()-15),$C$3:$C$12,1,FALSE)
You can drag it down to get the list sorted.
Hope it helps!
I would like to extract the top 5 players based on the sales by each employee (without Pivot Table / Auto filter).
Refer my input and output screenshot
Snapshot
Any suggestions, how to obtain first top 5 ranks (even if repeated; as shown in the screenshots)
I have verified Extract Top 5 Values for Each Group in a List without VBA and some other links also.
Thanks in advance for your time and consideration! Please let me know if my request is unclear and/or if you have any specific questions.
This is what I use to track the top 5 absentees...
Edit to suit your needs.
Formula in cell A1:
=INDEX(A$13:A52,AGGREGATE(15,6,ROW($1:$40)/(B$13:B$52=B1),COUNTIF(B$1:B1,B1)))
Formula in cell B1:
LARGE(B$13:B$52,ROW())
An alternative approach using Power Query which is available in Excel 2010 Professional Plus and all later versions of Excel.
Steps are:
Add your input data table to the Power Query Editor;
Sort the table by Sales then by Name;
Add an Index Column starting from 1;
Filter the Index column to show values less than or equal to 5;
Remove the Index column, then you should have something like the following:
Close & Load the output table to a new worksheet (by default).
Here are the power query M Codes for your reference. All functions used are within GUI so it should be easy and straight forward.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Employee", type text}, {"Month", type text}, {"Sales", type number}}),
#"Sorted Rows" = Table.Sort(#"Changed Type",{{"Sales", Order.Descending}, {"Employee", Order.Ascending}}),
#"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1),
#"Filtered Rows" = Table.SelectRows(#"Added Index", each [Index] <= 5),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index"})
in
#"Removed Columns"
Let me know if you have any questions. Cheers :)
Try this one. As you have in your sample:
On Cell E16:
=VLOOKUP(LARGE($C$3:$C$12,ROW()-15),CHOOSE({2/1},$A$3:$A$12,$C$3:$C$12),2,FALSE)
On Cell F16:
=VLOOKUP(LARGE($C$3:$C$12,ROW()-15),CHOOSE({2/1},$B$3:$B$12,$C$3:$C$12),2,FALSE)
On Cell G16:
=VLOOKUP(LARGE($C$3:$C$12,ROW()-15),$C$3:$C$12,1,FALSE)
You can drag it down to get the list sorted.
Hope it helps!
How do you remove duplicate values from a single excel cell (A1) using power query
For example:
Anish,Anish,Prakash,Prakash,Prakash,Anish~,Anish~
Result wanted as like:
Anish,Prakash,Anish~
Using Power Query, you can refer to a single cell in the current workbook if it is a named range. You could then use something like this, to list the distinct values:
let
Source = Excel.CurrentWorkbook(){[Name="MyCell"]}[Content],
#"Split List" = Text.Split(Source{0}[Column1],","),
#"Removed Duplicates" = List.Distinct(#"Split List"),
#"Combine Values" = Text.Combine(#"Removed Duplicates",",")
in
#"Combine Values"
I am new to M code. However, for others who might has similar experience as me, I studied a little bit and I think the following might be easier for others to understand:
#"Added Custom1" = Table.AddColumn(#"Extracted Values1", "Split1", each Text.Split([#"Cust"],",")),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "RemoveDuplicate1", each List.Distinct([#"Split1"])),
#"Added Custom3" = Table.AddColumn(#"Added Custom2", "CombineValue1", each Text.Combine([#"RemoveDuplicate1"],",")),
Just simply copy above code in Advanced Editor, and change the column name respectively. In my case, the column name is Cust, Split1, RemoveDuplicate1,CombineValue1. Of course, the added column name might also be different.
Basically, the 3 rows means we need to create 3 columns, if we manually create 3 columns, then we just need to copy and paste the code after "each" of each row of above.
See below: