Excel: How can I delete all rows except the fifth - excel

I have some Data in my Csv. file and I need to delete all rows except every 5th, how can I do that ?

I'd advise you to load the CSV into PowerQuery. Though PQ by no means is my forte, I'd then take the following steps:
Add an Index-Column with a starting index of '1' and a standard increment of '1';
Add a custom column based on modulus 5, e.g.: =Number.Mod([Index],5)=0;
Filter your custom column based on 'TRUE' values;
Remove the index- & custom column.
For example:
Add the index column:
Add the custom column:
Filter the custom column:
Delete the index- and custom column:
End up with only every 5th row:
For what it's worth, this is the m-code of me loading the data from my worksheet (source). You can load the data through CSV:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", Int64.Type}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each Number.Mod([Index],5)=0),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = true)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index", "Custom"})
in
#"Removed Columns"

Not-tested pseudo-code, but it will be something like that:
for i=end downto beginning:
if mod(i,5) != 0 then
Range(i,1).EntireRow.Delete
end if
step -1
It is crucial to go from end to beginning, or you'll mess up the indexes in your rows :-)

You could make a helper column with this formula:
=MOD(ROW(A2)-ROW($A$2)+1;5)
Replace semicolon with comma, if your Excel version needs.
ROW($A$2) is the first row of the data.
Then apply a filter
Data--> Filter
and remove the tick for the 0
Then delete rows.
then remove the filter
Delete sheet rows
Remove Filter
Remove helpercolumn
Then you can export the Excel file as csv.
Opening a csv in Excel, changing it and saving, might not work.

Related

Merge rows before executing transpose in Power Query?

I need to get these project description rows merged into a single row so that there will be consistency in the number of a rows per record so that I can transpose them into proper columns through Power Query. (see image) I understand how to execute a transpose w/ Power Query if the number of rows are consistent across records but I cannot figure out how to do this if the number of rows differ. The data comes from a PDF which is horribly formatted and breaks the Project Description information in to separate rows. < THAT IS THE KEY PROBLEM. Apart from that the rest is cake. See snippet to see what I mean.
Each transposed record will have seven columns:
Director Analysis
Address
Project
Area
Notice Date
Project Description
Appeal
I can get everything I need including the headers. I just can't figure out how to merge the rows under Project Description so that I can proceed w/ the transpose.
here is the link to view a screenshot of my sheet
This is a kludge but seems to work. Assumes the column we want to operate on is named column a in powerquery
It looks for anything between the rows that contain Project Description and Appeals must be
Create a shifted row, so we can see what is on the row above
Add index
Use custom columns to determine which rows need filtering out, and which rows are the start and end rows to combine based on the first column and the shifted first column
Merge text together based on that info, merge that back into original table, then remove the extra rows
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
// create shifted row
shiftedList = {null} & List.RemoveLastN(Source[a],1),
custom3 = Table.ToColumns(Source) & {shiftedList},
custom4 = Table.FromColumns(custom3,Table.ColumnNames(Source) & {"Next Row Header"}),
#"Added Index" = Table.AddIndexColumn(custom4, "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each try if Text.Contains([Next Row Header],"Project Description" ) then [Index] else if Text.Contains([a],"Appeals must be") then [Index] else null otherwise 0),
#"Filled Down" = Table.FillDown(#"Added Custom",{"Custom"}),
#"Added Custom1" = Table.AddColumn(#"Filled Down", "Custom.1", each try if Text.Contains([Next Row Header],"Project Desc") then "remove" else if Text.Contains([a],"Appeals must be") then "keep" else null otherwise "keep"),
#"Filled Down1" = Table.FillDown(#"Added Custom1",{"Custom.1"}),
#"Filtered Rows1" = Table.SelectRows(#"Filled Down1", each ([Custom.1] = "remove")),
#"Grouped Rows1" = Table.Group(#"Filtered Rows1", {"Custom"}, {{"Count", each Text.Combine(List.Transform([a], Text.From), ","), type text}}),
#"Merged Queries" = Table.NestedJoin(#"Filled Down1", {"Index"}, #"Grouped Rows1", {"Custom"}, "Table2", JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"Count"}, {"Count"}),
#"SwapValue"= Table.ReplaceValue( #"Expanded Table2", each [Custom.1], each if [Count] = null then [Custom.1] else "keep", Replacer.ReplaceValue,{"Custom.1"}),
#"Final Swap"=Table.ReplaceValue(#"SwapValue",each [a], each if [Count]=null then [a] else [Count] , Replacer.ReplaceValue,{"a"}),
#"Filtered Rows" = Table.SelectRows(#"Final Swap", each ([Custom.1] = "keep")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Next Row Header", "Index", "Custom", "Custom.1", "Count"})
in #"Removed Columns"

powerquery group by rows with formula on multiple columns

I am trying to group/merge two rows by dividing the values in each based on another column (Eligible) value.
From the initial raw data, I have reached this level with different steps (by unpivoting etc.) in power query.
Now I need to have a ratio per employee (eligible/not-eligible) for each month.
So for employee A, "Jan-14" will be -10/(-10 + -149) and so on. Any ideas will be appreciated. Thanks
Really appreciate the response. Interestingly, I have used your other answer to reach this stage from the raw data.
Since we are calculating how much time an employee worked on eligible activities each month so We will be grouping on the Employee. Employee name was just for reference which I took out and later will join with employee query to get the names if required. There was a typo in the image, the last row should also be an employee with id 2.
So now when there is a matching row, we use the formula to calculate the percentage of time spent on eligible activities but
If there isn't a matching row with eligible=1, then the outcome should be 0
if there isn't a matching row with eligible-0, then the outcome should be 1 (100%)
Try this and modify as needed. It assumes you are starting with all Eligible=0 and will only pick up matching Eligible=1. If there is a E=1 without E=0 it is removed. Also assumes we match on both Employee and EmployeeName
~ ~ ~ ~ ~
Click select the first three columns (Employee, EmployeeName, Eligible), right click .... Unpivot other other columns
Add custom column with name "One" and formula =1+[Eligible]
Merge the table onto itself, Home .. Merge Queries... with join kind Left Outer
Click to match Employee, EmployeeName and Attribute columns in the two boxes, and match One column in the top box to the Eligible Column in the bottom box
In the new column, use arrows atop the column to expand, choosing [x] onlt the Value column. Make the name of the column: Custom.Value
Add column .. custom column ... formula = [Custom.Value] / ( [Custom.Value] + [Value])
Filter Eligible to only pick up the zeroes using the arrow atop that column
Remove extra columns
Click select Attribute column, Transform ... pivot ... use custom as the values column
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Employee", "EmployeeName", "Eligible"}, "Attribute", "Value"),
#"Added Custom" = Table.AddColumn(#"Unpivoted Other Columns", "One", each 1+[Eligible]),
#"Merged Queries" = Table.NestedJoin(#"Added Custom",{"Employee", "EmployeeName", "Attribute", "One"},#"Added Custom",{"Employee", "EmployeeName", "Attribute", "Eligible"},"Added Custom",JoinKind.LeftOuter),
#"Expanded Added Custom" = Table.ExpandTableColumn(#"Merged Queries", "Added Custom", {"Value"}, {"Custom.Value"}),
#"Added Custom1" = Table.AddColumn(#"Expanded Added Custom", "Custom", each [Custom.Value]/([Custom.Value]+[Value])),
#"Filtered Rows" = Table.SelectRows(#"Added Custom1", each ([Eligible] = 0)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Eligible", "Value", "One", "Custom.Value"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attribute]), "Attribute", "Custom", List.Sum)
in #"Pivoted Column"

Replacing newline with new rows in excel sheet

I am working with an excel sheet where rows inside a particular column is written using new lines.
.
For e.g. in Fig 1. Col D and Col E have been represented using new lines. i.e. A = Very Good, Needs Improvement. What I am trying to get is this in another form as shown. Any pointers in this regard would be helpful.
Try to use "Get&Transform" aka Powerquery.
Steps:
Select your data and load it (with headers) into PQ.
Add a new custom column (named 'Custom' for example) and use the following custom column formula:
Table.FromColumns({Text.Split([Grades],"#(lf)"), Text.Split([Comment],"#(lf)")})
On the newly created column, click the expand button (top right) and expand both columns.
Delete columns 'Grades', 'Comments'.
Additionally you could rename the last two columns back to 'Grades' and 'Comment'.
To make things a litle easier you could also just apply the following M-code:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each Table.FromColumns({Text.Split([Grades],"#(lf)"), Text.Split([Comment],"#(lf)")})),
#"Expanded {0}" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Column1", "Column2"}, {"Custom.Column1", "Custom.Column2"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded {0}",{"Grades", "Comment"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Custom.Column1", "Grades"}, {"Custom.Column2", "Comment"}})
in
#"Renamed Columns"
Your end result should look like:
Try esProc, split and expand multiline words in an excel cell into multiple rows as following code.
A
1 =file("data.xlsx").xlsimport#t()
2 =A1.run(Grades=Grades.split("\n"),Comment=Comment.split("\n"))
3 =A2.news(Grades.len();Names,Class,Year,Grades(#):Grades,Comment(#):Comment)
4 =file("result.xlsx").xlsexport#t(A3)
For more explanation, see http://c.raqsoft.com/article/1609902051322
DISCLAIMER: This is about our tool esProc. It’s freemium.

Excel Power Query inserting column between other columns

I'm importing a bunch of columns to do some analysis on in Excel power query. Some of the analysis columns need to be inserted after a certain column, but every option for adding a column only lets me append the column to the very end. I want to insert the new columns after the one named "Total" for readability.
Bellow a function than outputs the list of re-arranged column names.
ReorderList:
(tableName as table, toBeMovedColumnName as any, optional afterColumnName as text) as list=>
//tableName - the name of the table we want to reorder.
//toBeMovedColumnName - the name of the column you want to change the position. Can be a list of column names.
//columnName - the name of the column you want the toBeMovedColumnName to be positioned after. If omited toBeMovedColumnName will be placed as the first column.
let
columnNames = Table.ColumnNames(tableName),
positionOf = if afterColumnName is null or afterColumnName = "" then 0 else List.PositionOf(columnNames, afterColumnName) + 1,
toBeMovedList = if Value.Is(toBeMovedColumnName, type list) = true then toBeMovedColumnName else {toBeMovedColumnName},
intermediaryList = List.Combine({List.FirstN(columnNames,positionOf),toBeMovedList}),
intermediaryList2 = List.RemoveItems(columnNames,intermediaryList),
reorderList = List.Combine({intermediaryList,intermediaryList2})
in
reorderList
Usage like this:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom1", each 4),
#"Reordered Columns" = Table.ReorderColumns(#"Added Custom", ReorderList(#"Added Custom","Custom1","Total"))
in
#"Reordered Columns"
Sample below.
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
// get baseline column names. Use this before inserting new analysis columns
Names = Table.ColumnNames(Source),
TotalSpot = List.PositionOf(Names,"Total"),
// add any code or steps here ; this is random sample. don't use
#"Added Custom" = Table.AddColumn(Source, "Custom1", each 4),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom2", each 5),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Custom3", each 6),
// insert this after all your new columns are added
// it moves all new columns to the right of the Total column
// replace #"Added Custom2" in step below with previous step name
#"Reordered Columns" = Table.ReorderColumns(#"Added Custom2",List.Combine ({List.FirstN(Names,TotalSpot+1),List.RemoveItems(Table.ColumnNames(#"Added Custom2"),Names),List.RemoveFirstN (Names,TotalSpot+1)}))
in #"Reordered Columns"

Excel Power Query - How do I append columns inside the same table in power query?

Maybe this is a very simple question, but I'm trying to figure out how to do this, as I have hundreds of columns and the idea of doing it by hand, splitting them into separate queries and then append them doesn't seem to be very practical.
I've been working on a query and it returns me values in the following format:
Date | Time | Value | Time | Value...
A | B | C | D | E...
But I need to transform it to look like:
Date | Time | Value
A | B | C
A | D | E
Thanks for the help!
Using no custom code:
Load data into powerquery using Data ... From Table/Range...
Right-click Date column, choose unpivot other columns
Add column... index column... use default column name Index
Add column...Custom Column... with formula =Number.Mod([Index],2) and default name Custom
This converts the index column into alternating 0/1s
(Assuming your 2nd column is named Value.1) Add column...Custom Column... with formula =#"Added Custom"{[Index]+1}[Value.1] and default name Custom.1
That will place the value from the row below the current one into current row
Remove alternating row by clicking arrow next to Custom column and removing [x] next the the 1
Click-Select the Attribute, Index and Custom columns, right-click Remove Columns
Load and Close
Assuming your data is loaded as range Table1 you could use this code, pasted into Home...Advanced...
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Date"}, "Attribute", "Value.1"),
#"Added Index" = Table.AddIndexColumn(#"Unpivoted Other Columns", "Index", 0, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each Number.Mod([Index],2)),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each #"Added Custom"{[Index]+1}[Value.1]),
#"Filtered Rows" = Table.SelectRows(#"Added Custom1", each ([Custom] = 0)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Attribute", "Index", "Custom"})
in #"Removed Columns"
If you are willing to use some custom code, this creates two tables, one table with odd columns and one with even columns, unpivots each of them, adds an index to both, then merges them back on that index. Works for any number of columns, might be faster than above for larger data sets.
Assuming your data is loaded as range Table1 you could use this code, pasted into Home...Advanced...
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
OddUnpivot= Table.AddIndexColumn(Table.UnpivotOtherColumns(Table.RemoveColumns(Source,List.RemoveFirstN(List.Alternate(Table.ColumnNames(Source),1,1,1),1)), {"date"}, "Attribute", "Value"), "Index", 0, 1),
EvenUnpivot= Table.AddIndexColumn(Table.UnpivotOtherColumns(Table.RemoveColumns(Source,List.Alternate(Table.ColumnNames(Source),1,1)), {"date"}, "Attribute", "Value"), "Index", 0, 1),
#"Merged Queries" = Table.NestedJoin(OddUnpivot,{"Index"},EvenUnpivot,{"Index"},"Table2",JoinKind.LeftOuter),
#"Expanded Table" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"Value"}, {"Value.1"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Table",{"Attribute", "Index"})
in #"Removed Columns"
LATER UPDATE:
More generically, I've decided I like this method better
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
// 1 base columns, then groups of 2 columns, stack them
Combo = List.Transform(List.Split(List.Skip(Table.ColumnNames(Source),1),2), each List.FirstN(Table.ColumnNames(Source),1) & _),
#"Added Custom" =List.Accumulate(
Combo,
#table({"Column1"}, {}),
(state,current)=> state & Table.Skip(Table.DemoteHeaders(Table.SelectColumns(Source, current)),1)
)
in #"Added Custom"

Resources