Move all text values with the same ID field to separate columns - excel

I would like to move all the comments (Column B3:B14) to be new columns against each unique ID (Column A3:A14).
The Desired Format shows the layout that I would like to get to.
Hopefully that makes sense.

EDIT: This will do what you want using vba:
Option Explicit
Sub TransposeComments()
Dim inSR%, inTR%, inTC%, rgSource As Range, rgTarget As Range
Set rgSource = Range("A3") 'Change this if the 1st ID in the source table is moved
Set rgTarget = Range("D3") 'Change this to start populating at another start point
inTR = -1
Do
If rgSource.Offset(inSR) <> rgSource.Offset(inSR - 1) Then
inTR = inTR + 1: inTC = 2
rgTarget.Offset(inTR) = rgSource.Offset(inSR)
rgTarget.Offset(inTR, 1) = rgSource.Offset(inSR, 1)
Else
rgTarget.Offset(inTR, inTC) = rgSource.Offset(inSR, 1)
inTC = inTC + 1
End If
inSR = inSR + 1
''' End on 1st empty ID (assumes ID's in source data are contiguous and nothing is below them)
Loop Until rgSource.Offset(inSR) = ""
End Sub
I've assumed you know how to implement and call/run the vb. If not, let me know and I try and help with that. :)
============================================================
EDIT: How to do it all with formulas?
I'm unsure of how dynamic the extraction table has to be (as you don't say). For example:
o Will you be making a new extraction each time or will build a standing extractor table
o Will the source data vary in size (so you need to grow and shrink the 'lookup' range)
o Etc.
Given this, I've aimed for a solution that works and is adaptable. I'll leave it to you to adapt as appropriate 😊
To extract the unique serial numbers:
{=IFERROR(INDEX($A$2:$A$14, MATCH(0, COUNTIF($E$2:E2, $A$2:$A$14), 0)),"")}
To extract the corresponding comments:
{=IF($E3="","",IF(SUM(IF($A$2:$A$15=$E3,1))>=COUNTA($F$2:F$2),INDEX($B$2:$B$15,MATCH($E3,$A$2:$A$15,0)+COUNTA($F$2:F$2)-1),""))}
Notice the {}. Both are array formulas (entered with Ctrl, Shift and Enter)
Pictogram:
Addition Information:
The solution proposed assumes any same-serial-numbers are contiguous (as shown in your example.
If that's not the case by default, you'll have to sort the source date so it is.

You can obtain your desired output using Power Query, available in Windows Excel 2010+ and Office 365 Excel
Select some cell in your original table
Data => Get&Transform => From Table/Range
When the PQ UI opens, navigate to Home => Advanced Editor
Make note of the Table Name in Line 2 of the code.
Replace the existing code with the M-Code below
Change the table name in line 2 of the pasted code to your "real" table name
Examine any comments, and also the Applied Steps window, to better understand the algorithm and steps
M Code
let
//read in the data
//change table name in next line to actual table name in your workbook
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//set data types
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Comments", type text}}),
//group by ID and concatenate the comments with a character not used in the comments
//I used a semicolon, but that could be changed
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {
{"Comment", each Text.Combine([Comments],";")},
//also generate Count of the number of comments in each ID group
//as the Maximum will be the count of the number of columns to eventually create
{"numCols", each Table.RowCount(_)}
}),
//calculate how many columns to create and delete that column
maxCols = List.Max(#"Grouped Rows"[numCols]),
remCount = Table.RemoveColumns(#"Grouped Rows","numCols"),
//Split into new columns
#"Split Column by Delimiter" = Table.SplitColumn(remCount, "Comment",
Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv),maxCols)
in
#"Split Column by Delimiter"
If you have Excel for Microsoft 365 on the Mac with the FILTER and UNIQUE functions, you can use:
D23: =UNIQUE(Table1[ID]) *or some other cell8
and in the adjacent column:
=TRANSPOSE(FILTER(Table1[Comments],Table1[ID]=D23))

Related

Regexpmatch in excel - 5+ characters Match

I am in need of help finding 5+ character patterns between 2 cells in the same worksheet.
I found information online to set up an example code and I thought I would be able to tweak it to fix but is not working. Can anyone help me?
Here is what I am hoping to achieve:
Here is the formula I put into column C:
=Regexpmatch(A1:B1,"^[\S]{5}")
And the code in Visual Basic: Module1(Code)
Public Function RegExpMatch(input_range As Range, pattern As String, Optional match_case As Boolean = True) As Variant
Dim arRes() As Variant 'array to store the results
Dim iInputCurRow, iInputCurCol, cntInputRows, cntInputCols As Long 'index of the current row in the source range, index of the current column in the source range, count of rows, count of columns
On Error GoTo ErrHandl
RegExpMatch = arRes
Set regEx = CreateObject("VBScript.RegExp")
regEx.pattern = pattern
regEx.Global = True
regEx.MultiLine = True
If True = match_case Then
regEx.ignorecase = False
Else
regEx.ignorecase = True
End If
cntInputRows = input_range.Rows.Count
cntInputCols = input_range.Columns.Count
ReDim arRes(1 To cntInputRows, 1 To cntInputCols)
For iInputCurRow = 1 To cntInputRows
For iInputCurCol = 1 To cntInputCols
arRes(iInputCurRow, iInputCurCol) = regEx.Test(input_range.Cells(iInputCurRow, iInputCurCol).Value)
Next
Next
RegExpMatch = arRes
Exit Function
ErrHandl:
RegExpMatch = CVErr(xlErrValue)
End Function
Sub Run()
End Sub
I put this formula into Column C and received results in both Columns C and D. However, I cannot tell what it is even pulling as all the values in Column C are TRUE and I see no pattern or reason to why I received the FALSEs where I did.
Your regex pattern will only return a match based on the first five non-space characters in the string.
It seems to me that what you really want to do is return TRUE if there are matching Words in the two strings, and if those words are five or more characters in length.
If that is not the case, please clarify.
Split each string into a list or array of Words
For Column1, a Word is separated by #, ., or the transition from lower case to upper case letters.
For Column 2, a Word is separated by a Space
Filter each list to only retain words containing five or more characters
Check to see if a word is present in both lists.
This can be done using VBA and/or Power Query.
Here is a Power Query solution:
Power Query is available in Windows Excel 2010+ and Excel 365 (Windows or Mac)
To use Power Query
Select some cell in your Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
M Code
let
//Change next line to reflect actual data source
Source = Excel.CurrentWorkbook(){[Name="Table21"]}[Content],
//Set the data types
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
//Add custom column
#"Added Custom" = Table.AddColumn(#"Changed Type", "Match", each
let
/*Split column 1 by
Transition from lower case to upper case character.
Then by `#` or `.`
Filter to include only words with five or more characters*/
#"Split Col1" =
List.Select(
List.Combine(
List.Transform(
Splitter.SplitTextByCharacterTransition({"a".."z"},{"A".."Z"})([Column1]),
each Text.SplitAny(_,"#."))), each Text.Length(_)>=5),
/*Split Column 2 by <space>
Filter to include only words with length >=5*/
#"Split Col2" =
List.Select(
Text.Split([Column2]," "),
each Text.Length(_)>=5),
/*Create a List of words that are in both of the above lists
If there are one or more words in the Intersection of the two lists
then True, else False*/
Match =
List.Intersect(
{#"Split Col1",#"Split Col2"},Comparer.OrdinalIgnoreCase)
in
List.Count(Match) > 0, type logical)
in
#"Added Custom"
Seems like you could try:
Formula in C1:
=LET(x,TEXTSPLIT(B1," "),SUM(IFERROR(SEARCH(LEFT(FILTER(x,LEN(x)>4),5),A1),))>0)

word patterns within an excel column

I have 2 Excel data sets each comprising a column of word patterns and have been searching for a way to copy and group all instances of repetition within these columns into a new column.
This is the closest result I could find so far:
Sub Common5bis()
Dim Joined
Set d = CreateObject("Scripting.Dictionary") 'make dictionary
d.CompareMode = 1 'not case sensitive
a = Range("A1", Range("A" & Rows.Count).End(xlUp)).Value 'data to array
For i = 1 To UBound(a) 'loop trough alle records
If Len(a(i, 1)) >= 5 Then 'length at least 5
For l = 1 To Len(a(i, 1)) - 4 'all strings withing record
s = Mid(a(i, 1), l, 5) 'that string
d(s) = d(s) + 1 'increment
Next
End If
Next
Joined = Application.Index(Array(d.Keys, d.items), 0, 0) 'join the keys and the items
With Range("D1").Resize(UBound(Joined, 2), 2) 'export range
.EntireColumn.ClearContents 'clear previous
.Value = Application.Transpose(Joined) 'write to sheet
.Sort .Range("B1"), xlDescending, Header:=xlNo 'sort descending
End With
End Sub
Which yielded this result for the particular question:
This example achieves 4 of the things I'm trying to achieve:
Identify repeating strings within a single column
Copies these strings into a separate column
Displays results in order of occurrence (in this case from least to most)
Displays the quantity of repetitions (including the first instance) in an adjacent column
However, although from reading the code there are basic things I've figured out that I can adapt to my purposes, it still fails to achieve these essential tasks which I'm still trying to figure out:
Identify individual words rather than single characters
I could possibly reduce the size from 5 to 3, but for the word stings I have (lists of pronouns from larger texts) that would include "I I" repetitions but won't be so great for "Your You" etc, whilst at least 4 or 5 would miss anything starting with "I I"
Include an indefinite amount of values - looking at the code and the replies to the forum it comes from it looks like it's capped at 5, but I'm trying to find a way to identify all repetitions for all multiple word strings which could be something like "I I my you You Me I You my"
Is case sensitive - this is quite important as some words in the column have been capitalised to differentiate different uses
I'm still learning the basics of VBA but have manually typed out this example of what I'm trying to do with the code I've found above:
Intended outcome:
And so on
I'm a bit screwed at this point which is why I'm reaching out here (sorry if this is a stupid question, I'm brand new to VBA as my work almost never needs Excel, let alone macros) so will massively appreciate any constructive advice towards a solution!
Because I've been working with it recently, I note that you can obtain your desired output using Power Query, available in Windows Excel 2010+ and Office 365 Excel
Select some cell in your original table
Data => Get&Transform => From Table/Range or From within sheet
When the PQ UI opens, navigate to Home => Advanced Editor
Make note of the Table Name in Line 2 of the code.
Replace the existing code with the M-Code below
Change the table name in line 2 of the pasted code to your "real" table name
Examine any comments, and also the Applied Steps window, to better understand the algorithm and steps
First add a custom function:
New blank query
Rename per the code comment
Edits to make case-insensitive
Custom Function
//rename fnPatterns
//generate all possible patterns of two words or more
(String as text)=>
let
//split text string into individual words & get the count of words
#"Split Words" = List.Buffer(Text.Split(String," ")),
wordCount = List.Count(#"Split Words"),
//start position for each number of words
starts = List.Numbers(0, wordCount-1),
//number of words for each pattern (minimum of two (2) words in a pattern
words = List.Reverse(List.Numbers(2, wordCount-1)),
//generate patterns as index into the List and number of words
// will be used in the List.Range function
patterns = List.Combine(List.Generate(
()=>[r={{0,wordCount}}, idx=0],
each [idx] < wordCount-1,
each [r=List.Transform({0..starts{[idx]+1}}, (li)=> {li, wordCount-[idx]-1}),
idx=[idx]+1],
each [r]
)),
//Generate a list of all the patterns by using the List.Range function
wordPatterns = List.Distinct(List.Accumulate(patterns, {}, (state, current)=>
state & {List.Range(#"Split Words", current{0}, current{1})}), Comparer.OrdinalIgnoreCase)
in
wordPatterns
Main Function
let
//change next line to reflect data source
//if data has a column name other than "Column1", that will need to be changed also wherever referenced
Source = Excel.CurrentWorkbook(){[Name="Table17"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
//Create a list of all the possible patterns for each string, added as a custom column
#"Invoked Custom Function" = Table.AddColumn(#"Changed Type", "Patterns", each fnPatterns([Column1]), type list),
//removed unneeded original column of strings
#"Removed Columns" = Table.RemoveColumns(#"Invoked Custom Function",{"Column1"}),
//Expand the column of lists of lists into a column of lists
#"Expanded Patterns" = Table.ExpandListColumn(#"Removed Columns", "Patterns"),
//convert all lists to lower case for text-insensitive comparison
#"Added Custom" = Table.AddColumn(#"Expanded Patterns", "lower case patterns",
each List.Transform([Patterns], each Text.Lower(_))),
//Count number of matches for each pattern
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Count", each List.Count(List.Select(#"Added Custom"[lower case patterns], (li)=> li = [lower case patterns])), Int64.Type),
//Filter for matches of more than one (1)
// then remove duplicate patterns based on the "lower case pattern" column
#"Filtered Rows" = Table.SelectRows(#"Added Custom1", each ([Count] > 1)),
#"Removed Duplicates" = Table.Distinct(#"Filtered Rows", {"lower case patterns"}),
//Remove lower case pattern column and sort by count descending
#"Removed Columns1" = Table.RemoveColumns(#"Removed Duplicates",{"lower case patterns"}),
#"Sorted Rows" = Table.Sort(#"Removed Columns1",{{"Count", Order.Descending}}),
//Re-construct original patterns as text
#"Extracted Values" = Table.TransformColumns(#"Sorted Rows",
{"Patterns", each Text.Combine(List.Transform(_, Text.From), " "), type text})
in
#"Extracted Values"
Note that you could readily implement a similar algorithm using VBA, the VBA.Split function and a Dictionary

Increment difference between cells

I'm trying to duplicate data in a sheet with increments of 12 between each cell from a sheet with 1 cell per row. Between the 12-incremented rows there's other data. This means I can't drag to extend the formula. Like this for customer numbers:
'SheetA'E3 = 'SheetB'Y2
'SheetA'E15 = 'SheetB'Y3
'SheetA'E27 = 'SheetB'Y4
..and so on. I've tried extending 12/24 cells at a time and copying but I can't make it work. Extending doesn't add +1 to one sheet, just +12/+24 to both. Doing this manually will take months. Can this be done without a VBA solution?
Any suggestions? I'm sorry if my terminology isn't on point here.
SheetA:
Try this (run as VBA code):
Sub test1()
For i01 = 0 To 100
Worksheets("SheetA").Cells(3 + 12 * i01, 5) = Worksheets("SheetB").Cells(2 + i01, 25)
Next i01
End Sub
Power Query, available in Windows Excel 2010+ and Office 365, can produce your SheetA given SheetB. Not sure about the effect of the variability you mention.
The query assumes that the correct parameters are listed as column headers in Sheet B. The column headers will get copied over as parameters to sheet A.
To use Power Query:
Select some cell in your Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
M Code
let
//Read in the data
//Change table name in next line to be the "real" table name
Source = Excel.CurrentWorkbook(){[Name="Table12"]}[Content],
//set data types based on first entry in the column
//will be independent of the column names
typeIt = Table.TransformColumnTypes(Source,
List.Transform(
Table.ColumnNames(Source), each
{_,Value.Type(Table.Column(Source,_){0})})
),
//UNpivot except for the c.number and c.name columns to create the Parameter and Level columns
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(typeIt, {"C. number", "C. name"}, "Parameter", "Level"),
//Group By C.Number
//Add the appropriate rows for each customer
//And a blank row to separate the customers
#"Grouped Rows" = Table.Group(#"Unpivoted Other Columns", {"C. number"}, {
{"All", each _, type table [C. number=nullable number, C. name=nullable text, Parameter=text, Level=any]},
{"custLabel", (t)=> Table.InsertRows(t,0,{
[C. number = null, C. name=null,Parameter = null, Level = null],
[C. number = null, C. name=null, Parameter = "Customer Number", Level="Customer Name"],
[C. number = null, C. name=null,Parameter = t[C. number]{0}, Level = t[C. name]{0}],
[C. number = null, C. name=null,Parameter = "Parameter", Level = "Level"]
})}
}),
//Remove the unneeded columns and expand the remaining table
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"C. number", "All"}),
#"Expanded custLabel" = Table.ExpandTableColumn(#"Removed Columns", "custLabel", {"Parameter", "Level"}, {"Parameter", "Level"}),
//Remove the top blank row
//promote the new blank row to the Header location
#"Removed Top Rows" = Table.Skip(#"Expanded custLabel",1),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true]),
//data type set to text since it will look better on the report
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Customer Number", type text}, {"Customer Name", type text}})
in
#"Changed Type"```
Data
Results
[ Indirect with row() ]
Assuming 'SheetA'E3 column is the target and 'SheetB'Y2 is the source data.
In SheetA!E3 cell put:
=INDIRECT("SheetB!Y"&( ( (row()-3) / 12) + 2)
Press Enter
Then select SheetA!E3 cell, copy. Then paste in SheetA!E24. The formula will update itself.
Idea :
Find the relation between the target cell row number and the source cell row number. [ b > a : 3 > 2 , 15 > 3, 27 > 4 ] leads to a = (b-3)/12 + 2 . (The math is sort of like figuring out a straight line equation from 3 coordinate.) Then use INDIRECT() to combine the calculated row number with the column address.

Excel Powerquery split table top / bottom 50 percent

I have an example table in Excel to illustrate my question.
Two columns (first name, last name), 11 rows and a header row.
I would like to make get&transform (powerquery) links to another sheet in the same workbook where I would like to have two tables A & B with the same structure als the source table. I would like A to display row 1-6 and B to display 7-11.
BUT: I would like this split to be dynamic. So I would want A to display Top 50% rounded up, and B to display the rest. I've seen the top N rows and read some posts about counting in a different powerquery and using this Filedropper Excel file where image below comes from
Top Half:
let
Source = Excel.CurrentWorkbook(){[Name="SourceTable"]}[Content],
TopHalfRows = Number.RoundUp(Table.RowCount(Source) / 2),
KeepTopHalf = Table.FirstN(Source, TopHalfRows)
in
KeepTopHalf
Bottom Half:
let
Source = Excel.CurrentWorkbook(){[Name="SourceTable"]}[Content],
TopHalfRows = Number.RoundUp(Table.RowCount(Source) / 2),
DeleteTopHalf = Table.Skip(Source, TopHalfRows)
in
DeleteTopHalf
EDIT:
This shows how to amend by adding a filter step, before splitting:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Filtered Rows" = Table.SelectRows(Source, each Text.StartsWith([firstname], "Ab")),
TopHalfRows = Number.RoundUp(Table.RowCount(#"Filtered Rows") / 2),
KeepTopHalf = Table.FirstN(#"Filtered Rows", TopHalfRows)
in
KeepTopHalf

Excel PowerQuery (or dax is just as perfect) - add column with unique ID

I have a table (formatted as table) for the inputs.
I want to add a unique ID column to my table.
Constraints:
it should not use any other columns
it should be constant and stable, meaning that inserting a new row won't change any row IDs, but only adds a new one.
Anything calculated from another column's value is not useful because there will be typos. So changing the ID will mean data loss in other tables connected to this one.
Simply adding an index in query editor is not useful because there will be inserted rows in the middle, and ids are recalculated at this action
I am also open for any VBA solution. I tried to write a custom function taht would add a new ID into an "rowID" column in the same row if there is no ID yet, but I failed with referencing the cells from a function called from a Table.
My suggestion would be to use a self referencing query.
Query "Data" below imports Excel table "Data" and also outputs to Excel table "Data".
In order to create such a query, first create a query "Data" that imports some Excel table (let's say Table1), run the query so table "Data" is created. Now you can adjust the query source from Table1 to Data and maintain this table in Excel (leaving blank IDs for new rows) and run the query to generate new IDs.
Otherwise the query should be pretty straightforward; if not: let me know where you need additional explanation.
let
Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
Typed = Table.TransformColumnTypes(Source,{{"Col1", Int64.Type}, {"Col2", type text}, {"ID", Int64.Type}}),
MaxID = List.Max(Typed[ID]),
OriginalSort = Table.AddIndexColumn(Typed, "OriginalSort",1,1),
OldRecords = Table.SelectRows(OriginalSort, each ([ID] <> null)),
NewRecords = Table.SelectRows(OriginalSort, each ([ID] = null)),
RemovedNullIDs = Table.RemoveColumns(NewRecords,{"ID"}),
NewIDs = Table.AddIndexColumn(RemovedNullIDs, "ID", MaxID + 1, 1),
NewTable = OldRecords & NewIDs,
OriginalSortRestored = Table.Sort(NewTable,{{"OriginalSort", Order.Ascending}}),
RemovedOriginalSort = Table.RemoveColumns(OriginalSortRestored,{"OriginalSort"})
in
RemovedOriginalSort

Resources