I would like to transform the structure of an excel file so that I can import it into a system which needs it to be structured in a given way.
Here is a small extraction of the excel file. There is a combination of categorical variables such as Line of business and four dummy variables indicating which data categories are used in a given process - such as Customer.
| Process name | Line of business | Customer | Potential customer | Employee | Vendor |
|--------------|------------------|----------|--------------------|----------|--------|
| Ad campaign | Marketing | x | x | | x |
| Payroll | HR | | | x | x |
What I want is to change the structure so that a new row is created for each variation of the dummy variables, and with a Data category column that applies/transposes the relevant data category name. The desired output would look like this:
| Process name | Line of business | Data category |
|--------------|------------------|--------------------|
| Ad campaign | Marketing | Customer |
| Ad campaign | Marketing | Potential customer |
| Ad campaign | Marketing | Vendor |
| Payroll | HR | Employee |
| Payroll | HR | Vendor |
What I have tried is making a COUNTIF statement that counts the number of "x" per row. I have then used a vba script that creates a new line with the process name for each variation of data categories.
Here is the code and the letters in the script refers to the columns in excel, so A is the Process name column and G is the COUNTIF column and it creates the n number of rows I need.
Sub KopyKat()
Dim N As Long, i As Long, K As Long
Dim v As String, kk As Long, m As Long
N = Cells(Rows.Count, "G").End(xlUp).Row
K = 1
For i = 2 To N
kk = Cells(i, "G").Value
v = Cells(i, "A").Value
For m = 1 To kk
Cells(K + 1, "H") = v
K = K + 1
Next m
Next i
End Sub
So that it goes from this:
| Process name | Line of business | Customer | Potential customer | Employee | Vendor | COUNTIF |
|--------------|------------------|----------|--------------------|----------|--------|---------|
| Ad campaign | Marketing | x | x | | x | 3 |
| Payroll | HR | | | x | x | 2 |
To this:
| Process name | Line of business | Customer | Potential customer | Employee | Vendor | COUNTIF | Process name_2 |
|--------------|------------------|----------|--------------------|----------|--------|---------|----------------|
| Ad campaign | Marketing | x | x | | x | 3 | Ad campaign |
| Payroll | HR | | | x | x | 2 | Ad campaign |
| | | | | | | | Ad campaign |
| | | | | | | | Payroll |
| | | | | | | | Payroll |
This is where my limited vba knowledge have taken me. I would like to change the code so that I get my desired output.
Thanks in advance!
Done simply with Power Query (available in Excel 2010+)
All steps can be done from the UI, but the M-code is below
Edit: Added step to rename Attribute column
Get data from table/range
Select the first two columns, then Unpivot Other Columns
Filter the Value column to show only the x's
Delete the Value column
Rename Attribute column --> Data Category
M-Code
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Process name", type text}, {"Line of business", type text}, {"Customer", type text}, {"Potential customer", type text}, {"Employee", type text}, {"Vendor", type text}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Process name", "Line of business"}, "Attribute", "Value"),
#"Filtered Rows" = Table.SelectRows(#"Unpivoted Other Columns", each ([Value] = "x")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Value"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Attribute", "Data Category"}})
in
#"Renamed Columns"
First of all, I recommend Power Query for data this kind of data manipulation.
Secondly, if you must do it this way, I think putting your data in an actual Excel table is much better for organizing your data and writing your code.
Anyways, below you can find my solution. Based on how many input columns you have, you can adjust the inner loop.
Note: looping over cells one by one is not the most efficient way. If the data size is big, reading the values into an array, and changing them in memory will increase the calculation speed 100+ times. If you can provide more information about the data size, I can update my answer accordingly.
Option Explicit
Sub KopyKat()
Dim totalRow As Long
totalRow = Cells(Rows.Count, "A").End(xlUp).Row
'Result and data input sheets are specified here.
Dim wsInput As Worksheet: Set wsInput = Worksheets("Sheet5")
Dim wsOutput As Worksheet: Set wsOutput = Worksheets("Sheet6")
Dim i As Long 'Row
Dim j As Long 'Column
Dim counter As Long: counter = 0
For i = 2 To totalRow
For j = 3 To 6 'Column numbers are hardcoded for the sake of the example
'Assumption is that value "x" is the only way to specify
If wsInput.Cells(i, j).Value = "x" Then
With wsOutput
.Cells(counter + 2, 1) = wsInput.Cells(i, 1).Value 'Process Name
.Cells(counter + 2, 2) = wsInput.Cells(i, 2).Value 'Line of Business
.Cells(counter + 2, 3) = wsInput.Cells(1, j).Value 'Data Category
End With
counter = counter + 1
End If
Next j
Next i
End Sub
Check this code and customize it according to your needs:
Based on this data layout:
Sub Transpose()
Dim evalSheet As Worksheet
Dim evalRange As Range
Dim headerRange As Range
Dim evalCell As Range
Dim destCell As Range
Dim sheetName As String
Dim sourceRangeAddress As String
Dim headerRangeAddress As String
Dim destinationCellAddress As String
Dim rowCounter As Long
' Customize to fit your needs
sheetName = "Sheet1"
sourceRangeAddress = "A2:F3"
headerRangeAddress = "A1:F1"
destinationCellAddress = "I2"
Set evalSheet = ThisWorkbook.Worksheets(sheetName )
' Get the range
Set evalRange = evalSheet.Range(sourceRangeAddress)
Set headerRange = evalSheet.Range(headerRangeAddress)
Set destCell = evalSheet.Range(destinationCellAddress)
' Loop through each cell in the first column
For Each evalCell In evalRange.Columns(1).Cells
' Evaluate the four columns (columnOffset means how many columns to the right)
If Trim(evalCell.Offset(columnOffset:=2).Value) = "x" Then
destCell.Offset(rowOffset:=rowCounter, columnOffset:=0).Value = Trim(evalCell.Offset(columnOffset:=0).Value)
destCell.Offset(rowOffset:=rowCounter, columnOffset:=1).Value = Trim(evalCell.Offset(columnOffset:=1).Value)
' Header range cells (3) means the third cell in the range - different than offset
destCell.Offset(rowOffset:=rowCounter, columnOffset:=2).Value = Trim(headerRange.Cells(3).Value)
rowCounter = rowCounter + 1
End If
If Trim(evalCell.Offset(columnOffset:=3).Value) = "x" Then
destCell.Offset(rowOffset:=rowCounter, columnOffset:=0).Value = Trim(evalCell.Offset(columnOffset:=0).Value)
destCell.Offset(rowOffset:=rowCounter, columnOffset:=1).Value = Trim(evalCell.Offset(columnOffset:=1).Value)
destCell.Offset(rowOffset:=rowCounter, columnOffset:=2).Value = Trim(headerRange.Cells(4).Value)
rowCounter = rowCounter + 1
End If
If Trim(evalCell.Offset(columnOffset:=4).Value) = "x" Then
destCell.Offset(rowOffset:=rowCounter, columnOffset:=0).Value = Trim(evalCell.Offset(columnOffset:=0).Value)
destCell.Offset(rowOffset:=rowCounter, columnOffset:=1).Value = Trim(evalCell.Offset(columnOffset:=1).Value)
destCell.Offset(rowOffset:=rowCounter, columnOffset:=2).Value = Trim(headerRange.Cells(5).Value)
rowCounter = rowCounter + 1
End If
If Trim(evalCell.Offset(columnOffset:=5).Value) = "x" Then
destCell.Offset(rowOffset:=rowCounter, columnOffset:=0).Value = Trim(evalCell.Offset(columnOffset:=0).Value)
destCell.Offset(rowOffset:=rowCounter, columnOffset:=1).Value = Trim(evalCell.Offset(columnOffset:=1).Value)
destCell.Offset(rowOffset:=rowCounter, columnOffset:=2).Value = Trim(headerRange.Cells(6).Value)
rowCounter = rowCounter + 1
End If
Next evalCell
End Sub
Remember to mark the answer if this helps
Related
This was solved using a formula.
Unfortunately, I need a solution that works in Excel 2016, and it seems VBA is the best/only route.
Legend: (this is across multiple worksheets in the same workbook)
Each column has a header.
Column A of Sheet3: List of Names
Column H of Sheet3: List of Email Addresses
Column M of Sheet1: contains the below formula dragged down, which produces a variable number of rows of data:
=IFERROR(INDEX($A$2:$A$42,MATCH(0,IF("1"=$L$2:$L$42,COUNTIF($O$1:$O1,$A$2:$A$42),""),0)),"")
In column M of Sheet1, I have an Index/Match formula, which populates with a list of people's names. (As said above, the number of names that appear is ever-changing.)
I'd like to look up each name that appears in column M of Sheet1 against column A of Sheet3 then return the respective email address from column H of Sheet3.
Additionally, I'd like to separate each email address with a semicolon, as this is to populate the To field of an Outlook email.
Snapshot of what the data looks like
| A, Sheet3 | H, Sheet3 | M, Sheet1 |
| --------------- | ------------------------ | ------------- |
| John Smith | JohnSmith#email.com | Frank Sinatra |
| Kimberly Jones | Kimberly#email.com | Corey Smith |
| Joe Montana | JoeMontana#email.com | Kimberly Jones|
| Dean Martin | DeanMartin#email.com | John Smith |
| Corey Smith | Corey.Smith#email.com | |
| Frank Sinatra | Frank.Sinatra#email.com | |
In cell F2 of Sheet1, the macro would produce the below:
Frank.Sinatra#email.com; Corey.Smith#email.com; Kimberly#email.com; JohnSmith#email.com
Worksheet tab names:
Worksheet1:
Worksheet3:
Try,
Function JoinEmail() As String
Dim Ws(1 To 2) As Worksheet
Dim vDB As Variant, vR() As Variant
Dim vName As Variant
Dim Dic As Object 'Dictionary
Dim i As Long, n As Integer
Dim s As String
Set Ws(1) = Sheets(1)
Set Ws(2) = Sheets(3)
Set Dic = CreateObject("Scripting.Dictionary")
vDB = Ws(2).UsedRange 'Sheets(3) data
With Ws(1)
vName = .Range("M2", .Range("M" & Rows.Count).End(xlUp))
End With
For i = 2 To UBound(vDB, 1)
Dic.Add vDB(i, 1), vDB(i, 8) 'name, email
Next i
For i = 1 To UBound(vName, 1)
s = vName(i, 1)
If Dic.Exists(s) Then
n = n + 1
ReDim Preserve vR(1 To n)
vR(n) = Dic(s)
End If
Next i
If n Then
JoinEmail = Join(vR, "; ")
Else
JoinEmail = ""
End If
End Function
Sheet1 image
Sheet3 image
Macro to keep on going to the next cell till the value doesn't match and for all the similar values, subtract the values from the bottom most row
Essentially my data is like this (There is only one buy for each name and it is the bottom most cell)
Name | Transaction.Type | Amount | Remaining (what macro needs to do)
Name1 | Sell | 5 | 15 (20-5)
Name1 | Sell | 10 | 10 (20-10)
Name1 | Sell | 15 | 5 (20-15)
Name1 | Buy | 20 |
Name2 | Sell | 25 | 5
Name2 | Buy | 30 |
So far my macro looks like
Dim sline As Integer
Dim eline As Integer
Dim rng As Range
Dim lastrow(1 To 3) As Long
Application.DisplayAlerts = False
With Worksheets("Testing Data 2")
lastrow(1) = .Cells(Rows.Count, "A").End(xlUp).Row
End With
For i = 2 To 4151
If Worksheets("Testing Data 2").Range("A" & i) <> Worksheets("Testing Data 2").Range("A" & i).Offset(1, 0) Then
eline = i
Worksheets("Testing Data 2").Range(":C" & eline)
'struggling to go from here
End If
Next i
Application.DisplayAlerts = True
You can do this without VBA with the understanding that each Name only has one instnace of Buy
=SUMIFS(C:C,A:A,A3,B:B,"Buy")-C2 'Drag down as needed
I have managed to get to a point with a data set where i have a list of items delimited with a "|" symbol. I am now trying to separate each item in the list into the corresponding column, however the identifier of the column is a bit of text at the end of each value of variable length.
Example Data (all in one column):
Column A
40.00A|24.00QS|8.00J[a]
40.00A|12.00J|8.00J[a]
20.00A|4.00V
30.00A|12.00CS|8.00QS
Desired Outcome:
+-------+-------+------+-------+-------+------+
| A | QS | J[a] | J | CS | V |
+-------+-------+------+-------+-------+------+
| 40.00 | 23.00 | 8.00 | | | |
| 40.00 | | 8.00 | 12.00 | | |
| 20.00 | | | | | 4.00 |
| 30.00 | 8.00 | | | 12.00 | |
+-------+-------+------+-------+-------+------+
The number of trailing characters that define columns is fixed to 6 (A,QS,J[a],J,CS & V), so I know at the beginning how many columns I will need.
I have some ideas on how to do it directly through formulas, but it would require me to split out the items into individual columns by the delimiter, then use some sort of if statement on some additional columns. Would prefer to avoid the helper column issue. Also, looked at the following link, but it doesn't solve the solution, as it assumes the value matches the column heading (I can correct that, but I feel like there is a faster VBA solution here):
How to split single column (with unequal values) to multiple columns sorted according to values from the original single column?
I have been reading about Regular Expressions, and i suspect there is a solution there, but I can't quite figure out how to sort the result.
Once i have this data setup, it is a small task to unpivot it and get the data in a proper tabular format using Power Query.
Thanks in advance!
since headers are fixed, it can simply be tried out like this (the Row & Column of the Source & destination data may be changed to your requirement)
Option Explicit
Sub test()
Dim Ws As Worksheet, SrcLastrow As Long, TrgRow As Long, Rw As Long
Dim Headers As Variant, xLine As Variant
Dim i As Long, j As Long
Set Ws = ThisWorkbook.ActiveSheet
'Column A assumed to have the texts
SrcLastrow = Ws.Range("A" & Rows.Count).End(xlUp).Row
TrgRow = 2
Headers = Array("A", "QS", "J[a]", "J", "CS", "V")
For Rw = 1 To SrcLastrow
xLine = Split(Ws.Cells(Rw, 1).Value, "|")
For i = 0 To UBound(xLine)
For j = 0 To UBound(Headers)
xLine(i) = Trim(xLine(i))
If Right(xLine(i), Len(Headers(j))) = Headers(j) Then
Ws.Range("D" & TrgRow).Offset(0, j).Value = Replace(xLine(i), Headers(j), "") ' The output data table was assumed to be at Column D
End If
Next j
Next i
TrgRow = TrgRow + 1
Next
End Sub
Probably this question is very rookie, but not really used to play a lot with Excel, anyway here I go.
I have 2 spreadsheets: A and B
In the spreadsheet "A" have the following info:
+----------+--------+-------+------+
| DATE | CODE | CORRL | CAPA |
+----------+--------+-------+------+
| 01/03/17 | 110104 | 5 | 28 |
| 01/03/17 | 110104 | 7 | 30 |
| 01/03/17 | 810107 | 5 | 30 |
+----------+--------+-------+------+
and in the spreadsheet "B" the following info:
+----------+--------+-------+--------+
| DATE | CODE | CORRL | SN |
+----------+--------+-------+--------+
| 01/03/17 | 110104 | 5 | 182694 |
| 01/03/17 | 110104 | 5 | 571394 |
| 01/03/17 | 110104 | 7 | 298435 |
| 01/03/17 | 110104 | 7 | 205785 |
| 01/03/17 | 810107 | 5 | 234519 |
| 01/03/17 | 810107 | 5 | 229787 |
+----------+--------+-------+--------+
So what I need is when I move through the records of the spreadsheet "A" only the records with the same value of DATE, CODE and CORRL in the spreadsheet "B" are shown
Example:
If I'm positioned in the 1st row of the spreadsheet "A" in the spreadsheet "B" only the first 2 records must be shown, that is:
+-----------+---------+--------+--------+
| DATE | CODE | CORRL | SN |
+-----------+---------+--------+--------+
| 01/03/17 | 110104 | 5 | 182694 |
| 01/03/17 | 110104 | 5 | 571394 |
+-----------+---------+--------+--------+
and so on
Thanks
I have to say, this is one of the more different requests that I've seen for Excel functionality.
I think I have something for you.
Firstly, if you're not familiar with the VBA editor then you can access it by pressing Alt + F11. Another way to access it is from the Developer tab in your ribbon, which is hidden by default. To unhide it, change the ribbon to include it.
From there you can get to the VBA editor as well as run macros.
From within there, add the following code ...
Private Sub Worksheet_SelectionChange(ByVal Target As Range)
Dim lngRow As Long, lngCol As Long, strDate As String, strCode As String, strCorrl As String
Dim strKey As String, strSlaveKey As String, i As Long
If objSlaveSheet Is Nothing Then Exit Sub
objSlaveSheet.Rows.EntireRow.Hidden = True
objSlaveSheet.Rows(1).Hidden = False
If Target.Cells(1, 1).Row > 1 Then
With Target.Worksheet
strDate = .Cells(Target.Row, 1)
strCode = .Cells(Target.Row, 2)
strCorrl = .Cells(Target.Row, 3)
strKey = strDate & "_" & strCode & "_" & strCorrl
End With
' Now loop through all of the cells in the slave workbook.
' Start at the second row because the first has a header.
With objSlaveSheet
For lngRow = 2 To .Rows.Count
strSlaveKey = ""
For i = 1 To 3
strSlaveKey = strSlaveKey & "_" & .Cells(lngRow, i)
Next
strSlaveKey = Mid(strSlaveKey, 2)
If strSlaveKey = "__" Then Exit For
If strSlaveKey = strKey Then
.Rows(lngRow).Hidden = False
End If
Next
.Activate
.Cells(1, 1).Select
End With
End If
End Sub
... into the worksheet where you want to trigger the selection from, this is your worksheet A.
Also in workbook A, create a new Module in the VBA editor and paste the following code ...
Public objSlave As Workbook
Public objSlaveSheet As Worksheet
Public Sub SelectSlaveBook()
Dim objDlg As FileDialog, strFile As String, strSlaveSheetName As String
strSlaveSheetName = "Sheet1"
Set objDlg = Application.FileDialog(msoFileDialogOpen)
objDlg.Show
If objDlg.SelectedItems.Count > 0 Then
strFile = objDlg.SelectedItems(1)
Set objSlave = Application.Workbooks.Open(strFile, False, True)
Set objSlaveSheet = objSlave.Worksheets(strSlaveSheetName)
ThisWorkbook.Activate
End If
End Sub
... before moving on, make sure you change the value of strSlaveSheetName to be the name of the sheet where your data is in your "Slave" workbook (B).
Finally in worksheet A, add the following code into the ThisWorkbook object ...
Private Sub Workbook_Open()
SelectSlaveBook
End Sub
... now close the master workbook (in your case, workbook A) and open it again.
You will be prompted for the location of the "Slave" workbook (workbook B).
Once you've given that location, select what you want to select and all things held constant, it should work for you.
Of course, if it needs tweaks to suit your exact requirement, that's always possible.
I hope it works for you.
I have a table that looks like this:
| A | B | C | D |
+-------+------------+------------+------------+
1 | Name | Language 1 | Language 2 | Language 3 |
+=======+============+============+============+
2 | John | English | Chinese | Spanish |
3 | Wendy | Chinese | French | English |
4 | Peter | Spanish | Chinese | English |
And I want to generate a table that has only one language column. The other two language columns should become new rows like this:
| A | B |
+-------+----------+
1 | Name | Language |
+=======+==========+
2 | John | English |
3 | John | Chinese |
4 | John | Spanish |
5 | Wendy | Chinese |
6 | Wendy | French |
7 | Wendy | English |
8 | Peter | Spanish |
9 | Peter | Chinese |
10 | Peter | English |
I understand this will probably will need a macro or something. If anybody point me in the right direction it would me much appreciate. I am not very familiar with VBA or the Excel object model.
This will do the trick. It is also dynamic supports as many language columns as you want with as many languages per person.
Assumes the data is formatted as per the example:
Sub ShrinkTable()
Dim maxRows As Double
Dim maxCols As Integer
Dim data As Variant
maxRows = Cells(1, 1).End(xlDown).row
maxCols = Cells(1, 1).End(xlToRight).Column
data = Range(Cells(1, 1), Cells(maxRows, maxCols))
Dim newSht As Worksheet
Set newSht = Sheets.Add
With newSht
.Cells(1, 1).Value = "Name"
.Cells(1, 2).Value = "Column"
Dim writeRow As Double
writeRow = 2
Dim row As Double
row = 2
Dim col As Integer
Do While True
col = 2
Do While True
If data(row, col) = "" Then Exit Do 'Skip Blanks
'Name
.Cells(writeRow, 1).Value = data(row, 1)
'Language
.Cells(writeRow, 2).Value = data(row, col)
writeRow = writeRow + 1
If col = maxCols Then Exit Do 'Exit clause
col = col + 1
Loop
If row = maxRows Then Exit Do 'exit cluase
row = row + 1
Loop
End With
End Sub
Messy but should work:
For Each namething In Range("A1", Range("A1").End(xlDown))
Range("A1").End(xlDown).Offset(1, 0) = namething.Value
Range("A1").End(xlDown).Offset(0, 1) = namething.Offset(0, 2)
Range("A1").End(xlDown).Offset(1, 0) = namething.Value
Range("A1").End(xlDown).Offset(0, 1) = namething.Offset(0, 3)
namething.Offset(0, 2) = ""
namething.Offset(0, 3) = ""
Next
Then just sort
The following formula should work. The data in sheet2 would always reflect the data on sheet1 so you wouldn't have to re-run a macro to create a new list.
That being said, using a macro to generate it is probably a better choice as it would allow more flexability should you need to add a 4th language or something at a later date.
In Sheet2!A2
=INDIRECT("Sheet1!A"&ABS(INT((ROW()+1)/3))+1)
In Sheet2!B2
=INDIRECT("Sheet1!"&IF(ABS(INT((ROW()+1)/3)-(ROW()+1)/3)=0,"B",IF(ABS(INT((ROW()+1)/3)-(ROW()+1)/3)=(1/3),"C","D"))&ABS(INT((ROW()+1)/3))+1)
Add the column titles in A1 and B1 then autofill the formula down the sheet.