Grouping data tables - excel

I have following table in an Excel file:
customer ID sell attribute1 attribute2 attribute3 …… attribute N
Customer1 sell1 1 0 1 1
Customer1 sell2 0 0 0 0
Customer1 sell3 1 0 1 1
Customer2 sell4 0 0 0 1
Customer2 sell5 1 0 1 0
Customer3 sell6 1 0 0 0
…… ……
I need an Excel VBA function to remove redundant customers rows and produce only one row for each customer represent the all columns values for that customer ... ( max value for each column produce only maximum values in that column ), column sell its not consider in the final result.
The result as follows:
customer ID attribute1 attribute2 attribute3 …… attribute N
Customer1 1 0 1 1
Customer2 1 0 1 1
Customer3 1 0 0 0
etc....

VBA is not required for this. Select all your data and DATA > Outline - Subtotal:
At each change in: customer ID
Use function: Max
Add subtotals to: check each of the attributes (only)
Check Replace current subtotals and Summary below data
OK
Copy and Paste Special..., Values over the top. Filter to select for ColumnA, Text Filters, Contains..., Max, OK, select visible data (excluding last row if not wanted), copy and paste where required. Delete ColumnB.

Try this simple little procedure
Sub MergeData()
Dim lastrow As Long
Dim lastcol As Long
Dim i As Long, ii As Long
Application.ScreenUpdating = False
With ActiveSheet
lastrow = .Cells(.Rows.Count, "A").End(xlUp).Row
lastcol = .Cells(1, .Columns.Count).End(xlToLeft).Column
For i = lastrow To 2 Step -1
If .Cells(i, "A").Value = .Cells(i - 1, "A").Value Then
For ii = 3 To lastcol
.Cells(i - 1, ii).Value = -(.Cells(i - 1, ii).Value > 0 OR .Cells(i, ii).Value > 0)
Next ii
.Rows(i).Delete
End If
Next i
.Columns(2).Delete
End With
Application.ScreenUpdating = True
End Sub

Related

Compare 2 columns and remove duplicates without shifting

I have this below datasheet,
COLA COLB
ABC 10
ABC 10
ABC 15
XYZ 10
XYZ 15
XYZ 15
I want to compare both the columns and if they are duplicate I want to remove blank the cell, I don't want to shift.
For example like this,
COLA COLB
ABC 10
ABC 15
XYZ 10
XYZ 15
How can I do this in excel?
Cheers!!
no vba method
Step 1 identify rows to remove using formula.
place this formula at C2 and fill down.
=IF(COUNTIFS(A$2:A2,A2,B$2:B2,B2)=1,1,"REMOVE")
Step 2 Filter for "REMOVE" only
Step 3 Select entire rows and just press DEL button, then remove filter.
Try this.
Option Explicit
Dim i, j, count, lastrow As Integer
Dim number As Long
Sub delete_duplicates()
lastrow = Cells.Find("*", [A1], , , xlByRows, xlPrevious).Row
For i = 1 To lastrow
number = Cells(i, 2)
For j = 1 To lastrow
If number = Cells(j, 2) Then
count = count + 1
If count > 1 Then
Cells(j, 2) = ""
End If
End If
Next j
count = 0
Next i
End Sub

Copy values from mulit-column listbox to a sheet VBA

I have a listbox with column 4 columns (Name, Marital Status, Years Married, Gender). I need to copy the contents of my listbox to sheet 1. For example, if the listbox contained: Joe, Single, 0, Male then I need to have cells A2:D2 have the values Joe, Single, 0, Male, respectively.
You can loop over the rows/columns in a listbox, example:
For r = 0 to ListBox.ListCount - 1
For c = 0 to ListBox.ListColumns.Count - 1
Range("A2").Offset(r, c).Value = ListBox.List(r,c)
Next
Next
just changed ListColumns.Count - 1 to .ColumnCount - 1
For r = 0 to ListBox.ListCount - 1
For c = 0 to ListBox.ColumnCount - 1
Range("A2").Offset(r, c).Value = ListBox.List(r,c)
Next
Next

How to transpose in excel 2010

I have an excel sheet with the following format
a 2 3 4 0 0 0 0 0
a 2 5 6 7 0 0 0 0
a 4 5 9 0 0 0 0 0
b 5 5 9 0 0 0 0 0
b 1 1 1 1 1 1 1 1
I want to end up with something like this
a 2
a 3
a 4
a 2
a 5
a 6
a 7
a 4 .....
b 1
b 1
b 1 ....
Check out my unpivot add-in (just add a header row that you can remove afterwards). It will return the data in the format you need.
Try this
Sub move()
Dim cell As Range, _
found As Range
Dim letter As String
Set cell = Range("B1")
Set found = Range("A:A").Find("*", Range("A1"), searchdirection:=xlPrevious).Offset(2, 0)
Do While (cell.Value <> "")
letter = cell.Offset(0, -1).Value
Do While (cell.Value <> 0)
found.Value = letter
found.Offset(0, 1).Value = cell.Value
Set cell = cell.Offset(0, 1)
Set found = found.Offset(1, 0)
Loop
Set cell = Cells(cell.Row + 1, 2)
Loop
End Sub
I was able to do this pretty quickly with a combination of concatenation and de-concatenation and a formula from http://www.cpearson.com/excel/TableToColumn.aspx
=$A1&","&B1
=OFFSET($K$1:$R$5,TRUNC((ROW()-ROW($U$1))/COLUMNS($K$1:$R$5),0),MOD(ROW()-ROW($U$1),COLUMNS($K$1:$R$5)),1,1)
=LEFT(U1,FIND(",",U1,1)-1)
=RIGHT(U1,FIND(",",U1,1)-1)
Good Luck.

Macro to make all possible combinations of data in various columns in excel sheet?

In an another post, the user Excellll provided a macro address the aforementioned question.
I have a worksheet which has data as below:
A B C
abc,def,ghi,jkl 1,2,3 a1,e3,h5,j8
The following solution turns it into
abc 1 a1
abc 2 a1
abc 3 a1
abc 1 e3
abc 2 e3
abc 3 h5
However, I wanted to know how the macro can be modified as the number of columns of data grow from 3 columns of data to 10 columns of data.
I tried modifying the macro a number of times based upon the patterns in the code that I saw, but I kept getting an error.
I am a fan of recursion but only if I believe it provides the simpliest solution. I do not believe it is appropriate for this problem.
In the original question, UJ9 had:
Column A B C
Row 1 abc,def,ghi,jkl 1,2,3 a1,e3,h5,j8
and wanted:
Column A B C
Row 1 abc 1 a1
Row 2 abc 2 a1
Row 3 abc 3 a1
Row 4 abc 1 e3
Row 5 abc 2 e3
Row 6 abc 3 h5
:
Row 48 jkl 3 j8
user1657410 wants the same but with 10 columns.
The solutions for the original problem use three (one per column) nested for-loops. Adapting those solutions for ten nested for-loops is possible but not an easy implementation. Let us consider the principle behind those solutions and then look for a different implementation strategy.
If we index the values in each column we get:
Column A B C
Row 1 abc,def,ghi,jkl 1,2,3 a1,e3,h5,j8
Index 0 1 2 3 0 1 2 0 1 2 3
What the solutions do is generate every combination of index: 000 001 002 003 010 011 012 013 020 021 021 023 100 ... 323 and use the digits to select the appropriate substring from the appropriate string.
To adapt this approach for a larger number of columns we need to switch from nested for-loops to arrays with one entry per column. One array hold the maximum value of the index for the column and the other holds the currently selected index. The initial state would be something like:
Column A B C D E F G H I J
Maximum index array 4 3 4 4 3 2 6 3 4 2
Current index array 0 0 0 0 0 0 0 0 0 0
We now need a loop that will increment the Current index array like a speedometer except each column has its own maximum. That is, we want to add one to the rightmost element of the Current index array unless it is already at its maximum value. If it is at its maximum value, it is reset to zero and the next column to the left is incremented unless it is at its maximum value. This continues until the loop wants to increment the leftmost index past its maximum value. That is, we need a loop which will set the Current index array to the following values:
Column A B C D E F G H I J
Maximum index array 4 3 4 4 3 2 6 3 4 2
Current index array 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 2
0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 0 1 2
0 0 0 0 0 0 0 0 2 0
0 0 0 0 0 0 0 0 2 1
0 0 0 0 0 0 0 0 2 2
0 0 0 0 0 0 0 0 3 0
0 0 0 0 0 0 0 0 3 1
0 0 0 0 0 0 0 0 3 2
0 0 0 0 0 0 0 1 0 0
: :
4 3 4 4 3 2 6 3 4 2
For each different value of the Current index array, you select the appropriate substring from each column and generate a row containing the substrings.
Before we go any further, are you sure you want to generate a row per combination of sub-string? With the maximum index values I selected for my example, you would get 2,520,000 rows.
The code below assumes the source row is row 1. It outputs the generated rows starting at row 3. This code generates a table like the one above so you can properly understand how the code works. Below this code are instructions to amend it to output substrings. The code adjusts to the number of columns in the source row. The code does not check that your version of Excel can support the number of rows generated.
Sub Combinations()
Dim ColCrnt As Long
Dim ColMax As Long
Dim IndexCrnt() As Long
Dim IndexMax() As Long
Dim RowCrnt As Long
Dim SubStrings() As String
Dim TimeStart As Single
TimeStart = Timer
With Worksheets("Combinations")
' Use row 1 as the source row. Find last used column.
ColMax = .Cells(1, Columns.Count).End(xlToLeft).Column
' Size Index arrays according to number of columns
' Use one based arrays so entry number matches column number
ReDim IndexCrnt(1 To ColMax)
ReDim IndexMax(1 To ColMax)
' Initialise arrays
For ColCrnt = 1 To ColMax
SubStrings = Split(.Cells(1, ColCrnt).Value, ",")
' SubStrings is a zero-based array with one entry
' per comma separated value.
IndexMax(ColCrnt) = UBound(SubStrings)
IndexCrnt(ColCrnt) = 0
Next
RowCrnt = 3 ' Output generated values starting at row 3
Do While True
' Use IndexCrnt() here.
' For this version I output the index values
For ColCrnt = 1 To ColMax
' This will generate an error if RowCrnt exceeds the maximum number
' of columns for your version of Excel.
.Cells(RowCrnt, ColCrnt).Value = IndexCrnt(ColCrnt)
Next
RowCrnt = RowCrnt + 1
' Increment values in IndexCrnt() from right to left
For ColCrnt = ColMax To 1 Step -1
If IndexCrnt(ColCrnt) < IndexMax(ColCrnt) Then
' This column's current index can be incremented
IndexCrnt(ColCrnt) = IndexCrnt(ColCrnt) + 1
Exit For
End If
If ColCrnt = 1 Then
' Leftmost column has overflowed.
' All combinations of index value have been generated.
Exit Do
End If
IndexCrnt(ColCrnt) = 0
' Loop to increment next column
Next
Loop
End With
Debug.Print Format(Timer - TimeStart, "#,###.##")
End Sub
If you are happy that you understand the above code, replace:
' For this version I output the index values
For ColCrnt = 1 To ColMax
.Cells(RowCrnt, ColCrnt).Value = IndexCrnt(ColCrnt)
Next
by:
For ColCrnt = 1 To ColMax
SubStrings = Split(.Cells(1, ColCrnt).Value, ",")
.Cells(RowCrnt, ColCrnt).Value = SubStrings(IndexCrnt(ColCrnt))
Next
This revised code output the appropriate substring for each combination but it will be slow with large numbers of combination because it extracts the required substring from the source cell for every generated row. For example, it generates 27,648 rows in 12.66 seconds. The code below takes 9.15 seconds but uses a more advanced technique.
Step 1, replace:
Dim SubStrings() As String
by:
Dim SubStrings() As Variant
With Dim SubStrings() As String, SubString(N) can only contain a string. With Dim SubStrings() As Variant, SubString(N) can contain a string or an integer or a floating-point value. This is not good in most situations because a variant is slower to process than a string or a long and you will not be warned if you set it to the wrong sort of value for your code. However, I am going to store an array in SubString(N). I will be using what is called a ragged array because each row has a different number of columns.
Step 2, replace:
ReDim IndexCrnt(1 To ColMax)
ReDim IndexMax(1 To ColMax)
by:
ReDim IndexCrnt(1 To ColMax)
ReDim IndexMax(1 To ColMax)
ReDim SubStrings(1 To ColMax)
Step 3, replace:
' Initialise arrays
For ColCrnt = 1 To ColMax
SubStrings = Split(.Cells(1, ColCrnt).Value, ",")
' SubStrings is a zero-based array with one entry
' per comma separated value.
IndexMax(ColCrnt) = UBound(SubStrings)
IndexCrnt(ColCrnt) = 0
Next
by:
' Initialise arrays
For ColCrnt = 1 To ColMax
SubStrings(ColCrnt) = Split(.Cells(1, ColCrnt).Value, ",")
IndexMax(ColCrnt) = UBound(SubStrings(ColCrnt))
IndexCrnt(ColCrnt) = 0
Next
With the first version, I overwrite the array SubStrings everytime I split a cell. With the second version, I save each column's substrings. With the values used by UJ9 in the original question, the new SubString looks like:
---- Columns -----
Row 0 1 2 3
1 abc def ghi jkl
2 1 2 3
3 a1 e3 h5 j8
Step 4: replace:
For ColCrnt = 1 To ColMax
SubStrings = Split(.Cells(1, ColCrnt).Value, ",")
.Cells(RowCrnt, ColCrnt).Value = SubStrings(IndexCrnt(ColCrnt))
Next
by:
For ColCrnt = 1 To ColMax
.Cells(RowCrnt, ColCrnt).Value = SubStrings(ColCrnt)(IndexCrnt(ColCrnt))
Next
With the revised code I do not split a source cell for every generated value. I extract the substring I require from the array.
Note: if you have ever used two dimensional arrays, you will have written something like MyArray(Row,Column). Ragged arrays are different; you write MyArray(Row)(Column).
Here's a generalised solution that uses Recursion to handle any number of columns (greater than 1)
Sub Combinations()
Dim aSrc As Variant
' Get Data into an array
' This section is an example to get the source data into an array
' Replace this section if your data is sourced differently.
' The required format of aSrc is Array(1 To NumberOfColumnsOfData)
' where each element aSrc(n) is Array(1 To NumberOfRowsInColumnN, 1 To 1) of Variant
Dim rSrc As Range, colR As Range
Dim sh As Worksheet
Dim a As Variant
Dim i As Long
Set sh = ActiveSheet ' <-- Adjust to suit
Set rSrc = sh.[A:D] ' <-- Adjust to suit
ReDim aSrc(1 To rSrc.Columns.Count)
With sh
For i = 1 To rSrc.Columns.Count
Set colR = rSrc.Columns(i)
aSrc(i) = .Range(colR.Cells(1, 1), colR.Cells(.Rows.Count, 1).End(xlUp))
Next
End With
' Generate output
' This populates aDst(1 To lSize, 1 To NumberOfSourceColumns)
' where lSize is total number of combinations
Dim aDst As Variant
Dim lSize As Long
Dim n As Long
Dim aBase() As String
lSize = 1
For i = 1 To UBound(aSrc)
lSize = lSize * UBound(aSrc(i), 1)
Next
ReDim aDst(1 To lSize, 1 To UBound(aSrc))
ReDim aBase(0 To UBound(aSrc) - 1)
n = 1
aBase = Split(String(UBound(aSrc) - 1, ","), ",")
aBase(0) = aSrc(1)(1, 1)
Generate aSrc, aDst, aBase, 1, n
' Place output into sheet
' Starting at cell rDst
Dim rDst As Range
Set rDst = [E1] ' <-- Adjust to suit
Set rDst = rDst.Resize(UBound(aDst, 1), UBound(aDst, 2))
rDst = aDst
End Sub
Private Sub Generate(ByRef aSrc As Variant, ByRef aDst As Variant, ByRef aBase As Variant, ByVal pCol As Long, ByRef pDst As Long)
Dim i As Long, j As Long
If pCol = UBound(aSrc) Then
' If iterating the last source column, output to aDst
For i = 1 To UBound(aSrc(pCol), 1)
For j = 1 To UBound(aBase)
aDst(pDst, j) = aBase(j - 1)
Next
aDst(pDst, j) = aSrc(pCol)(i, 1)
pDst = pDst + 1
Next
Else
' If NOT iterating the last source column, aBase and call Generate again
For i = 1 To UBound(aSrc(pCol), 1)
aBase(pCol - 1) = aSrc(pCol)(i, 1)
Generate aSrc, aDst, aBase, pCol + 1, pDst
Next
End If
End Sub

How can I filter and copy data in Excel?

I have a lot of data in an excel worksheet. For calculations, I would like to restrict this data to the relevant data only. That is: filter the data and put the subset in another worksheet.
Relevant data is data that falls within a given minimum and maximum value.
For example:
Suppose I want to filter column A for values between 1 and 2, and column B for values between 0 and 1. Result should become like this.
A B C = Data
1 0 0 0
2 1 1 0
3 2 0 3
4 2 2 1
A B C = Result
1 1 1 0
2 2 0 3
Is there an easy solution for this?
The fact that I don't filter on exact matches apparently makes the problem more difficult.
Thanks in advance!
I've got a quick VBA procedure that will do just what you want...
Private Sub MultiFilter(DataRange As Range, CriteriaRange As Range, OutputRangeTL As Range)
Dim intRowCounter As Integer
Dim intColCounter As Integer
Dim varCurrentValue As Variant
Dim blnCriteriaError As Boolean
Dim rngOutputCurrent As Range
If CriteriaRange.Columns.Count <> DataRange.Columns.Count Then
Err.Raise Number:=513, Description:="CriteriaRange and DataRange must have same column count"
End If
If CriteriaRange.Rows.Count <> 2 Then
Err.Raise Number:=513, Description:="CriteriaRange must be of 2 rows"
End If
Set rngOutputCurrent = OutputRangeTL.Resize(1, DataRange.Columns.Count)
For intRowCounter = 1 To DataRange.Rows.Count
For intColCounter = 1 To DataRange.Columns.Count
varCurrentValue = DataRange.Cells(intRowCounter, intColCounter).Value
If Not (varCurrentValue >= CriteriaRange.Cells(1, intColCounter) _
And varCurrentValue <= CriteriaRange.Cells(2, intColCounter)) Then
''#i.e. criteria doesn't match
blnCriteriaError = True
Exit For
End If
Next intColCounter
If Not blnCriteriaError Then
''#i.e. matched all criteria
rngOutputCurrent.Value = DataRange.Resize(1).Offset(intRowCounter - 1).Value
Set rngOutputCurrent = rngOutputCurrent.Offset(1)
End If
blnCriteriaError = False
Next intRowCounter
End Sub
Usage:
DataRange:
0 0 0
1 1 0
2 0 3
2 2 1
CriteriaRange:
1 0 0
2 1 10
Then do:
Public Sub DoTheFilter()
MultiFilter Range("MyDataRange"), Range("MyCriteriaRange"), Range("MyOutputRangeTopLeft")
End Sub
The CriteriaRange is simply a 2 row range giving minimum and maximum values for each column.
This isn't the most elegant of most efficient way I'm sure, but I used it as a quick fix as I've needed to do this once or twice.
If you're not comfortable with using VBA code then let me know and I'm sure I can manage to convert it into a worksheet function for you (this would also have the added advantage of updating if you changed the criteria...)
Simon

Resources