Moving multiple sets columns into a single set of columns - excel

I have a spreadsheet with columns of data in A-B-C-D-E for a single Person's Data.
Then, the same headers are repeated in columns F-G-H-i-J, for a 2nd Person's Data
Then, the same headers are repeated again in the next 5 columns, and this goes on for 25 sets of columns (so I have data for 25 people going across).
There are about 50 rows down of this data.
The data headers are as follows:
A-Name
B-DOB
C-Level
D-Acct#
E-Tshirt Size
F-Name
G-DOB
H-Level
I-Acct#
E-Tshirt Size
Repeating for 25 sets
I would like to move all of the data so that is is in only columns A-B-C-D-E.
I would like ALL of the Names to appear in Column A, All of the DOBs to appear in Column B, etc.
I have seen something similar to this posted here but I don't understand how to re-write the code to fit my specific situation.
I have struggled with this problem for 3 years and copy/pasted everything by hand which takes hours. I would really appreciated any information on how to automate this.
Thank you!

it needs fiddling with but here's a start:
Sub CopySomeCells()
With Excel.ThisWorkbook.Sheets("myExample")
lastcellincolumnA = .Cells(.Rows.Count, 1).End(Excel.xlUp).Row
lastcellincolumnL = .Cells(.Rows.Count, 12).End(Excel.xlUp).Row
.Range("G1:L" & lastcellincolumnL).Copy
.Range("A" & (lastcellincolumnA + 1)).PasteSpecial Excel.xlValues
End With
End Sub

Related

VBA Range.End(xlDown) stops at last visible row

I am doing a simple VBA script in Microsoft Excel which iterates a list of cars and a list of information about when the cars were refueled to provide an overview of how many kilometers each car is driving each month.
I make use of the Range.End property to calculate the number of rows with data and then loop through the indicies.
Set Data = Worksheets("Tankninger") ' Danish for refuellings
NumRows = Data.Range("A1", Data.Range("A1").End(xlDown)).Rows.Count
For x = 1 To NumRows
' Process data
Next
Everything seemed to be working fine, however I found that if someone applied a filter to e.g. the sheet with refuelling data - e.g. only showing data related to car A, then NumRows would be assigned the index of the last visible row.
Example: if the refuling sheet contains 100 records and the records related car A are located on row 50-60, then NumRows would be assigned the value 60 - resulting in my script ignoring the last 40 records.
Is there a way to make the Range.End property ignore any filter applied to sheet, or will I have to change the implementation to use a while-loop instead?
I ended up replacing the for-loop with a while-loop. This allowed me to access every cell regardless of any filtering applied to the sheets.
Set Data = Worksheets("Tankninger") ' Danish for refuellings
r = 2
While Not IsEmpty(Cars.Cells(r, 1).value)
' Process data
Wend
What you can do is add the following in your code to remove filters before you find the last row with data.
'Remove all filters
Worksheets("Sheet1").Activate
On Error Resume Next
ActiveSheet.ShowAllData

Add and remove cells based on values

Original Post: Here (New post as it started to get very clunky and dragged away from the issues at hand)
I am looking to automatically grab data from an excel CRM output and take certain values into a new sheet. I have had a bit of luck with my progress, but I am still struggling to adapt the code properly.
First Iteration of Code:
Sub Client_CRM()
Range("A4:A44,C4:C44,G4:H44").Select
Selection.Copy
Sheets("Output Sheet").Select
Range("A1").Select
ActiveSheet.Paste
End Sub
Current code:
Sub Client_CRM()
Dim ClientStartRow As Long, ClientEndRow As Long
Dim Listed As Long
Set PortfolioRange = Worksheets("Client Paste").Range("A:M")
Set Listed = Worksheets("Client Paste").Range("A:A")
With Sheets("Client Paste")
Sheets("Output Sheet").Cells.Clear
Worksheets("Client Paste").Activate
ClientStartRow = .Range("A3").Row
':A").Find(What:="Listed", after:=.Range("A1")).Row
ClientEndRow = .Range("A:A").Find(What:="Totals", after:=.Range("A3"), LookIn:=xlValues, lookat:=xlPart, MatchCase:=False).Row
Range(ClientStartRow & ":" & ClientEndRow).Select
Selection.Columns(1).Copy
Sheets("Output Sheet").Select
Range("A3").Select
ActiveSheet.Paste
Sheets("Output Sheet").Range("B1:B70") = Application.VLookup(Listed, PortfolioRange, 8, False)
Sheets("Output Sheet").Range("C1:C70") = Application.VLookup(Listed, PortfolioRange, 3, False)
Sheets("Output Sheet").Range("D1:D70") = Application.VLookup(Listed, PortfolioRange, 7, False)
End With
End Sub
As you can see, I've slowly added and learnt more things throughout today.
What I am now looking to do is:
Find a better way to copy the columns over to the new sheet. **An issue that I have encountered is that maybe 1/10 CRM exports have an additional column, so the VLOOKUP can't accurately be used 100% - The CRM export has headers. Can I use some sort of code to grab these columns by value? They are exported and on Row 2. "Listed" "Quantity" "MV" "PW" are the 4 headings. Usually they are columns: 1,3,7,8 but in a rare instance they are 1,3,8,9...
Find a way to remove certain "blacklist" products. All products generally have a 3 part code that they are identified as. There are certain 3 part codes I do not want included and I want to be able to update this as time goes on. Ideally, I'd like to make a separate sheet with these codes and if they match to anything from the export, they aren't copied over...
Some product codes have 5 characters instead of 3, I'd like these ones to be coped in the same list but added to a separate list (Unsure if this is possible?)
Update:
Have worked out how to get the code to bring the 4 columns I want regardless of their order over.
Set PPSExport = Range("A2:M2")
For Each cell In PPSExport
If cell.Value = "Asset" Then
cell.EntireColumn.Copy
ActiveSheet.Paste Destination:=Worksheets("Output Sheet").Range("A:A")
End If
If cell.Value = "Quantity" Then
cell.EntireColumn.Copy
ActiveSheet.Paste Destination:=Worksheets("Output Sheet").Range("B:B")
End If
If cell.Value = "Market value" Then
cell.EntireColumn.Copy
ActiveSheet.Paste Destination:=Worksheets("Output Sheet").Range("C:C")
End If
If cell.Value = "Portfolio weight %" Then
cell.EntireColumn.Copy
ActiveSheet.Paste Destination:=Worksheets("Output Sheet").Range("D:D")
End If
Next cell
Sheets("Output Sheet").Select
End With
Thanks for any help,
I've already learnt so much already -- any pointers would be greatly appreciated :D
Yes, all of these things are possible. I will give a brief description on how to accomplish all of these things, but I recommend that you try to research how to do each of these tasks on your own before asking another question(s). It's also a good idea to keep the scope of your question limited. For example, you are asking about 3 loosely related items here. Yes, they are related to one another via your project, but in the general world of VBA programming, they are not. This will keep the conversation focused and easier to follow.
Find a better way to copy the columns over to the new sheet.
You made a great observation: your data is imported with headers. And your proposal is possible. You can certainly use the headers of a range (or table) to copy data. You could:
Iterate through all cells in the header row
If you come across one you are interested it, copy all of the data in that column to the new sheet
If you come across a column header you are not interested in, just skip it and move to the next one
Find a way to remove certain "blacklist" products.
This is possible, and your proposed solution sounds ideal to me. Keep a record of all blacklist values in a sheet, and reference that list when necessary.
Some product codes have 5 characters instead of 3, I'd like these ones
to be coped in the same list but added to a separate list
Certainly possible. Once you have your data:
Iterate through all of it and check how many characters are in the value
If there are 5, copy that data to a new location or store it somewhere
If there are not 5, move on to the next value

Secifying a common range for xlsread function in Matlab

I am trying to figure out how to specify a common range for xlsread() function in matlab.
Usually I use n=xlsread('filename','#sheet','A1:A10'), but I have quite a bit of data in the same sheet and I'd like to know if I can specify it with one range, i.e . if all my data is between '1:10', I want to specify 1:10 as range, and only call the letter values of each column.
I was thinking to do it as follows:
function [a,b,c]=getdata(filename,'1:10')
a=xlsread(filename,1,'A:A'???)
b=xlsread(filename,1,'B:B'???)
c=xlsread(filename,1,'C:C'???)
end
After some research I could not find any information as to how this is done.
Thanks in advance,
Greg
If you want to read 1 to 10 rows of column A, use:
data = xlsread(filename, 1, 'A1:A10');
If you want to read 1 to 10 rows of all columns, use:
data = xlsread(filename, 1, '1:10');
If you want to read 1 to 10 rows of, say, first three columns A, B, and C, use:
data = xlsread(filename, 1, 'A1:C10');
Using dynamic variable names is always a bad idea. Read this for explanation. But if you still want to create a, b, and c and so on depending on the number of columns in the Excel file, you can use:
for k=1:size(data,2)
assignin('caller', char(96+k), data(:,k)); %or char(64+k) for block letters
end
The above will work if number of columns are less than or equal to 26. This may only be feasible if you're dealing with a few columns. But I still recommend to avoid it.

RemoveDuplicates is not working as expected

I have a large data set that is exported from a website. I use a macro in my main ‘filter’ workbook to find the file and copy the data from Sheet1 of the exported file into Sheet1 of the filter workbook.
Once the data is copied into Sheet1 of the filter workbook, I use VBA to copy columns A/B/D/F/H/Z/AA/etc from Sheet 1 of the filter workbook into Sheet2 of the filter workbook AND also at the same time, I use this code here to attempt to delete any duplicate rows:
Worksheets("Sheet2").Range("A:DZ").RemoveDuplicates Columns:=15, Header:=xlYes
I am finding though that the RemoveDuplicates is not working as expected.
As an example, Sheet1 in the filter workbook (and export workbook) has 3344 rows. When I manually filter using conditional formatting to highlight duplicates, I can find 314 rows listed as duplicates (meaning 157 genuine rows of actual data and 157 rows which are duplicates of that actual data. I haven’t found any examples of duplicates existing more than one time each). So on Sheet2 I was expecting to see 3344 – 157 = 3157 Rows of real data. I don’t see that, or even 3030 rows (3344-314). Instead, I am getting 1897 rows paste into Sheet2, which is a difference of 1447 rows (1290 less rows than expected).
On top of that, I am manually checking the data to see what is up by using Control-F in the column and am finding that in some instances that both of the two duplicated items are missing from Sheet2 (versus it just deleting the one duplicate row).
The duplicate data is not on sequential rows and is scattered throughout the column in Sheet2. But when I sort it before I attempt to DeleteDuplicates, it does not seem to impact its accuracy or make it right.
I have also tried using the DeleteDuplicates in different locations of the code / at different times but the results are always off by the same amount (1447 missing rows instead of 157 missing rows).
I found only a few articles on this site, this one was the closest but not quiet my issue: Delete Rows With Duplicate Data VBA
And other help sites/forums mention there was some bug with office 2007 that prevents this from working (am using 2013).
Does anyone know where I can find a work around, or if one exists - or if this is still a real bug or just a problem with the code line I have above.
Adding bits of code I use in this example in case it is something within these that is causing the problem…
Part of the copy code:
wsFromSheet.Cells.Copy
wsToFile.Range("A1").PasteSpecial xlPasteValuesAndNumberFormats
Application.CutCopyMode = False
wbFromFile.Close True
Part of the ‘paste column code’:
Sheets("Sheet2").Rows(1 & ":" & Sheets("Sheet2").Columns.Count).ClearContents 'Clear from row 1 down
LastRowFromSiteTracker = xSht.Cells.SpecialCells(xlCellTypeLastCell).Row 'original report has 128 columns
xSht.Range("B1:B" & LastRowFromSiteTracker).Copy ySht.Cells(Rows.Count, "A").End(xlUp) 'customer name
‘repeat a bunch of times, then…
Application.CutCopyMode = False 'do I need this line?
Worksheets("Sheet2").Range("A:DZ").RemoveDuplicates Columns:=15, Header:=xlYes
End Sub
Example/sample of data:
Row Source Data Expected Data Actual Data
1 1000474608 1000474608 1000474608 (Dup missing from sheet2)
2 1000474608 1000487672 1000487672
3 1000487672 1000487674 1000487674
4 1000487674 1000487676 1000487676 (missing from sheet2, wasn’t a dup)
5 1000487676 1000487678 1000487678
6 1000487678 1000487680 1000487680
7 1000487680 1000487682 1000487682 (Dup missing from sheet2)
8 1000487682 1000520278 1000520278
9 1000487682 1000520280 1000520280
10 1000520278 1000520282 1000520282 (Is there)
11 1000520280 1000520286 1000520286
12 1000520282 1000520336 1000520336 (Is there)
13 1000520282 1000520338 1000520338
14 1000520286 1000520392 1000520392
15 1000520286 1000520394 1000520394
16 1000520336 1000530333 1000530333
17 1000520338
18 1000520392
19 1000520394
20 1000530333
EDIT: EDIT: EDIT:
So I've tried to do some more manual testing, and tried two separate things with the same set of data, getting two different results. I used the Conditional Formatting - Highlight Duplicates from the main Home ribbon and the Remove Duplicates from the Data ribbon.
The Remove Duplicates button finds and removed 163 items in Column P and leaves 3181 rows.
However, when I use the Highlight Duplicates conditional format finds 314 items that are duplicated within Column P, leaving 3030 non duplicates.
It does not make sense that these two numbers do not match. I thought it has something to do with the duplicates themselves - as most duplicated items have only one dup of itself (123123 shows up in two rows) but then just a small handful of rows are duplicated multiple times (234234 shows up in 4 or 6 columns).
So instead of using the manual way, I used the suggestions I've found online, and both of these also provide differing results when run:
3344 Base records
1897 left after scrub of duplicates (1446 removed)
Dim tmpAddress As String
tmpAddress = "A2:BZ" & Worksheets("ColScrub").UsedRange.Rows.Count
Worksheets("ColScrub").Range(tmpAddress).RemoveDuplicates Columns:=15, Header:=xlNo
3181 left after scrub of duplicates (162 removed)
Cells.Select
ActiveSheet.Range("$A$1:$EI$3345").RemoveDuplicates Columns:=31, Header:=xlYes
My further experience now shows that UsedRange is completely unreliable if you have blank rows or columns. UsedRange only includes rows/columns up to the blank one. I have found a better way to get the last of each. These function use 2 basic assumptions, which should hold true for mostof your spreadsheets.
For LastRow there is a "key" column, i.e. a column where the MUST be
data, for example an ID column
For LastCol there should be a header row (or row where you can guarantee the last column is filled)
With this in mind, I have created the following 2 functions retrieve the last values accurately, every time ... well, almost (my complete function handles issues of the footer rows with merged cells)
For the last row
Public Function Excel_GetLastRow(xlSheet As Excel.Worksheet, _
ByVal KeyColumn As Long) As Long
' This could be adjusted for exact max rows Excel allows
Const MAX_XL_ROWS As Long = 1048000
Excel_GetLastRow = xlSheet.Cells(MAX_XL_ROWS, KeyColumn).End(xlUp).row
End Function
And for last column
Public Function Excel_GetLastCol(xlSheet As Excel.Worksheet, _
ByVal HeaderRow As Long) As Long
' This could be adjusted for exact max columns Excel allows
Const MAX_XL_COLS As Long = 16000
Excel_GetLastCol = xlSheet.Cells(MAX_XL_COLS, HeaderRow).End(xlToLeft).Column
End Function
Using these values you can now set your complete data range successfully.
top left = Cells(HeaderRow + 1, 1)
bottom right = Cells(LastRow, LastCol)
My complete functions include error handling and allowances for possible merged cells in the footer row and last header column, but you get the idea.
Art
1) you are only clearing as many rows as your have columns, not rows
Also, you may not be clearing anything, so use the UsedRange.Rows for proper rowcount
This line ...
Sheets("Sheet2").Rows(1 & ":" & Sheets("Sheet2").Columns.Count).ClearContents
Should read ...
Sheets("Sheet2").Rows(1 & ":" & Sheets("Sheet2").UsedRange.Rows.Count).ClearContents
Without properly clearing the old data, unpredictable results may occur.
2) Excel VBA seems rather quirky in that many things won't work correctly without specifically "selecting" the object(s) in question AND specifiying complete (not columnar) ranges
3) I also prefer to leave out the header row (note the "A2") and pass Header:=xlNo
4) Unless you have more than 625 columns, BZ should be far enough
So add this to your code ...
Dim tmpAddress as String
tmpAddress = "A2:BZ" & Worksheets("Sheet2").UsedRange.Rows.Count
Worksheets("Sheet2").Activate
Worksheets("Sheet2").Range(tmpAddress).RemoveDuplicates Columns:=15, Header:=xlNo
Hope this helps :)
I don't know why (or if/how) this is any different, but this seems to be the closest I can get to true removal of duplicates. I wanted to add it here as an answer for others in similar situations.
Dim lastrow As Long
With ThisWorkbook.Worksheets("ColScrub")
If Application.WorksheetFunction.CountA(.Cells) <> 0 Then
lastrow = .Cells.Find(What:="*", _
After:=.Range("A1"), _
Lookat:=xlPart, _
LookIn:=xlFormulas, _
SearchOrder:=xlByRows, _
SearchDirection:=xlPrevious, _
MatchCase:=False).Row 'Change P1 back to A1 if needed
Else
lastrow = 1
End If
.Range("A1:AZ" & lastrow).RemoveDuplicates Columns:=Array(16), Header:=xlYes
End With
I have to go through each row visually to prove this works I think, and to rule out that it isn't deleting things that should not be deleted - but this seems to get rid of the 'double duplicates' (4-6 lines items instead of 2 like the other duplicates).

How can I add a 1 to the most recent, repeated row in Excel?

I have a dataset with 60+ thousand rows in excel and about 20 columns. The "ID column" sometimes repeats itself and I want to add a column that will return 1 only in the row that is the most recent only IF it repeats itself.
Here is the example. I have…
ID DATE ColumnX
AS1 Jan-2013 DATA
AS2 Feb-2013 DATA
AS3 Jan-2013 DATA
AS4 Dec-2013 DATA
AS2 Dec-2013 DATA
I want…
ID DATE ColumnX New Column
AS1 Jan-2013 DATA 1
AS2 Feb-2013 DATA 0
AS3 Jan-2013 DATA 1
AS4 Dec-2013 DATA 1
AS2 Dec-2013 DATA 1
I've been trying with a combination of sort and nested if's, but it depends on my data being always in the same order (so that it looks up the ID in the previous row).
Bonus points: consider my dataset if fairly large for excel, so the most efficient code that won't eat up processor would be appreciated!
An approach you could use is to point MSQuery at your table and use SQL to apply the business rules. On the positive side, this runs very quickly (a couple seconds in my tests against 64k rows). A huge minus is the query engine does not seem to support Excel tables exceeding 64k rows, but there might be ways to work around this. Regardless, I offer the solution in case it gives you some ideas.
To set up first give your data set a named range. I called it MYTABLE. Save. Next select a cell to the right of your table in row 1, and click through Data | From other sources | from Microsoft Query. Choose Excel Files* | OK, browse for your file. The Query Wiz should open, showing MYTABLE available, add all the columns. Click Cancel (really), and click Yes, you want to continue editing.
The MSQuery interface should open, click the SQL button and replace the code with the following. You will need to edit some specifics, such as the file path. (Also, note I used different column names. This was sheer paranoia on my part. The Jet engine is very finicky and I wanted to rule out conflicts with reserved words as I built this.)
SELECT
MYTABLE.ID_X,
MYTABLE.DATE_X,
MYTABLE.COLUMN_X,
IIF(MAXDATES.ID_x IS NULL,0,1) * IIF(DUPTABLE.ID_X IS NULL,0,1) AS NEW_DATA
FROM ((`C:\Users\andy3h\Desktop\SOTEST1.xlsx`.MYTABLE MYTABLE
LEFT OUTER JOIN (
SELECT MYTABLE1.ID_X, MAX(MYTABLE1.DATE_X) AS MAXDATE
FROM `C:\Users\andy3h\Desktop\SOTEST1.xlsx`.MYTABLE MYTABLE1
GROUP BY MYTABLE1.ID_X
) AS MAXDATES
ON MYTABLE.ID_X = MAXDATES.ID_X
AND MYTABLE.DATE_X = MAXDATES.MAXDATE)
LEFT OUTER JOIN (
SELECT MYTABLE2.ID_X
FROM `C:\Users\andy3h\Desktop\SOTEST1.xlsx`.MYTABLE MYTABLE2
GROUP BY MYTABLE2.ID_X
HAVING COUNT(1) > 1
) AS DUPTABLE
ON MYTABLE.ID_X = DUPTABLE.ID_X)
With the code in place MSQuery will complain the query can't be represented graphically. It's OK. The query will execute -- it might take longer than expected to run at this stage. I'm not sure why, but it should run much faster on subsequent refreshes. Once results return, File | Return data to Excel. Accept the defaults on the Import Data dialog.
That's the technique. To refresh the query against new data simply Data | Refresh. If you need to tweak the query you can get back to it though Excel via Data | Connections | Properties | Definition tab.
The code I provided returns your original data plus the NEW_DATA column, which has value 1 if the ID is duplicated and the date is the maximum date for that ID, otherwise 0. This code will not sort out ties if an ID's maximum date is on several rows. All such rows will be tagged 1.
Edit: The code is easily modified to ignore the duplication logic and show most recent row for all IDs. Simply change the last bit of the SELECT clause to read
IIF(MAXDATES.ID_x IS NULL,0,1) AS NEW_DATA
In that case, you could also remove the final LEFT JOIN with alias DUPTABLE.
Sort by ID, then by DATE (ascending). Define entries in new column to be 1 if previous row has the same ID and next row has a different ID or is empty (for last row), 0 otherwise.
It could be done in VBA. I'd be interested to know if this is possible just using formulas, I had to do something similar once before.
Sub Macro1()
Dim rowCount As Long
Sheets("Sheet1").Activate
rowCount = Cells(Rows.Count, 1).End(xlUp).Row
Columns("A:D").Select
Selection.AutoFilter
Range("D2:D" & rowCount).Select
Selection.ClearContents
Columns("A:D").Select
ActiveWorkbook.Worksheets("Sheet1").AutoFilter.Sort.SortFields.Add Key:=Range _
("B1:B" & rowCount), SortOn:=xlSortOnValues
ActiveWorkbook.Worksheets("Sheet1").AutoFilter.Sort.SortFields.Add Key:=Range _
("A1:A" & rowCount), SortOn:=xlSortOnValues
ActiveWorkbook.Worksheets("Sheet1").AutoFilter.Sort.Apply
Dim counter As Integer
For counter = 2 To rowCount
Cells(counter, 4) = 1
If Cells(counter, 1) = Cells(counter + 1, 1) Then Cells(counter, 4) = 0
Next counter
End Sub
So you activate the sheet and get the count of rows.
Then select and autofilter the results, and clear out Column D which has the 0s or 1s. Then filter on the values mbroshi suggested that you say you're already using. Then execute a loop for each record, changing the value to 1, but then back to 0 if the value ahead of it has the same ID.
Depending on your processor I dont think this would take more than a minute or two to run. If you do find something using formulas I would be interested to see it!

Resources