Alphanumerical sorting - paste error - large amount of data - excel

I've devellopped a quick little code to help me sort a large amount of data on an excel tool, it works perfectly except for one tiny detail ruining all of my work so far.
My table is 500+ columns wide and my algorithm would like to copy and paste two rows at a time (I use fused cells and do not wish to use an alternative to that).
My algorithm then tries to do :
Rows(i & ":" & i + 1).Cut
Rows(j & ":" & j + 1).Select
Selection.Insert shift:=xlDown
i and j being the row index (I try to put row j and j+1 just below row i and i+1
Which tells me that excel cannot do with the current ressources.
I tried to reduce the amount of cells with something like
range(cells(i,bc),cells(i+1,ec)).cut
range(cells(j,bc),cells(j+1,ec)).select
Selection.Insert shift:=xlDown
bc being the first column of my selection and ec the last
But still I have the same problem (selection is between 500 and 510 colums so around 1k cells)
So here is the question :
Is there a way to bypass that error and force the cut/paste? (I would like to avoid changing x columns at a time, slowing a process already quite slow)
Or a way to change the index of the row or something alike that I do not know about ?
Thank you for your time and future answers.
PEagle

This is not the answer, just too long to put as a comment.
I just ran the code below (with 700 columns of data per row), and it Cut >> Paste just fine without errors.
Option Explicit
Sub CopyFullTwoRows()
Dim i As Long, j As Long
' just for simulation
i = 2
j = 10
Rows(i & ":" & i + 1).Cut
Rows(j & ":" & j + 1).Insert Shift:=xlDown
End Sub

Sorry i am not yet at comment privilegies, so here is the answer to your comments/answers:
My file is about 6300KB.
For other programs, generaly just the outlook and firefox but computers at work are not very powerful.
Regarding the flatening of my merged cells and then sorting, the problem is then that I need to do sorting but only on even lines.
As it is a file from work I cannot send it on the internet, but to give you an idea, I have 2 lines working together, columns B to F are merged (B8+B9, B10+B11 etc, then C8+C9 etc.) and then I have two rows of data for each of these lines.
Unmerging would cause blank cells to pop and then sorting would mess the whole table.

I am sorry, my question is now obsolete.
I reorked the file, noticing that the file, for a reason I do not know, included all lines until the end of the sheet (1+ million). I have removed the unnecessary lines (recreated the file) and it now works fine.
sorry for your time loss, thank you for helping me.
See you soon.

Related

Add and remove cells based on values

Original Post: Here (New post as it started to get very clunky and dragged away from the issues at hand)
I am looking to automatically grab data from an excel CRM output and take certain values into a new sheet. I have had a bit of luck with my progress, but I am still struggling to adapt the code properly.
First Iteration of Code:
Sub Client_CRM()
Range("A4:A44,C4:C44,G4:H44").Select
Selection.Copy
Sheets("Output Sheet").Select
Range("A1").Select
ActiveSheet.Paste
End Sub
Current code:
Sub Client_CRM()
Dim ClientStartRow As Long, ClientEndRow As Long
Dim Listed As Long
Set PortfolioRange = Worksheets("Client Paste").Range("A:M")
Set Listed = Worksheets("Client Paste").Range("A:A")
With Sheets("Client Paste")
Sheets("Output Sheet").Cells.Clear
Worksheets("Client Paste").Activate
ClientStartRow = .Range("A3").Row
':A").Find(What:="Listed", after:=.Range("A1")).Row
ClientEndRow = .Range("A:A").Find(What:="Totals", after:=.Range("A3"), LookIn:=xlValues, lookat:=xlPart, MatchCase:=False).Row
Range(ClientStartRow & ":" & ClientEndRow).Select
Selection.Columns(1).Copy
Sheets("Output Sheet").Select
Range("A3").Select
ActiveSheet.Paste
Sheets("Output Sheet").Range("B1:B70") = Application.VLookup(Listed, PortfolioRange, 8, False)
Sheets("Output Sheet").Range("C1:C70") = Application.VLookup(Listed, PortfolioRange, 3, False)
Sheets("Output Sheet").Range("D1:D70") = Application.VLookup(Listed, PortfolioRange, 7, False)
End With
End Sub
As you can see, I've slowly added and learnt more things throughout today.
What I am now looking to do is:
Find a better way to copy the columns over to the new sheet. **An issue that I have encountered is that maybe 1/10 CRM exports have an additional column, so the VLOOKUP can't accurately be used 100% - The CRM export has headers. Can I use some sort of code to grab these columns by value? They are exported and on Row 2. "Listed" "Quantity" "MV" "PW" are the 4 headings. Usually they are columns: 1,3,7,8 but in a rare instance they are 1,3,8,9...
Find a way to remove certain "blacklist" products. All products generally have a 3 part code that they are identified as. There are certain 3 part codes I do not want included and I want to be able to update this as time goes on. Ideally, I'd like to make a separate sheet with these codes and if they match to anything from the export, they aren't copied over...
Some product codes have 5 characters instead of 3, I'd like these ones to be coped in the same list but added to a separate list (Unsure if this is possible?)
Update:
Have worked out how to get the code to bring the 4 columns I want regardless of their order over.
Set PPSExport = Range("A2:M2")
For Each cell In PPSExport
If cell.Value = "Asset" Then
cell.EntireColumn.Copy
ActiveSheet.Paste Destination:=Worksheets("Output Sheet").Range("A:A")
End If
If cell.Value = "Quantity" Then
cell.EntireColumn.Copy
ActiveSheet.Paste Destination:=Worksheets("Output Sheet").Range("B:B")
End If
If cell.Value = "Market value" Then
cell.EntireColumn.Copy
ActiveSheet.Paste Destination:=Worksheets("Output Sheet").Range("C:C")
End If
If cell.Value = "Portfolio weight %" Then
cell.EntireColumn.Copy
ActiveSheet.Paste Destination:=Worksheets("Output Sheet").Range("D:D")
End If
Next cell
Sheets("Output Sheet").Select
End With
Thanks for any help,
I've already learnt so much already -- any pointers would be greatly appreciated :D
Yes, all of these things are possible. I will give a brief description on how to accomplish all of these things, but I recommend that you try to research how to do each of these tasks on your own before asking another question(s). It's also a good idea to keep the scope of your question limited. For example, you are asking about 3 loosely related items here. Yes, they are related to one another via your project, but in the general world of VBA programming, they are not. This will keep the conversation focused and easier to follow.
Find a better way to copy the columns over to the new sheet.
You made a great observation: your data is imported with headers. And your proposal is possible. You can certainly use the headers of a range (or table) to copy data. You could:
Iterate through all cells in the header row
If you come across one you are interested it, copy all of the data in that column to the new sheet
If you come across a column header you are not interested in, just skip it and move to the next one
Find a way to remove certain "blacklist" products.
This is possible, and your proposed solution sounds ideal to me. Keep a record of all blacklist values in a sheet, and reference that list when necessary.
Some product codes have 5 characters instead of 3, I'd like these ones
to be coped in the same list but added to a separate list
Certainly possible. Once you have your data:
Iterate through all of it and check how many characters are in the value
If there are 5, copy that data to a new location or store it somewhere
If there are not 5, move on to the next value

RemoveDuplicates is not working as expected

I have a large data set that is exported from a website. I use a macro in my main ‘filter’ workbook to find the file and copy the data from Sheet1 of the exported file into Sheet1 of the filter workbook.
Once the data is copied into Sheet1 of the filter workbook, I use VBA to copy columns A/B/D/F/H/Z/AA/etc from Sheet 1 of the filter workbook into Sheet2 of the filter workbook AND also at the same time, I use this code here to attempt to delete any duplicate rows:
Worksheets("Sheet2").Range("A:DZ").RemoveDuplicates Columns:=15, Header:=xlYes
I am finding though that the RemoveDuplicates is not working as expected.
As an example, Sheet1 in the filter workbook (and export workbook) has 3344 rows. When I manually filter using conditional formatting to highlight duplicates, I can find 314 rows listed as duplicates (meaning 157 genuine rows of actual data and 157 rows which are duplicates of that actual data. I haven’t found any examples of duplicates existing more than one time each). So on Sheet2 I was expecting to see 3344 – 157 = 3157 Rows of real data. I don’t see that, or even 3030 rows (3344-314). Instead, I am getting 1897 rows paste into Sheet2, which is a difference of 1447 rows (1290 less rows than expected).
On top of that, I am manually checking the data to see what is up by using Control-F in the column and am finding that in some instances that both of the two duplicated items are missing from Sheet2 (versus it just deleting the one duplicate row).
The duplicate data is not on sequential rows and is scattered throughout the column in Sheet2. But when I sort it before I attempt to DeleteDuplicates, it does not seem to impact its accuracy or make it right.
I have also tried using the DeleteDuplicates in different locations of the code / at different times but the results are always off by the same amount (1447 missing rows instead of 157 missing rows).
I found only a few articles on this site, this one was the closest but not quiet my issue: Delete Rows With Duplicate Data VBA
And other help sites/forums mention there was some bug with office 2007 that prevents this from working (am using 2013).
Does anyone know where I can find a work around, or if one exists - or if this is still a real bug or just a problem with the code line I have above.
Adding bits of code I use in this example in case it is something within these that is causing the problem…
Part of the copy code:
wsFromSheet.Cells.Copy
wsToFile.Range("A1").PasteSpecial xlPasteValuesAndNumberFormats
Application.CutCopyMode = False
wbFromFile.Close True
Part of the ‘paste column code’:
Sheets("Sheet2").Rows(1 & ":" & Sheets("Sheet2").Columns.Count).ClearContents 'Clear from row 1 down
LastRowFromSiteTracker = xSht.Cells.SpecialCells(xlCellTypeLastCell).Row 'original report has 128 columns
xSht.Range("B1:B" & LastRowFromSiteTracker).Copy ySht.Cells(Rows.Count, "A").End(xlUp) 'customer name
‘repeat a bunch of times, then…
Application.CutCopyMode = False 'do I need this line?
Worksheets("Sheet2").Range("A:DZ").RemoveDuplicates Columns:=15, Header:=xlYes
End Sub
Example/sample of data:
Row Source Data Expected Data Actual Data
1 1000474608 1000474608 1000474608 (Dup missing from sheet2)
2 1000474608 1000487672 1000487672
3 1000487672 1000487674 1000487674
4 1000487674 1000487676 1000487676 (missing from sheet2, wasn’t a dup)
5 1000487676 1000487678 1000487678
6 1000487678 1000487680 1000487680
7 1000487680 1000487682 1000487682 (Dup missing from sheet2)
8 1000487682 1000520278 1000520278
9 1000487682 1000520280 1000520280
10 1000520278 1000520282 1000520282 (Is there)
11 1000520280 1000520286 1000520286
12 1000520282 1000520336 1000520336 (Is there)
13 1000520282 1000520338 1000520338
14 1000520286 1000520392 1000520392
15 1000520286 1000520394 1000520394
16 1000520336 1000530333 1000530333
17 1000520338
18 1000520392
19 1000520394
20 1000530333
EDIT: EDIT: EDIT:
So I've tried to do some more manual testing, and tried two separate things with the same set of data, getting two different results. I used the Conditional Formatting - Highlight Duplicates from the main Home ribbon and the Remove Duplicates from the Data ribbon.
The Remove Duplicates button finds and removed 163 items in Column P and leaves 3181 rows.
However, when I use the Highlight Duplicates conditional format finds 314 items that are duplicated within Column P, leaving 3030 non duplicates.
It does not make sense that these two numbers do not match. I thought it has something to do with the duplicates themselves - as most duplicated items have only one dup of itself (123123 shows up in two rows) but then just a small handful of rows are duplicated multiple times (234234 shows up in 4 or 6 columns).
So instead of using the manual way, I used the suggestions I've found online, and both of these also provide differing results when run:
3344 Base records
1897 left after scrub of duplicates (1446 removed)
Dim tmpAddress As String
tmpAddress = "A2:BZ" & Worksheets("ColScrub").UsedRange.Rows.Count
Worksheets("ColScrub").Range(tmpAddress).RemoveDuplicates Columns:=15, Header:=xlNo
3181 left after scrub of duplicates (162 removed)
Cells.Select
ActiveSheet.Range("$A$1:$EI$3345").RemoveDuplicates Columns:=31, Header:=xlYes
My further experience now shows that UsedRange is completely unreliable if you have blank rows or columns. UsedRange only includes rows/columns up to the blank one. I have found a better way to get the last of each. These function use 2 basic assumptions, which should hold true for mostof your spreadsheets.
For LastRow there is a "key" column, i.e. a column where the MUST be
data, for example an ID column
For LastCol there should be a header row (or row where you can guarantee the last column is filled)
With this in mind, I have created the following 2 functions retrieve the last values accurately, every time ... well, almost (my complete function handles issues of the footer rows with merged cells)
For the last row
Public Function Excel_GetLastRow(xlSheet As Excel.Worksheet, _
ByVal KeyColumn As Long) As Long
' This could be adjusted for exact max rows Excel allows
Const MAX_XL_ROWS As Long = 1048000
Excel_GetLastRow = xlSheet.Cells(MAX_XL_ROWS, KeyColumn).End(xlUp).row
End Function
And for last column
Public Function Excel_GetLastCol(xlSheet As Excel.Worksheet, _
ByVal HeaderRow As Long) As Long
' This could be adjusted for exact max columns Excel allows
Const MAX_XL_COLS As Long = 16000
Excel_GetLastCol = xlSheet.Cells(MAX_XL_COLS, HeaderRow).End(xlToLeft).Column
End Function
Using these values you can now set your complete data range successfully.
top left = Cells(HeaderRow + 1, 1)
bottom right = Cells(LastRow, LastCol)
My complete functions include error handling and allowances for possible merged cells in the footer row and last header column, but you get the idea.
Art
1) you are only clearing as many rows as your have columns, not rows
Also, you may not be clearing anything, so use the UsedRange.Rows for proper rowcount
This line ...
Sheets("Sheet2").Rows(1 & ":" & Sheets("Sheet2").Columns.Count).ClearContents
Should read ...
Sheets("Sheet2").Rows(1 & ":" & Sheets("Sheet2").UsedRange.Rows.Count).ClearContents
Without properly clearing the old data, unpredictable results may occur.
2) Excel VBA seems rather quirky in that many things won't work correctly without specifically "selecting" the object(s) in question AND specifiying complete (not columnar) ranges
3) I also prefer to leave out the header row (note the "A2") and pass Header:=xlNo
4) Unless you have more than 625 columns, BZ should be far enough
So add this to your code ...
Dim tmpAddress as String
tmpAddress = "A2:BZ" & Worksheets("Sheet2").UsedRange.Rows.Count
Worksheets("Sheet2").Activate
Worksheets("Sheet2").Range(tmpAddress).RemoveDuplicates Columns:=15, Header:=xlNo
Hope this helps :)
I don't know why (or if/how) this is any different, but this seems to be the closest I can get to true removal of duplicates. I wanted to add it here as an answer for others in similar situations.
Dim lastrow As Long
With ThisWorkbook.Worksheets("ColScrub")
If Application.WorksheetFunction.CountA(.Cells) <> 0 Then
lastrow = .Cells.Find(What:="*", _
After:=.Range("A1"), _
Lookat:=xlPart, _
LookIn:=xlFormulas, _
SearchOrder:=xlByRows, _
SearchDirection:=xlPrevious, _
MatchCase:=False).Row 'Change P1 back to A1 if needed
Else
lastrow = 1
End If
.Range("A1:AZ" & lastrow).RemoveDuplicates Columns:=Array(16), Header:=xlYes
End With
I have to go through each row visually to prove this works I think, and to rule out that it isn't deleting things that should not be deleted - but this seems to get rid of the 'double duplicates' (4-6 lines items instead of 2 like the other duplicates).

Macro to Group Rows Throughout Excel Document

I have looked around and I feel like I am going crazy for not understanding how to do this, or what to do. It seems simple and yet I cannot figure out the best method.
I have an excel document that has 8 rows of data and that is supported by individual data from individuals that is 16 rows long. In total, there are 600 individuals in the dataset.
I was trying to locate a macro that would simply allow me to group every 16 rows in my excel sheet together. Whatever I have tried though, has not worked.
I am using Microsoft Excel 20116 for mac.
Range("1:16").Select
Selection.Rows.Group
Range("17:32").Select
Selection.Rows.Group
Repeat!
If there is a predictable data pattern (i.e. always exactly 16 rows) you might wish to put this inside a loop, where Range.Select is offset a further 16 rows each time.
For example:
i = 1
j = 16
While i < "YOUR UPPER LIMIT"
Range(i & ":" & j).Select
Selection.Rows.Group
i = i+16
j = j+16
Next
Note that if you group contiguous rows without a break, your groupings will automatically combine. You will need to play around with the line Range(i & ":" & j).Select to either use i+1 or j-1 depending on what you want as your display row on the grouping.

vba excel macro using too much memory (select.insert seems to be the cause)

I am struggling with a Macro that cannot get to the end because of too much memory usage.
I have a specific line that seems to be responsible for this:
Selection.Insert Shift:=xlToRight
This is within a while loop and an if/else statement like this:
While SearchRow < lastRow
If StrComp(...) = 0 Then
...
End If
If StrComp(...) = 0 Then
...
End If
If StrComp(...) = 0 _
Or StrComp(...) = 0 _
Or StrComp(...) = 0 Then
ActiveSheet.Cells(SearchRow, 1).EntireRow.Delete
Else
ThisWorkbook.Sheets(...).Cells(SearchRow, 1).Select
Selection.Insert Shift:=xlToRight
End If
Wend
I have runned the macro in debug lineby line and so when I get into the If/Or block then to the EndIf I see the memory usage in the task manager of windows jumping.
Do you have any explanation/suggestion on this ?
Many thanks for your help.
BR,
Massimo
Before I would even struggle with finding an elegant solution I would start off with optimizing the macro for refreshing formulas and the screen:
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
Please do not forget to turn them back to the original values when you are done with the sub.
This should prevent any recalculations during your insertion and prevent the screen from unnecessarily updating itself. This might actually solve your problem right away.
Now for an elegant solution we would need to see your whole code and see what you are struggling to achieve as most probably you are moving a lot of columns with data and formulas hence Excel has to process a lot in memory to move them to the right.
W/o seeing your code I would probably:
Think or rewriting it to prevent any Shift
Modify the code to do the insertions rather from the right to left than the other way round (to limit the number of columns needed to be moved)
Insert upfront many columns and hide them. Then in the code, when needed instead of inserting them you would unhide one of the hidden columns - most effective

Excel ran out of resources while attempting to calculate one or more formulas

I have a workbook to do 'smart'-graphs on my expenses. It's been running for a year and there are now a lot of graphs and expenses. Excel now throws an out-of-resources error whenever I change anything or open the workbook. Thing is, I have lots of resources and its not using hardly any of them.
Win8 64bit w/ 8 core CPU and 32GB of ram
Office 2013 64bit
I have 2 sheets, the first sheet called Expenses has 3 columns [Date,Description,Amount] and about 1500 rows of data. The second sheet has a LOT (500 or so) of formulas that are all the same and aim to do "Sum all expenses between date X and Y where description matches -some needle-". The formula I have is this:
=
ABS(
SUMPRODUCT(
--(Expenses!A:A >= DATE(2011,12,1)),
--(Expenses!A:A < DATE(2012,1,1)),
--(ISNUMBER(FIND(C50,Expenses!B:B))),
Expenses!C:C
)
)
Can I give Excel more resources? (I'm happy for it to use all my ram, and chug my CPU for a few minutes).
Is there a more efficient way I can do this formula?
I understand that this formula is creating a large grid and masking my expenses list with it, and that for each formula this grid has to get created. Should I create a macro to do this more efficiently instead? If I had a macro, I would want to call it from a cell somehow like
=sumExpenses(<startDate>, <endDate>, <needle>)
Is that possible?
Thanks.
I had a similar problem where there were a few array formulas down about 150 rows and I got this error, which really baffled me because there really aren't that many formulas to calculate. I contacted our IT guy and he explained the following, some of which I understand, most of which I don't:
Generally when the computer tries to process large amounts of data, it uses multi-threaded calculation, where it uses all 8 processors that the computer tricks itself into thinking it has. When multi-threaded calculation is turned off, the computer doesn't throw the 'Excel ran out of resources...' error.
To turn off multi-threaded calculation, got to the 'File' tab in your Excel workbook and select 'Options'. On the right side of the box that appears select 'Advanced' and scroll down to the heading 'Formulas'. Under that heading is a check box that says 'Enable multi-threaded calculation'. Untick it, then select 'OK' and recalculate your formulas.
I had a go at creating a function that hopefully replicates what your current equation does in VBA with a few differences. Since I don't know the specifics of your second sheet the caching might not help at all.
If your second sheet uses the same date range for all calls to sumExpenses then it should be a bit quicker as it pre-sums everything on the first pass, If your date range changes throughout then its just doing a lot of work for nothing.
Public Cache As Object
Public CacheKey As String
Public Function sumExpenses(ByVal dS As Date, ByVal dE As Date, ByVal sN As String) As Variant
Dim Key As String
Key = Day(dS) & "-" & Month(dS) & "-" & Year(dS) & "_" & Day(dE) & "-" & Month(dE) & "-" & Year(dE)
If CacheKey = Key Then
If Not Cache Is Nothing Then
If Cache.Exists(sN) Then
sumExpenses = Cache(sN)
Exit Function
End If
Set Cache = Nothing
End If
End If
CacheKey = Key
Set Cache = CreateObject("Scripting.Dictionary")
Dim Expenses As Worksheet
Dim Row As Integer
Dim Item As String
Set Expenses = ThisWorkbook.Worksheets("Expenses")
Row = 1
While (Not Expenses.Cells(Row, 1) = "")
If Expenses.Cells(Row, 1).Value > dS And Expenses.Cells(Row, 1).Value < dE Then
Item = Expenses.Cells(Row, 2).Value
If Cache.Exists(Item) Then
Cache(Item) = Cache(Item) + Expenses.Cells(Row, 3).Value
Else
Cache.Add Item, Expenses.Cells(Row, 3).Value
End If
End If
Row = Row + 1
Wend
If Cache.Exists(sN) Then
sumExpenses = Cache(sN)
Else
sumExpenses = CVErr(xlErrNA)
End If
End Function
Public Sub resetCache()
Set Cache = Nothing
CacheKey = ""
End Sub
There could be many causes of this. I just wish Excel would tell us which one (or more) of the 'usual suspects' is committing the offence of RAM hogging at this time.
Also look for
Circular references
Fragmented Conditional formatting (caused by cutting, pasting, sorting, deleting and adding cells or rows.
Errors resulting in #N/A, #REF, #DIV/0! etc,
Over-use of the volatile functions TODAY(), NOW(), etc.
Too many different formats used
... in that order
While you're there, check for
Broken links. A formula relying on a fresh value from external data could return an error.
Any formulas containing #REF!. If your formulas are that messed these may well be present also. They will not cause an error flag but may cause some unreported errors. If your formulas are satisfied by an earlier condition the part of the formula containing #REF! will not be evaluated until other conditions prevail.
Fragmented conditional formatting was the case for me.
Older versions of the same workbook did not have an issue. Today, I cut/pasted many cells and the issue started occurring.
Removing the columns where I was cutting/pasting resolved the issue for me.
This is difficult to diagnose since conditional formatting does not immediately standout like normal formulas.

Resources