I am trying to find a method to loop through rows in a named table, copying each row over to another table and adding a value in a blank field on the end of each row which sequences the dates between a date span.
I came across code which can separate a date span successfully into rows, but have been having trouble creating a loop to go through each row of data and copying the rest over.
Example of data from table (w/ headers):
Table Name: TblOGCalendar
Sheet Name: OGCalendarData
Should be copied over to look like the following:
Table Name: TblR2Calendar
Sheet Name: R2CalendarData
This also has implications for another project that I am working on, in which they are wanting staff hours tracked and projected for Project work.
is is by no means an answer but more suited to help you learn stuff
Loops through excel files can be done in various ways - some good and some bad. Just depends on your skill level & comfort level with working with code.
Im only going to outline 2 methods
method #1 - Looping through the rows/columns themselves. I dont like this method as its bad practice - interacting with the applications objects is aperformance killer.
dim rng as range, rcell as range
' you have to tell the compiler where stuff is at
' this is important and a commmon mistake that causes quesitons her eon SO
set rng = Thisworkbook.worksheets("Yoursheetname").Range("yourrange")
for each rcell in rng.Cells
'rcell is the current cell in the range you're looping through.
'Will physically loop through cells top to bottom, left to right
' do some processing.
next rcell
method #2 - Working in memory with arrays. This is the preferred method and the one you should get good at if you plan on using excel-vba more often in the future.
dim arr as variant ' you need this for dumping sheet to arrays
dim i as long, j as long
arr = THisworkbook.Worksheets("yoursheet").UsedRange
' there are many ways to get the desired range of a sheet - pick your favorite (after researching), and use that.
' the ubound and lbound functions literally mean upper and lower. It basically says for i equal beginning of array dimension to last.
' the number dictates what dimension of the array you want to loop through. Excel ranges are mutlidimensional by default. 1 = rows, 2 = columns
for i = LBound(arr,1) to UBound(arr,1)
for j = LBound(arr,2) to UBound(arr,2)
' do some processing
' array values can be accessed through this methods
' arr(i,j)
' arr(i, x) x being a number, like if you know i want column 7 of current iteration/row
' arr(i+1, j-1) plus values move down, or to the right (depending on dimension) while negative value go up or left (depending on dimension)
next j
next i
'to put stuff back on your sheet after processing
thisworkbook.worksheets("yoursheet").range("yoursheet").value = arr
this should get you going on figuring things out for yourself
Related
My goal is to use parts of several sheets to compute basic stuff of each of the selected rows in those sheets.
The prior way to do this was to create a row with formulas manually. Let's say I have sheets: "Input1","TranslateTable", "Calculations". Then row 1 in "calculations" would have in cell A1 Input1!A1*2, B1 would have Input1!B1 / TranslateTable!D1, etcetera. The formula row would be autofilled using VBA, incrementing the rows used.This incrementing would work as all relevant rows would be sorted to the top rows in the input sheet using the VBA. Each month, all the data would be removed except for the row with the formulas, and it would be autofilled again with the amount of relevant input rows that month.
I didnt make this, and I think it's bad practise to change input data only to use it for another sheet.
Moreover, a new sheet "Input2" was added, making this method not usable. Technically, there needs to be two slightly different rows autofilled, with input2 calculations being right under the input1 calculation.
I am thinking of ways make this method more robust. Coming from Python, I'm used to calculations being very quick. So I'd figured removing the formula rows and recreating these with VBA directly. While it's working, it's extremely slow.
As I don't have much experience yet, I might not do things properly. Right now (example code), I have
Sub Calculations
For row_num = 2 to 20
Sheets("Calculations").Range("A" & row_num) = Sheets("Input1").Range("B" & row_num)*5
# and than here is the next calculation for the next cell in the row
# again, et cetera
End Sub
Is this the right way to do this loop this? I can imagine there is a more efficient way (I know in Python there is).
I have thought of several options
try to make the loop more efficient
create rows of formula in the sheet and use vba to autofill (but then more robust than before)
Create formulas using VBA and then to autofill.
All help is greatly appreciated!
update: based on Bankeris' answer, I would get the following using arrays:
Dim RangeArray As Variant
RangeArray = Sheets("Input1").Range("A1:AZ10").Value
'and then loop calculations like this
for row_num = 2 to 15
'my output table starts at a different row
output_row = row_num + 2
ValueArrayB1Cell = RangeArray(row_num, 2)
Range("A" & output_row) = ValueArrayB1Cell * 2
Looping through Excel Cells is very slow thing :)
Probably best way to solve this - calc everything in memory from array. Put everything into array, loop and calc and then write result in Excel.
Dim RangeArray As Variant
RangeArray = Sheets("Sheet1").Range("A1:A5").Value
ValueArrayA1Cell = RangeArray(1, 1)
ValueArrayB2Cell = RangeArray(2, 2)
What ways are there to test an Excel VBA range variable for references to entire columns?
I'm using Excel 2007 VBA, iterating through Range variables with For-Each loops. The ranges are passed into the function as parameters. References to individual cells, ranges of cells, and entire rows are fine.
For instance, these are okiedokie:
Range("A1") 'One cell
Range("A1:D4") 'Range of cells.
Range("10:20") 'Entire rows 10 through 20.
But if any of the ranges have references to entire columns, it will drag the function down to a screeching halt. For instance, these are not okiedokie, and they need to be tested for and avoided:
Range("A:A")
Range("A:Z")
Range("AA:ZZ")
There are a few ways I've throught of to do this, each of them plausible but with weaknesses. The code contains loops which are used for searching through cells in worksheets with many thousands of rows, so speed is critical.
Here are three ways I can think of, but I'd like to know if there are others..?
The simplest & fastest method is to count the rows. If Range(x).Rows.Count=1048576, that's the maximum number of rows in a worksheet. However, this wouldn't work if the actual number of rows turned out to be exactly that number, or if by some wild chance there were multiple overlapping areas/ranges
that all added up to that number. Both unlikely, but possible. Also, if the version of Excel changes, so might that number, thus rendering the code broken.
Use a RegEx match against the text of Range.Address(False,False) with a pattern such as ([A-Z]{1,3}):([A-Z]{1,3}). I think this would be a medium on the speed scale.
Use VBA loops, If-Then, and string functions such as InStr() and Mid() to pick at the text of Range.Address(False,False). I think this would be the slowest possible way to do it.
You could test if the range is a reference to a column by checking the Range.Address against the Range.EntireColumn.Address like this:
If Range("AA:ZZ").Address = Range("AA:ZZ").EntireColumn.Address Then
'This returns True
End If
If Range("AA1:ZZ4").Address = Range("AA1:ZZ4").EntireColumn.Address Then
'This returns False
End If
Not sure I understand the question completely but this might work for you:
Public Sub Test()
Debug.Print RowCheck(ThisWorkbook.Worksheets("Sheet1").Range("A1:A10"))
End Sub
Public Function RowCheck(InputRange As Range)
Dim u As Long 'used number of rows
Dim x As Long 'max number of rows for any column
Dim r As Long 'number of rows based on input range
With InputRange
u = Cells(Rows.Count, .Columns(1).Column).End(xlUp).Row
r = .Rows.Count
x = Rows.Count
End With
If r = x And u < r Then
RowCheck = "A bad column reference provided"
Else
RowCheck = "This is a valid reference"
End If
End Function
Ok, after reading everyone's suggestions, I realized that no matter what I do, any Range objects passed to my function might include either an entire column reference or any combination of overlapping Range references that result in an entire column being selected.
But in translation, that means...all rows in the data, aka the UsedRange. It's possible with a large amount of data the UsedRange may actually hit the last row at 1048576. And any combination of Range references passed to my Function might result in a huge area that does cover an entire column, all the way to the maximum row.
Of course the likelihood of that happening is very low, but I do like to cover all bases in my code. But the key to this puzzle is UsedRange. This creates a "synthetic maximum last row". If the GrandRange, for lack of a better name, covers all rows in the UsedRange, then my function has nothing to do and no data to return. And so a simple IF-Then-Exit should give me the solution I was looking for:
If Intersect(UsedRange,LeGrandeRange).Rows.Count = UsedRange.Rows.Count Then
'All rows in `UsedRange` are affected.
'Nothing to do.
Exit Function
Else
'Do everything here.
'Then exit normally.
...
...
...
Endif
I have tried using this formula field and copying to all >100k records I have.
=IF(SUMPRODUCT(--EXACT(A2,$B$1:B1)),"",A2)
where:
column A = column with all data including duplicates
column B = column to display data (from column A) if unique otherwise empty string
However I hit this issue:
Yes my Excel 2016 is 32bit and yes my laptop is only 8GB RAM. But I have read up that people with 64bit and 16GB RAM experienced the same error as me.
I know there is a function in Excel function : Data > Select Column(s)> Remove Duplicates. However this function deletes case INSENSITIVE data only.
Please advise me how I can overcome this issue. I am open to using stuff like Crystal Reports or some sort of freeware to solve this issue. Please advise.
You may try something like this.
Before trying this backup your data.
The code below will remove the duplicates from the column A and it is case sensitive.
Sub GetUniqueValues()
Dim x, dict
Dim lr As Long
lr = Cells(Rows.Count, 1).End(xlUp).Row
x = Range("A2:A" & lr).Value
Set dict = CreateObject("Scripting.Dictionary")
For i = 1 To UBound(x, 1)
dict.Item(x(i, 1)) = ""
Next i
Range("A2:A" & lr).ClearContents
Range("A2").Resize(dict.Count).Value = Application.Transpose(dict.keys)
End Sub
Edited Code:
Sub GetUniqueValues()
Dim x, dict, y
Dim lr As Long
Application.ScreenUpdating = False
lr = Cells(Rows.Count, 1).End(xlUp).Row
x = Range("A2:A" & lr).Value
Set dict = CreateObject("Scripting.Dictionary")
For i = 1 To UBound(x, 1)
dict.Item(x(i, 1)) = ""
Next i
ReDim y(1 To dict.Count, 1 To 1)
i = 0
For Each it In dict.keys
i = i + 1
y(i, 1) = it
Next it
Range("A2:A" & lr).ClearContents
Range("A2").Resize(dict.Count).Value = y
Application.ScreenUpdating = True
End Sub
Using power of sorting to get unique values. No libraries required. Can be converted easily to select complete rows:
Sub GetUniqueValues()
'Sort once so we can run through the list without nested loops
Sheet1.Range("$A:$A").Sort Key1:=Sheet1.Range("$A:$A"), Header:=xlYes, MatchCase:=True
count = Application.WorksheetFunction.CountA(Sheet1.Range("$A:$A"))
LastCell = 1
For i = 2 To count
If Sheet1.Cells(i, 1).Value = Sheet1.Cells(LastCell, 1).Value Then
'Remove second/third/fourth occurrences
Sheet1.Cells(i, 1).Clear
Else
'If its first occurrence of this value, make a note of its position
LastCell = i
End If
Next
'Sort again to move the cells emptied out to the bottom
Sheet1.Range("$A:$A").Sort Key1:=Sheet1.Range("$A:$A"), Header:=xlYes, MatchCase:=True
End Sub
For a general solution the VBA approach already suggested is probably to prefer. But for something that works only once, you can probably make it work the way you intended with only a little bit of adaptation in how you apply =IF(SUMPRODUCT(--EXACT(A2,$B$1:B1)),"",A2). I also tried to use a COUNTIF algorithm, which is much faster than SUMPRODUCT, but that's not case sensitive.
Since I am also running 32-bit Excel with 8GB memory I was curious to see if I could replicate the memory issue. I generated a list of 100,000 random 5-letter strings in column A. Only 10 letters were used (ABCDEFGHJK), so in 100,000 strings some would occur more than once. I then applied the formula suggested by the OP in column B to filter out only unique values. It did indeed work, but it took quite some time. But I never ran into the memory issue that the OP did.
Proposed solution:
Based on these observations, one possible solution to you particular problem might be to copy column A to a new, temporary workbok and run your SUMPRODUCT formula there while all other workbooks are closed. Once it has finished you could just paste the result back to the original column in the original file. Actually removing the duplicates could be done by simply filtering on that column so that all dublicates (empty cells) are grouped together and then remove those rows. Details of my attempt to replicate can be found below.
SUMPRODUCT: Approximately 1 hour
First I tried the same formula as in the OP, =IF(SUMPRODUCT(--EXACT(A2,$B$1:B1)),"",A2), but doing only 10,000 rows at a time (by inserting empty rows at row 10,000, 20,000 etc. and copying down ten thousand rows at a time.) Each set of 10,000 rows took a couple of minutes to complete. When I did the whole shebang as one giant copy operation for all 100,000 cells at once, the operation took around one hour to complete and Excel was unresponsive in the meantime. Memory usage was 1,4 GB and the CPU averaged over 50% capacity (monitored with the Windows Task Manager). I also tried to run the formula when I had already manipulated the data in various ways (thus consuming more memory), which pushed CPU capacity to 100% and caused a couple of crashes. I managed to avoid that by simply closing Excel to clear the memory and running the operation again from a fresh restart with no other workbooks open.
As you can see in the following screenshots the formula worked and the unique entries become rarer further down the list (as expected since they are random). I assigned 1 to cells contaning duplicates so I could count them easily. There were 36,843 such instances.
First rows, no duplicates:
Last rows, mostly duplicates (cells with 1):
COUNTIF: 8.5 minutes
Compared to the SUMPRODUCT algorithm which took around one hour to complete, the following COUNTIF formula completed the same job in only 8,5 minutes, but it would not distinguish between lower and upper case. This approach requires the use of a help column. COUNTIF returns the number of instances that a particular string has been used in the range above the current cell, so every time a string is encountered for the first time, it will return 1. Cell B2 contains =COUNTIF($A$2:$A2,A2), and copying this down for all 100,000 rows took around eight and a half minutes. Then, in a separate colum I just used a simple IF formula to filter out the unique values from column A; cell C2 contains =IF(B2=1,A2,1), which returns the string in column A if it is unique; otherwise 1 is returned (to allow easy comparison with SUMPRODUCT). Copying this IF formula down for all 100,000 rows is practically instantaneous. The sum of 1s in column C after this operation was, reassuringly, the same as in the case of SUMPRODUCT, 36,843.
INDEX: Failure
I also played around with an array formula using the INDEX and MATCH functions. This formula that does the same job as COUNTIF, but also filters out the empty rows:
=INDEX($A$2:$A$100001,MATCH(0,COUNTIF($E$1:E1,$A$2:$A$100001),0)). This should be entered in cell B2 as an array formula (Ctrl + Shift + Enter) and then copied down. Copying individual cells one at a time worked fine for a few dozen rows, but anything more than that caused Excel to crash. I even tried running this overnight, but the operation never finished. (The formula could be extended to become case sensitive, but I didn't bother to try.)
One thing to note, however, with the failed INDEX formula was that the behavior described above occured when the formula was applied in a separate workbook. I also tried to run this formula in column D in the same workbook as the COUNTIF formula. Then I did actually run into the memory issue described in the OP, which, unsurprisingly, suggests that the problem with memory depends on the rest of the data in the workbook.
I have a worksheet of athletes whose names appear all in one row. I want to grab all of the data underneath these names (aka the whole columns) so that I can manipulate the data. My issue is that I'm not familiar with available methods or functions in Excel VBA so I have only gone as far as this:
Dim MyArray(0 to 9) as String
MyArray(0) = "Molly"
MyArray(1) = "Jane"
MyArray(2) = "Louis"
MyArray(3) = "Omar"
MyArray(4) = "Wendy"
MyArray(5) = "Greg"
MyArray(6) = "Tina"
MyArray(7) = "Andrew"
MyArray(8) = "Jen"
MyArray(9) = "Lucy"
I'm thinking of creating a script that will look through all the names and select the columns whose names match the values in the Array.
EDIT: I've uploaded an example WS here for reference (please forgive me if this is not according to SO standards, still trying to figure out how this site works and I don't have enough rep to post images :D). I'm interested in manipulating the numbers in the "Total" row, and need to showcase it (along with the specific names that it belongs to). I want to iterate this manipulation over all instances that these names pop up in the WS though, so a loop of sorts would be necessary
Thanks again for the tips/help!
You can loop accross the columns. The code below assumes the columns & your array are in the same order and that you are interested in the first 10 rows of each column of sheet 1.
With Worksheets (1)
For x = 1 to (UBound (MyArray) +1)
' Select column
Range (.Cells (1, x), .Cells (10, x)).Select
' Do whatever you want with the data
'
Next x
End With
You can make this more intelligent using If statements to check whether the first cell in each column matches a name etc. but at least it's a start!
Problem Description:
I have two sheets, one is a blank report template, and the other contains my data. Within the data set there's a location ID in Column M. Each location ID is formatted as , (). For example, a warehouse located in Miami would be: "Miami,FL (DIST)" as it is part of the Distribution division.
I would like to import all of the unique values into an array, and then trim the contents so that I end up with an array of just the names of the City.
Superflous Details:
I thought about doing this in my old "go-to" fashion of advanced filtering the data into a separate sheet, trimming it, and referencing the range, but decided against it.
I would like to learn a bit more about programming with Arrays instead of sheet objects, something that I've avoided since I do not like how VBA handles them vs. other languages that I use where arrays have more dynamic properties.
You may use WorkSheetFunction.Transpose method to copy data in a column to Array. It gives you a index 1 based Single Dimensional Array. If you have multi columns, the method provides you with a multi dimensional array.
Dim arrayV as Variant
arrayV = WorkSheetFunction.Transpose(Sheets(1).Range("A2:A20").Value)
To find the last used row in this range, use the following code that will remove any empty cell values being populated into the Array,
Dim LastRow as Long
LastRow = Sheets(1).Cells(Sheets(1).Rows.Count, _
Sheets(1).Range("A2:A20").Column).End(xlUp).Row
arrayV = WorkSheetFunction.Transpose(Sheets(1).Range("A2:A20").Resize(LastRow-1).Value)
Next, to get the unique values, you may use a dictionary object as it will only hold unique items.
Dim dc as Object
Set dc = CreateObject("Scripting.Dictionary")
For i = Lbound(arrayV) to Ubound(arrayV)
If Not dc.Exists(arrayV(i)) Then
dc.Add arrayV(i), i
End If
Next i
'--output to Sheet or do whatever you want with this
dc.Keys() '-- gives you an array with the unique values
Set dc = Nothing