Feeding pairwise compared values into a large table - excel

I've got a few large (500-1000) datasets in the following format using only the first two rows.
id
value
a-b
number
a-c
number
a-d
number
...
number
b-c
number
b-d
number
and so on
They compare two values and save their difference while skipping previously done comparisons. I want to put them in a table like this:
id
a
b
c
d
e
a
/
number
number
number
number
b
number
/
number
number
number
c
number
number
/
number
number
d
number
number
number
/
number
e
number
number
number
number
/
The lower left half of this table is easily prepared with offset, but how do I feed the values into the upper right half?
Is there a way to mostly automate doing this?

If i understand what you are trying to do, I would suggest to do this in 2 steps:
Set up formulas in the results table to "read" data
Have a macro to "save" data
First fill your results table with this formula (example for cell B2) - keep your offset formula in bottm left half
=IF(B$2=$A3, "\", VLOOKUP(IF(B$2>$A3,$A3&"-"&B$2, B$2&"-"&$A3), $H$3:$I$35, 2, FALSE))
This will give you \ if the row and column id or the same or vlookup on row_id-column_id/column_id-row_id if they are different ensuring they are ordered "low-high" always to give you your mirror across the diagonal. Some might argue the duplication of vlookup is inefficient but more inefficient than an offset and figuring our how to paste one formula into one diagonal and a different one in the other? Who really knows and it is simple and it works IMHO
Next put the data for the current "pass" into the observations table and your values will appear in the table via the VLOOKUP
Finally you need a little macro to replace the formula with the value and run it after every set of results is acquired, so that you don't lose this data when you put a subsequent set of new values in your results table
Option Explicit
Sub replace_formula_with_value()
Dim ws As Worksheet
Set ws = ThisWorkbook.Sheets("Sheet1")
Dim r_in As Range
' this is where my input data start
Set r_in = Range(ws.Range("B3"), ws.Range("B3").End(xlDown).End(xlToRight))
Dim r As Range
For Each r In r_in
If Not IsError(r.Value) Then
If r.Value <> "\" Then
'overwrite formula with value by setting value to the evaluation of the formula in the cell
r.Value = r.Value
End If
End If
Next r
End Sub
If you dont want to have a macro then you need to keep ALL data in your observations table and just build it up over time.

Related

How to calculate various results on the basis of various input values with one formula (Excel VBA)

I am relatively new to Excel VBA that's why I have some problems.
I have constructed a (Poisson-) matrix with range "EB5:EV25" where each cell contains a formula that basically (not only) multiplies two input values, e.g. they look like
=POISSON(0;$H$2;FALSE)*POISSON(0;$H$2;FALSE)
where "$H$2" and "$H$2" are my first two input values.
The results are presented in "EB33:ED33" where in each of the three cells the sum of a certain part of the matrix above is summed up.
Lets say the column with one of the input values is H and the column with the other input values is K (the values are in every 82th row). Thus for every 82th row of column H and K my matrix in "EB5:EV25" calculates results and presents them in "EB33:ED33". This solution so far only allows for presenting the results of one pair of input variables at a time - I need to change the row of input variables in my matrix manually.
However I don't finally want to present my results in "EB33:ED33" but in a different column, say "BA:BC", 2 rows after the row that contains the input values (separately for each row of input values).
I have already tried out some code but I am not even able to work with input variables that are in different columns, neither does my code change the input values - for each row of input values the results are always the same (i does not change).
Dim i As Integer
Dim rng As Range
Set rng = Range("EB5:EV25")
For i = 3 To Cells(Rows.Count, 8).End(xlUp).Row Step 82
rng.Replace "$" & i - 1, "$" & i
Application.Calculate
Range("EB33:ED33").Copy
Cells(i + 1, 63).PasteSpecial xlValues
Next i
Set rng = Nothing
End Sub
Would be great if anybody is able to help!

Excel Match Numbers in 2 Columns to a Number in a 3rd

I've run into a bit of a road block. I get a .PDF output from an accounting program and copy/paste the data into excel, then convert text to columns. I am trying to match the GL code with the totals for that specific account. Columns A, B, and C show the state of my data prior to sorting it, and the lines under Intended Output show how I would like the data to output.
I am trying to automate this process, so I can paste data into columns A, B, & C in the raw format and have it automatically spit out the required numbers in the format of the Intended Output. The GL codes remain the same, but the numbers and the number of rows will change. I've color coded them for ease of review.
Thank you very much in advance!
Using a combination of the following formulas you can create a list of filtered results. It works on the principal that you Data1 text that you want to pull is the only text with a "-" in it, and that the totals you are pulling from Data2 and Data3 are the only numbers in the column. Any change to that pattern will most likely break the system. Note the formulas will not copy formatting.
IFERROR
INDEX
AGGREGATE
ROW
ISNUMBER
FIND
Lets assume the output will be place in a small table with E2 being the upper left data location.
In E2 use the following formula and copy down as needed:
=IFERROR(INDEX(A:A,AGGREGATE(15,6,ROW($A$1:$A$30)/ISNUMBER(FIND("-",$A$1:$A$30)),ROW(A1))),"")
In F2 use the following formula and copy to the right 1 column and down as needed:
=IFERROR(INDEX(B:B,AGGREGATE(15,6,ROW($A$1:$A$30)/ISNUMBER(B$1:B$30),ROW(A1))),"")
AGGREGATE performs array like calculations. As such, do not use full column references such as A:A in it as it can lead to excess calculations. Be sure to limit it to the range you are looking at.
Try this procedure:
Public Sub bruce_wayne()
'Assumptions
'1. Data spreadsheet will ALWAYS have the structure shown in the question
'2. The key word "Total" (or whatever else it might be) is otherwise NOT found
' anywhere else in the 1st data column
'3. output is written to the same sheet as the data
'4. As written, invoked when data sheet is the active sheet
'5. set the 1st 3 constants to the appropriate values
Const sData2ReadTopLeft = "A1" 'Top left cell of data to process
Const sData2WriteTopLeft = "J2" 'Top left cell of where to write output
Const sSearchText = "Total" 'Keyword for summary data
'*******************
Const sReplaceText = "Wakanda"
Dim r2Search As Range
Dim sAccountCode As String
Dim rSearchText As Range
Dim iRowsProcessed As Integer
Set r2Search = Range(sData2ReadTopLeft).EntireColumn
sAccountCode = Range(sData2ReadTopLeft).Offset(1, 0).Value
iRowsProcessed = 0
Do While Application.WorksheetFunction.CountIf(r2Search, sSearchText) > 0
Set rSearchText = r2Search.Find(sSearchText)
Range(sData2WriteTopLeft).Offset(iRowsProcessed, 0) = sAccountCode
Range(sData2WriteTopLeft).Offset(iRowsProcessed, 1) = rSearchText.Offset(0, 1).Value
Range(sData2WriteTopLeft).Offset(iRowsProcessed, 2) = rSearchText.Offset(0, 2).Value ' add this if there are more summary columns to return
'last two lines could be collapsed into a single line; at the expense of readability..
rSearchText.Value = sReplaceText 'so that next search will find the next instance of the trigger text
iRowsProcessed = iRowsProcessed + 1
sAccountCode = rSearchText.Offset(1, 0).Value
Loop
r2Search.Replace what:=sReplaceText, Replacement:=sSearchText
End Sub

Select all rows where all cells equal one another

I have a large table that only shows a single type of information: whether or not a species of plant was present at a particular study site. I have 500+ species listed in the first column, and 30 sites as column names. The table is populated with a simple "Y" or "N" to show presence. Example:
Scientific Name Old Wives Beach Dadi Orote N Airstrip
Abelmoschus moschatus N N N
Abrus precatorius Y N Y
Abutilon indicum N N N
However, the species list contains some species that do not occur at any sites, rendering a row full of "N"s, like the 1st and 3rd rows in the example above. I need to delete those rows in order to make the table more manageable.
Is there any way to achieve this without a long IF AND statement?
Inspired by pnuts' comment, in a new column, use the a COUNTIF() formula. For example, =COUNTIF(B2:AE2,"Y"), assuming the row/column headers are in row 1 and column A and the data is in the range B2:AE501+.
If you then select the entire range, including the headers and the new formula column and add filters, then you can select only the rows where the count of Y's is 0. Once you have only the 0's showing, you can select the entire rows and delete them (using Right-Click, Delete) without effecting the non-zero rows.
At this point, if you no longer need the count column, you can turn off the filter and delete the column but I wouldn't be surprised if you find the count comes in handy for some other reason.
Alternately, you could just use the filter to HIDE the 0 rows rather than delete them and that way to don't remove the data altogether but it's no longer in your way.
The code below is one way to do this, assuming there are no gaps in the data. The animated gif steps through to demonstrate how it works. You should remove the .select statements once you understand it.
Sub deleteIfAllN()
Dim plantR As Range, cell As Range, allN As Boolean
Set plantR = Range("A2")
While plantR <> ""
plantR.Select
Set r = plantR.Offset(0, 1)
allN = True
Do
r.Select
If r <> "N" Then
allN = False
Exit Do
End If
Set r = r.Offset(0, 1)
Loop Until r = ""
Set plantR = plantR.Offset(1, 0)
Rows(plantR.row - 1).Select
If allN Then Rows(plantR.row - 1).Delete
Wend
End Sub
You can use the Advanced Filter
Set up your data and criterion area as below
For the example you posted, the formula would be:
=COUNTIF($B8:$D8,"N")<>3
For 30 columns, just modify the range and the count.
Before
After
I chose to filter in place
Note that there is also an option to Copy to another location which would place the results of the filter in another location.

Day/Time Difference VBA

Hello I'm trying to figure out the time difference between data in Column K and J. I need to know how long it took for data to be updated by the amount of days it took, hours, or minutes and am using those columns. I also want to get the average amount of time only if the names in Column A match and enter that information in Column N. This is what I have so far for the first part.
ActiveSheet.Name = "Raw Data"
Range("M2:M").Value = ("K2:K-J2:J" > 1) + ("d:hh:mm")
Thanks for the help.
This should get the results you want. The With ... End With block is used to save having to write Worksheets("Raw Data") all the time. Anything within the block which starts with a . refers to Worksheets("Raw Data")
The formula was adapted from this answer but modified to allow for more than 31 days. We then use the FillDown method to copy the formula in cell M2 down to the same row as the last used cell in column K. The references in the formula adjust automatically to point to the correct row
edit: updated to calculate average elapsed time for each name listed in column A. This is more complicated because we can't directly average the values we put in column M because we have converted them into a text format. I've chosen to use column Z to hold the numeric time values which we need. You could choose any other unused column and you could hide the column so it doesn't display.
The last used row is stored in a variable because we need to use it in a couple of different places. The numeric difference between columns K and J is stored in column Z. The text value in column M is then calculated from the value in column Z.
Finally, the average in column N is calculated using the AVERAGEIF function. This looks through all the used rows in column A, finds those with the same name as the current row and then averages all the values from column Z for any matching rows.
One important point to note: all of the formulas in column N will need to be rewritten if any rows are added or deleted. This is because the last row value stored in those formulas will become incorrect if rows are added or deleted
With Worksheets("Raw Data")
Dim lngLastUsedRow As Long
lngLastUsedRow = .Cells(.Rows.Count, "K").End(xlUp).Row
.Range("Z2").Formula = "=K2-J2"
.Range("Z2:Z" & lngLastUsedRow).FillDown
.Range("M2").Formula = "=FLOOR(Z2,1)&"":""&TEXT(Z2,""hh:mm"")"
.Range("M2:M" & lngLastUsedRow).FillDown
.Range("N2").Formula = "=AVERAGEIF(A$2:A$" & lngLastUsedRow & ",A2,Z$2:Z$" & lngLastUsedRow &")"
.Range("N2:N" & lngLastUsedRow).FillDown
End With

Dynamic insert function

It is a requirement that I use Excel to solve this issue.
In col A I have 0s and 1s with various quantities of 0s between the 1s. Every time a 1 appears I want the difference between two numbers given in two columns next to my binary column. However I wish to get the results from this calculation stated next to the previous 1.
I'd cope with different software, but how do I achieve this with Excel?
=IF(A4=1,OFFSET(B4,MATCH(1,A5:A$1000,0),0)-OFFSET(C4,MATCH(1,A5:A$1000,0),),"")
in D4 and copied down to suit seems to work.
Edit:
=(IF(A$=1, ,"") is as in: IF(logical_test,value_if_true,value_if_false) where value if false is (blank), expressed as "".
The value_if_true is the difference between ColumnB and ColumnC values, each ‘located’ from an OFFSET function as in =OFFSET(reference,rows,cols,height,width).
references are to the appropriate column for the row into which the formula is inserted (ie B4 and C4) from which the values required are ‘south’ by a variable amount.
MATCH, as in =MATCH(lookup_value, lookup_array, [match_type]) is to determine the extent of the offset on a case-by-case basis. In reverse order, the parameters here are match_type = 0 (to require an exact match) and lookup_array is as much of ColumnA as required. Initially chosen as up to Row1000 (by A$1000) but can be extended as far as necessary, subject to row limit for the relevant Excel version.
The first parameter lookup_value) is of course 1 since that is the flag for the rows that contain the values to be subtracted.
Without a $ between A and 5 in the MATCH functions the size of the array automatically decreases (top cell row reference increases) as the formula is copied down, hence finds the next instance (rather than the same one over and over again).
With VBA, I'd first set the formulas to show results in same line as the "ones". (Suppose I used the D column for that.)
= if(A1 = 1; B1 - C1; "")
Then, in VBA window, do the following:
Dim i as integer
Dim Filled as Collection
Set Filled = new Colleciton 'this collection will stored filled lines
'store filled lines
for i = 2 to 1000 'use your table limit
if Sheet1.Cells(i, 4).Value <> "" then 'the 4 is for D column, i for the line
Filled.Add i
end if
next
'now fill the E column with displaced values
for i = 1 to Filled.Count - 1
Sheet1.Cells(Filled(i), 5).Value = Sheet1.Cells(Filled(i+1), 5).Value
next
'please note there's a missing line (the last), it's up to you to decide how to fill it
'sorry I cannot debug this code
I'd associate that to some sheet event or to a button.

Resources