lookup Data in Excel - excel

I have a 2 variable 100x100 data table in excel.
I need to have a function that returns all the possible sets of variables that yield a given target value.
What I am looking at is some kind of a reursive 2 dimensional lookup function. Can someone point me in the right direction?

It can be done without VBA, fairly compactly, like so.
Suppose your 100x100 table is in B2:CW101, and we put a list of numbers 1 to 100 down the left from A2 to A101, and again 1 to 100 across the top from B1 to CW1
Create a column of cells underneath, starting (say) in B104
B104=MAX(($A$2:$A$101*100+$B$1:$CW$1<B103)*($B$2:$CW$101=TargetValue)*($A$2:$A$101*100+$B$1:$CW$1))
This is an "array" formula,so press Ctrl-Shift-Enter instead of Enter, and curly brackets {} should appear around the formula.
Then copy down for as many rows as you might need. You also need to put a large number above your first formula, i.e. in B103, e.g. 999999.
What the formula does is to calculate Rowx100+Column, but only for each successful cell, and the MAX function finds the largest result, excluding all previous results found, i.e. it finds the target results one at a time, starting from bottom right and working up to top left. (With a little effort you could get it to search the other way).
This will give you results like 9922, which is row 99, column 22, and you can easily extract these values from the number.

There is no built-in function that will do what you want, I'm 99% sure of that.
A VBA function that returns an array could be built, along the lines of the quick-and-dirty Sub already shown. Create an Variant to hold the output, perhaps Redimmed to the maximum possible number of results and Redim Preserve-d down to the actual number at the end. Then return that as the result of the function which then needs to be called as an array function (Control-Shift-Enter).
One down-side is that you'd have to ensure that the target range was large enough to hold the entire result: Excel won't do that automatically.

Would the Solver suit?
http://office.microsoft.com/en-us/excel/HA011118641033.aspx

I tried this a lot without using VBA but doesn't seem to be possible without it.
To solve this issue , I needed to loop through the entire array and found closest values. These values were then derefernced using calls and range properties and the output was generated in a range being incremented at each valid match.
The quick and dirty implementation is as under:
Dim arr As Range
Dim tempval As Range
Dim op As Integer
Set arr = Worksheets("sheet1").Range("b2:ao41")
op = 1
Range("B53:D153").ClearContents
For Each tempval In arr
If Round(tempval.Value, 0) = Round(Range("b50").Value, 0) Then
Range("b52").Offset(op, 0).Value = Range("a" & tempval.Row).Value
Range("b52").Offset(op, 1).Value = Cells(tempval.Column, 1).Value
Range("b52").Offset(op, 2).Value = tempval.Value
op = op + 1
End If
Next
Range("b50").Select
I am still looking for an approach without VBA.

I've got a solution that doesn't use VBA, but it's fairly messy. It involves creating a further one-dimensional table in Excel and doing lookups on that. For a 100x100 data table, the new table would need 10,000 rows.
Apologies if this doesn't fit your needs.
A summary is below - let me know if you need more detail. N = the dimension of the data, e.g. 100 in your example.
First, create a new table with five columns and NxN rows. In each case, replace my column names with the appropriate Excel reference
The first column (call it INDEX) simply lists 1, 2... NxN.
The second column (DATAROW) contains a formula to loop through 1, 2... N, 1, 2...N... This can be done using something like =MOD(INDEX-1, N)+1
The third column (DATACOL) contains 1, 1, 1... 2, 2, 2... (N times each).
This can be done with =INT((INDEX-1)/N)+1
The fourth column (VALUE) contains the value from your data table, using something like:
=OFFSET($A$1, DATAROW, DATACOL), assuming your data table starts at $A$1
We have now got a one-dimensional table holding all your data.
The fifth column (LOOKUP) contains the formula:
=MATCH(target, OFFSET(VALUERANGE, [LOOKUP-1], 0),0)+ [LOOKUP-1]
where [LOOKUP-1] refers to the cell immediately above (e.g. in cell F4 this refers to F3). You'll need a 0 above the first cell in the LOOKUP column.
VALUERANGE should be a fixed (named or using $ signs) reference to the entire VALUE column.
The LOOKUP column then holds INDEX numbers which can be used to look up DATAROW and DATACOL to find the position of the match in the data.
This works by searching for matches in VALUERANGE, then searching for matches in an adjusted range starting after the previous match.
It's much easier in a spreadsheet then via the explanation above, but that's the best I can do for the moment...

Related

Returning all possible values instead of a VLOOKUP

So I've looked up tutorials on how to do this, and I'm still struggling, so I could use some expert help. I know it involves a very complex nested formula with things like SMALL, ROW, INDEX, etc...
So here are two screenshots that provide a sample of what I'm looking for. In realities there is over 1000 rows, but this makes it easier for you guys.
So here is my first example, lets call this Sheet1!:
Code, ID_1 and ID_2. So as you can see (and just focus on the input in A2) there will be two separate IDs in the linked workbook. That sheet, or at least a tiny sample of it, looks like this:
In the first column we see the code we're looking for (which is what we have in A2 of the first one), each of them with different IDs. So as I'm sure you can tell by now, I'm looking for a formula that will allow me to return those values in ID_1 and ID_2 in the first sheet.
I have been going at this for an hour and I'm stumped, so I would greatly appreciate any help provided!
This is a more generic code if the ids are NOT listed consecutively: Obviously I have done this as an example to take in a more general case where the ids occur anywhere throughout the second dataset, AND where there are potentially several.
IFERROR(INDEX($V$2:$V$15, SMALL(IF($U$2:$U$15=$M2, ROW($U$2:$U$15), FALSE), COLUMNS($N2:N2))-ROW($V$1), 1), "")
This formula must be entered with Ctrl-Shift-Enter before copying across and down! Note all absolute and relative referencing/locking ($ signs)
The logical steps in constructing such a formula:
1) We use IF function to test if the values in the column U match the value in column M.
2) In the 'value-if-true' parameter, we will get the corresponding row number of values in column U. These numbers will be fed later in the SMALL function.
3) In the value-if-false part, we just return false, as that will later be used as a non-number in the SMALL function
Above 3 steps in the part: IF($U$2:$U$15=$M2, ROW($U$2:$U$15), FALSE)
4 ) We have now an array of mixed row numbers and FALSE values, which we want to feed to the INDEX function to simply get the corresponding value in column V(our second datset). BUT as we wish to retrieve the different row matches for each code, we have to fish them out of the mixed array with the SMALL function.
5) using our columns as an incrementer, we apply the SMALL function to the array with a varying k parameter. We USE the COLUMNS function (note carefully the different $ sign usage), so that as we drag the formula across, the column count increments: COLUMNS($N2:N2) - giving K values of 1, 2, 3, 4 as we drag the formula across from column N to column Q. Note that it is useful that the SMALL function disregards FALSE values when looking through the array for the values by size.
6) There is an adjustment to account for the fact that the rows are relative to the 'Ids' range which we will feed into the INDEX function to retrieve the different ids. SMALL(IF($U$2:$U$15=$M2, ROW($U$2:$U$15), FALSE), COLUMNS($N2:N2))-ROW($V$1).
This can be avoided if we use the entire column V as the look-up array parameter in the INDEX function, but that's another way...
7) This resulting value can now be passed to the INDEX function to obtain the various ids. The column_num parameter of 1 which I put in the function isn't necessary in a single-column look-up array, but is there for completeness.
8) The entire construction is then wrapped in an IFERROR function to give an empty string if there is no match, but some people may wish to have error outputs there...
well if the two ID will be consecutive in the second list try this:
=index('workbookname'SheetName!columnrangeofserialnumbers,match(A2,'workbookname'Sheetname!columnrangeofIDs,0))
Assuming your other workbook is called Serials, and all the info is on sheet1 you would enter the follow in B2:
=index('serials'sheet1!$B$2:$B$1000,match(A2,'serials'sheet1!$B$2:$B$1000,0))
in C2 enter the following (assuming ids will show up consecutively)
=index('serials'sheet1!$B$2:$B$1000,match(A2,'serials'sheet1!$B$2:$B$1000,0)+1)
This only works if the other workbook is open as far as I know and with the understanding that the two ID will be listed consecutively in the list.

How do I return the intersecting value from 2 partial match lookups? Index/Match

I have two tables in an excel worksheet. I'm trying to gather product info from data on another table in the same workbook. The first table is the product data feed I'm building with the product part numbers. Those part numbers include the variables of the product (in this case the length and the width). On the other sheet, I have partial part numbers in the header column and the rough dimensions in the header row. The intersection gives the final dimensions which is the data I'm trying to gather on sheet 1. I've been trying to use and Index/Match formula to solve the problem, but since there are only partial part numbers on the 2nd sheet the lookup is inconclusive. I know the lookup value supports wildcards, but it seems I would need some sort of wildcard search within the lookup array instead.
Example product names on sheet 1 column A "EXP81285-150-11 x 14-Flat"
Example of product names on sheet 2 column A "EXP81285-150"
Example of rough dimensions on sheet 2 row 1 "11 x 14"
Here is what I have so far:
=INDEX('sheet 2'!$A$1:$L$87,MATCH($A3,'sheet 2'!$A:$A,0),MATCH($A3,'sheet 2'!$1:$1,0))
Sheet 1
Sheet 2
Any help is greatly appreciated!
Asuming its always like string1-string2-unused and string2 and unused doesn't contain "-" you can get the first string with:
*updated due to misunderstanding*
=MID(A3,4,FIND("|",SUBSTITUTE(A3,"-","|",LEN(A3)-LEN(SUBSTITUTE(A3,"-",""))-1))-4)
While the string2 one is a hell of a formula:
=MID(A3,FIND("|",SUBSTITUTE(A3,"-","|",LEN(A3)-LEN(SUBSTITUTE(A3,"-",""))-1))+1,FIND("|",SUBSTITUTE(A3,"-","|",LEN(A3)-LEN(SUBSTITUTE(A3,"-",""))))-FIND("|",SUBSTITUTE(A3,"-","|",LEN(A3)-LEN(SUBSTITUTE(A3,"-",""))-1))-1)
Asuming the last part is allways in Q3 then:
=MID(SUBSTITUTE($A3,"-"&$Q3,""),FIND("|",SUBSTITUTE($A3,"-","|",LEN($A3)-LEN(SUBSTITUTE($A3,"-",""))-1))+1,99)
You may also use an arrayformula for the second part like:
=MID(SUBSTITUTE($A3,"-"&$Q3,""),LARGE((MID($A3,ROW($1:$99),1)="-")*ROW($1:$99),2)+1,99)
This is an array formula and must be confirmed with Ctrl+Shift+Enter.
(the second formula may work faster)
You could use array formulas in a reverse Match however... having lots of entrys even one formula will slow down the calculation by ~2-5 seconds.
You better use VBA like:
(in Module)
Public Function MATCH2(str As String, rng As Range) As Long
Dim i As Long, var1 As Variant
i = 0
For Each var1 In rng
i = i + 1
If InStr(str, var1.Value) Then MATCH2 = i: Exit Function
Next
End Function
And then use your formula as followed:
=INDEX('sheet 2'!$A$1:$L$87,MATCH2($A3,'sheet 2'!$A:$A,0),MATCH2($A3,'sheet 2'!$1:$1,0))
EDIT 2015-11-19
OK... some small problems:
some sizes doesnt exist (like 6 x 9)
size 7 x 12 was bugged (a space at the end > fixed it)
the function needs to be in a module (also fixed that)
also some items doesn't exist like 600823-002
a misunderstanding regarding the formulas (doesn't matter at the VBA-version) > all asumed the A:A-searchstring starts at the 1st character (but it is the 4th, no EXP)
Also there will be an error at each "header" (the ones without the * x * but that should be ok)
You can download the updated workbook here
If you still have questions, just ask :)
Here is one using vlookup:
=VLOOKUP(LEFT(A2,FIND("-",A2,10)-1),Sheet2!A:L,MATCH(MID(A2,FIND("-",A2,10)+1,(FIND("-",A2,15))-(FIND("-",A2,10)+1)),Sheet2!A1:L1,0),FALSE)
But I agree with Dirk, This could be done faster and probably more accurate with vba.
Edit, I realized that my dictating of 10 and 15 in the formula would not work, I have fixed it, but it is based on the part number has 1 and only 1 "-" in the part name. Warning it is quite long.
=VLOOKUP(LEFT(A2,FIND("-",A2,FIND("-",A2,1)+1)-1),Sheet2!A:L,MATCH(MID(A2,FIND("-",A2,FIND("-",A2,1)+1)+1,(FIND("-",A2,FIND("-",A2,FIND("-",A2,FIND("-",A2,1)+1))+1))-(FIND("-",A2,FIND("-",A2,1)+1)+1)),Sheet2!A1:L1,0),FALSE)

Random excel function with no duplicates and unfixed data

I need to translate an old program working on AS/400 that was picking random students to work for my city. I can use any program, as long as it works. To make it simple and fast, i chose excel.
However, i come over a small problem. I need to have no duplicates, because the same student can't do 2 jobs over one summer. Also, i need this to be flexible, since every year, new students will be added and some will be deleted.
This function works almost as much as i would want it:
=INDEX($A:$A,RANDBETWEEN(1,COUNTA($A:$A)),1)
The index $A:$A gets all the lines in the column A. So even if i add 20 names, it will take them into consideration. Then it choose randomly a value (the name) between the line 1 and the number of total lines (COUNTA) in the column $A. The problem with this method is that it allows duplicates.
Another function i found was to create a colum full of =ALEA() and then rank these by the numbers. This is not very pretty, but at least, there is no duplicates. The problem comes from my formula, that is static, and that i can't make flexible:
=INDEX($A$2:$A$74,RANK(B2,$B$2:$B$74))
My names are in the colum $A and my random values in colum $B. What i say is, rank the value in B2 (then B3, then B4, etc.) that is found in the column B.
What i would like is to integrate the COUNTA into the second function and (IF POSSIBLE) take the RANDBETWEEN instead of the rank function so that i don't have ugly numbers.
I am opened to use the first function with some kind of duplicate check. As long as the secretary doesn't have to do a lot of manipulation, it should be fine.
Thanks a lot for your help xox
I've created something in VBA that does what I think you want. Now keep in mind I'm very new with VBA, so it probably isn't the prettiest thing. To be clear, I had 10 names in column A from rows 1 to 10 and then simply ran this subroutine and it generated a list of unique names in column F. Here's my code:
Sub getRandom()
Do While Application.WorksheetFunction.CountA(Range("A:A")) > 0
Dim count As Integer
count = Application.WorksheetFunction.CountA(Range("A:A"))
Dim name As String
name = Application.WorksheetFunction.Index(Range("A:A"), Application.WorksheetFunction.RandBetween(1, count))
Dim row As Integer
row = Application.WorksheetFunction.Match(name, Range("A:A"), 0)
Range("F11").Select
Selection = name
Rows(row).EntireRow.Delete
Loop
End Sub
If you wanted to get the names one at a time, just remove the loop. How it works is it does exactly what you did with the INDEX and RANDBETWEEN functions, grabs the name in column A with that generated number, then deletes that row entirely, and thus no unique name is generated.
I chose column F arbitrarily and cell F11 specifically since that cell will not be affected when the rows are deleted.
I hope this helps, and if it's not what you were looking for I'll see if I can enhance it a bit.
There are many examples available for doing this in VBA (google Excel Random Unique) One of the best is from CPearson.Com which includes (among others):
Random Elements From A Range Of Worksheet Cells
You can also create a function that returns a number of elements in random order and without duplicates from a range of worksheet cells. The function below does just this. You can call as an array formula with a formula like
=RandsFromRange(A1:A10,5)
where A1:A10 is the list of elements from which to pull the values and 5 is the number of values to return. Enter the formula in a range with as many cells as you specify to return, and press CTRL SHIFT ENTER rather than just ENTER.
If you don't want to use VBA then then it can be done with a few simple steps:
Insert a column next to your student names, with the formula =RAND()
Sort the list of names and random numbers on the number
Pick as many students as you want fromm the top of the list
Each time you sort you get a new random ordering

Excel: Find last value in an array

I have a dataset with
> A b c d...AA,BB
>1,2,3,4
> apple apple apple
> orange pear pear apple pear
> grapefruit,grape, grape,grape
Is there a way to find the final occurence of a particular fruit in the array automatically via formula in Excel?
You need to use counta to tell you how many items are in the array and index to get the value of the last element.
You can try
=INDEX(1:1,0,COUNTA(1:1))
This will find the last value in the 1:1 array.
Write a user defined function to search the data backward from the last cell
Function LastFruit(r As Range, Fruit As String) As Range
Dim rw As Long, col As Long
For rw = r.Rows.Count To 1 Step -1
For col = r.Cells.Count To 1 Step -1
If r.Cells(rw, col) = Fruit Then
Set LastFruit = r.Cells(rw, col)
End If
Next
Next
End Function
You might want to try this, although it forces you to make one extra array, it looks for the first occurrence: Put your fruit data in A1:A10 and add an extra column next to it in B1:B10 (this is important and alas mandatory (see VLOOKUP description : it has to be ?1:?10)) with numbers from 1 to 10
To populate column B you can use, depending on your needs, formulas like
= ROWS($B$1:B1)
= ROW() + offset
Then the formula that will get your information is VLOOKUP (HLookup if your data array is horizontal). It will look for the value in the leftmost column of the argument matrix and return the matching value in the 2nd column (3rd argument, column B in our case). The FALSE is to require an exact match.
= VLOOKUP("orange", A1:B10, 2, FALSE)
Remember the drawbacks:
* You have to add one extra data column, be it convenient or not
* It will look for the first result. period.
(I am still searching for a better way to really find the MIN and MAX of an array of findings, but no success yet, except with Ctrl-Shift-Enter formulas, which are a no-go. Please post back if you find it)
Let's suppose we have a horizontal array of fruit names in A1:J1. The column number for the last occurence of "apple" would be:
{=MAX(COLUMN(A1:J1)*(A1:J1="apple"))}
Don't forget to press Ctrl+Shift+Enter, it's an array formula.
It's the same idea as PPC's concept of a bit mask & sequential numbers, but invented independently and expressed in a much more compact way. :)
I haven't given it a big stress test, but I saw no problem using more complicated formulas in multiple places on hundreds of items in each instance, which is quite enough for me.
Another very different solution, which very few people will like: the goal is to use a huge number that contains all matches as a bitmask. Then, using arithmetics, you can find the last match.
Disclaimer: this solution is
Inelegant
computationnaly heavy
will overflow with moderately big records (for me ~1015). Overflow can cause clean errors (#NUM) and might also give a wrong number due to excessive float rounding (I have not observed it but it's still possible)
You need to have an array of sequential numbers of the same size as your dataset (doesn't have to be close though). If your fruits are in A1:A10, you can put the values (1..10) in Z1:Z10.
= FLOOR(IMLOG2(
SUMPRODUCT( (A1:A10 = "orange")*1 ; Z1:Z10
) ; 1)
Let's look at it:
SUMPRODUCT will make the bitmask containing 1s wherever you have the orange word
IMLOG2 (Why doesn't Excel have a Real numbers log2?) will get you the (float) log2 of the mask
FLOOR will truncate it, the result is the "biggest index" of 1s in the bitmask
Hopefully you will find other arithmetical operations for finding other matches

Copy every nth line from one sheet to another

I have an Excel spreadsheet with 1 column, 700 rows. I care about every seventh line. I don't want to have to go in and delete the 6 rows between each row I care about. So my solution was to create another sheet and specify a reference to each cell I want.
=sheet1!a1
=sheet1!a8
=sheet1!a15
But I don't want to type in each of these formulas ... `100 times.I thought if I selected the three and dragged the box around, it would understand what I was trying to do, but no luck.
Any ideas on how to do this elegantly/efficiently?
In A1 of your new sheet, put this:
=OFFSET(Sheet1!$A$1,(ROW()-1)*7,0)
... and copy down. If you start somewhere other than row 1, change ROW() to ROW(A1) or some other cell on row 1, then copy down again.
If you want to copy the nth line but multiple columns, use the formula:
=OFFSET(Sheet1!A$1,(ROW()-1)*7,0)
This can be copied right too.
In my opinion the answers given to this question are too specific. Here's an attempt at a more general answer with two different approaches and a complete example.
The OFFSET approach
OFFSET takes 3 mandatory arguments. The first is a given cell that we want to offset from. The next two are the number of rows and columns we want to offset (downwards and rightwards). OFFNET returns the content of the cell this results in. For instance, OFFSET(A1, 1, 2) returns the contents of cell C2 because A1 is cell (1,1) and if we add (1,2) to that we get (2,3) which corresponds to cell C2.
To get this to return every nth row from another column, we can make use of the ROW function. When this function is given no argument, it returns the row number of the current cell. We can thus combine OFFSET and ROW to make a function that returns every nth cell by adding a multiplier to the value returned by ROW. For instance OFFSET(A$1,ROW()*3,0). Note the use of $1 in the target cell. If this is not used, the offsetting will offset from different cells, thus in effect adding 1 to the multiplier.
The ADDRESS + INDIRECT approach
ADDRESS takes two integer inputs and returns the address/name of the cell as a string. For instance, ADDRESS(1,1) return "$A$1". INDIRECT takes the address of a cell and returns the contents. For instance, INDIRECT("A1") returns the contents of cell A1 (it also accepts input with $'s in it). If we use ROW inside ADDRESS with a multiplier, we can get the address of every nth cell. For instance, ADDRESS(ROW(), 1) in row 1 will return "$A$1", in row 2 will return "$A$2" and so on. So, if we put this inside INDIRECT, we can get the content of every nth cells. For instance, INDIRECT(ADDRESS(1*ROW()*3,1)) returns the contents of every 3rd cell in the first column when dragged downwards.
Example
Consider the following screenshot of a spreadsheet. The headers (first row) contains the call used in the rows below.
Column A contains our example data. In this case, it's just the positive integers (the counting continues outside the shown area). These are the values that we want to get every 3rd of, that is, we want to get 1, 4, 7, 10, and so on.
Column B contains an incorrect attempt at using the OFFSET approach but where we forgot to use $. As can be seen, while we multiply by 3, we actually get every 4th row.
Column C contains an incorrect attempt at using the OFFSET approach where we remembered to use $, but forgot to subtract. So while we do get every 3rd value, we skipped some values (1 and 4).
Column D contains a correct function using the OFFSET approach.
Column E contains an incorrect attempt at using the ADDRESS + INDRECT approach, but where we forgot to subtract. Thus we skipped some rows initially. The same problem as with column C.
Column F contains a correct function using the ADDRESS + INDRECT approach.
If I were confronted with extracting every 7th row I would “insert” a column before Column “A” . I would then (assuming that there is a header row in row 1) type in the numbers 1,2,3,4,5,6,7 in rows 2,3,4,5,6,7,8, I would highlight the 1,2,3,4,5,6,7 and paste that block to the end of the sheet (700 rows worth). The result will be 1,23,4,5,6,7,1,2,3,4,5,6,7,1,2,3,4,5,6,7……. Now do a data sort ascending on column “A”. After the sort all of the 1’s will be the first in the series, all of the 7’s will be the seventh item.
insert a new column and put a series in 1,2,3,4, etc. Then create another new column and use the command =if(int(a1/7)=(a1/7),1,0) you should get a 1 in every 7th row, filter the column on the 1
Highlight the 7th line. Paintbrush the format for the first 7 lines a few times. Then do a bigger chunk of paintbrush copying the format until you are done. Every 7th line should be highlighted. Filter by color and then copy and paste (paste the values) from the highlighted cells into a new sheet.
Create a macro and use the following code to grab the data and put it in a new sheet (Sheet2):
Dim strValue As String
Dim strCellNum As String
Dim x As String
x = 1
For i = 1 To 700 Step 7
strCellNum = "A" & i
strValue = Worksheets("Sheet1").Range(strCellNum).Value
Debug.Print strValue
Worksheets("Sheet2").Range("A" & x).Value = strValue
x = x + 1
Next
Let me know if this helps!
JFV
If your original data is in column form with multiple columns and the first entry of your original data in C42, and you want your new (down-sampled) data to be in column form as well, but only every seventh row, then you will also need to subtract out the row number of the first entry, like so:
=OFFSET(C$42,(ROW(C42)-ROW(C$42))*7,0)
Add new column and fill it with ascending numbers. Then filter by ([column] mod 7 = 0) or something like that (don't have Excel in front of me to actually try this);
If you can't filter by formula, add one more column and use the formula =MOD([column; 7]) in it then filter zeros and you'll get all seventh rows.

Resources