Random excel function with no duplicates and unfixed data - excel

I need to translate an old program working on AS/400 that was picking random students to work for my city. I can use any program, as long as it works. To make it simple and fast, i chose excel.
However, i come over a small problem. I need to have no duplicates, because the same student can't do 2 jobs over one summer. Also, i need this to be flexible, since every year, new students will be added and some will be deleted.
This function works almost as much as i would want it:
=INDEX($A:$A,RANDBETWEEN(1,COUNTA($A:$A)),1)
The index $A:$A gets all the lines in the column A. So even if i add 20 names, it will take them into consideration. Then it choose randomly a value (the name) between the line 1 and the number of total lines (COUNTA) in the column $A. The problem with this method is that it allows duplicates.
Another function i found was to create a colum full of =ALEA() and then rank these by the numbers. This is not very pretty, but at least, there is no duplicates. The problem comes from my formula, that is static, and that i can't make flexible:
=INDEX($A$2:$A$74,RANK(B2,$B$2:$B$74))
My names are in the colum $A and my random values in colum $B. What i say is, rank the value in B2 (then B3, then B4, etc.) that is found in the column B.
What i would like is to integrate the COUNTA into the second function and (IF POSSIBLE) take the RANDBETWEEN instead of the rank function so that i don't have ugly numbers.
I am opened to use the first function with some kind of duplicate check. As long as the secretary doesn't have to do a lot of manipulation, it should be fine.
Thanks a lot for your help xox

I've created something in VBA that does what I think you want. Now keep in mind I'm very new with VBA, so it probably isn't the prettiest thing. To be clear, I had 10 names in column A from rows 1 to 10 and then simply ran this subroutine and it generated a list of unique names in column F. Here's my code:
Sub getRandom()
Do While Application.WorksheetFunction.CountA(Range("A:A")) > 0
Dim count As Integer
count = Application.WorksheetFunction.CountA(Range("A:A"))
Dim name As String
name = Application.WorksheetFunction.Index(Range("A:A"), Application.WorksheetFunction.RandBetween(1, count))
Dim row As Integer
row = Application.WorksheetFunction.Match(name, Range("A:A"), 0)
Range("F11").Select
Selection = name
Rows(row).EntireRow.Delete
Loop
End Sub
If you wanted to get the names one at a time, just remove the loop. How it works is it does exactly what you did with the INDEX and RANDBETWEEN functions, grabs the name in column A with that generated number, then deletes that row entirely, and thus no unique name is generated.
I chose column F arbitrarily and cell F11 specifically since that cell will not be affected when the rows are deleted.
I hope this helps, and if it's not what you were looking for I'll see if I can enhance it a bit.

There are many examples available for doing this in VBA (google Excel Random Unique) One of the best is from CPearson.Com which includes (among others):
Random Elements From A Range Of Worksheet Cells
You can also create a function that returns a number of elements in random order and without duplicates from a range of worksheet cells. The function below does just this. You can call as an array formula with a formula like
=RandsFromRange(A1:A10,5)
where A1:A10 is the list of elements from which to pull the values and 5 is the number of values to return. Enter the formula in a range with as many cells as you specify to return, and press CTRL SHIFT ENTER rather than just ENTER.
If you don't want to use VBA then then it can be done with a few simple steps:
Insert a column next to your student names, with the formula =RAND()
Sort the list of names and random numbers on the number
Pick as many students as you want fromm the top of the list
Each time you sort you get a new random ordering

Related

Search for partial match of text string in array and return match of greater length

I have a list of locations, most of which contain a town name within them. I would like to extract the town name. However, some town names are contained within other names, for example, "hadley", and "east hadley". Based on this post , I have found 2 different almost-solutions to my problem (see image below). However, depending on the order of the town names in Column D, the result may return the shorter or longer name. How can I always obtain the more complete match? I have over 18000 records so need an automated solution.
Array formula in column B (top) and formula in column C (bottom)
So as per my comment, The reason that neither formula is working has to do with the fact that excel searches one direction till it finds a match then stops searching, even if there is a better match further along.
Your first equation is searching from the top down and the second is searching from the bottom up, this is why you are getting different answers.
To fix this the search area must be put in some sort of order. It must go from the longest string to the shortest along the search path.
To do this add a helper column in E. Place the formula Len(D2) in E2 and copy down. Then sort column D and E on Column E:
Then you just need to use the first equation:
If you like the second, sort columns D and E ascending:
And use the second equation:
The third option is to do both and take the longest but that is more steps that can be done quicker by simple sorting the search list.
I think you can just compare the results of columns B and C in a new column for the greatest string with: =IF(LEN(B2)>LEN(C2);B2;C2)
just to give you a solution without sorting or helper-columns:
=INDEX($D$2:$D$6,MAX((MAX(NOT(ISERROR((FIND($D$2:$D$6,A2)>0)))*LEN($D$2:$D$6))=LEN($D$2:$D$6))*NOT(ISERROR(FIND($D$2:$D$6,A2)))*ROW($1:$5)))
or a different(slight faster) way:
=INDEX($D$2:$D$6,MAX((MIN(LEN(SUBSTITUTE(A2,$D$2:$D$6,"")))=LEN(SUBSTITUTE(A2,$D$2:$D$6,"")))*ROW($1:$5)))
however: i do not recommend using that... while it is okay for small tables, the time to calculate will incease extremely for each additional keyword...
also the first formula will output the first item in the list if no match is found, and the second formula will output the last entry of the list.
better use Scott Carner's solution with sorting by length (should be MUCH faster, but you may check that for yourself)
at least, you could also use vba like that:
Public Function maxMatch(str As String, rng As Range) As String
Dim cell As Variant
For Each cell In rng.Value
If InStr(str, cell) > 0 And Len(cell) > Len(maxMatch) Then maxMatch = cell
Next
End Function
and then simply put in the cell =maxMatch(A2,$D$2:$D$6)
(however, you where not going for VBA so that does not count) ;)

How do I return the intersecting value from 2 partial match lookups? Index/Match

I have two tables in an excel worksheet. I'm trying to gather product info from data on another table in the same workbook. The first table is the product data feed I'm building with the product part numbers. Those part numbers include the variables of the product (in this case the length and the width). On the other sheet, I have partial part numbers in the header column and the rough dimensions in the header row. The intersection gives the final dimensions which is the data I'm trying to gather on sheet 1. I've been trying to use and Index/Match formula to solve the problem, but since there are only partial part numbers on the 2nd sheet the lookup is inconclusive. I know the lookup value supports wildcards, but it seems I would need some sort of wildcard search within the lookup array instead.
Example product names on sheet 1 column A "EXP81285-150-11 x 14-Flat"
Example of product names on sheet 2 column A "EXP81285-150"
Example of rough dimensions on sheet 2 row 1 "11 x 14"
Here is what I have so far:
=INDEX('sheet 2'!$A$1:$L$87,MATCH($A3,'sheet 2'!$A:$A,0),MATCH($A3,'sheet 2'!$1:$1,0))
Sheet 1
Sheet 2
Any help is greatly appreciated!
Asuming its always like string1-string2-unused and string2 and unused doesn't contain "-" you can get the first string with:
*updated due to misunderstanding*
=MID(A3,4,FIND("|",SUBSTITUTE(A3,"-","|",LEN(A3)-LEN(SUBSTITUTE(A3,"-",""))-1))-4)
While the string2 one is a hell of a formula:
=MID(A3,FIND("|",SUBSTITUTE(A3,"-","|",LEN(A3)-LEN(SUBSTITUTE(A3,"-",""))-1))+1,FIND("|",SUBSTITUTE(A3,"-","|",LEN(A3)-LEN(SUBSTITUTE(A3,"-",""))))-FIND("|",SUBSTITUTE(A3,"-","|",LEN(A3)-LEN(SUBSTITUTE(A3,"-",""))-1))-1)
Asuming the last part is allways in Q3 then:
=MID(SUBSTITUTE($A3,"-"&$Q3,""),FIND("|",SUBSTITUTE($A3,"-","|",LEN($A3)-LEN(SUBSTITUTE($A3,"-",""))-1))+1,99)
You may also use an arrayformula for the second part like:
=MID(SUBSTITUTE($A3,"-"&$Q3,""),LARGE((MID($A3,ROW($1:$99),1)="-")*ROW($1:$99),2)+1,99)
This is an array formula and must be confirmed with Ctrl+Shift+Enter.
(the second formula may work faster)
You could use array formulas in a reverse Match however... having lots of entrys even one formula will slow down the calculation by ~2-5 seconds.
You better use VBA like:
(in Module)
Public Function MATCH2(str As String, rng As Range) As Long
Dim i As Long, var1 As Variant
i = 0
For Each var1 In rng
i = i + 1
If InStr(str, var1.Value) Then MATCH2 = i: Exit Function
Next
End Function
And then use your formula as followed:
=INDEX('sheet 2'!$A$1:$L$87,MATCH2($A3,'sheet 2'!$A:$A,0),MATCH2($A3,'sheet 2'!$1:$1,0))
EDIT 2015-11-19
OK... some small problems:
some sizes doesnt exist (like 6 x 9)
size 7 x 12 was bugged (a space at the end > fixed it)
the function needs to be in a module (also fixed that)
also some items doesn't exist like 600823-002
a misunderstanding regarding the formulas (doesn't matter at the VBA-version) > all asumed the A:A-searchstring starts at the 1st character (but it is the 4th, no EXP)
Also there will be an error at each "header" (the ones without the * x * but that should be ok)
You can download the updated workbook here
If you still have questions, just ask :)
Here is one using vlookup:
=VLOOKUP(LEFT(A2,FIND("-",A2,10)-1),Sheet2!A:L,MATCH(MID(A2,FIND("-",A2,10)+1,(FIND("-",A2,15))-(FIND("-",A2,10)+1)),Sheet2!A1:L1,0),FALSE)
But I agree with Dirk, This could be done faster and probably more accurate with vba.
Edit, I realized that my dictating of 10 and 15 in the formula would not work, I have fixed it, but it is based on the part number has 1 and only 1 "-" in the part name. Warning it is quite long.
=VLOOKUP(LEFT(A2,FIND("-",A2,FIND("-",A2,1)+1)-1),Sheet2!A:L,MATCH(MID(A2,FIND("-",A2,FIND("-",A2,1)+1)+1,(FIND("-",A2,FIND("-",A2,FIND("-",A2,FIND("-",A2,1)+1))+1))-(FIND("-",A2,FIND("-",A2,1)+1)+1)),Sheet2!A1:L1,0),FALSE)

Sort Order formula to alphabetise in Excel

I am currently drawing up a spreadsheet that will automatically remove duplicates and alphabetize a list:
I am using the COUNTIF() function in column G to create a sort order and then VLOOKUP() to find the sort in column J.
The problem I am having is that I can't seem to get my SortOrder column to function properly. At the moment it creates an index for two number 1's meaning the cell highlighted in yellow is missed out and the last entry in the sorted list is null:
If anyone can find and rectify this mistake for me I'll be very grateful as it has been driving me insane all day! Many thanks.
I'll provide my usual method for doing an automatic pulling-in of raw data into a sorted, duplicate-removed list:
Assume raw data is in column A. In column B, use this formula to increase the counter each time the row shows a non-duplicate item in column A. Hardcord B2 to be "1", and use this formula in B3 and drag down.
=if(iserror(match(A3,$A$2:A2,0)),B2+1,B2)
This takes advantage of the fact that when we refer to this row counter in our revised list, we will use the match function, which only checks for the first matching number. Then say you want your new list of data on column D (usually I do this for display purposes, so either 'group-out' [hide] columns that form the formulas, or do this on another tab). You can avoid this step, but if you are already using helper columns I usually do each step in a different column - easier to document. In column C, starting in C3 [C2 hardcoded to 1] and drag down, just have a simple counter, which error-checks to the stop at the end of your list:
=if(C2<max(B:B),C2+1," ")
Then in column D, starting at D2 and dragged down:
=iferror(index(A:A,match(C2,B:B,0)),"")
The index function is like half of the vlookup function - it pulls the result out of a given array, when you provide it with a row number. The match function is like the other half of the vlookup function - it provides you with the row number where an item appears in a given array.
Hope this helps you in the future as well.
The actual reason that this is going wrong as implied by Jeeped's comment is that you can't meaningfully compare a string to a number unless you do a conversion because they are stored differently. So COUNTIF counts numbers and text separately.
20212 will give a count of 1 because it is the only (or lowest) number.
CS10Z002 will give a count of 1 because it is the first text string in alphabetical order.
Another approach is to add the count of numbers to the count if the current cell contains text:-
=COUNTIF(INDIRECT("$D$2:$D$"&$F$3),"<="&D2)+ISTEXT(D2)*COUNT(INDIRECT("$D$2:$D$"&$F$3))
It's easier to show the result of three different conversions with some test data:-
(0) No conversion - just use COUNTIF
=COUNTIF(D$2:D$7,"<="&D2)
"999"<"abc"<"def", 999<1000
(1) Count everything as text
=SUMPRODUCT(--(D$2:D$7&""<=D2&""))
"1000"<"999"
(2) Count numbers before text
=COUNTIF(D$2:D$7,"<="&D2)+ISTEXT(D2)*COUNT(D$2:D$7)
999<1000<"999"
(3) Count everything as text but convert numbers with leading zeroes
=SUMPRODUCT(--(TEXT(D$2:D$7,"000000")<=TEXT(D2,"000000")))
"000999" = "000999", "000999"<"001000"

How to count cells that contain text from a certain list

I use a sheet to enter names of people who work at certain shifts, for example
on column A, the people that work from 8am to 4pm,
on column B the people that work from 4pm till midnight
on column C and beyond, special shifts
etc
This table is A1:N24 and it contains titles (of shifts), names of workers and some special notes, about each worker.
On column R I have a list of workers that I use for data validation/drop down lists, to make the entry of workers' names easier
My question is how I can count the number of cells on the A1:N24 table that contain only names from the R column list, leaving out the title cells and the special notes cells.
The COUNTIF function seems like a logical choice but I couldn't make it work with a range of criteria, my workers list. Maybe the DCOUNTA function could be of use in my case?
Any help would be appreciated
Try this (entered as an array formula)
=COUNT(MATCH(A1:N24,R:R,0))
How it works:
MATCH(A1:N24,R:R,0) returns an array of values where the entry in A1:N24 is found, and #N/A errors where its not
COUNT( ) counts the Numbers in that array, ie the number of matching values

lookup Data in Excel

I have a 2 variable 100x100 data table in excel.
I need to have a function that returns all the possible sets of variables that yield a given target value.
What I am looking at is some kind of a reursive 2 dimensional lookup function. Can someone point me in the right direction?
It can be done without VBA, fairly compactly, like so.
Suppose your 100x100 table is in B2:CW101, and we put a list of numbers 1 to 100 down the left from A2 to A101, and again 1 to 100 across the top from B1 to CW1
Create a column of cells underneath, starting (say) in B104
B104=MAX(($A$2:$A$101*100+$B$1:$CW$1<B103)*($B$2:$CW$101=TargetValue)*($A$2:$A$101*100+$B$1:$CW$1))
This is an "array" formula,so press Ctrl-Shift-Enter instead of Enter, and curly brackets {} should appear around the formula.
Then copy down for as many rows as you might need. You also need to put a large number above your first formula, i.e. in B103, e.g. 999999.
What the formula does is to calculate Rowx100+Column, but only for each successful cell, and the MAX function finds the largest result, excluding all previous results found, i.e. it finds the target results one at a time, starting from bottom right and working up to top left. (With a little effort you could get it to search the other way).
This will give you results like 9922, which is row 99, column 22, and you can easily extract these values from the number.
There is no built-in function that will do what you want, I'm 99% sure of that.
A VBA function that returns an array could be built, along the lines of the quick-and-dirty Sub already shown. Create an Variant to hold the output, perhaps Redimmed to the maximum possible number of results and Redim Preserve-d down to the actual number at the end. Then return that as the result of the function which then needs to be called as an array function (Control-Shift-Enter).
One down-side is that you'd have to ensure that the target range was large enough to hold the entire result: Excel won't do that automatically.
Would the Solver suit?
http://office.microsoft.com/en-us/excel/HA011118641033.aspx
I tried this a lot without using VBA but doesn't seem to be possible without it.
To solve this issue , I needed to loop through the entire array and found closest values. These values were then derefernced using calls and range properties and the output was generated in a range being incremented at each valid match.
The quick and dirty implementation is as under:
Dim arr As Range
Dim tempval As Range
Dim op As Integer
Set arr = Worksheets("sheet1").Range("b2:ao41")
op = 1
Range("B53:D153").ClearContents
For Each tempval In arr
If Round(tempval.Value, 0) = Round(Range("b50").Value, 0) Then
Range("b52").Offset(op, 0).Value = Range("a" & tempval.Row).Value
Range("b52").Offset(op, 1).Value = Cells(tempval.Column, 1).Value
Range("b52").Offset(op, 2).Value = tempval.Value
op = op + 1
End If
Next
Range("b50").Select
I am still looking for an approach without VBA.
I've got a solution that doesn't use VBA, but it's fairly messy. It involves creating a further one-dimensional table in Excel and doing lookups on that. For a 100x100 data table, the new table would need 10,000 rows.
Apologies if this doesn't fit your needs.
A summary is below - let me know if you need more detail. N = the dimension of the data, e.g. 100 in your example.
First, create a new table with five columns and NxN rows. In each case, replace my column names with the appropriate Excel reference
The first column (call it INDEX) simply lists 1, 2... NxN.
The second column (DATAROW) contains a formula to loop through 1, 2... N, 1, 2...N... This can be done using something like =MOD(INDEX-1, N)+1
The third column (DATACOL) contains 1, 1, 1... 2, 2, 2... (N times each).
This can be done with =INT((INDEX-1)/N)+1
The fourth column (VALUE) contains the value from your data table, using something like:
=OFFSET($A$1, DATAROW, DATACOL), assuming your data table starts at $A$1
We have now got a one-dimensional table holding all your data.
The fifth column (LOOKUP) contains the formula:
=MATCH(target, OFFSET(VALUERANGE, [LOOKUP-1], 0),0)+ [LOOKUP-1]
where [LOOKUP-1] refers to the cell immediately above (e.g. in cell F4 this refers to F3). You'll need a 0 above the first cell in the LOOKUP column.
VALUERANGE should be a fixed (named or using $ signs) reference to the entire VALUE column.
The LOOKUP column then holds INDEX numbers which can be used to look up DATAROW and DATACOL to find the position of the match in the data.
This works by searching for matches in VALUERANGE, then searching for matches in an adjusted range starting after the previous match.
It's much easier in a spreadsheet then via the explanation above, but that's the best I can do for the moment...

Resources