Find location of NEW number in list - excel

I'm looking for a way to find where a RAND() number lies on a list of irrational (effectively RAND() as well) numbers from 0 to 1.
So I have a list of numbers 0.1003, 0.1984, 0.3895, 0.4506, 0.4724, 0.4856, 0.5602, 0.8542 in A1:A8
Then I have a RAND() number to check against the list. Now I've tried RANK.AV(RAND(),A1:A8), but the rank functions require your lookup value to be in the list.
A simple solution would be to place my RAND() at the bottom of the list (A9) and use RANK.AV(A9,A1:A9), so my number is included on the list, however I would have to do this for every number in my array of thousands of rand numbers, so impractical.
Perhaps there is some way I can join another cell onto an array without actually placing it adjacent?
Eg, for a RAND() in B1, I could write in C1:
=RANK.AV(B1,ARRAY.JOIN(A1:A8,B1)), but I've tried a few ways (&,+) and can't achieve this array joining function, so I thought I'd ask for help! Perhaps a macro or UDF is required?

Sorry #Chris, I'm sure this is what you meant in your comment:-
=IF(B1<A1,1,MATCH(B1,$A$1:$A$10)+1)
where your new random number is in B1 and your existing list in A1:A10, sorted in ascending order. Where the new number is less than the first entry in the list, this would be a special case: otherwise, you could allow for it by placing a zero in A1 and moving the pseudo-random numbers down which would simplify the formula to
=MATCH(B1,A$1:A$10)
If you wanted to allow for ties, you could correct for it:-
=IF(B1<A1,1,MATCH(B1,$A$1:$A$10)+1)-COUNTIF($A$1:$A$10,B1)/2
or just
=MATCH(B1,$A$1:$A$10)-COUNTIF($A$1:$A$10,B1)/2
with a zero in A1.
I'm assuming that ranking is from 1=smallest to 8=largest if there are 8 numbers: you can easily change it by subtracting the rank from count(A$1:A$10)+2 or count(A$1:A$10)+1 if the zero is included.

This could do your trick:
=RANK.AVG(INDEX(A1:A8,RANDBETWEEN(1,COUNT(A1:A8))),A1:A8)
This ranks a random choice out of random numbers.

Related

How to pick certain amount of random excel cells with text and combine them in a string

I have a big list with about ~300 hashtags where I want to pick a certain amount (like 10 hashtags) randomly. Is it possible without VBA?
Here's one way of doing it, though it relies on your Excel being recent enough to support these functions:
LET
UNIQUE
RANDARRAY
SEQUENCE
With this formula:
=TEXTJOIN(" ",,LET(rng,$B$3:$B$13,n,ROWS(rng),idx,UNIQUE(RANDARRAY(n*n,1,1,n,TRUE)),s,SEQUENCE($E$1),INDEX(rng,INDEX(idx,s))))
It's a bit lengthy, but works around the observation that RANDARRAY() can return duplicate values by getting many more random numbers than are needed, and then taking the first x values.
NB. If the number of tags you want is small compared to the total available, then you probably don't need the (n*n) and can just use (n).
Hat-tips to #SpencerBarnes and #MayukhBhattacharya
on Office 365:
=CONCAT(INDEX(A1:A300, RANDARRAY(10,0, 1, 300, TRUE)))
Where A1:A300 is your list of hashtags, and 1, 300, is the start and end of the list.
Removing the CONCAT() function from the outside of the formula will output a spilled range rather than a concatenated string.

Is it possible to return a dynamic formula in VLOOKUP?

I am looking for a possibility to use VLOOKUP in combination with a dynamic formula (or a different solution if possible).
A simplified problem is provided in the image below. Based on a category the number of outlets in a room is calculated. The number of outlets can either be a fixed amount or based on the total area of a room which is provided in a separate column.
What is the best method to apply this to a (much larger) sample?
If "based on total area" is always a x per unitA, then enter it as a proportion, e.g.: 0.1a instead of 1 per 10 m^2, and use a formula that checks for the trailing 'a' and responds accordingly. e.g.:
=IF(RIGHT(VLOOKUP(B2,$F$2:$G$3,2,FALSE),1)="a",CEILING.MATH(SUBSTITUTE(VLOOKUP(B2,$F$2:$G$3,2,FALSE),"a","")*C2),VLOOKUP(B2,$F$2:$G$3,2,FALSE))
(I used "Ceiling" to get the integer value. Replace with Floor or Round as needed).
Probably easier from a formula writing (and readability) POV would be a separate column that holds the a (or rather: a "fixed/proportional" column) and key off of that, but the result is the same.

Ranking with subsets

I'm trying to rank values and have managed to work out how to sort ties. My data looks at the total number of entries, ranks based on that and if there is a tie it looks to the next column of values to sort them out. However, I have two classes (East and West I've called them) of data within my dataset and want to rank them both separately (but stick to the rules above). So, if I had seven entries, 3 of them West and 4 of the East, I want West to have ranking 1,2,3 based on all the values that lie in that subset and East would have ranking 1,2,3,4. Can you explain what your formula is doing so I can understand how to apply your answer better in the future.
Effectively I'm asking what formula needs to go in achieve my result.
Cheers
Paul
There are a few related ways to do this, most involving SUMPRODUCT. If you don't like the solution below and would like to research other ways/explanations, try searching for "rankif".
The function looks up the Class and Value columns and, for every value in those columns, returns a TRUE or 1 if the current Class is a match AND if its Value is larger than the current Value, False or 0 if otherwise. The SUM adds up all these 1s, and the 1+ is for decoration. Remember to enter as an array formula using Ctrl+Shift+Enter before dragging down.
I used the array formula and SUM above to explain, but the following also works and might even be faster since it's not an array formula. It's the same idea, except we hijack SUMPRODUCT's ability to spit out a single value from an array.
=1+SUMPRODUCT(($A$2:$A$8=A2)*($B$2:$B$8>B2))
EDIT
To extend the rank-if, you could add more subsets to rank by multiplying more conditions:
You can also easily add tiebreakers by adding another SUMPRODUCT to treat the ties as an additional subset:
The first SUMPRODUCT is the 'base rank', while the second SUMPRODUCT is tiebreaker #1.

nested excel functions with conditional logic

Just getting started in Excel and I was working with a database extract where I need to count values only if items in another column are unique.
So- below is my starting point:
=SUMPRODUCT(COUNTIF(C3:C94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"}))
what i'd like to figure out is the syntax to do something like this-
=sumproduct (Countif range1 criteria..., where range2 criteria="is unique value")
Am I getting this right? The syntax is a bit confusing, and I'm not sure I've chosen the right functions for the task.
I just had to solve this same problem a week ago.
This method works even when you can't always sort on the grouping column (J in your case). If you can keep the data sorted, #MikeD 's solution will scale better.
Firstly, do you know the FREQUENCY trick for counting unique numbers? FREQUENCY is designed to create histograms. It takes two arrays, 'data' and 'bins'. It sorts 'bins', then creates an output array that's one longer than 'bins'. Then it takes each value in 'data' and determines which bin it belongs in, incrementing the output array accordingly. It returns the array. Here's the important part: If a value appears in 'bins' more than once, any 'data' value meant for that bin goes in the first occurrence. The trick is to use the same array for both 'data' and 'bins'. Think it through, and you'll see that there's one non-zero value in the output for each unique number in the input. Note that it only counts numbers.
In short, I use this:
=SUM(SIGN(FREQUENCY(<array>,<array>)))
to count unique numeric values in <array>
From this, we just need to construct arrays containing numbers where appropriate and text elsewhere.
In the example below, I'm counting unique days when the color is red and the fruit is citrus:
This is my conditional array, returning 1 or true for the rows I'm interested in:
($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0))
Note that this requires ctrl-shift-enter to be used as an array formula.
Since the value I'm grouping by for uniqueness is text (as is yours), I need to convert it to numeric. I use:
MATCH($C$2:$C$10,$C$2:$C$10,0)
Note that this also requires ctrl-shift-enter
So, this is the array of numeric values within which I'm looking for uniqueness:
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),"")
Now I plug that into my uniqueness counter:
=SUM(SIGN(FREQUENCY(<array>,<array>)))
to get:
=SUM(SIGN(FREQUENCY(
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),""),
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),"")
)))
Again, this must be entered as an array formula using ctrl-shift-enter. Replacing SUM with SUMPRODUCT will not cut it.
In your example, you'd use something like:
=SUM(SIGN(FREQUENCY(
IF(ISNUMBER(MATCH($C$3:$C$94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"},0)),MATCH($J$3:$J$94735,$J$3:$J$94735,0),""),
IF(ISNUMBER(MATCH($C$3:$C$94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"},0)),MATCH($J$3:$J$94735,$J$3:$J$94735,0),"")
)))
I'll note, though, that scaling might be a problem on data sets as large as yours. I tested it on larger data sets, and it was fairly fast on the order of 10k rows, but really slow on the order of 100k rows, such as yours. The internal arrays are plenty fast, but the FREQUENCY function slows down. I'm not sure, but I'd guess it's between O(n log n) and O(n^2) depending on how the sort is implemented.
Maybe this doesn't matter - none of this is volatile, so it'll just need to calculate once upon refreshing the data. If the column data is changing, though, this could be painful.
Asuming the source data is sorted by the key value [A], start with determining the occurence of the key column
B2: =IF(A2=A1;B1+1;1)
Next determine a group sum
C2: =SUMIF($A$2:$A$9;A2;$B$2:$B$9)
A key is unique if its group sum is exactly 1
D2: =(C2=1)
To count records which match a certain criterium AND are unique, include column D in a =IF(AND(D2, [yourcondition];1;0) and sum this column
Another option is to asume a key unique within a sorted list if it is unequal to both its predecessor and successor, so you could find the unique records like
E2: =AND(A2<>A1;A2<>A3)
G2: =IF(AND(E2;F2="this");1;0)
E and G can of course be combined into one single formula (not sure though if that helps ...)
G2(2): =IF(AND(AND(A2<>A1;A2<>A3);F2="this");1;0)
resolving unnecessarily nested AND's:
G2(3): =IF(AND(A2<>A1;A2<>A3;F2="this");1;0)
all formulas in row 2 should be copied down to the end of the list

How to find the first and second maximum number?

I am trying to find first highest number and second highest number in excel. What shall i do for that. I did not find the right formula.
Note: I have already used the large and max formula.
=LARGE(E4:E9;1)
edit: guys I know if i write 2 instead of 1 i will get the result but i have to click the mouse to see all result.
If you want the second highest number you can use
=LARGE(E4:E9;2)
although that doesn't account for duplicates so you could get the same result as the Max
If you want the largest number that is smaller than the maximum number you can use this version
=LARGE(E4:E9;COUNTIF(E4:E9;MAX(E4:E9))+1)
OK I found it.
=LARGE($E$4:$E$9;A12)
=large(array, k)
Array Required. The array or range of data for which you want to determine the k-th largest value.
K Required. The position (from the largest) in the array or cell range of data to return.

Resources