Interview question about "largest range" makes no sense - seq

Here's the question. I'm actually dumbfounded. I don't even get the question. What are they on about?
What even is a largest range? What do they mean by largest? What's a range? They say a range is a collection of numbers that come right after each other in the set of real integers. Okay, so 1, 2, 3, 4, stuff like that, right? But then they say the numbers need not be ordered or even adjacent.... but then they're not coming right after each other!! They are contradicting their own previous statement. Now I have no idea what a range is.
Their example doesn't help either. Why is [0, 15, 5, 2, 4, 10, 7] the largest range in that vector?
What is going on?

It's not very clear in the question, but I'm pretty sure the interviewer means a "range" is a set of consecutive numbers (n, n+1).
The range [0,7] is actually [0,1,2,3,4,5,6,7] since all of those appear in the full set.
The actual order doesn't matter.

In the example you were given in the interview, which you list in your question as well, the input array is: [1, 11, 3, 0, 15, 5, 2, 4, 10, 7, 12, 6]. The reason that the "largest range" is identified as [0, 7] is because all the numbers between 0 and 7 are included in that array.
There isn't another range in the input array that has a longer range than 0 to 7. For instance, there is a [10, 12] range in the input array, but that array has a length of 3 that is smaller than the length of [0, 7] range, which is 8.
In this case, the range is understood as a continuous list of integers, the largest range is the list with the most number of integers.

It means
Find the largest continuous range of numbers
For eg. in array [0,1,2,5,6,7,8,9,10]
There are 2 continuous list
[0,1,2] and [5,6,7,8,9,10] but as the larger range is the second one. so the output must be [5,10].
i.e. The largest and smallest of the largest range.

Related

Longest “increasing” subsequence with two consecutive numbers whose average is less than the third number

Problem Statement
Given an array of integers, find the length of the longest subsequence with two consecutive numbers whose average is less than the third number in O(n^3) time.
Example:
[20, 10, 5, 0, 6, 4, 15, 6, 9, 8], the longest subsequence that satisfies the requirement is 5, 0, 6, 4, 6, 9, 8, and the length of that sequence is 7. (5 + 0) / 2 = 2.5 < 6, (0 + 6) / 2 = 3.0 < 4, (6 + 4) / 2 = 5.0 < 6, etc.
What I tried
1st approach: O(n^2)
A generic dynamic programming approach, I define the DP array to be the length of the longest subsequence that satisfies the condition.
If (i-2)th and (i-1)th integers’ average is less than ith integer, we add one to the dp array. The solution is the last element of the DP array.
This didn’t work as I realized it is only considering the numbers in the original array, not the subsequence I am trying to achieve. So, this approach only gave me 5 as the answer for the example input above, and the answer would be 5, 0, 6, 4, 15. The approach did not account for disjoint parts of the original sequence to create the new subsequence.
1.5th approach
While writing out the problem on my notes, I realized the corresponding average subsequence for the example input is the longest. Following the idea of a LIS problem, I created an array of all the average numbers to find the longest increasing subsequence in that array. This solved the example input but failed more complicated inputs.
2nd approach: O(n^3)
Using the hint of the problem statement that the algorithm can be O(n^3), so I tried coming up with a definition for a 2D DP array and a loop to make it O(n^3). I defined the DP[i][j] to be the length of the longest subsequence from the start element to the ith element, while considering the jth element.
Considering the example input, for instance, DP[2][6] = 3 because the subsequence would be 10, 5, 15. From the first element to the 2nd index element, we consider the subsequence 10, 5, and the 6th index element is 15, so the subsequence here is 10, 5, 15, and the length is 3. Repeat until every half above the main diagonal of the table is filled, and the solution is the last element (last row, last column) in that half.
I thought this was it, but there were problems I ran into such as not knowing which part of the DP table should i be reusing and not knowing what exactly are my last two numbers of the subsequence I am trying to achieve. Ultimately, I didn’t know where to go next.
Other thoughts
I think a 3D DP array could also work, but I haven’t really thought about how I would define the array…
Any help would be greatly appreciated!

How to get the second largest value in a column

Recently I discovered the LARGE and SMALL worksheet functions, one can use for determining the first, second, third, ... larges of smalles value in an array.
At least, that's what I thought:
When having a look at the array [1, 3, 5, 7, 9] (in one column or row), the LARGE(...;2) gives 7 as expected, but:
When having a look at the array [1, 1, 5, 9, 9], I expect LARGE(...;2) to give 5 but instead I get 9.
Now this makes sense : it seems that the function LARGE(...;2) takes the largest entry in the array (value 9 on the last but one place), deletes this and gives the larges entry of the reduced array (which still contains another 9), but this is not what one might expect intuitively.
In order to get 5 from [1, 1, 5, 9, 9], I would need something like:
=LARGE_OF_UNIQUE_VALUES_OF(...;2))
I didn't find this in LARGE documentation.
Does anybody know an easy way to achieve this?
If you have the new Dynamic Array formulas:
=LARGE(UNIQUE(...),2)
If not use AGGREGATE:
=AGGREGATE(14,7,A1:A5/(MATCH(A1:A5,A1:A5)=ROW(A1:A5)),2)
This is a bit of a hack.
=LARGE(IF(YOUR_DATA=LARGE(YOUR_DATA,1),SMALL(YOUR_DATA,1)-1,YOUR_DATA),1)
The idea is to (a) take any value in your data that is equal to the largest element and set it to less than the smallest element, then (b) find the (new) largest element. It's OK if you want the 2nd largest, but extending to 3rd largest etc. gets progressively uglier.
Hope that helps

Using result of an Excel array function in a calculation

I am attempting to count instances of a particular value in Excel, from the last instance of a prior value.
Assume a vertical list starting in cell A1: 1, 2, 3, 4, 5, 4, 5, 3, 4, 5, 2, 3, 4, 3, 4, 2, 3, 4, 5
I can use an array function in, say B14 (A14 value: 3), of {=MAX(ROW($1:14)*(A$1:A14=A14-1)) to give me the row number of the last instance of a "2" (row 10).
I can then have, in C15, a function =COUNTIF(OFFSET(A14,0,0,B14-ROW(A14),1):A14,A14), which will count the instances of 3's since the last 2.
The question is: how do I integrate that array function directly into the final formula, so as not to have to waste a column with the interim calculation?
Edit
The list of numbers represents a level of indentation, so the end result will be a compound of these calculations with different offset checking to provide section numbering: 1; 1.1; 1.1.1, 1.2, 1.2.1, 1.2.2, etc
I want a single function that can calculate this entire depth level, without having to waste several columns identifying how many rows above the previous indent layer was defined.
Try in cell B14 this formula array:
{=COUNTIF(OFFSET($A14,0,0,
MAX(ROW($1:14)*($A$1:$A14=$A14-1))
-ROW($A14),1):$A14,$A14)}

Formula logic for extracting specific cell data

For extracting First £ figure and not the second one from sample data
(24M UNLTD+INS 30GB £347+£30 S6)
Following array formula has been used in Stackoverflow questions.
{=MID(A1,FIND("£",A1),MIN(IF(ISERROR(MID(MID(A1,FIND("£",A1)+1,999),ROW($1:$999),1)+0),ROW($1:$999),999)))}
I attempted an analysis of the formula as represented in the image below. I am not able to grasp the logic of part
MIN(IF(ISERROR(MID(MID(A1,FIND("£",A1)+1,999),ROW($1:$999),1)+0),ROW($1:$999),999))
As to how it leads to a figure of 4. Request that this part of formula be elaborated to clarify the role of various constituents of this formula.
Try the following as a standard (non-array) formula,
=--REPLACE(REPLACE(A2, 1, FIND("£", A2), ""), FIND("+", REPLACE(A2, 1, FIND("£", A2), "")), LEN(A2), "")
First the inside REPLACE(A2, 1, FIND("£", A2), "") erases everything up to the first £ symbol, then the same logic is applied to erase everything in that modified text from the first + to the end. The -- converts text-that-looks-like-a-number to an actual number.
The array formula you provided uses a more convoluted logic.
FIND("£", A2) + 1 finds the starting point of the first number after the first £ symbol. e.g. The first £ is the 20th character so it returns 21.
MID(A2, FIND("£",A2)+1, 999) extracts the text following that first £ symbol. The text might look like it starts with a number but it is text masquerading as a number. e.g. 347+£30 S6
In an array formula, ROW($1:$999) processes as a sequence of numbers from 1 to 999, incrementing by 1 for each cycle of calculation.
MID(MID(A1, FIND("£", A1) + 1, 999), ROW($1:$999), 1) + 0) returns an array of text values, each one 1 character long and 1 position deeper into the text than the previous one. e.g. 3, 4, 7, +, £, etc.
+0 is used to try and convert each of these pieces of text to a number. The IFERROR function returns TRUE if a piece of text cannot be converted to a true number. The first one that cannot be turned into a true number is the 4th e.g. +
The IF catches the TRUE on the fourth position and returns 4 from the second ROW($1:$999). It has returned 999 for positions 1, 2 and 3. e.g. 999, 999, 999, 4, 5, etc.
The MIN catches the smallest of these numbers returned as an array. This is where the 4 comes from. e.g. 999, 999, 999, 4, 5, 999, ...
You can see this yourself by changing all of the 999's to 9 then using the Evaluate Formula command. The reason changing to 9 is important is so that the returned arrays of number look like 1, 2, 3, 4, 5, 6, 7, 8, 9 which does not obfuscate the results quite a badly as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, ... 998, 999.
        This shows the formula evaluated several steps into the process. Note the 4 being returned to the MIN function.

Conditional Max/Min on Horizontally oriented data

Example data set
Right, above is a link to an image of a sub-segment of my data set. It is oriented in sets of 3 columns, with the first being a concentration, the second a qualifier, and the last an MDL - and continues for up to 95 samples (so a total of 285 columns making manual entry impractical) . How can i calculate the max or min of the concentration values for those that have a qualifier of "u" or vice versa have no qualifier?
I can't figure out anything, and unfortunately i don't have the time to re-orient the data. Anybody have an idea?
Perhaps something like this will do,
  
The 8 formulas in C7:J7 are,
=AGGREGATE(15, 6, $A2:$AY2/(($A$1:$AY$1=C$6)*($B2:$Z2="U")), 1)
=AGGREGATE(15, 6, $C2:$BA2/(($C$1:$BA$1=D$6)*($B2:$AZ2="U")), 1)
=AGGREGATE(14, 6, $A2:$AY2/(($A$1:$AY$1=E$6)*($B2:$Z2="U")), 1)
=AGGREGATE(14, 6, $C2:$BA2/(($C$1:$BA$1=F$6)*($B2:$AZ2="U")), 1)
=AGGREGATE(15, 6, $A2:$AY2/(($A$1:$AY$1=G$6)*($B2:$Z2<>"U")), 1)
=AGGREGATE(15, 6, $C2:$BA2/(($C$1:$BA$1=H$6)*($B2:$AZ2<>"U")), 1)
=AGGREGATE(14, 6, $A2:$AY2/(($A$1:$AY$1=I$6)*($B2:$Z2<>"U")), 1)
=AGGREGATE(14, 6, $C2:$BA2/(($C$1:$BA$1=J$6)*($B2:$AZ2<>"U")), 1)
Those cover both minimum and maximum values when either including or excluding the qualifier.
Addendum: Excluding blank cells
One more condition to check the LEN of the values can be added. To change the length of the value into a divide-by-1 (unchanged) or divide-by-0 (#DIV/)! error) wrap the LEN in the SIGN function.
=AGGREGATE(15, 6, $A2:$AY2/(SIGN(LEN($A2:$AY2))*($A$1:$AY$1=C$6)*($B2:$Z2="U")), 1)
I'm retaining the SMALL sub-function as only AGGREGATE's sub-functions 14 and up process as an array.

Resources