I'm very confused. Reading Data.List package, it says:
transpose [[1,2,3],[4,5,6]] -->
[[1,4],[2,5],[3,6]]
Which would mean that every list is a row?
Reading other literature it seems that a list is in fact a column. Which one is it?
Neither. A list is a sequence. We can pretend it is a row, or a column, but that's an arbitrary choice.
transpose takes a list of lists as input. We can think of that as a sequence of rows, forming a matrix (or a "jagged" matrix if rows are of unequal size). The result, interpreted in the same way, is the transposed matrix.
If we want, we can also see the input of transpose as a sequence of columns, forming a matrix. The result, if we interpret it in the same way, is again the transposed matrix.
TL;DR: for transpose it does not matter if we see the list of lists as a list/sequence of rows of a matrix, or a list/sequence of columns of a matrix, as long as we interpret the result in the same way.
Related
I have a table of data with many data repeating.
I have to sort the rows by random, however, without having identical names next to each other, like shown here:
How can I do that in Excel?
Perfect case for a recursive LAMBDA.
In Name Manager, define RandomSort as
=LAMBDA(ζ,
LET(
ξ, SORTBY(ζ, RANDARRAY(ROWS(ζ))),
λ, TAKE(ξ, , 1),
κ, SUMPRODUCT(N(DROP(λ, -1) = DROP(λ, 1))),
IF(κ = 0, ξ, RandomSort(ζ))
)
)
then enter
=RandomSort(A2:B8)
within the worksheet somewhere. Replace A2:B8 - which should be your data excluding the headers - as required.
If no solution is possible then you will receive a #NUM! error. I didn't get round to adding a clause to determine whether a certain combination of names has a solution or not.
This is just an attempt because the question might need clarification or more sample data to understand the actual scenario. The main idea is to generate a random list from the input, then distribute it evenly by names. This ensures no repetition of consecutive names, but this is not the only possible way of sorting (this problem may have multiple valid combinations), but this is a valid one. The solution is volatile (every time Excel recalculates, a new output is generated) because RANDARRAY is volatile function.
In cell D2, you can use the following formula:
=LET(rng, A2:B8, m, ROWS(rng), seq, SEQUENCE(m),
idx, SORTBY(seq, RANDARRAY(m,,1,m, TRUE)), rRng, INDEX(rng, idx,{1,2}),
names, INDEX(rRng,,1), nCnts, MAP(seq, LAMBDA(s, ROWS(FILTER(names,
(names=INDEX(names,s)) * (seq<=s))))), SORTBY(rRng, nCnts))
Here is the output:
Update
Looking at #JosWoolley approach. The generation of the random sorting can be simplified so that the resulting formula could be:
=LET(rng, A2:B8, m, ROWS(rng), seq, SEQUENCE(m), rRng,SORTBY(rng, RANDARRAY(m)),
names, TAKE(rRng,,1), nCnts, MAP(seq, LAMBDA(s, ROWS(FILTER(names,
(names=INDEX(names,s)) * (seq<=s))))), SORTBY(rRng, nCnts))
Explanation
LET function is used for easy reading and composition. The name idx represents a random sequence of the input index positions. The name rRng, represents the input rng, but sorted by random. This sorting doesn't ensure consecutive names are distinct.
In order to ensure consecutive names are not repeated, we enumerate (nCnts) repeated names. We use a MAP for that. This is a similar idea provided by #cybernetic.nomad in the comment section, but adapted for an array version (we cannot use COUNTIF because it requires a range). Finally, we use SORTBY with input argument by_array, the map result (nCnts), to ensure names are evenly distributed so no consecutive names will be the same. Every time Excel recalculate you will get an output with the names distributed evenly in a different way.
Not sure if it's worth posting this, but I might as well share the results of my research such as it is. The problem is similar to that of re-arranging the characters in a string so that no same characters are adjacent The method is just to insert whichever one of the remaining characters (names) has the highest frequency at this point and is not the same as the previous character, then reduce its frequency once it has been used. It's fairly easy to implement this in Excel, even in Excel 2019. So if the initial frequencies are in D2:D8 for convenience using Countif:
=COUNTIF(A$2:A$8,A2)
You can use this formula in (say) F2 and pull it down:
=INDEX(A$2:A$8,MATCH(MAX((D$2:D$8-COUNTIF(F$1:F1,A$2:A$8))*(A$2:A$8<>F1)),(D$2:D$8-COUNTIF(F$1:F1,A$2:A$8))*(A$2:A$8<>F1),0))
and similarly in G2 to get the ages:
=INDEX(B$2:B$8,MATCH(MAX((D$2:D$8-COUNTIF(F$1:F1,A$2:A$8))*(A$2:A$8<>F1)),(D$2:D$8-COUNTIF(F$1:F1,A$2:A$8))*(A$2:A$8<>F1),0))
I'm fairly sure this will always produce a correct result if one is possible.
HOWEVER there is no randomness built in to this method. You can see if I extend it to more data that in the first several rows the most common name simply alternates with the other two names:
Having said that, this is a bit of a worst case scenario (a lot of duplication) and it may not look too bad with real data, so it may be worth considering this approach along with the other two methods.
I have an excel sheet with some column data that I would like to use for some matrix multiplications using MMULT-function. For that purpose I need to reshape the column data first. I would like to do the reshaping using a dynamic array function since that could then feed directly into the MMULT function without having to actually display the reshaped matrix in the sheet (i.e. keeping only the column with the input data visible for the user). I am aware of ideas such as the one outlined here http://www.cpearson.com/excel/VectorToMatrix.aspx however however as far as I can see that requires having the reshaped data displayed in the sheet which I do not want. An alternative could be to enter the arrays directly in the formula using curly brackets, however as far as I can see this notation does not allow cell-references, i.e. something like MMULT({A1,A2,A3;A4,A5,A6},{A7,A8;A9,A10;A11,A12}) is not allowed. Any ideas for solving this issue?
An example is shown below, basically I have the column-data in my sheet, but do not want to repeat the data (as reshaped data), however, I would still like to be able to do display the square of the reshaped matrix.
Reshaped data and matrix multiplication:
For reshaping a 9x1 array into a 3x3 array:
INDEX(B3:B11,SEQUENCE(ROWS(B3:B11)/3,3))
I am given a matrix of ints and I need to create a function that does the following -
if a number in a cell = 0, the whole column and row needs to change to 0.
but I need to change the original matrix, not allowed to make a copy.
I can use only very basic stuff like list of lists, for, if, split I don't know array or complicated numbers.
I hope you can help me please :)
What is the case
I'm trying to compare two arrays. For simplicity sake let's assume we want to know how often the values of one array exist in the other array.
My referenced/lookup array data sits in A1:A3
Apple
Lemon
Pear
My search array is NOT in the worksheet, but written {"Apple","Pear"}
Problem
So to know how often our search values exists in the lookuparray we can apply a formula like:
{=SUMPRODUCT(--(range1=range2))}
However, {=SUMPRODUCT(--({"Apple","Pear"}=A1:A3))} produces an error. In other words the lookup array wasn't working as expected.
What did work was using TRANSPOSE() function to create a horizontal array from my data first using {=SUMPRODUCT(--({"Apple","Pear"}=TRANSPOSE(A1:A3)))} resulting in the correct answer of 2!
It seems as though my typed array is automatically handled as an horizontal array, and my data obviously was originally vertical.
To test my hypotheses I tried another formula:
{=SUMPRODUCT(--({"Apple","Pear"}={"Apple","Lemon","Pear"}))}
Both are typed arrays, so with above logic it would both be horizontal arrays, perfectly able to work without using TRANSPOSE(), however this returns an error! #N/A
Again {=SUMPRODUCT(--({"Apple","Pear"}=TRANSPOSE({"Apple","Lemon","Pear"})))} gave a correct answer of 2.
Question
Can someone please explain to me:
The reasoning why horizontal can't be compared to vertical arrays.
Why a typed array would automatically be handled as horizontal
Why in my test of the hypotheses the second typed array was handled as vertical.
I'm really curious, and would also be happy to be linked to appropriate documentation as so far I have not been able to find any.
This might be an easy one to answer, though I can't seem to get my head around the logic.
Can someone please explain to me:
The reasoning why horizontal can't be compared to vertical arrays.
This is actually possible, and you can also compare horizontal arrays with other horizontal arrays.
The reason you have been getting the error is because of the mismatch in the length of the array. Consider the following arrays:
Doing =SUMPRODUCT(--(B3:D3=F3:G3)) is the same (on excel's english version, I'm not 100% sure on the delimiters on other versions) as =SUMPRODUCT(--({"Apple","Lemon","Pear"}={"Apple","Pear"})) and results in =SUMPRODUCT(--(Apple=Apple, Lemon=Pear, Pear=???)), that is the nth element of the first array is compared to the nth element of the second array, and if there is nothing to match --the 3rd element in the 1st array is Pear but there is no 3rd element for the 2nd array-- then you get N/A.
When you compare two arrays, one vertical and one horizontal, excel actually 'expands' the final array. Consider the following (1row x 3col and 2row x 1col):
Doing =SUMPRODUCT(--(B3:D3=F3:F4)) is the same as =SUMPRODUCT(--({"Apple","Lemon","Pear"}={"Apple";"Pear"})) and results in =SUMPRODUCT(--(Apple=Apple, Lemon=Apple, Pear=Apple; Apple=Pear, Lemon=Pear, Pear=Pear)). Basically it feels like Excel expanded the two arrays like this (3col x 2row):
This 'expansion' only happens when one array is 1 row high and the other is 1 column wide I believe, so if you take arrays that have something different, then excel will go back to trying to compare an element with 'nothing' to give N/A (you can use the Evaluate Formula feature under Formula tab to help):
So essentially excel is getting something a bit similar to this, where the first array is multiplied to the second array, giving the result array:
But since the last row and last column involve blanks, you get N/A there.
Why a typed array would automatically be handled as horizontal
In your question, it would seem that , delimit rows, so with =SUMPRODUCT(--({"Apple","Pear"}=A1:A3)) you are observing similar to the comparison of two rows in my first example, while with =SUMPRODUCT(--({"Apple","Pear"}=TRANSPOSE(A1:A3))), you are getting the 'expansion' occurring.
As stated in the comments, on the English version of excel, , delimits columns and ; delimits rows, as can be observed in this simple example where I supply an array with 2 rows and 3 columns, excel shows {0,0,0;0,0,0}:
Why in my test of the hypotheses the second typed array was handled as vertical.
TRANSPOSE simply switches an array from vertical to horizontal (and vice versa), but depending on what you are trying to do, you'll get different results as per the first part of my answer, so you'll either have N/A when excel cannot match an item of an array with another item of the other array, or 'expansion' of the two arrays that results in a bigger array.
We are given with a string and an integer. We have to tell what character would be at that integer position in the string if the characters were to be placed into sort order.
For Example
String = LALIT
Index = 3
Sorted string AILLT and the character at position 3 is L
Is it possible to solve this problem without sorting?
if yes then can someone provide a pseudo code.
Yes, it's possible to do this. You're looking for something called a selection algorithm which, given a list of elements and a number k, returns what element would be in position k if the elements were to be in sorted order. Amazingly enough, it's possible to do this without sorting the entire list!
The simplest non-sorting algorithm for selection is called quickselect, which runs in expected time O(n) and, provided you're allowed to modify the original array, uses only O(1) auxiliary storage space. The idea behind quickselect is to do a single step of quicksort - pick a pivot element, partition the elements into elements less than the pivot, elements equal to the pivot, and elements greater than the pivot - then to see what happens based on that. If the pivot element ends up in position k after this step, then you're done - that's the element that would be at position k in the final sequence. If the pivot is at a position higher than k, recursively look to the left of the pivot (the kth smallest element is somewhere in there), and if the pivot is at a position lower than k, recursively look to the right of the pivot (the kth smallest element is somewhere in there).
Other approaches exist as well, such as the median-of-medians algorithm that always runs in worst-case O(n) time but is a classic "tricky algorithm to wrap your head around."