My goal is to shuffle a matrix using a specified matrix of indexes.
For example, let's say this is my input matrix:
A B C
D E F
G H I
Now I want to shuffle my input matrix. But not in a random way, I want to use a custom set of indexes (0, 1, 2 etc.) that represent the final order of my matrix. Let's say this is the matrix of indexes:
5 3 8
1 0 4
2 7 6
The resulting matrix should be:
F D I
B A E
C H G
since A was in position 0, B was in position 1 etc.
I'm looking for the formula that I should insert in each cell of the resulting matrix. I tried INDEX and MATCH functions but I'm not sure this is the right way.
I would just use quotient/mod to get row and column
=INDEX($A$1:$C$3,QUOTIENT(E1,3)+1,MOD(E1,3)+1)
You could make it more general if you wanted to.
Related
So I am given 3 input values, C, D, and V. Where D is a list of integers and C is the number of times that any element in D can be repeated when summing to at most a value V.
Now what will always be the case is that not every combination of sums in D will equal 1 to V. Therefore, my function has to find the minimum amount of integers to include in D that will satisfy summing to at most V.
In other words, minimum number of integers to insert into D such that all integers from 1 <= x <= V can be represented by a summation of combination of numbers in D, where no number is repeated more than C times
For example,
D = [1, 5, 10, 25]
C = 2
V = 100
As you can see, the max value I can get from D and C is:
2(1+5+10+25) = 82 which is less than V = 100.
I solved it manually where I can satisfy every value from 1 to 100 when I include two new integers in D which is 12 and 2.
So my output would be 12 and 2.
Now the way I saw it was to basically cover all single digit values so since I already have 1 and 5 in D where I can repeat their sums 2 times given by C=2, the values I cannot account for are 3, 4, 8, and 9 which can be resolved by introducing 2 into D and likewise for double digits it would be 12 given C and D.
ie.
< 3=1+2, 4=2+2 or 2+1+1, 8=5+2+1, 9=5+2+2 or 5+2+1+1 >
What becomes obvious now is the fact that computationally it can be extremely expensive for large values of N, especially if I brute force it.
My strategy was this:
Find all possible combinations of sums given D and C then put into a list, and lets call it dSums.
Create a list ranging from 1 to V ie.
vList = [i for i in range(1, v+1, 1)]
iterate through dSums and remove any value in dSums from vList
so now I have a list of all integers that are missing but the problem is this.
If D = [1, 5], C=2 and V=10
vList = [3, 4, 8, 9]
I thought I can find the common integer that will satisfy all the values in vList using D and C but I am not sure how to traverse or map the lists in such a way that I ultimately arrive at my desired value which is 2.
Any thoughts?
Here's a link to a screenshot with the formula used in Column B and some sample data
I have a spreadsheet with 48 rows of data in column A
The values range from 0 to 19
The average of these 48 rows = 8.71
the standard deviation of the population = 3.77
I've used the STANDARDIZE function in excel in column B to return the Z-score of each item in column A given that I know the mean (8.71), std dev (3.77), and x (whatever is in column A).
For example (row 2) has:
x = 2
z = -1.779
Using the z value, I want to create an lower (4) and upper (24) boundary and calculate what the value would be in this 3rd column.
Essentially, if x = 0 (min value), then z = -2.3096, and columnC = 4 (lower boundary condition)
Conversely, if x = 19 (max value), then z = 2.9947, and columnC = 19 (upper boundary condition)
and then all other values between 0 to 19 would be calculated....
Any ideas how I can accomplish this with a formula in the column C?
So if your lowest original value is 0 and your highest is 19 and you want to re-distribute them from 4 to 24 and we assume that both are linear that means:
Since both are linear we have to use these formulas:
we develope the first to c so we get
and replace the c in the second equation with that so we get
and develope this to m as follows
If we put this togeter with our third equation above we get:
So we finally have equations for m = and c = and we can use the numbers from our old and new lower and upper bound to get:
you can use these values with
where x is are your old values in column A and y is the new distributed value in column B:
Some visualization if you change the boundaries:
Idea for a non-linear solution
If you want 4 and 24 as boundaries and the mean should be 12 the solution cannot be linear of course. But you could use for example any other formula like
So you can use this formula for column D y2 with the following values a, b, c as well as calculating the mean, min and max over column D y2.
Then use the solver:
Goal is: Mean $M$15 should be 12
secondary conditions: $M$16 = 4 (lower boundary) and $M$17 = 24 (upper boundary)
variable cells are a, b and c: $M$11:$M$13
The solver will now adjust the values a, b and c so that you get very close to your goal and to get these results:
The min is 4 the max is almost 24 and the mean is almost 12 that is probably the closest you can get with a numeric method.
I want to column join
┌─┬─┬─┐
│1│1│2│
│2│4│4│
│3│9│6│
└─┴─┴─┘
and I'd like to put a=.1 2 3 as the fourth row, and then put b=.1 1 1 1 as the first column to the new boxed data. How can I do this easily? Do I have to ravel the whole thing and compute the dimention on my own in order to box it again?
Also, if I want the data i.8 to be 2 rows, do I have to calculate the other dimension 4(=8/2) in order to form a matrix 2 4$i.8? And then box it ;/2 4$i.8? Can I just specify one dimension, either the number of row or columns and ask automatic boxing or forming the matrix?
The answer to your question will involve learning about &. , the 'Under' conjunction, which is tremendously useful in J.
m
┌─┬─┬─┐
│1│1│2│
│2│2│4│
│3│9│6│
└─┴─┴─┘
a=. 1 2 3
b=. 1 1 1 1
So we want to add each item of a to each boxed column of m . It would be perfect if we could unbox the column using unbox(>), append the item of a to the column using append (,) and then rebox the column using box (<). This undo, act, redo cycle is exactly what Under (&.) does. It undoes both its right and left arguments ( m and a ) using the verb to its right, then applies the verb to its left, then uses the reverse of the verb to its right on the result. In practice,
m , &. > a
┌─┬─┬─┐
│1│1│2│
│2│2│4│
│3│9│6│
│1│2│3│
└─┴─┴─┘
The fact that a is unboxed when it was never boxed to begin with means that it is not changed, while m is unboxed before (,) is applied to each a . In fact this is used so often in J that &. > is assigned the name 'each'.
m , each a
┌─┬─┬─┐
│1│1│2│
│2│2│4│
│3│9│6│
│1│2│3│
└─┴─┴─┘
Prepending a boxed version of b requires first giving it an extra dimension with laminate (,:) then transposing (|:) b and finally boxing (<) the result. The step of adding the extra dimension is required because transposing swaps the indices and b start as a one-dimensional list.
(<#|:#,:b)
┌─┐
│1│
│1│
│1│
│1│
└─┘
The rest is easy as we just use append (,) to join the boxed b with (m, each a)
(<#|:#,: b) , m , each a
┌─┬─┬─┬─┐
│1│1│1│2│
│1│2│2│4│
│1│3│9│6│
│1│1│2│3│
└─┴─┴─┴─┘
Brackets around (<#|:#,: b) are necessary to force the correct order of execution.
For the second question, you can use i. n m to create a n X m array, which may help.
i. 4 2
0 1
2 3
4 5
6 7
i. 2 4
0 1 2 3
4 5 6 7
but perhaps I am misunderstanding your intentions here.
Hope this helps, bob
append a (with rank): ,"x a
You can simply append (,) a to your unboxed (>) input but you have to be careful with the append rank. You want to append each "item" of a, so you have right rank of "0". You want to apend to a 2-cell so you have a left rank of "2". Therefore, the , you need has rank "2 0. After the append, you rebox your data to a 2-cell with <"2.
<"2(>in)(,"2 0) a
┌─┬─┬─┐
│1│1│2│
│2│4│4│
│3│9│6│
│1│2│3│
└─┴─┴─┘
prepend b: b,
If your b has the right shape you prepend it with b,. The shape you seem to use is (boxed) 4 1:
b =: < 4 1$ 1
┌─┐
│1│
│1│
│1│
│1│
└─┘
b,in
┌─┬─┬─┬─┐
│1│1│1│2│
│1│2│4│4│
│1│3│9│6│
│1│1│2│3│
└─┴─┴─┴─┘
I have come up with the term loop rolling myself with the hope that it does
not overlap with an existing term. Basically I'm trying to come up with an
algorithm to find loops in a printed text.
Some examples from simple to complicated
Example1
Given:
a a a a a b c d
I want to say:
5x(a) b c d
or algorithmically:
for 1 .. 5
print a
end
print b
print c
print d
Example2
Given:
a b a b a b a b c d
I want to say:
4x(a b) c d
or algorithmically:
for 1 .. 4
print a
print b
end
print c
print d
Example3
Given:
a b c d b c d b c d b c e
I want to say:
a 3x(b c d) b c e
or algorithmically:
print a
for 1 .. 3
print b
print c
print d
end
print b
print c
print d
It didn't remind me of any algorithm that I know of. I feel like some of the
problems can be ambiguous but finding one of the solutions is enough to me for
now. Efficiency is always welcome but not mandatory. How can I do this?
EDIT
First of all, thanks for all the discussion. I have adapted an LZW algorithm
from rosetta and ran it on my
input:
abcdbcdbcdbcdef
which gave me:
a
b
c
d
8 => bc
10 => db
9 => cd
11 => bcd
e
f
where I have a dictionary of:
a a
c c
b b
e e
d d
f f
8 bc
9 cd
10 db
11 bcd
12 dbc
13 cdb
14 bcde
15 ef
7 ab
It looks good for compression but it's not quite what I wanted. What I need
is more like compression in the algorithmic representation from my examples
which would have:
subsequent sequences (if a sequence is repeating, there would be no other
sequence in between)
no dictionary but only loops
irreducable
with maximum sequence sizes (which would minimize the algorithmic
representation)
and let's say nested loops are allowed (contrary to what I said before in
the comment)
I start with an algorithm, which gives maximum sequence sizes. Though it would not always minimize the algorithmic representation, it may be used as an approximation algorithm. Or it may be extended to optimal algorithm.
Start with constructing Suffix array for your text along with LCP array.
Sort an array of indexes of LCP array, indexes of larger elements of LCP array come first. This groups together repeating sequences of the same length and allows to process sequences in greedy manner, starting from maximum sequence sizes.
Extract suffix array entries, grouped by LCP value (by group I mean all the entries with selected LCP value as well as all entries with larger LCP values), and sort them by position in the text.
Filter out entries with positional difference not equal to LCP. For remaining entries, get prefixes of length, equal to LCP. This gives all possible sequences in the text.
Add sequences, sorted by starting position, to ordered collection (for example, binary search tree). Sequences are added in order of appearance in sorted LCP, so longer sequences are added first. Sequences are added only if they are independent or if one of them is completely nested inside the other one. Intersecting intervals are ignored. For example, in caba caba bab sequence ab intersects with caba and so it is ignored. But in cababa cababa babab one instance of ab is dropped, 2 instances are completely inside larger sequence, and 2 instances are completely outside of it.
At the end, this ordered collection contains all the information, needed to produce the algorithmic representation.
Example:
Text ababcabab
Suffix array ab abab ababcabab abcabab b bab babcabab bcabab cabab
LCP array 2 4 2 0 1 3 1 0
Sorted LCP 4 3 2 2 1 1 0 0
Positional difference 5 5 2 2 2 2 - -
Filtered LCP - - 2 2 - - - -
Filtered prefixes (ab ab) (ab ab)
Sketch of an algorithm, producing the minimal algorithmic representation.
Start with the first 4 steps of previous algorithm. Fifth step should be modified. Now it is not possible to ignore intersecting intervals, so every sequence is added to the collection. Since the collection now contains intersecting intervals, it is better to implement it as some advanced data structure, for example, Interval tree.
Then recursively determine the length of algorithmic representation for all sequences, that contain any nested sequences, starting from the smallest ones. When every sequence is evaluated, compute optimal algorithmic representation for whole text. Algorithm for processing either a sequence or whole text uses dynamic programming: allocate a matrix with number of columns, equal to text/sequence length and number of rows, equal to the length of algorithmic representation; doing in-order traversal of interval tree, update this matrix with all sequences, possible for each text position; when more than one value for some cell is possible, either choose any of them, or give preference to longer or shorter sub-sequences.
I am trying to generate some ranges for a problem I am working on. These rangers are going to be based on the sum of the ratio's of a bunch of numbers. So for example, the constant's are 5 6 and 7.
The ranges I get will be 5/x + 6/y + 7/z = S
I want x, y, and z to come out of a list of numbers I have - say .5, .6, .7, .8, .9, and 1
So If I run 100 iterations of this, I want the spreadsheet to randomly fill a value in X from that list of numbers, another random selection for y, and yet another for z.
And like I said, I want that sum, S, to be calculated 100 times in such a way that I will get a range of values for S.
I have been trying to figure out how to do this without the use of macros.
Here's one way to do it. Create a table of x, y, and z input values. Put a column to the left of the table with the number of each input value (1...N). Say that you have 10 potential input values for each. So your table is in A1:D10 with 1 through 10 in column A and the x values in B, y values in C, and z valued in D.
Then you can select a random value of the x values by writing =VLOOKUP(10*RAND()+1,$A$1:$D$10,2,TRUE). This randomly selects a number between 0 and 10 and looks up the x value matching the A column that matches the number, rounded down. E.g. the random number is 4.3 -- then it will select the 4th value. Replace the third parameter in the VLOOKUP column with 3 for y values and 4 for z values...
If you don't have any other data in columns A:D, you can generalize this with =VLOOKUP(count($A:$A)*RAND()+1,$A:$D,2,TRUE).