How can I implement a grouping algorithm in J? - j

I'm trying to implement A006751 in J. It's pretty easy to do in Haskell, something like:
concat . map (\g -> concat [show $ length g, [g !! 0]]) . group . show
(Obviously that's not complete, but it's the basic heart of it. I spent about 10 seconds on that, so treat it accordingly.) I can implement any of this fairly easily in J, but the part that eludes me is a good, idiomatic J algorithm that corresponds to Haskell's group function. I can write a clumsy one, but it doesn't feel like good J.
Can anyone implement Haskell's group in good J?

Groups are usually done with the /. adverb.
1 1 2 1 </. 'abcd'
┌───┬─┐
│abd│c│
└───┴─┘
As you can see, it's not sequential. Just make your key sequential like so (essentially determining if an item is different from the next, and do a running sum of the resulting 0's and 1's):
neq =. 13 : '0, (}. y) ~: (}: y)'
seqkey =. 13 : '+/\neq y'
(seqkey 1 1 2 1) </. 'abcd'
┌──┬─┬─┐
│ab│c│d│
└──┴─┴─┘
What I need then is a function which counts the items (#), and tells me what they are ({. to just pick the first). I got some inspiration from nubcount:
diffseqcount =. 13 : ',(seqkey y) (#,{.)/. y'
diffseqcount 2
1 2
diffseqcount 1 2
1 1 1 2
diffseqcount 1 1 1 2
3 1 1 2
If you want the nth result, just use power:
diffseqcount(^:10) 2 NB. 10th result
1 3 2 1 1 3 2 1 3 2 2 1 1 3 3 1 1 2 1 3 2 1 2 3 2 2 2 1 1 2

I agree that /. ( Key ) is the best general method for applying verbs to groups in J. An alternative in this case, where we need to group consecutive numbers that are the same, is dyadic ;. (Cut):
1 1 0 0 1 0 1 <(;.1) 3 1 1 1 2 2 3
┌─┬─────┬───┬─┐
│3│1 1 1│2 2│3│
└─┴─────┴───┴─┘
We can form the frets to use as the left argument as follows:
1 , 2 ~:/\ 3 1 1 1 2 2 3 NB. inserts ~: in the running sets of 2 numbers
1 1 0 0 1 0 1
Putting the two together:
(] <;.1~ 1 , 2 ~:/\ ]) 3 1 1 1 2 2 3
┌─┬─────┬───┬─┐
│3│1 1 1│2 2│3│
└─┴─────┴───┴─┘
Using the same mechanism as suggested previously:
,#(] (# , {.);.1~ 1 , 2 ~:/\ ]) 3 1 1 1 2 2 3
1 3 3 1 2 2 1 3
If you are looking for a nice J implementation of the look-and-say sequence then I'd suggest the one on Rosetta Code:
las=: ,#((# , {.);.1~ 1 , 2 ~:/\ ])&.(10x&#.inv)#]^:(1+i.#[)
5 las 1 NB. left arg is sequence length, right arg is starting number
11 21 1211 111221 312211

Related

List of all permutations

Verbs C. A. is related to permutations.
And they have very complicated documentation.
I want just get all possible permutations (n!)
For example for elements 1 2 3
1 2 3
1 3 2
2 1 3
2 3 1
3 1 2
3 2 1
Left argument of A. is a list of permutation indeces.
Right argument of A. is the list to be permuted.
The initial (unpermuted) list has index 0 and it goes on from there lexicographically [*].
Egs:
(0) A. 'a';'b';'c'
┌─┬─┬─┐
│a│b│c│
└─┴─┴─┘
(1 0) A. 1 2 3
1 3 2
1 2 3
(0 1 2) A. 5 1 2
5 1 2
5 2 1
1 5 2
To get all permutations of a list, you request all (! #y) (factorial of number of elements of list y to be permuted) of them, by requesting all indeces 0 ... (n-1): i. (! # y):
(i.!#y) A. y
[*]: Lexicographically by the implied list i. # y. That is, A. always permutes the simple list 0 ... n and then applies this permutation to your initial list: permutation { initial_list.

Seeming inconsistency in the way transpose |: works

Consider:
|: 2 3 $ 1 2 3
1 1
2 2
3 3
|: 1 2 3
1 2 3
The first one makes sense to me: the rows are now columns. But, by analogy, I expected the output of the 2nd one to be:
|: 1 2 3
1
2
3
Why is it still a row, rather than a column?
|:
reverses the order of the axes of its argument
So
$ |: 2 3 $ 1 2 3
3 2
$ |: 1 2 3 $ 1 2 3
3 2 1
and naturally
$ |: 1 2 3
3
which is the list 1 2 3
The result that you expected has axes 3 1; you would get this for the transpose of the list 1 3 $ 1 2 3
] l =: 1 3 $ 1 2 3
1 2 3
|: l
1
2
3
($ l);($ |: l)
┌───┬───┐
│1 3│3 1│
└───┴───┘

How to remove an element from a list in J by index?

The rather verbose fork I came up with is
({. , (>:#[ }. ]))
E.g.,
3 ({. , (>:#[ }. ])) 0 1 2 3 4 5
0 1 2 4 5
Works great, but is there a more idiomatic way? What is the usual way to do this in J?
Yes, the J-way is to use a 3-level boxing:
(<<<5) { i.10
0 1 2 3 4 6 7 8 9
(<<<1 3) { i.10
0 2 4 5 6 7 8 9
It's a small note in the dictionary for {:
Note that the result in the very last dyadic example, that is, (<<<_1){m , is all except the last item.
and a bit more in Learning J: Chapter 6 - Indexing: 6.2.5 Excluding Things.
Another approach is to use the monadic and dyadic forms of # (Tally and Copy). This idiom of using Copy to remove an item is something that I use frequently.
The hook (i. i.##) uses Tally (monadic #) and monadic and dyadic i. (Integers and Index of) to generate the filter string:
2 (i. i.##) 'abcde'
1 1 0 1 1
which Copy (dyadic #) uses to omit the appropriate item.
2 ((i. i.##) # ]) 0 1 2 3 4 5
0 1 3 4 5
2 ((i. i.##) # ]) 'abcde'
abde

How to reconstruct strings in "edit_distance_problem"?

Suppose you have given dp table for string X = "AGGGCT" and string Y = "AGGCA"
m = length of X + 1
n = length of Y + 1
0 1 2 3 4 5
1 0 1 2 3 4
2 1 0 1 2 3
dp[m][n] = 3 2 1 0 1 2
4 3 2 1 1 2
5 4 3 2 1 2
6 5 4 3 2 2
and you want to reconstruct three strings as follows
string row1 = "AGGGCT" ;
string row2 = "||| | " ;
string row3 = "AGG-CA" ;
How to recontruct strings row1, row2 and row3, if possible post code in C/C++/Java.
I think this page can be a good starting point:
http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#Java
You have to make a few modifications, but the core idea should be to store in the "min" which case was choosed for a given (i,j), and before the return you can walk through the matrix backwards starting by distance[str1.length()][str2.length()] step-by-step. If in the steps the distances are the same you show a |, if they differ but stepping diagonal then it was a change step, otherwise if vertical/horizontal a remove/add.
You can store this "backwards" information in a string and later display it in a reverse order.

Most concise J syntax for creating a numeric matrix

Imagine that I want to take the numbers from 1 to 3 and form a matrix such that each possible pairing is represented, e.g.,
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
Here is the monadic verb I formulated in J to do this:
($~ (-:## , 2:)) , ,"0/~ 1+i.y
Originally I had thought that ,"0/~ 1+i.y would be sufficient, but unfortunately that produces the following output:
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
In other words, its shape is 3 3 2 and I want something whose shape is 9 2. The only way I could think of to fix it is to pour all of the data into a new shape. I'm convinced there must be a more concise way to do this. Anyone know?
Reshaping your intermediate result can be simplified. Removing the topmost axis is commonly done with ,/ so in your case the completed phrase could be ,/ ,"0/~ 1+i.y
One way (which uses { as a monad in its capacity for permutation cataloguing):
>,{ 2#<1+i.y
EDIT:
Some fun to be had with this scheme:
All possible permutations:
>,{ y#<1+i.y
Configurable number in sequence:
>,{ x#<1+i.y
I realize this question is old, but there is a simpler way to do it: count to 9 in trinary, and add 1.
1 + 3 3 #: i.9
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
The 3 3 & #: gives you two digits. The general 'base 3' verb is 3 & #.^:_1.

Resources