Mapping of elements by number of occurrences in J - j

Using J language, I wish to attain a mapping of the counts of elements of an array.
Specifically, I want to input a lowercased English word with two to many letters and get back each pair of letters in the word along with counts of occurences.
I need a verb that gives something like this, in whatever J structure you think is appropriate:
For 'cocoa':
co 2
oc 1
oa 1
For 'banana':
ba 1
an 2
na 2
For 'milk':
mi 1
il 1
lk 1
For 'to':
to 1
(For single letter words like 'a', the task is undefined and will not be attempted.)
(Order is not important, that's just how I happened to list them.)
I can easily attain successive pairs of letters in a word as a matrix or list of boxes:
2(] ;._3)'cocoa'
co
oc
co
oa
]
2(< ;._3)'cocoa'
┌──┬──┬──┬──┐
│co│oc│co│oa│
└──┴──┴──┴──┘
But I need help getting from there to a mapping of pairs to counts.
I am aware of ~. and ~: but I don't just want to return the unique elements or indexes of duplicates. I want a mapping of counts.
NuVoc's "Loopless" page is indicating that / (or /\. or /\) are where I should be looking for accumulation problems. I am familiar with / for arithmetic operations on numeric arrays, but for u/y I don't know what u would have to be to accumulate the list of pairs of letters that would make up y.
(NB. I can already do this in "normal" languages like Java or Python without help. Similar questions on SO are for languages with very different syntax and semantics to J. I am interested in the idiomatic J approach to this sort of problem.)

To get the list of 2-letter combinations I'd use dyadic infix (\):
2 ]\ 'banana'
ba
an
na
an
na
To count occurrences the primitive that immediately comes to mind is key (/.)
#/.~ 2 ]\ 'banana'
1 2 2
If you want to match the counts to the letter combinations you can extend the verb to the following fork:
({. ; #)/.~ 2 ]\ 'banana'
┌──┬─┐
│ba│1│
├──┼─┤
│an│2│
├──┼─┤
│na│2│
└──┴─┘

I think that you are looking to map counts of unique items to the items. You can correct me if I am wrong.
Starting with
[t=. 2(< ;._3)'cocoa'
┌──┬──┬──┬──┐
│co│oc│co│oa│
└──┴──┴──┴──┘
You can use ~. (Nub) to return the unique items in the list
~.t
┌──┬──┬──┐
│co│oc│oa│
└──┴──┴──┘
Then if you compare the nub to the boxed list you get a matrix where the 1's are the positions that match the nub to the boxed pairs in your string
t =/ ~.t
1 0 0
0 1 0
1 0 0
0 0 1
Sum the columns of this matrix and you get the number of times each item of the nub shows up
+/ t =/ ~.t
2 1 1
Then box them so that you can combine the integers along side the boxed characters
<"0 +/ t =/ ~.t
┌─┬─┬─┐
│2│1│1│
└─┴─┴─┘
Combine them by stitching together the nub and the count using ,. (Stitch)
(~.t) ,. <"0 +/ t =/ ~.t
┌──┬─┐
│co│2│
├──┼─┤
│oc│1│
├──┼─┤
│oa│1│
└──┴─┘
[t=. 2(< ;._3)'banana'
┌──┬──┬──┬──┬──┐
│ba│an│na│an│na│
└──┴──┴──┴──┴──┘
(~.t) ,. <"0 +/ t =/ ~.t
┌──┬─┐
│ba│1│
├──┼─┤
│an│2│
├──┼─┤
│na│2│
└──┴─┘
[t=. 2(< ;._3)'milk'
┌──┬──┬──┐
│mi│il│lk│
└──┴──┴──┘
(~.t) ,. <"0 +/ t =/ ~.t
┌──┬─┐
│mi│1│
├──┼─┤
│il│1│
├──┼─┤
│lk│1│
└──┴─┘
Hope this helps.

Related

what will be the dp and transitions in this problem

Vasya has a string s of length n consisting only of digits 0 and 1. Also he has an array a of length n.
Vasya performs the following operation until the string becomes empty: choose some consecutive substring of equal characters, erase it from the string and glue together the remaining parts (any of them can be empty). For example, if he erases substring 111 from string 111110 he will get the string 110. Vasya gets ax points for erasing substring of length x.
Vasya wants to maximize his total points, so help him with this!
https://codeforces.com/problemset/problem/1107/E
i was trying to get my head around the editorial,but couldn't understand it... can anyone tell an easy way to do it?
input:
7
1101001
3 4 9 100 1 2 3
output:
109
Explanation
the optimal sequence of erasings is: 1101001 → 111001 → 11101 → 1111 → ∅.
Here, we consider removing prefixes instead of substrings. Why?
We try to remove a consecutive prefix of a particular state which is actually a substring in the main string. So, our DP states will be start index, end index, prefix length.
Let's consider an example str = "1010110". Here, initially start=0, end=7, and prefix=1(the first '1' will be the only prefix now). we iterate over all the indices in the current state except the starting index and check if str[i]==str[start]. Here, for example, str[4]==str[0]. Now we divide the string into "010" with prefix=1(010) && "110" with prefix=2(1010110). These two are now two individual subproblems. So, when there remains a string with length 1, we return aprefix.
Here is my code.

TCL, extract 2 integers from string into list?

I have 2 string formatted as such:
(1234, 4567)
And I have a list
points {0 1 2 4}
I would like to extract 2 integers from the first list and replace the first two integers in the list, after that extract two more integers from the 2nd list and replace the 3rd and 4th integers in the list so at the end I will have a list of 4 integers from the two strings.
So far I have tried all kind of things but always end up with errors or brackets in the list which I do not want. I feel I am missing out on the easy way to do that.
With the first set of values, you can parse with scan or regexp; in this case, I think scan looks better:
set input "(1234, 5678)"
scan $input "(%d,%d)" a b
To update a Tcl list (formally, one in a variable), you use lset; you can give a sequence of (zero-based) indices to it to navigate into the exact place in the list where you want to update:
set workingArea "points {0 1 2 4}"
lset workingArea 1 2 $a
lset workingArea 1 3 $b
puts $workingArea
# prints: points {0 1 1234 5678}

I want to get result in one line. I don't need new dimensions

(1:)`(3:)#.(1&=)"0 i.2
1 3
(1:,2:)`(3:)#.(1&=)"0 i.2
1 2
3 0
I want to get
1 2 3
Without new dimensions. Without zeros.
The shape changes dramatically between (1:) and (1:,2:).
$ 1: 'a'
$ 1 $ 1: 'a'
1
$ (1:,2:) 'a'
2
(1&$ 1:)`(1&$ 3:)#.(1&=)"0 i.2
1
3
There's probably a better way, but to my way of thinking, you're generating arrays of unequal length, which should be boxed, and then you want to turn them into a single list.
Thus:
; ((1:,2:)`(3:))#.(1&=)"0&.> i.2
1 2 3
Which can be refactored and improved a bit:
;#:((1:,2:)`(3:)#.(1&=)each) i.2
1 2 3
You could have used (1:,2:,3:) 'ignored argument' to form the list, but that doesn't address why you were using #.
Dane's comment about boxing intermediate results and then razing the resulting list is relevant if you want to merge irregularly shaped results. (Which might be what you were trying for, here.)

Understanding the Knuth Morris Pratt(KMP) Failure Function

I've been reading the Wikipedia article about the Knuth-Morris-Pratt algorithm and I'm confused about how the values are found in the jump/partial match table.
i | 0 1 2 3 4 5 6
W[i] | A B C D A B D
T[i] | -1 0 0 0 0 1 2
If someone can more clearly explain the shortcut rule because the sentence
"let us say that we discovered a proper suffix which is a proper prefix and ending at W[2] with length 2 (the maximum possible)"
is confusing. If the proper suffix ends at W[2] wouldn't it be size of 3?
Also I'm wondering why T[4] isn't 1 when there is a prefix and suffix of size 1: The A.
Thanks for any help that can be offered.
Notice that the failure function T[i] does not use i as an index, but rather as a length. Therefore, T[2] represents the length of the longest proper border (a string that is both a prefix and suffix) of the string formed from the first two characters of W, rather than the longest proper border formed by the string ending at character 2. This is why the maximum possible value of T[2] is 2 rather than 3 - the substring formed from the first two characters of W can't have length any greater than 2.
Using this interpretation, it's also easier to see why T[4] is 0 rather than 1. The substring of W formed from the first four characters of W is ABCD, which has no proper prefix that is also a proper suffix.
Hope this helps!
"let us say that we discovered a proper suffix which is a proper prefix and ending at W[2] with length 2 (the maximum possible)"
Okay, the length can be maximum 2, it's correct, here is why...
One fact: "proper" prefix can't be the whole string , same goes for "proper" suffix(like proper subset)
Lets, W[0]=A W[1]=A W[2]=A , i.e the pattern is "AAA", so, the (max length)proper prefix can be "AA" (left to right) and, the (max length) proper suffix can be "AA" (right to left)
//yes, the prefix and suffix have overlaps (the middle "A")
So, the value would be 2 rather than 3, it would have been 3 only if the prefix was not proper.

Counting quantifiers - how

Let's say, I have to model a checkerboard and I want to say that at least 5 squares on the "A" vertical are empty. How do I do that in Alloy? Any other example with numbers different from 0 or 1 would be good. In other words, what do I do when "some" is not precise enough?
Thanks!
You can use the cardinality operator (#) to make assertions about the number of tuples in a relation, e.g.,
#r >= 5
says that the relation r must have at least 5 tuples.
You can also use the cardinality operator with an arbitrary expression, e.g.,
#board.cells >= 5
or
#{c: Cell | c in board.cells and ...} >= 5

Resources