Avoiding a repeated verb name in a train - j

Consider a dyadic verb g, defined in terms of a dyadic verb f:
g=. [ f&.|: f
Is it possible to rewrite g so that the f term appears only once, but the behavior is unchanged?
UPDATE: Local Context
This question came up as part of my solution to this problem, which "expanding" a matrix in both directions like so:
Original Matrix
1 2 3
4 5 6
7 8 9
Expanded Matrix
1 1 1 1 2 3 3 3 3
1 1 1 1 2 3 3 3 3
1 1 1 1 2 3 3 3 3
1 1 1 1 2 3 3 3 3
4 4 4 4 5 6 6 6 6
7 7 7 7 8 9 9 9 9
7 7 7 7 8 9 9 9 9
7 7 7 7 8 9 9 9 9
7 7 7 7 8 9 9 9 9
My solution was to expand the matrix rows first using:
f=. ([ # ,:#{.#]) , ] , [ # ,:#{:#]
And then to apply that same solution under the transpose to expand the columns of the already row-expanded matrix:
3 ([ f&.|: f) m
And I noticed that it wasn't possible to write my solution with making the temporary verb f, or repeating its definition inline...
Try it online!

Knowing the context helps. You can also approach this using (|:#f)^:(+: x) y. A tacit (and golfed) solution would be 0&(|:{.,],{:)~+:.
(>: i. 3 3) (0&(|:{.,],{:)~+:) 2
1 1 1 2 3 3 3
1 1 1 2 3 3 3
1 1 1 2 3 3 3
4 4 4 5 6 6 6
7 7 7 8 9 9 9
7 7 7 8 9 9 9
7 7 7 8 9 9 9

I don't think it is possible. The right tine is going to be the result of x f y and the left tine is x The middle tine will transpose and apply f to the arguments and then transpose the result back. If you take the right f out then there is not a way to have x f y and if the middle f is removed then you do not have f applied to the transpose.
My guess is that you are looking for a primitive that will accomplish the same result with only one mention of f, but I don't know of one.
Knowing the J community someone will prove me wrong!

Related

Excel formulae for max or min of multiple occurrences of vlookup

I need to do a complicated vlookup/maxif type of selection. The data I have is as below
Row Col G Col H Col I colJ col K
1 Bench Strip Block BenchAbove BenchBelow
2 1 1 4
3 1 1 5
4 1 1 6
5 1 1 7
6 1 1 8
7 8 1 4 ?? ??
8 8 1 5
9 8 1 6
10 8 1 7
11 8 1 8
12 9 1 4
13 9 1 5
14 9 1 6
15 9 1 7
.....this list is long ( this is a sample only)
For every combination of (Strip, block) like say (1,4) There are benches like 1, 8 and 9. So bench above for 8 is 1 and bench below for 8 is 9. I need to determine the bench above and bench below for each row. There are no bench above 1 and no bench below 9.
I dont think vlookup is the solution here. Not sure if MAX(IF..) can help either. What would be the best formulae to obtain say on row 7, block combination is 1,4. The bench in question is 8. The bench above is 1 and bench below is 9. So 2 formulae will be required on Col J and Col I above.
The expected answer for the above sample data is :
Row Col G Col H Col I colJ col K
1 Bench Strip Block BenchAbove BenchBelow
2 1 1 4 - 8
3 1 1 5 - 8
4 1 1 6 - 8
5 1 1 7 - 8
6 1 1 8 - 8
7 8 1 4 1 9
8 8 1 5 1 9
9 8 1 6 1 9
10 8 1 7 1 9
11 8 1 8 1 9
12 9 1 4 8 -
13 9 1 5 8 -
14 9 1 6 8 -
15 9 1 7 8 -
Maybe in J2:
=IFERROR(LOOKUP(2,1/((H$1:H1=H2)*(I$1:I1=I2)),G$1:G1),"-")
In K2:
=IFERROR(INDEX(G3:G$16,MATCH(1,INDEX((H3:H$16=H2)*(I3:I$16=I2),),0)),"-")
However, I find your question a bit confusing so this answer might be a bit off.
If you have Office365 then you can use MAXIFS(), MINIFS() easily to get BenchAbove and BenchBelow. Try-
=MAXIFS(A2:A15,B2:B15,B7,C2:C15,C7)
=MINIFS(A2:A15,B2:B15,B7,C2:C15,C7)
EDIT: Solution for Excel-2016
Try below formula-
=INDEX($A$2:$A$15,AGGREGATE(14,6,ROW($A$2:$A$15)-ROW($A$1)/(($B$2:$B$15=B7)*($C$2:$C$15=C7)),ROW(1:1)))
=INDEX($A$2:$A$15,AGGREGATE(15,6,ROW($A$2:$A$15)-ROW($A$1)/(($B$2:$B$15=B7)*($C$2:$C$15=C7)),ROW(1:1)))

Add ordinal number column to output of custom verb in J

If I type !i.10 it gives the factorial of first 10 numbers.
However if I try to add a column of ordinal numbers >/.!i.10, 1+i.10, then J freezes or I get an "Out of memory" error. How do I create custom tables?
I think that what is happening is that you are creating something much bigger than you expect. Taking it in steps:
1+ i. 10 NB. list of 1 to 10
1 2 3 4 5 6 7 8 9 10
10 , 1+ i. 10 NB. 10 prepended
10 1 2 3 4 5 6 7 8 9 10
i. 10 , 1+ i. 10 NB. creates an 11 dimension array with shape 10 1 2 3 4 5 6 7 8 9 10 and largest value of 36287999
When you apply ! to that i. 10 , 1+ i. 10 you get some very large numbers. I am not sure what you are trying to do with the leading >/.
Is this what you had in mind?
(!1 + i.10),. (1+i.10) NB. using parenthesis to isolate operations
1 1
2 2
6 3
24 4
120 5
720 6
5040 7
40320 8
362880 9
3.6288e6 10
To give extended type and get rid of the 3.6288e6 we can use x:
(x:!1 + i.10),. (1+i.10)
1 1
2 2
6 3
24 4
120 5
720 6
5040 7
40320 8
362880 9
3628800 10
or tacit
(x:#! ,. ]) # (1+i.) 10
1 1
2 2
6 3
24 4
120 5
720 6
5040 7
40320 8
362880 9
3628800 10
Or a version I find a little better
([: (,.~ !) 1x + i.) 10
1 1
2 2
6 3
24 4
120 5
720 6
5040 7
40320 8
362880 9
3628800 10

How do I calculate the probability of every value in a dataframe column quickly in Python?

I want to calculate the probability of all the data in a column dataframe according to its own distribution.For example,my data like this:
data
0 1
1 1
2 2
3 3
4 2
5 2
6 7
7 8
8 3
9 4
10 1
And the output I expect like this:
data pro
0 1 0.155015
1 1 0.155015
2 2 0.181213
3 3 0.157379
4 2 0.181213
5 2 0.181213
6 7 0.048717
7 8 0.044892
8 3 0.157379
9 4 0.106164
10 1 0.155015
I also refer to another question(How to compute the probability ...) and get an example of the above.My code is as follows:
import scipy.stats
samples = [1,1,2,3,2,2,7,8,3,4,1]
samples = pd.DataFrame(samples,columns=['data'])
print(samples)
kde = scipy.stats.gaussian_kde(samples['data'].tolist())
samples['pro'] = kde.pdf(samples['data'].tolist())
print(samples)
But what I can't stand is that if my column is too long, it makes the operation slow.Is there a better way to do it in pandas?Thanks in advance.
Its own distribution does not mean kde. You can use value_counts with normalize=True
df.assign(pro=df.data.map(df.data.value_counts(normalize=True)))
data pro
0 1 0.272727
1 1 0.272727
2 2 0.272727
3 3 0.181818
4 2 0.272727
5 2 0.272727
6 7 0.090909
7 8 0.090909
8 3 0.181818
9 4 0.090909
10 1 0.272727

Average of multiple files with unequal row sizes in Shell

I have 15 datafiles with unequal row sizes, but number of columns in each file is same. e.g.
ifile1.dat ifile2.dat ifile3.dat and so on ............
0 0 0 0 1 6
1 2 5 3 2 7
2 5 6 10 4 6
5 2 8 9 5 9
10 2 10 3 8 2
In each file 1st column represents the index number.
I would like to compute average of all these files for each index number in column 1. i.e.
ofile.txt
0 0 [This is computed as (0+0)/2]
1 4 [This is computed as (2+6)/2]
2 6 [This is computed as (5+7)/2]
3 [no value]
4 6 [This is computed as (6)/1]
5 4.66 [This is computed as (2+3+9)/3]
6 10
7
8 5.5
9
10 2.5
I can't think of any simple method to do it. I was thinking of a method, but seems very lengthy. Taking the average after converting all the files with same row size, .e.g.
ifile1.dat ifile2.dat ifile3.dat and so on ............
0 0 0 0 0 0
1 2 1 1 6
2 5 2 2 7
3 3 3
4 4 4 6
5 2 5 3 5 9
6 6 10 6
7 7 7
8 8 9 8 2
9 9 9
10 2 10 3 10
$ awk '{s[$1]+=$2; c[$1]++;} END{for (i in s) print i,s[i]/c[i];}' ifile*.dat
0 0
1 4
2 6
4 6
5 4.66667
6 10
8 5.5
10 2.5
In the above code, there are two arrays, s and c. s[i] is the sum of all entries with index i and c[i] is the number of entries with index i. After we have read all the files, we print the average, s[i]/c[i], for each index i.

Sort a group of data based on a column

I have an input file that contains following data:
1 2 3 4
4 6
8 9
10
2 1 5 7
3
3 4 2 9
2 7
11
I'm trying to sort the group of data based on the third column and get such an output:
2 1 5 7
3
1 2 3 4
4 6
8 9
10
3 4 2 9
2 7
11
Could you tell me how to do so?
sort -nk3r
will sort in reverse order based on 3rd column. Note however, that this outputs
2 1 5 7
1 2 3 4
3 4 2 9
10
11
2 7
3
4 6
8 9
because of the way bash sort functions, and this produces a different result than the output you posted, but correct according to the question.

Resources