Composing addition and division verbs over lists - j

If data =: 3 1 4 and frac =: % +/, why does % +/ data result in 0.125 but frac data result in 0.375 0.125 0.5?

%+/ 3 1 4 is "sum, then find reciprocal of that sum", that is:
+/ 3 1 4
8
% 8 NB. same as 1%8
0.125
But if you define frac =: %+/, then %+/ becomes a group of two verbs isolated from their arguments (aka tacit definition), that is, a hook:
(%+/) 3 1 4
0.375 0.125 0.5
Which reads "sum, then divide original vector by that sum":
+/ 3 1 4
8
3 1 4 % 8
0.375 0.125 0.5
If you want frac to behave as in the first example, then you need to either use an explicit definition:
frac =: 3 : '%+/y'
frac 3 1 4
0.125
Or to compose % and +/, e.g. with atop conjunction or clever use of dyadic fork with capped left branch:
%#(+/) 3 1 4
0.125
([:%+/) 3 1 4
0.125

Related

Create Multiple rows for each value in given column in pandas df

I have a dataframe with points given in two columns x and y.
Thing x y length_x length_y
0 A 1 3 1 2
1 B 2 3 2 1
These (x,y) points are situated in the middle of one of the sides of a rectangle with vertex lengths length_x and length_y. What I wish to do is for each of these points give the coordinates of the rectangles they are on. That is: the following coordinated for Thing A would be:
(1+1*0.5, 3), (1-1*0.5,3), (1+1*0.5,3-2*0.5), (1-1*0.5, 3-2*0.5)
The half comes from the fact that the given lengths are the middle-points of an object so half the length is the distance from that point to the corner of the rectangle.
Hence my desired output is:
Thing x y Corner_x Corner_y length_x length_y
0 A 1 3 1.5 2.0 1 2
1 A 1 3 1.5 1.0 1 2
2 A 1 3 0.5 2.0 1 2
3 A 1 3 0.5 1.0 1 2
4 A 1 3 1.5 2.0 1 2
5 B 2 3 3.0 3.0 2 1
6 B 2 3 3.0 2.5 2 1
7 B 2 3 1.0 3.0 2 1
8 B 2 3 1.0 2.5 2 1
9 B 2 3 3.0 3.0 2 1
I tried to do this with defining a lambda returning two value but failed. Tried even to create multiple columns and then stack them, but it's really dirty.
bb = []
for thing in list_of_things:
new_df = df[df['Thing']=='{}'.format(thing)]
df = df.sort_values('x',ascending=False)
df['corner 1_x'] = df['x']+df['length_x']/2
df['corner 1_y'] = df['y']
df['corner 2_x'] = df['x']+1df['x_length']/2
df['corner 2_y'] = df['y']-df['length_y']/2
.........
Note also that the first corner's coordinates need to be repeated as I later what to use geopandas to transform each of these sets of coordinates into a POLYGON.
What I am looking for is a way to generate these rows is a fast and clean way.
You can use apply to create your corners as lists and explode them to the four rows per group.
Finally join the output to the original dataframe:
df.join(df.apply(lambda r: pd.Series({'corner_x': [r['x']+r['length_x']/2, r['x']-r['length_x']/2],
'corner_y': [r['y']+r['length_y']/2, r['y']-r['length_y']/2],
}), axis=1).explode('corner_x').explode('corner_y'),
how='right')
output:
Thing x y length_x length_y corner_x corner_y
0 A 1 3 1 2 1.5 4
0 A 1 3 1 2 1.5 2
0 A 1 3 1 2 0.5 4
0 A 1 3 1 2 0.5 2
1 B 2 3 2 1 3 3.5
1 B 2 3 2 1 3 2.5
1 B 2 3 2 1 1 3.5
1 B 2 3 2 1 1 2.5

In Python: How to convert 1/8th of space to 1/6th of space?

Have got dataframe at store-product level as shown in sample below:
Store Product Space Min Max Total_table Carton_Size
11 Apple 0.25 0.0625 0.75 2 6
11 Orange 0.5 0.125 0.5 2 null
11 Tomato 0.75 0.0625 0.75 2 6
11 Potato 0.375 0.0625 0.75 2 6
11 Melon 0.125 0.0625 0.5 2 null
Scenario: All product here have space in terms of 1/8th. But if a product have carton_size other than null, then that particular product space has to be converted in terms of 1/(carton_size)th considering the Min(Space shouldn't be lesser than Min) and Max(Space shouldn't be greater than Max) values. Can get space from non-carton products but at the end, sum of 'Space' column should be equivalent/lesser than 'Total_table' value. Also, these 1/8th and 1/6th values are in relation to the 'Total_table', this total_table value is splitted as Space for each product.
Example: In above given dataframe, Three products have carton size, so we can take 1/8th space from the non-carton product selecting from top and split it as 1/24(means 1/24 + 1/24 + 1/24 = 1/8), which can be added to three carton products to make it 1/6, which forms the expected output shown below considering Min and Max values. If any of the product doesn't satisfy Min or Max condition - leave that product(eg., Tomato).
Roughly Expected Output:
Store Product Space Min Max Total_table Carton_Size
11 Apple 0.292 0.0625 0.75 2 6
11 Orange 0.375 0.125 0.5 2 null
11 Tomato 0.75 0.0625 0.75 2 6
11 Potato 0.417 0.0625 0.75 2 6
11 Melon 0.125 0.0625 0.5 2 null
Need solution in Python.
Thanks in Advance!

How to write maximize or minimize function in J

For example, if I want to maximize the expectation of returns function
E[r]= w1r1+w2r2 and solve the optimization value for the weight w1 and w2.
The only constraint that you have really given is that w1+w2=1
w1 =.0.25
(,~ -.)w1
0.25 0.75
That takes care of both w1 and w2 given the value of w1.
r1 r2 +/#:* w1 w2 calculates r1w1 + r2w2
r1 =. 5
r2 =.10
(r1,r2) (+/#:* (,-.))w1
8.75
(r1,r2) (+/#:* (,-.))0.9
5.5
(r1,r2) (+/#:* (,-.))0.01
9.95
If you really wanted to maximize you would need to add equations for the value of r1 and r2 and take those into account as well, but perhaps I don't understand your question?
Responding to the comment below: If the constraint of w1+w2=1 still is in play, then the matter just becomes summing the values in r1 and r2, then whichever is bigger should get the w value of 1 and the other will get the w value of 0
r1=.2 4 6 3 2
r2=.2.1 4 6 3 2
r3=.2 4 6 3 2.3
r1 (,-.)#:>/#:(+/#:,.) r2
0 1
r2 (,-.)#:>/#:(+/#:,.) r1
1 0
r3 (,-.)#:>/#:(+/#:,.) r2
1 0
'w1 w2'=.r3 (,-.)#:>/#:(+/#:,.) r2
w1
1
w2
0
'w1 w2'=.r1 (,-.)#:>/#:(+/#:,.) r2
w1
0
w2
1
(r1,.r2) +/#:,#:(+ . *) (0 1) NB. w1=.0 w2=.1
17.1
(r1,.r2) +/#:,#:(+ . *) (1 0) NB. w1=.1 w2=.0
17
(r1,.r2) +/#:,#:(+ . *) (0.5 0.5) NB. w1=.0.5 w2=.0.5
17.05
Based on the follow up comment below I would approach it in one of two ways. I could dig up all my linear programming texts from the 1980's and come up with the definitive mathematical solution (including degenerative cases and local maxima/ minima) or using the same technique as above but for a larger case than n=2. I'm going with the second option.
Let's look first at the r matrix which will be a set of constants. For this example I am taking a random 5 X 10 matrix with values from 1 to 10.
r=. >: ? 5 10 $ 10
r
4 4 8 1 4 3 6 9 6 2
2 6 5 4 4 7 5 10 4 6
2 4 9 10 1 1 9 8 2 7
5 6 5 4 7 9 2 6 10 6
10 3 6 2 10 2 7 10 4 2
Now the trick that I am going to use is that I want to find the column with the highest average to be multiplied by the largest value of w. Easy to do with J using (+/ % #)
(+/ % #) r
4.6 4.6 6.6 4.2 5.2 4.4 5.8 8.6 5.2 4.6
Then find the ranking of the list to be able to reorder the columns of the original r matrix. The leading 7 means that 7 { r is the largest average etc.
\:#:(+/ % #) r
7 2 6 4 8 0 1 9 5 3
I use this to in turn reorder the columns of the matrix r using {"1 since I am working columns. The result is that I have reordered the columns of r so that the column with the largest average is on the left and smallest on the right.
(\:#:(+/ % #) {"1 ]) r
9 8 6 4 6 4 4 2 3 1
10 5 5 4 4 2 6 6 7 4
8 9 9 1 2 2 4 7 1 10
6 5 2 7 10 5 6 6 9 4
10 6 7 10 4 10 3 2 2 2
Once I have that, then the next thing is to develop the w vector. Since I now have all the largest averages on the left I will just maximize the values to the left of w to be as large as possible within the noted constraints.
w=. 0.2 0.2 0.2 0.2 0.15 0.01 0.01 0.01 0.01 0.01
#w NB. w1 through w10
10
+/w NB. sum of the values in w
1
>./w NB. largest value in w
0.2
<./w NB. smallest value in w
0.01
Because the r matrix has been reordered using + . * the dot product gives values for w1r1 , w2r2 , w3r3 ... w10r10
(({"1~ \:#: (+/ % #))r) + . * w
1.8 1.6 1.2 0.8 0.9 0.04 0.04 0.02 0.03 0.01
2 1 1 0.8 0.6 0.02 0.06 0.06 0.07 0.04
1.6 1.8 1.8 0.2 0.3 0.02 0.04 0.07 0.01 0.1
1.2 1 0.4 1.4 1.5 0.05 0.06 0.06 0.09 0.04
2 1.2 1.4 2 0.6 0.1 0.03 0.02 0.02 0.02
to actually get the weight of the matrix ravel all the values then sum
+/ , (({"1~ \:#: (+/ % #))r) + . * w
31.22

Does convolution in Theano rotate the filters?

I have an 3-channel 5-by-5 image like this:
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
And a 3-channel 3-by-3 filter like this:
10 20 30 0.1 0.2 0.3 1 2 3
40 50 60 0.4 0.5 0.6 4 5 6
70 80 90 0.7 0.8 0.9 7 8 9
When convolve the image with the filter, I am expecting this output:
369.6 514.8 316.8
435.6 594. 356.4
211.2 277.2 158.4
However, Theano (using keras) gives me this output:
158.4 277.2 211.2
356.4 594. 435.6
316.8 514.8 369.6
It seems the output is rotated 180 degrees, I wonder why this happens and how can I get the correct answer. Here is my test code:
def SimpleNet(weight_array,biases_array):
model = Sequential()
model.add(ZeroPadding2D(padding=(1,1),input_shape=(3,5,5)))
model.add(Convolution2D(1, 3, 3, weights=[weight_array,biases_array],border_mode='valid',subsample=(2,2)))
return model
im = np.asarray([
1,1,1,1,1,
1,1,1,1,1,
1,1,1,1,1,
1,1,1,1,1,
1,1,1,1,1,
2,2,2,2,2,
2,2,2,2,2,
2,2,2,2,2,
2,2,2,2,2,
2,2,2,2,2,
3,3,3,3,3,
3,3,3,3,3,
3,3,3,3,3,
3,3,3,3,3,
3,3,3,3,3])
weight_array = np.asarray([
10,20,30,
40,50,60,
70,80,90,
0.1,0.2,0.3,
0.4,0.5,0.6,
0.7,0.8,0.9,
1,2,3,
4,5,6,
7,8,9])
im = np.reshape(im,[1,3,5,5])
weight_array = np.reshape(weight_array,[1,3,3,3])
biases_array = np.zeros(1)
model = SimpleNet(weight_array,biases_array)
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy')
out = model.predict(im)
print out.shape
print out
This is the definition of convolution. It has the advantage that if you convolve an image that consists of only zeros except for one single 1 somewhere, the convolution will place a copy of the filter at that position.
Theano does exactly these convolutions, as defined mathematically. This implies flipping the filters (the operation is filter[:, :, ::-1, ::-1]) before taking dot products with the image patches. Note that these are not rotations by 180 degrees, at least not in general.
It appears that what you are looking for is cross-correlation, which is taking dot products with the non-flipped versions of the filters at each point of the image.
See also this answer in which theano.tensor.nnet.conv2d is shown to do exactly the same thing as the scipy counterpart.

gnuplot: fetching a variable value from different row/column for calculations

I want to get a specific value from another row & column to normalize my data. The tricky part is, that this value changes for every data point in my data set.
Here what my data set looks like:
64 22370 1 585 1 10
128 47547 1 4681 1 10
256 291761 1 37449 1 10
128 48446 1.019 4681 1 10
256 480937 1.648 37449 1 10
128 7765 0.163 777 0.166 10
256 7164 0.025 1393 0.037 10
128 37078 0.780 4681 1 10
256 334372 1.146 37449 1 10
128 45543 0.958 4681 1 10
128 5579 0.117 649 0.139 10
128 40121 0.844 4529 0.968 10
128 49494 1.041 4681 1 10
# --> here it starts to repeat
64 48788 1 585 1 20
128 110860 1 4681 1 20
256 717797 1 37449 1 20
128 101666 0.917 4681 1 20
......
......
This data file contains all points for in total 13 different sets, so I plot it with something like this:
plot\
'../logs.dat' every 13::1 u 6:2 title '' with lines lt 3 lc 'black' lw 1,\
'../logs.dat' every 13::3 u 6:2 title '' with lines lt 3 lc 'black' lw 1,\
Now I try to normalize my data. The interesting value is respectively the 1st row 2nd column (starting to count at 0) $1:$2 and then adds 13 to the rows for every data point
For example: The first data set I want to plot would be
(10:47547/47547)
(20:110860/110860)
...
The second plot should be
(10:48446/47547)
(20:101666/110860)
...
And so on.
In pseudo code I would read something like
plot\
'../logs.dat' every 13::1 u 6:($2 / take i:$2 for i = i + 13 ) title '' with lines lt 3 lc 'black' lw 1,\
'../logs.dat' every 13::3 u 6:($2 / take i:$2 for i = i + 13 ) title '' with lines lt 3 lc 'black' lw 1,\
I hope I could make clear what I try to archive.
Thank you for any help!
If the value you want to use for normalisation is the very first to be plotted, then something like this is possible:
plot y0=-1e10, "data" using 1:(y0 == -1e10 ? (y0 = $2, 1) : $2/y0)
The normalisation value y0 is initialised to -1e10 on every replot. Check the help for ternary operator and serial evaluation.
But really you'd better pre-process your data.
If I understood your question correctly you want to normalize some of your data in a special way.
For the first plot you want to start from the second line (row-index 1) and divide the value in the column by itself and continue for every 13th row.
So, this is dividing the values of the second column for the following row indices: 1/1, 14/14, 27/27, ..., (n*13+1)/(n*13+1). This is trivial because it will always be 1.
For the second plot you want to start with the value in column 2 from row index 3 and divide it by the value in column2 of row index 1 and repeat this for every 13th row.
i.e. involved rows-indices: 3/1, 16/14, 29/27, ..., (n*13+3)/(n*13+1)
For the second case, a construct with every 13 will not work because you need every 13th value and every 13th shifted by 2 rows.
So, what you can do:
If you pass by row-index 1 (and every 13th row later), remember the value in column 2 and when you pass by row-index 3, divide this value by the remembered value and plot it, otherwise plot NaN. Repeat this for all rows cycled by 13. You can use the pseudocolumn 0 (check help pseudocolumns) and the modulo operator (check help operators binary).
If you want a continuous line with lines or linespoints you need to set datafile missing NaN because NaN values would interrupt the lines (check help missing). However, this works only for gnuplot>=5.0.6. For gnuplot 5.0.0 (version at OP's question) you have to use some workaround.
Script:
### special normalization of data
reset session
$Data <<EOD
1 900 3 4 5 10
2 1000 3 4 5 10
3 1050 3 4 5 10
4 1100 3 4 5 10
5 1150 3 4 5 10
6 1200 3 4 5 10
7 1250 3 4 5 10
8 1300 3 4 5 10
9 1350 3 4 5 10
10 1400 3 4 5 10
11 1450 3 4 5 10
12 1500 3 4 5 10
13 1550 3 4 5 10
#
1 1900 3 4 5 20
2 2000 3 4 5 20
3 2050 3 4 5 20
4 2100 3 4 5 20
5 2150 3 4 5 20
6 2200 3 4 5 20
7 2250 3 4 5 20
8 2300 3 4 5 20
9 2350 3 4 5 20
10 2400 3 4 5 20
11 2450 3 4 5 20
12 2500 3 4 5 20
13 2550 3 4 5 20
#
1 2900 3 4 5 30
2 3000 3 4 5 30
3 3050 3 4 5 30
4 3100 3 4 5 30
5 3150 3 4 5 30
6 3200 3 4 5 30
7 3250 3 4 5 30
8 3300 3 4 5 30
9 3350 3 4 5 30
10 3400 3 4 5 30
11 3450 3 4 5 30
12 3500 3 4 5 30
13 3550 3 4 5 30
EOD
M = 13 # cycle of your data
set datafile missing NaN # only for gnuplot>=5.0.6
plot $Data u 6:(1) every M w lp pt 7 lc "red" ti "Normalized 1/1", \
'' u 6:(int($0)%M==1?y0=$2:0,int($0)%M==3?$2/y0:NaN) w lp pt 7 lc "blue" ti "Normalized 3/1"
### end of code
Result:

Resources