Why "0 not equivalent to "_1 when arguments are one dimensional arrays? - j

Why these two expression not equivalent in this situation?
0 1 2 ,"(0)/ 0 1
0 0
0 1
1 0
1 1
2 0
2 1
0 1 2 ,"(_1)/ 0 1
|length error
| 0 1 2 ,"(_1)/0 1
Actually what I'm trying to do...
a =: 0 1 2 3 4 5 ; 0 1 2 ; 0 1
Want to get all possible combinations
,"0/&>/ a
This code doesn't work
This works though:
0 1 2 3 4 5 ,"(0 1)/ 0 1 2 ,"(0 0)/ 0 1
But, of course, I want to write in short form
,"0/&>/ a
The problem is that all terms should be
,"(0 1)/
but last
,"(0 0)/

Maybe this will help, as what it doing is simply appending at a rank of 0
0 ,"(0)/ 0 1
0 0
0 1
1 ,"(0)/ 0 1
1 0
1 1
2 ,"(0)/ 0 1
2 0
2 1
For the actual solution to the problem you are investigating, have you looked at Catalogue? {
{a
┌─────┬─────┐
│0 0 0│0 0 1│
├─────┼─────┤
│0 1 0│0 1 1│
├─────┼─────┤
│0 2 0│0 2 1│
└─────┴─────┘
┌─────┬─────┐
│1 0 0│1 0 1│
├─────┼─────┤
│1 1 0│1 1 1│
├─────┼─────┤
│1 2 0│1 2 1│
└─────┴─────┘
┌─────┬─────┐
│2 0 0│2 0 1│
├─────┼─────┤
│2 1 0│2 1 1│
├─────┼─────┤
│2 2 0│2 2 1│
└─────┴─────┘
┌─────┬─────┐
│3 0 0│3 0 1│
├─────┼─────┤
│3 1 0│3 1 1│
├─────┼─────┤
│3 2 0│3 2 1│
└─────┴─────┘
┌─────┬─────┐
│4 0 0│4 0 1│
├─────┼─────┤
│4 1 0│4 1 1│
├─────┼─────┤
│4 2 0│4 2 1│
└─────┴─────┘
┌─────┬─────┐
│5 0 0│5 0 1│
├─────┼─────┤
│5 1 0│5 1 1│
├─────┼─────┤
│5 2 0│5 2 1│
└─────┴─────┘
Catalogue matches the Append Table:
(>{a)-: 0 1 2 3 4 5 ,"(0 1)/ 0 1 2 ,"(0 0)/ 0 1
1

"_1 is equivalent to "0"_
In other words, "_1 forms a verb which looks at all the data to find its rank and then derives another verb to work at one rank lower than that.

Related

Replace column between two files in Unix

How can i replace the 6th column of my following data.fam file as following with 3rd second column of my .txt file.
My data.fam file looks like this
20481 20481 0 0 2 -9
20483 20483 0 0 1 1
20488 20488 0 0 2 1
20492 20492 0 0 1 1
20493 20493 0 0 1 -9
20498 20498 0 0 2 -9
data.txt file looks like this.
20481 2 1
20488 2 1
20483 2 1
20493 1 0
22822 2 1
20498 -9 -9
22692 1 0
The output.fam should be like
20481 20481 0 0 2 1
20483 20483 0 0 1 1
20488 20488 0 0 2 1
20492 20492 0 0 1 0
20493 20493 0 0 1 0
20498 20498 0 0 2 1
I have tried awk 'FNR==NR{a[$1]=$2;next}{$6=a[$3]}1' data.txt data.fam >output.fam
But the output is without the 6th column as following at all like the following. Where i am doing wrong??
20481 20481 0 0 2
20483 20483 0 0 1
20488 20488 0 0 2
20492 20492 0 0 1
20493 20493 0 0 1
20498 20498 0 0 2

ValueError in clustering evaluation: Expected 2D array, got 1D array instead

df_2D = df[['sepal-length', 'petal-length']]
df_2D = np.array(df_2D)
k_means_2D_model = KMeans(n_clusters=3, max_iter=1000).fit(df_2D)
Error:
ValueError: Expected 2D array, got 1D array instead:
array=[0 1 0 0 2 2 1 2 1 1 0 1 0 2 0 2 1 2 0 2 2 1 0 1 2 0 0 2 1 0 2 0 1 1 0 0 2
1 2 1 0 0 1 1 1 1 2 1 2 2 0 2 2 2 0 1 1 1 1 0 1 2 1 2 1 2 2 1 2 1 0 0 2 1
1 0 2 0 0 1 1 0 1 2 1 1 0 1 0 1 0 0 0 2 1 1 1 2 0 2 0 0 2 2 0 0 1 2 2 1 0
2 2 1 1 2 0 0 2 2 2 0 0 1 0 1 0 2 2 2 0 0 0 0 2 1 2 2 2 2 1 1 1 2 2 0 0 0
1 0].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

Pattern identification in a dataset using python

I have a dataframe that looks something like this:
empl_ID day_1 day_2 day_3 day_4 day_5 day_6 day_7 day_8 day_9 day_10
1 1 1 1 1 1 1 0 1 1 1
2 0 0 1 1 1 1 1 1 1 0
3 0 1 0 0 1 1 1 1 1 1
4 1 0 1 0 1 1 1 0 1 0
5 1 0 0 1 1 1 1 1 1 1
6 0 0 0 0 1 1 1 1 1 1
As we can see we have 6 employees and index 1 indicates their presence for that day. I want to write a code using Python such that I can trace 2 continuous absences i.e. pattern 0 ,0 for day i, day i+1 in a time-frame of 6 days right from the person begins his employment.
For example, employee 1 begins his work at day_1 column, which is his first appearance of 1. So, from columns day_1 to day_6 if we do not observe any continuous 0, 0 that record should be labeled as '0'. Same would be the case for employee 2 (cols: day_3 to day_8), employee 4 (cols: day_1 to day_6) and employee 6 (cols: day_5 to day_10) and they will be labeled as '0'.
However, for employee 3 (cols: day_2 to day_7), employee 6 (cols: day_5 to day_10) they contain a 0, 0 pattern right from their first presence of 1 within the respective time-frame and thus will be labeled as '1'.
It would be really helpful if someone could help me in formulating a code to achieve the above objective. Thanks in advance!
The result should look something like this:
empl_ID day_1 day_2 day_3 day_4 day_5 day_6 day_7 day_8 day_9 day_10 label
1 1 1 1 1 1 1 0 1 1 1 0
2 0 0 1 1 1 1 1 1 1 0 0
3 0 1 0 0 1 1 1 1 1 1 1
4 1 0 1 0 1 1 1 0 1 0 0
5 1 0 0 1 1 1 1 1 1 1 1
6 0 0 0 0 1 1 1 1 1 1 0
Check with idxmcx and for loop with shift
s=df.set_index('empl_ID')
idx=s.columns.get_indexer(s.idxmax(1))
l=[(s.iloc[t, x :y].eq(s.iloc[t, x :y].shift())&s.iloc[t, x :y].eq(0)).any() for t , x ,y in zip(df.index,idx,idx+5)]
df['Label']=l
df
empl_ID day_1 day_2 day_3 day_4 ... day_7 day_8 day_9 day_10 Label
0 1 1 1 1 1 ... 0 1 1 1 False
1 2 0 0 1 1 ... 1 1 1 0 False
2 3 0 1 0 0 ... 1 1 1 1 True
3 4 1 0 1 0 ... 1 0 1 0 False
4 5 1 0 0 1 ... 1 1 1 1 True
5 6 0 0 0 0 ... 1 1 1 1 False
[6 rows x 12 columns]

Python: How do I assign string values to numeric data?

Running the four commands results in below output, from a dataframe called cancer.
$ print("\n target")
$ print(cancer.target)
$ print("\n target_names")
$ print(cancer.target_names)
target
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 0 1 0 0 1 1 1 1 0 1 0 0 1 1 1 1 0 1 0 0
1 0 1 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 0 0 1 1 1 1 0 1 1 0 1 1
1 1 1 1 1 1 0 0 0 1 0 0 1 1 1 0 0 1 0 1 0 0 1 0 0 1 1 0 1 1 0 1 1 1 1 0 1
1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 1 1 0 1 1 0 0 0 1 0
1 0 1 1 1 0 1 1 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 1 1 0 0 1 1
1 0 1 1 1 1 1 0 0 1 1 0 1 1 0 0 1 0 1 1 1 1 0 1 1 1 1 1 0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 0 1 1 0 1 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1
1 0 1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 0 1 1 1 1 0 0 0 1 1
1 1 0 1 0 1 0 1 1 1 0 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0
0 1 0 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 0 1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1
1 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 1 1 1 1 0 1 1
0 1 0 1 1 0 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1
1 1 1 1 1 1 0 1 0 1 1 0 1 1 1 1 1 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 1 0 1 0 0
1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 0 0 0 0 0 0 1]
target_names
['malignant' 'benign']
How would I be able to assign "malignant" to 0 and "benign" to 1, and vice versa?
You can map it by using a dictionary
target=[0,0,0,0,0,1,1,1,0,1,0,0,1]
d={0:'malignant',1:'benign'}
target=[d[t] for t in target]
print(target)
['malignant', 'malignant', 'malignant', 'malignant', 'malignant', 'benign', 'benign', 'benign', 'malignant', 'benign', 'malignant', 'malignant', 'benign']
What so you mean by assign?
Something like this?
for i in range(len(cancer)):
if target[i]==0:
target[i]=target_names[0]
elif target[i]==1:
target[i]=target_names[1]

J Language rank of power function

t=:1
test=: monad define
t=.y
t=. t, 0
)
testloop=: monad def'test^:y t'
testloop 1
1 0
testloop 2
1 0 0
testloop 10
1 0 0 0 0 0 0 0 0 0 0
In order to simplify this
(testloop 0),(testloop 1), (testloop 2), ...
110100100010000...
I tried
, testloop"0 (i.10)
but it gives
1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0...
It seems like I have a problem with a rank, I can't figure out which one to use.
I would be grateful if you could help me on this issue.
Thank you!
This is not so much a rank problem as the fact that the results are padded with zeros so that the row lengths match.
testloop 1
1 0
testloop 2
1 0 0
testloop"0 [ 1 2
1 0 0
1 0 0
testloop"0 [ 1 2 3
1 0 0 0
1 0 0 0
1 0 0 0
If I redefine your test and testloop to add a different appending digit, we can see how the padding is working.
test2 =: 3 : 0
​t=. y
​t=. t,2
​)
test2loop=: monad def'test2^:y t'
test2loop"0 [1
1 2
test2loop"0 [2
1 2 2
test2loop"0 [ 1 2 NB. 0 padded in first row
1 2 0
1 2 2
test2loop"0 [ 1 2 3 NB. 0's padded in first two rows
1 2 0 0
1 2 2 0
1 2 2 2
To get around the padding issue I will use each=: &.> so that the results are boxed before combining to avoid the padding.
testloop each 1 2 3
+---+-----+-------+
|1 0|1 0 0|1 0 0 0|
+---+-----+-------+
testloop each i. 10
+-+---+-----+-------+---------+-----------+-------------+---------------+-----------------+-------------------+
|1|1 0|1 0 0|1 0 0 0|1 0 0 0 0|1 0 0 0 0 0|1 0 0 0 0 0 0|1 0 0 0 0 0 0 0|1 0 0 0 0 0 0 0 0|1 0 0 0 0 0 0 0 0 0|
+-+---+-----+-------+---------+-----------+-------------+---------------+-----------------+-------------------+
using ; to unbox and ravel the results
; testloop each i. 10
1 1 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
To be honest I would be more inclined to use the fact that complex numbers used as the left argument of # introduce 0's for padding. The number of 0's depends on the imaginary value of the complex number.
1j0 # 1
1
1j1 # 1
1 0
1j2 # 1
1 0 0
test3=: monad def '(1 j. y)#1'
test3 1
1 0
test3 2
1 0 0
test3 1 2
1 0 1 0 0
test3 i. 10
1 1 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0

Resources