pytorch how to select channels by mask? - python-3.x

I want to know how do I select channels by the mask in Pytorch.
[channel1 channel2 channel3 channel4] x [1,0,0,1] --> [channel1,channel4]
I tried torch.masked_select() and it did't work.
if the input has a shape like [B,C,H,W] the output's shape should be [B,masked_C,H,W],
import torch
from torch import nn
input = torch.randn((1,5,3,3))
pool = nn.AdaptiveAvgPool2d(1)
w = torch.sigmoid(pool(input)).view(1,-1)
mask = torch.gt(w,0.5)
print(input)
print(w)
print(mask)
the output is as following:
tensor([[[[ 0.9129, -0.9763, 1.4460],
[ 0.3608, 0.5561, -1.4612],
[ 1.4953, -1.2474, 0.4069]],
[[-0.9121, 0.1261, 0.4661],
[-1.1624, -1.0266, -1.5419],
[ 1.0644, 1.0039, -0.4022]],
[[-1.8454, -0.2150, 2.3703],
[ 0.5224, 0.3366, 1.7545],
[-0.4624, 1.2639, 1.8032]],
[[-1.1558, -1.9985, -1.1336],
[-0.4400, -0.2092, 0.0677],
[-0.4172, -0.3614, -1.3193]],
[[-0.9441, -0.2944, 0.3381],
[ 1.6562, -0.5623, 0.0599],
[ 0.7229, 0.0472, -0.5122]]]])
tensor([[0.5414, 0.4341, 0.6489, 0.3156, 0.5142]])
tensor([[1, 0, 1, 0, 1]], dtype=torch.uint8)
the result I want is like this:
tensor([[[[ 0.9129, -0.9763, 1.4460],
[ 0.3608, 0.5561, -1.4612],
[ 1.4953, -1.2474, 0.4069]],
[[-1.8454, -0.2150, 2.3703],
[ 0.5224, 0.3366, 1.7545],
[-0.4624, 1.2639, 1.8032]],
[[-0.9441, -0.2944, 0.3381],
[ 1.6562, -0.5623, 0.0599],
[ 0.7229, 0.0472, -0.5122]]]])

I believe you can simply do:
input[mask]
Btw. you don't need to call sigmoid and then .gt(0.5). You can directly do .gt(0.0) without calling the sigmoid.

Related

Point-convolution in PyTorch

I would like to implement a point-convolution to my input tensor.
Let's say we have:
input_ = torch.randn(1,1,8,1)
input_.shape
Out[70]: torch.Size([1, 1, 8, 1])
input_
Out[71]:
tensor([[[[ 0.7656],
[-0.3400],
[-0.2487],
[ 0.6246],
[ 2.0450],
[-0.9588],
[ 1.2221],
[-1.3164]]]])
where the dimensions represent respectively (batch_size, n_channels, height, width).
Then, (keeping fixed batch_size and channel), I would like to apply to each a nn.Conv1d layer basically to expand the number of channels. What I've tried has been:
list(torch.unbind(input_,dim=2))
Out[72]:
[tensor([[[0.7656]]]),
tensor([[[-0.3400]]]),
tensor([[[-0.2487]]]),
tensor([[[0.6246]]]),
tensor([[[2.0450]]]),
tensor([[[-0.9588]]]),
tensor([[[1.2221]]]),
tensor([[[-1.3164]]])]
and then applying nn.Conv1d entrywise to these elements? Would that be a correct way?
EDIT:
Should I use nn.Conv1d directly on my original input_ by doing
conv = nn.Conv1d(in_channels=1, out_channels=3, kernel_size=(1,1)
conv(input_)
conv(input_)
Out[89]:
tensor([[[[-0.1481],
[ 1.5345],
[ 0.3082],
[ 1.8677],
[ 0.7515],
[ 1.2916],
[ 0.0218],
[ 0.5606]],
[[-1.2975],
[-0.3080],
[-1.0292],
[-0.1121],
[-0.7685],
[-0.4509],
[-1.1976],
[-0.8808]],
[[-0.7169],
[ 0.6493],
[-0.3464],
[ 0.9199],
[ 0.0135],
[ 0.4521],
[-0.5790],
[-0.1415]]]], grad_fn=<ThnnConv2DBackward0>)
conv(input_).shape
Out[90]: torch.Size([1, 3, 8, 1])
I'm not sure though if doing this is the same as my original purpose.

Is this a bug in xgboost's XGBClassifier?

import numpy as np
from xgboost import XGBClassifier
model = XGBClassifier(
use_label_encoder=False,
label_lower_bound=0, label_upper_bound=1
# setting the bounds doesn't seem to help
)
x = np.array([ [1,2,3], [4,5,6] ], 'ushort' )
y = [ 1, 1 ]
try :
model.fit(x,y)
# this fails with ValueError:
# "The label must consist of integer labels
# of form 0, 1, 2, ..., [num_class - 1]."
except Exception as e :
print(e)
y = [ 0, 0 ]
# this works
model.fit(x,y)
model = XGBClassifier()
y = [ 1, 1 ]
# this works, but with UserWarning:
# "The use of label encoder in XGBClassifier is deprecated, etc."
model.fit(x,y)
Seems to me like label encoder is deprecated but we are FORCED to use it, if our classifications don't happen to contain a zero.
I had the same problem. I solved using use_label_encoder=False as parameter and the warning message disappear.
I think in your case the problem is that you have only 1 in your y, but XGBoost wants the target starting from 0. If you change y = [ 1, 1 ] with y = [ 0, 0 ] the UserWarning should disappear.

Trouble creating 3D rotation matrix in Pytorch - ValueError: only one element tensors can be converted to Python scalars

I am trying to create 3D rotation matrices in pytorch as seen on the first page of this pdf, but I am encountering some problems. Here is my code so far:
zero = torch.from_numpy(np.zeros(len(cos)))
one = torch.from_numpy(np.ones(len(cos)))
R_transpose = torch.tensor([cos, -sin, zero, sin, cos, zero, zero, zero, one]).reshape(-1, 3, 3)
The cos and sin are matrices that look like this:
tensor([[[1.]],
[[1.]],
[[1.]],
[[1.]],
[[1.]]], dtype=torch.float64)
My goal is to create x number of rotation matrices(e.g. four matrices with the cos values shown above).
The code I currently have results in a "ValueError: only one element tensors can be converted to Python scalars"
How should I change my code to achieve my goal?
Why don't you use assignment to create R_transpose?
# define rotation angels (radians) using numpy
th_np = np.array([np.pi*0.25, np.pi/6, np.pi*0.5, np.pi/3.], dtype=np.float32)
# conver to pytorch
th_t = torch.from_numpy(th_np)
# init to zeros
R_transpose = torch.zeros(th_t.numel(), 3, 3, dtype=torch.float)
# assign the values:
R_transpose[:, 2, 2] = 1.
R_transpose[:, [0,1],[0,1]] = th_t[:, None].cos()
R_transpose[:, 0, 1] = -th_t.sin()
R_transpose[:, 1, 0] = th_t.sin()
Resulting with
tensor([[[ 7.0711e-01, -7.0711e-01, 0.0000e+00],
[ 7.0711e-01, 7.0711e-01, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 1.0000e+00]],
[[ 8.6603e-01, -5.0000e-01, 0.0000e+00],
[ 5.0000e-01, 8.6603e-01, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 1.0000e+00]],
[[-4.3711e-08, -1.0000e+00, 0.0000e+00],
[ 1.0000e+00, -4.3711e-08, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 1.0000e+00]],
[[ 5.0000e-01, -8.6603e-01, 0.0000e+00],
[ 8.6603e-01, 5.0000e-01, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 1.0000e+00]]])
Note that we assigned all all angels at once, so this solution is applicable to any number of angles you may have.

Keras 'Error when checking input' when trying to predict multiple values

I have a net with a length 4 input vector, length 2 output vector. I am trying to predict multiple inputs simultaneously. If I just want to predict one, I would do the following and it works:
in = numpy.array( [ [1,2,3,4] ] )
self.model.predict(in)
# prediction = [ [1,2] ]
However, when I try to pass in multiple inputs I get ValueError: Error when checking input: expected dense_1_input to have shape (4,) but got array with shape (1,)
in = numpy.array( [
[1,2,3,4],
[1,2,3,4]
]
)
#OR
in = numpy.array( [
[ [1,2,3,4] ],
[ [1,2,3,4] ]
]
)
self.model.predict(in)
#ERR
What am I doing wrong?
Edit:
Code =
model = Sequential()
model.add(Dense(24, input_dim=4, activation='relu'))
model.add(Dense(24, activation='relu'))
model.add(Dense(4, activation='linear'))
model.compile(loss='mse',
optimizer=Adam(lr=self.learning_rate))
print(batch_arr[:,3][0])
predictions = self.model.predict(batch_arr[:,3][0])
print(predictions)
print(batch_arr[:,3])
predictions = model.predict(batch_arr[:,3])
Output =
[[-0.00441936 -0.20398824 -0.08134908 0.09739554]]
[[ 0.01860509 -0.01136071]]
[array([[-0.00441936, -0.20398824, -0.08134908, 0.09739554]])
array([[-0.00517939, 0.38975933, -0.11951023, -0.9718224 ]])
array([[0.00272119, 0.0025476 , 0.002645 , 0.03973542]])
array([[-0.00421809, -0.01006362, -0.07795483, -0.16971247]])
array([[-0.00904593, 0.19332681, -0.10655871, -0.64757587]])
array([[ 0.00654432, 0.00347247, -0.15332555, -0.47302148]])
array([[-0.01921821, -0.17354519, -0.20207744, -0.58569029]])
array([[ 0.00661377, 0.20038962, -0.16278598, -0.80983334]])
array([[-0.00348096, 0.18171964, -0.07072813, -0.38913168]])
array([[-0.01268919, -0.00548544, -0.08286095, -0.27108632]])
array([[ 0.01077598, -0.19254374, -0.004982 , 0.33175341]])
array([[-4.37101750e-04, -5.68196965e-01, -1.99532537e-01,
1.10581883e-01]])
array([[ 0.00657382, -0.19263146, -0.00402872, 0.33368607]])
array([[ 0.00677398, 0.19760551, -0.00076944, -0.25153403]])
array([[ 0.00261579, 0.19642629, -0.13894668, -0.71894379]])
array([[-0.0221003 , 0.37477368, -0.03765055, -0.63564477]])
array([[-0.0110009 , 0.37599703, -0.0574645 , -0.66318148]])
array([[ 0.00277214, 0.19763152, 0.00343971, -0.25211181]])
array([[-9.31810654e-05, -2.06245307e-01, -8.09019674e-02,
1.47356796e-01]])
array([[ 0.00709025, -0.37636771, -0.19725323, -0.11396513]])
array([[ 0.00015344, -0.01233088, -0.07851076, -0.11956039]])
array([[ 0.01077811, -0.18439307, -0.19043179, -0.34107231]])
array([[-0.01460483, 0.18019651, -0.05036345, -0.35505252]])
array([[-0.0127989 , 0.19071515, -0.08828268, -0.58871071]])
array([[ 0.01072609, 0.00249456, -0.00580012, 0.0409061 ]])
array([[ 0.01062156, 0.00782762, -0.17898265, -0.57245695]])
array([[-0.01180104, -0.37085843, -0.1973209 , -0.23782701]])
array([[-0.00849912, -0.00780031, -0.07940117, -0.21980343]])
array([[ 0.00672477, 0.00246062, -0.00160252, 0.04165408]])
array([[-0.02268911, -0.36534914, -0.21379125, -0.36284594]])
array([[-0.00865513, -0.20170279, -0.08379724, 0.0468145 ]])
array([[-0.0256848 , 0.17922475, -0.03098346, -0.33335449]])]
#ERR
Edit: When I print out the shape of batch_arr[:,3] I get (32,), not (32,4) as I expected. Thus I'm guess the numpy array does not know the shape of its inner arrays. Is there an easy way to fix that? It might be the root of the problem
The issue was the way that I had created my numpy array. I created it with indices of variable size, and thus it didn't know it was shaped (32,4), only that it was (32,). Reformulating the logic to ensure that the array is always a set width from the beginning allowed the array to be a (32,4), which allowed the prediction to work.

Training a Random Forest on Tensorflow

I am trying to train a tensorflow based random forest regression on numerical and continuos data.
When I try to fit my estimator it begins with the message below:
INFO:tensorflow:Constructing forest with params =
INFO:tensorflow:{'num_trees': 10, 'max_nodes': 1000, 'bagging_fraction': 1.0, 'feature_bagging_fraction': 1.0, 'num_splits_to_consider': 10, 'max_fertile_nodes': 0, 'split_after_samples': 250, 'valid_leaf_threshold': 1, 'dominate_method': 'bootstrap', 'dominate_fraction': 0.99, 'model_name': 'all_dense', 'split_finish_name': 'basic', 'split_pruning_name': 'none', 'collate_examples': False, 'checkpoint_stats': False, 'use_running_stats_method': False, 'initialize_average_splits': False, 'inference_tree_paths': False, 'param_file': None, 'split_name': 'less_or_equal', 'early_finish_check_every_samples': 0, 'prune_every_samples': 0, 'feature_columns': [_NumericColumn(key='Average_Score', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), _NumericColumn(key='lat', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), _NumericColumn(key='lng', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)], 'num_classes': 1, 'num_features': 2, 'regression': True, 'bagged_num_features': 2, 'bagged_features': None, 'num_outputs': 1, 'num_output_columns': 2, 'base_random_seed': 0, 'leaf_model_type': 2, 'stats_model_type': 2, 'finish_type': 0, 'pruning_type': 0, 'split_type': 0}
Then the process breaks down and I get a value error below:
ValueError: Shape must be at least rank 2 but is rank 1 for 'concat' (op: 'ConcatV2') with input shapes: [?], [?], [?], [] and with computed input tensors: input[3] = <1>.
This is the code I am using:
import tensorflow as tf
from tensorflow.contrib.tensor_forest.python import tensor_forest
from tensorflow.python.ops import resources
import pandas as pd
from tensorflow.contrib.tensor_forest.client import random_forest
from tensorflow.python.estimator.inputs import numpy_io
import numpy as np
def getFeatures():
Average_Score = tf.feature_column.numeric_column('Average_Score')
lat = tf.feature_column.numeric_column('lat')
lng = tf.feature_column.numeric_column('lng')
return [Average_Score,lat ,lng]
# Import hotel data
Hotel_Reviews=pd.read_csv("./DataMining/Hotel_Reviews.csv")
Hotel_Reviews_Filtered=Hotel_Reviews[(Hotel_Reviews.lat.notnull() |
Hotel_Reviews.lng.notnull())]
Hotel_Reviews_Filtered_Target = Hotel_Reviews_Filtered[["Reviewer_Score"]]
Hotel_Reviews_Filtered_Features = Hotel_Reviews_Filtered[["Average_Score","lat","lng"]]
#Preprocess the data
x=Hotel_Reviews_Filtered_Features.to_dict('list')
for key in x:
x[key] = np.array(x[key])
y=Hotel_Reviews_Filtered_Target.values
#specify params
params = tf.contrib.tensor_forest.python.tensor_forest.ForestHParams(
feature_colums= getFeatures(),
num_classes=1,
num_features=2,
regression=True,
num_trees=10,
max_nodes=1000)
#build the graph
graph_builder_class = tensor_forest.RandomForestGraphs
est=random_forest.TensorForestEstimator(
params, graph_builder_class=graph_builder_class)
#define input function
train_input_fn = numpy_io.numpy_input_fn(
x=x,
y=y,
batch_size=1000,
num_epochs=1,
shuffle=True)
est.fit(input_fn=train_input_fn, steps=500)
The variables x is a list of numpy array of shape (512470,):
{'Average_Score': array([ 7.7, 7.7, 7.7, ..., 8.1, 8.1, 8.1]),
'lat': array([ 52.3605759, 52.3605759, 52.3605759, ..., 48.2037451,
48.2037451, 48.2037451]),
'lng': array([ 4.9159683, 4.9159683, 4.9159683, ..., 16.3356767,
16.3356767, 16.3356767])}
The variable y is numpy array of shape (512470,1):
array([[ 2.9],
[ 7.5],
[ 7.1],
...,
[ 2.5],
[ 8.8],
[ 8.3]])
Force each array in x to be 2 dim using ndmin=2. Then the shapes should match and concat should be able to operate.

Resources