I am trying to extract the features using the get_feature_names function of the OneHotEncoder object of scikit learn but its is throwing me an error saying
"'OneHotEncoder' object has no attribute 'get_feature_names'".
Below is the code snippet
# Creating the object instance for label encoder
encoder = OneHotEncoder(sparse=False)
onehot_encoded = encoder.fit_transform(df[data_column_category])
onehot_encoded_frame = pd.DataFrame(onehot_encoded,columns = encoder.get_feature_names(data_column_category))
Apparently, it has been renamed to get_feature_names_out.
https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html#sklearn.preprocessing.OneHotEncoder.get_feature_names_out
That feature was introduced recently, so you might just need to update your sklearn version.
You can do it as follows:
pip install -U scikit-learn
Related
When trying to define transformer for a batch transform process for a sklearn estimator, i am getting the following error : TypeError: init() got multiple values for argument 'entry_point'"
These are the steps i followed:
STEP 1:
from sagemaker.sklearn.estimator import SKLearn
script_path = 'transformer.py'
sklearn_preprocessor = SKLearn(
entry_point=script_path,
role=role,
train_instance_type="ml.c4.xlarge",
sagemaker_session=sagemaker_session)
STEP 2:
sklearn_preprocessor.fit({'train': "s3://training-data/train.csv"})
training was successful.
STEP 3:
transformer = sklearn_preprocessor.transformer(
instance_count=1,
instance_type='ml.m4.xlarge',
assemble_with = 'Line',
output_path='s3://training-data/transformed.csv',
accept = 'text/csv')
Error at Step3:
TypeError: __init__() got multiple values for argument 'entry_point'
The issue was reported to the AWS Sagemaker Python SDK's Github repository
https://github.com/aws/sagemaker-python-sdk/issues/974
It seems like the issue has been resolved with PR https://github.com/aws/sagemaker-python-sdk/pull/978
Looking at the CHANGELOG, the fix was released in version v1.36.3. Updating the SDK to a later version should resolve the issue
I'm building a CNN text classifier using TensorFlow which I want to load in tensorflow-serving and query using the serving apis. When I call the Predict() method on the grcp stub I receive this error: AttributeError: 'grpc._cython.cygrpc.Channel' object has no attribute 'unary_unary'
What I've done to date:
I have successfully trained and exported a model suitable for serving (i.e., the signatures are verified and using tf.Saver I can successfully return a prediction). I can also load the model in tensorflow_model_server without error.
Here is a snippet of the client code (simplified for readability):
with tf.Session() as sess:
host = FLAGS.server
channel = grpc.insecure_channel('localhost:9001')
stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
request = predict_pb2.PredictRequest()
request.model_spec.name = 'predict_text'
request.model_spec.signature_name = 'predict_text'
x_text = ["space"]
# restore vocab processor
# then create a ndarray with transform_fit using the vocabulary
vocab = learn.preprocessing.VocabularyProcessor.restore('/some_path/model_export/1/assets/vocab')
x = np.array(list(vocab.fit_transform(x_text)))
# data
temp_data = tf.contrib.util.make_tensor_proto(x, shape=[1, 15], verify_shape=True)
request.inputs['input'].CopyFrom(tf.contrib.util.make_tensor_proto(x, shape=[1, 15], verify_shape=True))
# get classification prediction
result = stub.Predict(request, 5.0)
Where I'm bending the rules: I am using tensorflow-serving-apis in Python 3.5.3 when pip install is not officially supported. Various posts (example: https://github.com/tensorflow/serving/issues/581) have reported that using tensorflow-serving with Python 3 has been successful. I have downloaded tensorflow-serving-apis package from pypi (https://pypi.python.org/pypi/tensorflow-serving-api/1.5.0)and manually pasted into the environment.
Versions: tensorflow: 1.5.0, tensorflow-serving-apis: 1.5.0, grpcio: 1.9.0rc3, grcpio-tools: 1.9.0rcs, protobuf: 3.5.1 (all other dependency version have been verified but are not included for brevity -- happy to add if they have utility)
Environment: Linux Mint 17 Qiana; x64, Python 3.5.3
Investigations:
A github issue (https://github.com/GoogleCloudPlatform/google-cloud-python/issues/2258) indicated that a historical package triggered this error was related to grpc beta.
What data or learning or implementation am I missing?
beta_create_PredictionService_stub() is deprecated. Try this:
from tensorflow_serving.apis import prediction_service_pb2_grpc
...
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
Try to use grpc.beta.implementations.insecure_channel instead of grpc.insecure_channel.
See example code here.
I'm using CNTK as the backend for Keras. I'm trying to use my model which I have trained using Keras in C++.
I have trained and saved my model using Keras which is in HDF5. How do I now use CNTK API to save it in their model-v2 format?
I tried this:
model = load_model('model2.h5')
cntk.ops.functions.Function.save(model, 'CNTK_model2.pb')
but i got the following error:
TypeError: save() missing 1 required positional argument: 'filename'
If tensorflow were the backend I would have done this:
model = load_model('model2.h5')
sess = K.get_session()
tf_saver = tf.train.Saver()
tf_saver.save(sess=sess, save_path=checkpoint_path)
How can I achieve the same thing?
As per the comments here, I was able to use this:
import cntk as C
import keras.backend as K
keras_model = K.load_model('my_keras_model.h5')
C.combine(keras_model.model.outputs).save('my_cntk_model')
cntk_model = C.load_model('my_cntk_model')
You can do something like this
model.outputs[0].save('CNTK_model2.pb')
I'm assuming here you have called model.compile (i.e. that's the only case I have tried :-)
The reason you see this error is because keras' cntk backend use a user defined function to do reshape on batch axis, which can't been serialized. We have fixed this issue in CNTK v2.2. Please upgrade your cntk to v2.2, and upgrade keras to last master.
Please see this pull request:
https://github.com/fchollet/keras/pull/7907
accuracy = tf.streaming_accuracy (y_pred,y_true,name='acc')
recall = tf.streaming_recall (y_pred,y_true,name='acc')
precision = tf.streaming_precision(y_pred,y_true,name='acc')
confusion = tf.confuson_matrix(Labels, y_pred,num_classes=10,dtype=tf.float32,name='conf')
For the above code, I have received the same error in past few days.
Isn't the syntax same as it is in the API documentation for tensorflow?
try to use this instead (in a fresh python file, I would suggest create a /tmp/temp.py and run that)
from tensorflow.contrib.metrics import streaming_accuracy
and if this doesn't work then
either there is an installation problem (In which case reinstall)
or you are importing the wrong tensorflow module.
The following works in rpy2 2.0.6:
robjects.r('M = lm(...)')
M = robjects.r('M')
coefficients = M.r['coefficients'][0]
But after I upgraded to rpy2 2.3.8, the above fails with the message
AttributeError: 'ListVector' object has no attribute 'r'
What do I need to change to make this work in 2.3.8?
I am not certain that the code snippet you provide worked with rpy2-2.0.x
The documentation, section Introduction, is showing how to extract coefficients from linear models:
http://rpy.sourceforge.net/rpy2/doc-2.1/html/introduction.html#linear-models