I am trying to use auto logging of ML Flow with catboost - but looking at the UI of the experiment (in Databricks UI) I don't see any parameters or metrics logged.
My code is:
mlflow.sklearn.autolog()
model_for_analytics = catboost.CatBoostRegressor(**cb_params)
model_for_analytics.fit(x_train, y_train, **cb_fit_params)```
Related
I am trying to register a model inside one of my azure ml experiments. I am able to register it via Model.register but not via run_context.register_model
This are the two code sentences I use. The commented one is the one that fails
learn.path = Path('./outputs').absolute()
Model.register(run_context.experiment.workspace, "outputs/login_classification.pkl","login_classification", tags=metrics)
run_context.register_model("login_classification", "outputs/login_classification.pkl", tags=metrics)
I received the next error:
Message: Could not locate the provided model_path outputs/login_classification.pkl
But model is stored in this path:
Before implementing run_context.register_model() implement run_context = Run.get_context()
I was able to fix the problem by explicitly uploading the model into the run history record before trying for registering the model.
run.upload_file("output/model.pickle", "output/model.pickle")
Check the documentation for Message: Could not locate the provided model_path outputs/login_classification.pkl
To check about Run Class
I'm relatively new to Azure ML and trying to run a model via PythonScriptStep
I can publish pipelines and run the model. However, once it has run once I can't re-submit the step as it states "This run reused the output from a previous run".
My code declares allow_reuse to be False, but this doesn't seem to make a difference and I can simply not resubmit the step even though the underlying data is changing.
train_step = PythonScriptStep(
name='model_train',
script_name="model_train.py",
compute_target=aml_compute,
runconfig=pipeline_run_config,
source_directory=train_source_dir,
allow_reuse=False)
Many thanks for your help
i am trying to deploy a custom trained tensorflow model using Amazon SageMaker. i have trained xlm roberta using tf 2.2.0 for multilingual sentiment analysis task.(please refer to this notebook : https://www.kaggle.com/mobassir/understanding-cross-lingual-models)
now, using trained weight file of my model i am trying to deploy that in sagemaker, i was following this tutorial : https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/
converted some keras code from there to tensorflow.keras for 2.2.0
but when i do : !ls export/Servo/1/variables i can see that export as Savedmodel generating empty variables directory like this : https://github.com/tensorflow/models/issues/1988
i can't find any documentation help for tf 2.2.0 trained model deployment
need example like this : https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/ for tf 2.x models and not keras
even though !ls export/Servo/1/variables shows empty directory but An endpoint was created successfully and now i am not sure if my model was deployed successfully or not because when i try to test the model deployment inside aws notebook by using predictor = sagemaker.tensorflow.model.TensorFlowPredictor(endpoint_name, sagemaker_session)
i.e. predictor.predict(data) i get the following error message:
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "{
"error": "Session was not created with a graph before Run()!"
}"
related problem : Inference error with TensorFlow C++ on iOS: "Invalid argument: Session was not created with a graph before Run()!"
the code i tried can be found here : https://pastebin.com/sGuTtnSD
I currently have Google Cloud ML Engine setup to train models created in Keras. When using Keras, it seems ML Engine does not automatically save the logs to a storage bucket. I see the logs in the ML Engine Jobs page but they do not show in my storage bucket and therefore I am unable to run tensorboard while training.
You can see the job completed successfully and produced logs:
But then there are no logs saved in my storage bucket:
I followed this tutorial when setting up my environment: (http://liufuyang.github.io/2017/04/02/just-another-tensorflow-beginner-guide-4.html)
So, how do I get the logs and run tensorboard when training a Keras model on ML Engine? Has anyone else had success with this?
You will need to create a callback keras.callbacks.TensorBoard(..) in order to write out the logs. See Tensorboad callback. You can supply GCS path as well (gs://path/to/my/logs) to the log_dir argument of the callback and then point Tensorboard to that location. You will add the callback as a list when calling model.fit_generator(...) or model.fit(...).
tb_logs = callbacks.TensorBoard(
log_dir='gs://path/to/logs',
histogram_freq=0,
write_graph=True,
embeddings_freq=0)
model.fit_generator(..., callbacks=[tb_logs])
We managed to get Spark (2.x) to send metrics to graphite by changing the metrics.properties file as below:
# Enable Graphite
*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
*.sink.graphite.host=graphite-host
*.sink.graphite.port=2003
*.sink.graphite.period=5
*.sink.graphite.prefix=my-app
However I noticed that we are getting only a subset of the metrics in graphite compared to what we get under Monitoring Web UI (http://localhost:4040). Is there any settings to get all the metrics (including Accumulators) in graphite?
I use this library to sink user defined metrics in user code into Graphite: spark-metrics
Initialise the metric system in driver side:
UserMetricsSystem.initialize(sc, "test_metric_namespace")
Then use Counter Gauge Histogram Meter like Spark Accumulators:
UserMetricsSystem.counter("test_metric_name").inc(1L)
For Spark 2.0, you can specify --conf spark.app.id=job_name so that in Grafana, metrics from different job run with multiple application id could have the same metric name. E.g. without setting spark.app.id, the metric name may include application id like this:
job_name.application_id_1.metric_namespace.metric_name
But with setting spark.app.id, it looks like:
job_name.unique_id.metric_namespace.metric_name