I just started to use AWS Lambda and Docker so would appreciate any advice.
I am trying to deploy an ML model to AWS Lambda for reference. The image created from Dockerfile successfully load XLNet model from local dir, however, it stucked when doing the same thing for tokenizer
In the pretrained_tokenizer folder, I have 4 files saved from tokenizer.save_pretrained(...) and config.save_pretrained(...)
In Dockerfile, I have tried multiple things, including:
copy the folder COPY app/pretrained_tokenizer/ opt/ml/pretrained_tokenizer/
copy each file from folder with separated COPY command
compress the folder to .tar.gz and use ADD pretrained_tokenizer.tar.gz /opt/ml/ (which is supposed to extract the tar files in the process)
In my python script, I tried to load the tokenizer using tokenizer = XLNetTokenizer.from_pretrained(tokenizer_file, do_lower_case=True), which works on Colab, but not when I try to to do an invocation to the image through sam local invoke -e events/event.json, the error is
[ERROR] OSError: Can't load tokenizer for 'opt/ml/pretrained_tokenizer/'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'opt/ml/pretrained_tokenizer/' is the correct path to a directory containing all relev raise EnvironmentError(ers/tokenization_utils_base.py", line 1768, in from_pretrained
END RequestId: bf011045-bed8-41eb-ac21-f98bfcee475a
I have tried to look through past questions but couldn't really fix anything. I will appreciate any help!
Related
sorry if my question is too basic, but cannot solve it.
I am experimenting with mlflow currently and facing the following issue:
Even if I have set the tracking_uri, the mlflow artifacts are saved to the ./mlruns/... folder relative to the path from where I run mlfow run path/to/train.py (in command line). The mlflow server searches for the artifacts following the tracking_uri (mlflow server --default-artifact-root here/comes/the/same/tracking_uri).
Through the following example it will be clear what I mean:
I set the following in the training script before the with mlflow.start_run() as run:
mlflow.set_tracking_uri("file:///home/#myUser/#SomeFolders/mlflow_artifact_store/mlruns/")
My expectation would be that mlflow saves all the artifacts to the place I gave in the registry uri. Instead, it saves the artifacts relative to place from where I run mlflow run path/to/train.py, i.e. running the following
/home/#myUser/ mlflow run path/to/train.py
creates the structure:
/home/#myUser/mlruns/#experimentID/#runID/artifacts
/home/#myUser/mlruns/#experimentID/#runID/metrics
/home/#myUser/mlruns/#experimentID/#runID/params
/home/#myUser/mlruns/#experimentID/#runID/tags
and therefore it doesn't find the run artifacts in the tracking_uri, giving the error message:
Traceback (most recent call last):
File "train.py", line 59, in <module>
with mlflow.start_run() as run:
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/fluent.py", line 204, in start_run
active_run_obj = client.get_run(existing_run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/client.py", line 151, in get_run
return self._tracking_client.get_run(run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/_tracking_service/client.py", line 57, in get_run
return self.store.get_run(run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/store/tracking/file_store.py", line 524, in get_run
run_info = self._get_run_info(run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/store/tracking/file_store.py", line 544, in _get_run_info
"Run '%s' not found" % run_uuid, databricks_pb2.RESOURCE_DOES_NOT_EXIST
mlflow.exceptions.MlflowException: Run '788563758ece40f283bfbf8ba80ceca8' not found
2021/07/23 16:54:16 ERROR mlflow.cli: === Run (ID '788563758ece40f283bfbf8ba80ceca8') failed ===
Why is that so? How can I change the place where the artifacts are stored, this directory structure is created? I have tried mlflow run --storage-dir here/comes/the/path, setting the tracking_uri, registry_uri. If I run the /home/path/to/tracking/uri mlflow run path/to/train.py it works, but I need to run the scripts remotely.
My endgoal would be to change the artifact uri to an NFS drive, but even in my local computer I cannot do the trick.
Thanks for reading it, even more thanks if you suggest a solution! :)
Have a great day!
This issue was solved by the following:
I have mixed the tracking_uri with the backend_store_uri.
The tracking_uri is where the MLflow related data (e.g. tags, parameters, metrics, etc.) are saved, which can be a database. On the other hand, the artifact_location is where the artifacts (other, not MLflow related data belonging to the preprocessing/training/evaluation/etc. scripts).
What led me to mistakes is that by running mlflow server from command line one should set up for the --backend-store-uri the tracking_uri (also in the script by setting the mlflow.set_tracking_uri()) and for --default-artifact-location the location of the artifacts. Somehow I didn't get that the tracking_uri = backend_store_uri.
Here's my solution
Launch the server
mlflow server -h 0.0.0.0 -p 5000 --backend-store-uri postgresql://DB_USER:DB_PASSWORD#DB_ENDPOINT:5432/DB_NAME --default-artifact-root s3://S3_BUCKET_NAME
Set the the tracking uri to an HTTP URI like
mlflow.set_tracking_uri("http://my-tracking-server:5000/")
I'm trying to copy a file from one folder to another folder using node-red but flow execute fine and throwing an error and also showing syntax error in the console
My flow:
Node Configuration:
exec command used
copy C:\Users\Karthikeyan.Anbalaga\Downloads\ID_NewOnePulse_results.csv C:\Users\Karthikeyan.Anbalaga\Downloads\Documents\
At the same time if I execute the file copy command in the terminal working well
>copy C:\Users\Karthikeyan.Anbalaga\Downloads\ID_NewOnePulse_results.csv C:\Users\Karthikeyan.Anbalaga\Downloads\Documents\
1 file(s) copied.
I did a mistake on the configuration, after removing the msg.payload works fine and also files I can see in the destination folder.
I am currently using nodejs that is deployed in ebs on aws. I have a function that will write a pdf and then email it off but it says the file path can't be found. I've verified the project file seems to be /var/app/current/, but changing the reference of the file path doesn't seem to remove the error. Any idea how to go about fixing this?
The /var/app/current/ does not exist initially. Its only created at the very last stage of your deployment.
The deployment happens in /var/app/staging/ folder, and at the very last, once everything finishes, /var/app/staging/ is moved into /var/app/current/.
Thus, I would not recommend using absolute paths in your project or config files. Its better to use relative path or container_commands for config scripts:
The specified commands run as the root user, and are processed in alphabetical order by name. Container commands are run from the staging directory, where your source code is extracted prior to being deployed to the application server.
I have a Node-RED app running in a docker container, with the aim to periodically read contents of a directory where .csv files are constantly updated and new .csv files are sometimes added. The point is to read new entries periodically, parse data, and send it onward.
I have not utilized the numerous 'contrib' nodes, as I have enabled the NodeJS 'fs' module and played with it. Additionally the built-in 'file' and 'file in' Node-RED modules are useful when reading the .csv files' contents, so that is not an issue.
The problem comes with the new .csv files being added into the directory where all the .csv files are. I want be able to read all the file names and subsequently read all the .csv files.
I have mounted the .csv file directory into the docker container, and when testing whether I'm able to read the file names, weird things happen. Even though the files are visible in the container (viewed using docker exec -it CONTAINER /bin/bash) a piece of code containing fs.readdir does not list the files. When I try the fs.readdir too see the contents of /data directory, which is mounted into the container, it lists the contents like 10 % of the time (injecting a timestamp into the node to run it)
As you can see from the image, the contents of the directorty in question are not listed on every execution of the node. The contents of the mounted directory containing the .csv files are never listed upon running this node with the correct path as parameter.
The operating system is CentOS 7, where I am not a sudoer. I have managed to make it so that none of the mounted files or directories are owned by root, so they are owned by user node-red within the container. I managed to pull this directory file listing through on my ubuntu where I am a sudoer, but as none of the stuff is root-owned there either, I am not sure if that is the problem. I have a feeling this might be an operating system -relating thing.
Notes:
All relevant files and directories have permissions rwxr-xr-x
I have tried to mount the .csv files containing directory under /data directory, and as its own directory directly under root as /files
I am able to read the file contents with the Node-RED file nodes, just not the directories. Reading static file names is not enough as the directory contents keep changing
I have enabled NodeJS 'fs' module from the settings.js file which is mounted into the container
The Node-RED node (in image) does not output any errors (I tried this by adding an error return to the function in the image)
I have tried to run the Node-RED container as root user and without defining the user
I am running the Node-RED container using docker-compose
I hope this was not too much text or too unclear, I just wanted to make sure at least most of the stuff I have tried would be written here. If someone has some insight on the workings of Node-RED under docker and using the NodeJS fs module, it would be most appreciated :)
The core Watch node should do all of this for you, no need to write function nodes.
If you want walk subdirectories make sure you tick the right box in the config.
From the Sidebar docs for the watch node:
The full filename of the file that actually changed is put into
msg.payload and msg.filename, while a stringified version of the watch
list is returned in msg.topic.
msg.file contains just the short filename of the file that changed.
msg.type has the type of thing changed, usually file or directory,
while msg.size holds the file size in bytes.
To answer my question of why Node-RED was unable to read directory contents most of the time, it was because of using the asynchronous fs.readdir module. When I switched to using the synchronous version fs.readdirSync, Node-RED was able to read directory contents without problems.
Running pipeline failed with the following error.
User program failed with ValueError: ZIP does not support timestamps before 1980
I created Azure ML Pipeline that call several child run. See the attached codes.
# start parent Run
run = Run.get_context()
workspace = run.experiment.workspace
from azureml.core import Workspace, Environment
runconfig = ScriptRunConfig(source_directory=".", script="simple-for-bug-check.py")
runconfig.run_config.target = "cpu-cluster"
# Submit the run
for i in range(10):
print("child run ...")
run.submit_child(runconfig)
It seems timestamp of python script (simple-for-bug-check.py) is invalid.
My Python SDK version is 1.0.83.
Any workaround on this ?
Regards,
Keita
One workaround to the issue is setting the source_directory_data_store to a datastore pointing to a file share. Every workspace comes with a datastore pointing to a file share by default, so you can change the parent run submission code to:
# workspacefilestore is the datastore that is created with every workspace that points to a file share
run_config.source_directory_data_store = 'workspacefilestore'
if you are using RunConfiguration or if you are using an estimator, you can do the following:
datastore = Datastore(workspace, 'workspacefilestore')
est = Estimator(..., source_directory_data_store=datastore, ...)
The cause of the issue is the current working directory in a run is a blobfuse mounted directory, and in the current (1.2.4) as well as prior versions of blobfuse, the last modified date of every directory is set to the Unix epoch (1970/01/01). By changing the source_directory_data_store to a file share, this will change the current working directory to a cifs mounted file share, which will have the correct last modified time for directories and thus will not have this issue.