mlflow run git-uri clone to specific directory - python-3.x

I am using mlflow run with a GitHub uri.
When I run using the below command
mlflow run <git-uri>
The command sets up a conda environment and then clones the Git repo into a temp directory, But I need it setup in a specific directory
I checked the entire document, but I can't find it. Is there no such option to do so in one shot?

For non-local URIs, MLflow uses the Python's tempfile.mkdtemp function (source code), that creates the temporary directory. You may have some control over it by setting the TMPDIR environment variable as described in Python docs (it lists TMP & TEMP as well, but they didn't work for me on MacOS) - but it will set only "base path" for temporary directories and files, the directory/file names are still will be random.

Related

Run a notebook from another notebook in a Repo Databricks

I have a notebook with functions in a repo folder that I am trying to run in another notebook.
Normally I can run it as such: %run /Users/name/project/file_name
So I cloned the two files (function_notebook, processed_notebook) into a Repo in Databricks.
When I try to copy the path where I just cloned it, onlt this option appears: Copy File Path relative to Root
However in the Workspace user folder the option is Copy File Path
Evidently I dont quite grasp the difference between the relative path and the workspace path.
How can I run the notebook that has been cloned in the repo ?
Hierarchy:
RepoName (has 2 folders):
Folder1 Notebook1
Folder2 Notebook2
I'm in Notebook1 wanting to run Notebook2
%run ../Folder2/Notebook2
It's an UI problem that was already reported to development team. Until that time you need to create the path yourself. The difference is that it's starts with /Repos not with /Users. I have a small demo that shows how to use Repos to perform testing, etc. - if you interested in details.
But if the files are inside the same repository, then you don't need to use full paths, it's making them less portable - you can use relative paths, like, ./file_name to include notebook in the current folder, or ../file_name to include file in the level up folder, or ./folder/file_name to include file from the subfolder - but don’t specify file extension. In this case your code is portable, and could be used in different checkouts.
Example:
Notebook2:
Notebook1:
The name difference between workspace path & relative path is that former gives you full path inside the Workspace, while later gives you path relative to the root of the Repo
My notebook is called "UserLibraries" and i successfully ran it in separate cell without any other commands. Maybe it is the case. And if the path is correct I can open called NB in a new browser window by clicking path (it becomes hyperlink) (see picture).

Copying shell file to path

I'm new to WSL and Linux, but I'm trying to follow installation instructions for rhasspy (https://rhasspy.readthedocs.io/en/latest/installation/#windows-subsystem-for-linux-wsl). I have run the make install command successfully and the next step says I should copy rhasspy somewhere in my path but I can't quite figure out what copying to path means.
When installation is finished, copy rhasspy.sh somewhere in your PATH and rename it to rhasspy.
I added it to path but nothing changed so I was wondering if there is something I'm doing wrong. Right now when I run rhasspy on wsl it says rhasspy.sh: command not found. Any help would be really appreciated!
What it says is, put it in some place where the system will look for it when you type its name without full path in the shell.
There is an environment variable PATH that contains all those locations, separated by a :. (Check out echo $PATH.)
So, the author of these instructions leaves it up to you whether...
You want to copy the file to a location of your choice that is already in the PATH, such as /usr/local/bin or ~/bin.
Usually ~/bin is a good choice because it is per-user and doesn't pollute the system.
(Note that the directory ~/bin is added to the PATH by your .profile file only if it exists, so if you don't have this directory yet and create it now, you need to start a new login shell or run . ~/.profile1 before you can use it.)
- OR -
You want to create a new directory specifically for this application (say for example ~/opt/rhasspy) and append that directory to the PATH variable.
This can be done by adding the line export PATH=$PATH:~/opt/rhasspy to your ~/.profile file. Then, start a new login shell or reload the file using . ~/.profile1 for the changes to take effect.
If the directory in which this file is currently located is OK for you to keep permanently, then you can also just add that directory to the PATH instead of creating a new one.
Note: The PATH always contains directory paths in which the shell will look for executable files. It does not contain the actual file paths!
1: Yes, technically it is "cleaner" to log into a new shell or to run that one export statement manually instead of using . ~/.profile because the latter will apply things a second time that were already done before, so for example it can end up with the same directory in the PATH multiple times in the current session. In most cases that is fine though.
PATH is an environment variable. When you launch env, you see the list of known environment variables on your system.
In order to add something to your PATH variable, you need to take the variable, add the mentioned directory (preceeded by a semi-colon, most probably, as a separator) and store this again as the PATH variable. This can be done as follows (own example):
export PATH=$PATH:/home/this_user
the "PATH" it is referring to in linux is just inside the folder called /usr/bin. when you type a command into the terminal it looks for a program with that name inside the location. im not sure if this is the PATH you are looking for but hope it helps

Dockerizing Node.js app - what does: ENV PATH /app/node_modules/.bin:$PATH

I went through one of very few good dockerizing Vue.js tutorials and there is one thing I don't understand why is mandatory in Dockerfile:
# add `/app/node_modules/.bin` to $PATH
ENV PATH /app/node_modules/.bin:$PATH
COPY package.json /usr/src/app/package.json #not sure though how it relates to PATH...
I found only one explanation here which says:
We expose all Node.js binaries to our PATH environment variable and
copy our projects package.json to the app directory. Copying the JSON
file rather than the whole working directory allows us to take
advantage of Docker’s cache layers.
Still, it doesn't made me any smarter. Anyone able to explain it in plain english?
Error prevention
I think this is just a simple method of preventing an error where Docker wasn't able to find the correct executables (or any executables at all). Besides adding another layer to your image, there is in general as far as I know no downside in adding that line to your Dockerfile.
How does it work?
Adding node_modules/bin to the PATH environment variable ensures that the executables created during the npm build or the yarn build processes can be found. You could also COPY your locally builded node_modules folder to the image but it's advised to build it inside the Docker container to ensure all binaries are adapted to the underlying OS running in the container. The best practice would be to use multistage builds.
Furthermore, adding the node_modules/bin at the beginning of the PATH environment variable ensures that exactly these executables (from the node_modules folder) are used instead of any other executables which might also be installed on the system inside the Docker image.
Do I need it?
Short answer: Usually no. It should be optional.
Long answer: It should be enough to set the WORKDIR to the path where the node_modules is located for the issued RUN, CMD or ENTRYPOINT commands in your Dockerfile to find the correct binaries and therefore to successfully get executed. But I for example had a case where Docker wasn't able to find the files (I had a pretty complex setup with a so called devcontainer in VSCode). Adding the line ENV PATH /app/node_modules/.bin:$PATH solved my problem.
So, if you want to increase the stability of your Docker setup in order to make sure that everything works as expected, just add the line.
So I think the benefit of this line is to add the node_modules path from the Docker container to the list of PATHs on the relevant container. If you're on a Mac (or Linux I think) and run:
$ echo $PATH
You should see a list of paths which are used to run global commands from your terminal i.e. gulp, husky, yarn and so on.
The above command will add node_modules path to the list of PATHs in your docker container so that such commands if needed can be run globally inside the container they will work.
.bin (short for 'binaries') is a hidden directory, the period before the bin indicates that it is hidden. This directory contains executable files of your app's modules.
PATH is just a collection of directories/folders that contains executable files.
When you try to do something that requires a specific executable file, the shell looks for it in the collection of directories in PATH.
ENV PATH /app/node_modules/.bin:$PATH adds the .bin directory to this collection, so that when node tries to do something that requires a specific module's executable, it will look for it in the .bin folder.
For each command, like FROM, COPY, RUN, CMD, ..., Docker creates a image with the result of this command, and this images are called as layers. The final image is the result of merge of all layers.
If you use the COPY command to store all the code in one layer, it will be greater than store a environment variable with path of the code.
That's why the cache layers is a benefit.
For more info about layers, take a look at this very good article.

Location of default directories list stored for the 'origen new' command

Where is the default directories list stored for the 'origen new' command? I would like to make a PR to add the 'vendor' directory as a default directory for the 'origen new' command.
It is defined here:
https://github.com/Origen-SDK/origen_app_generators/blob/master/lib/origen_app_generators/application.rb#L25
Once you clone that plugin you can run the origen app_gen:test command from inside of it to test out your changes.
If you haven't already seen it, this guide will give some background on how the new application system works: http://origen-sdk.org/origen/guides/advanced/newapps
I've wondered about whether we should actually generate a .keep or .gitkeep file into empty directories like this so that they will stay around once the new app is checked in and cloned but nothing has been added to the dir yet.
Have a think about that for your PR.
Thanks!

Running a program from the source tree

Should it generally be possible to run a program from the source directory (src) after having invoked ./configure and make (but not make install)? I'm trying to fix a bug in an application and it seems unnecessary to run make install after each code change. Unfortunately I can't run the application in the source directory since it tries to access files in the lib installation directory (which do not exist before make install). Is the application wrongly configured or do I have to reinstall it after each change to the source code?
It all depends on the application and what components or files it expects to be visible and where. But assuming no required configuration or dependencies, then yes, you can run the program in-place.
To add a directory to your lib search path, add to the environment variable LD_LIBRARY_PATH. Like so:
LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/user/myproject/lib" ./someprogram
Note that specifiying a variable assignment on the command line in front of the program you run sets that variable for that run only. (Note, no semicolon -- this is a single command.) If you want to set the variable for the entire session, use
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/user/myproject/lib"
I'd recommend against this, though. It can lead to problems and confusion.

Resources