how to separate source code and data while minimizing directory changes during working? [closed]

how to separate source code and data while minimizing directory changes during working? [closed] - linux

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
This is a general software engineering problem about working on Linux. Suppose I have source code, mainly scripts. They manipulate text data, take text files as input and output. I am thinking about how to appropriately separate src code and data while minimizing directory changes during working. I see two possibilities:
mix code and data together. In this way, it minimizes directory transitions and eliminating the need of typing paths to files during working. Most of the time I just call:
script1 data-in data-out # call script
vi data-out # view result
The problem is that as the number of code and data files grows, it looks messy facing a long list of both code and data files.
Separate code and data in two folders, say "src" and "data". When I am in "src" folder, doing the above actions would require:
script1 ../data/data-in ../data/data-out # call script
vi ../data/data-out or cd data; vi data-out # view result
The extra typing of parent directories "../data" causes hassle, especially when there are lots of quick testings of scripts.
You might suggest I do it the other way around, in the data folder. But then similarly I need to call ../src/script1, again a hassle of typing prefix "../src". Yeah, we could add "src" to PATH. But what if there are dependencies among scripts across parent-child directories? e.g., suppose under "src" there are "subsrc/script2", and within script1, it calls "./subsrc/script2 ..."? Then calling script1 in "data" folder would throw error, because there is no "subsrc" folder under "data" folder.
Well separation of code & data, and minimizaing directory changes seem to be conflicting requirements. Do you have any suggestions? Thanks.

I would use the cd - facility of the shell plus setting the PATH to sort this out — possibly with some scripts to help.
I'd ensure that the source directory, where the programs are built, is on my PATH, at the front. I'd cd into either the data directory or the source directory, (maybe capture the directory with d=$PWD for the data directory, or s=$PWD for the source directory), then switch to the other (and capture the directory name again). Now I can switch back and forth between the two directories using cd - to switch.
Depending on whether I'm in 'code work' or 'data work' mode, I'd work primarily in the appropriate directory. I might have a simple script to (cd $source_directory; make "$#") so that if I need to build something, I can do so by running the script. I can edit files in either directory with a minimum of fuss, either with a swift cd - plus vim, or with vim $other_dir/whichever.ext. Because the source directory is on PATH, I don't have to specify full paths to the commands in it.
I use an alias alias r="fc -e -" to repeat a command. For example, to repeat the last vim command, r v; the last make command, r m; and so on.
I do this sort of stuff all the time. The software I work on has about 50 directories for the full build, but I'm usually just working in a couple at a time. I have sets of scripts to rebuild the system based on on where I'm working (chk.xyzlib and chk.pqrlib to build in the corresponding sets of directories, for example; two directories for each of the libraries). I prefer scripts to aliases; you can interpolate arguments more easily with scripts whereas with aliases, you can only append the arguments. The (cd $somewhere; make "$#") notation doesn't work with aliases.

It's a little more coding, but can you set environment variables from the command line to specify the data directory?
export DATA_INPUT_DIR=/path/to/data
export DATA_OUTPUT_DIR=/path/to/outfiles
Then your script can process files relative to these directories:
# Set variables at the top of your scripts:
in_dir="${DATA_INPUT_DIR:-.}" # Default to current directory
out_dir="${DATA_OUTPUT_DIR:-.}" # Defailt to current directory
# 1st arg is input file. Prepend $DATA_INPUT_DIR unless path is absolute.
infile = "$1"
[ "${1::1}" == "/" ] || infile="$DATA_INPUT_DIR/$infile"
# 2nd arg is output file. Prepend $DATA_OUTPUT_DIR unless path is absolute.
outfile = "$2"
[ "${2::1}" == "/" ] || outfile="$DATA_OUTPUT_DIR/$outfile"
# Remainder of the script uses $infile and $outfile.
Of course, you could also open several terminal windows: some for working on the code and others for executing it. :-)

Related

Linux create global command that immediately takes user to certain directory [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 years ago.
Improve this question
When the user types in the command line apps and presses enter: I want it to immediately take the user to some directory, like /var/foo/bar.
At first I was thinking I could do this with a symlink, but I then realized that wouldn't work because it isn't global.
When I say the term "global", what I mean is that: no matter what directory the user is currently in: it will always move the user to the /var/foo/bar directory.
How might I be able to do this?

You can create an alias to the corresponding cd command:
alias apps="cd /var/foo/bar"

There are two main ways of achieving that.
1. Use a shell alias
Enter this into your shell:
$ alias apps="cd ~/applications"
From now on, on this particular shell session, typing apps and pressing enter will run cd and take you to applications (~ is your home directory).
Note that here, apps is not a program, just an alias, a name that the shell recognizes and reinterprets.
To make the alias defined above permanent, you should add that line to your shell profile. This is a file, located at a known path, that runs every time you open a new shell. One of these files probably exist (~ is your home directory):
~/.bashrc
~/.bash_profile
~/.profile
So, if you add the alias command at the bottom, it will be available on all new terminals.
2. Write a new program
EDIT: funny thing, this doesn't work for your case. A program cannot change the shell's current directory. It's still a nice summary of how to create a program, though. Use it for something else
Creating new programs to accomplish specific tasks is pretty simple, but it takes some getting used to. We can do it in 3 steps.
1- Open a file named apps in your home directory, and put this in it:
#!/bin/bash
cd ~/applications # or whatever directory you want
The first line of the file is called a hashbang, and it signals that this program should be executed using bash, just like your command-line. The only other line is your bash command to change directory.
Save it in ~/apps.
2- Make the file executable by running:
$ chmod +x ~/apps
3- Last, put this program in your PATH. The PATH is the shell's list of directories that contain programs. You already have some directories in your PATH:
$ echo $PATH
/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin
To make your program available to all users in the system, move it to /usr/bin:
$ sudo mv ~/apps /usr/bin/apps
If you can't sudo, you can still make this program available to all of your own terminals by adding a directory you control to the PATH.
$ mkdir ~/bin
$ PATH="$PATH:~/bin"
The second command extends the PATH to include ~/bin. Like I explained for the alias, you can make this change permanent by putting it in your bash profile.
Now, move the program to your new bin directory:
$ mv ~/apps ~/bin/apps
You should be able to type apps and press enter to execute your program now.

cd command : how to go back an unknown number of levels from current subdirectory to a particular parent directory (unix and dos)

Ok, so I am trying to resolve a uri in an xmlcatalog and I want to go back from a particular sub-directory back to a parent-directory that is an-unknown-number-of-levels behind.
eg:
file:///D:/Sahil/WorkSpaces1/Cartridges1/Project1/ParticularFolder/Level1/Level2/<so-many-levels>/CurrentFolder
I want to go back from "CurrentFolder" to "ParticularFolder" without typing in the full FilePath.
I want to achieve this because, I work in multiple Projects which all have "ParticularFolder" in it, so the codes inside the sub-directories of this folder should dynamically have access to all other files in other sub-directories inside this parent folder. I do not want to specify separate full filepaths for my various projects and make the code too rigid.
Is it possible? Please mention how to achieve this in windows, unix as well as linux os.

In UNIX/Linux/OS X/etc.:
while [ "$(basename $PWD)" != "ParticularFolder" ]; do cd ..; done

Apply bash variables for directories (recursive)

I have a project that requires me to keep a lot of bash files with installation/maintenance/whatever scripts, and most of them need to know where other folders are. Right now that is all made with relative paths, but that makes me keep the folders structure, which might not be the best idea on the long run.
So, as an example, I have this file (script.sh):
THINGS_DIR=..\..\things
myprogram $THINGS_DIR
But now I would like to have 2 files, one with global variables from these directories (let's call it conf.sh):
THINGS_DIR=./things/
OTHERS_DIR=./things/others/
and, on script.sh, somehow, I would use those variables.
The best way I could find it's to keep that conf.sh in a fixed place and all of the others run it before it starts, but I was trying to find a better solution.
EDIT
I forgot to say that this is in a Git repository, which is a fair assumption to keep along the way. That being said and because I wanted to keep this as self contained as possible I ended up using this in every script that needs those variables:
. $(git rev-parse --show-toplevel)/my_conf_file.conf
This command executes what's inside my_conf_file.conf located in the git repository root. It's not ideal (nor completelly safe) but it does the trick with minimum configuration.

A common idiom is to have a config file in one (or more) of several directories and have your script check each one of them in order. For instance you could search:
/etc/script.cfg # global
~/.script.cfg # user-specific, hidden with leading dot
You might also search the environment as a last resort. This search strategy is very easy to implement in bash. All you have to do is:
[[ -e /etc/script.cfg ]] && . /etc/script.cfg
[[ -e ~/script.cfg ]] && . ~/script.cfg
echo "THINGS_DIR=$THINGS_DIR"
echo "OTHERS_DIR=$OTHERS_DIR"
It sources the two config files if they exist. If the user copy exists it overrides the global settings. If neither exists, the script will naturally use any environment variables. This way the user could override the config settings like so:
THINGS_DIR=/overridden/things/dir script.sh

Libraries paths defined by Master bash script, but having to run it in every terminal session, how to make more efficient?

I have build a set of libraries and many of my Fortran programs will use them. This creates a problem in that if I ever need to change the location of the libraries then I will need to individually update the path directories in each make file.
How is this usually overcome? I have planned instead to have each make file read a path from a single master path file in the home or root directory (this files location will never change). Within this file is the path for each Library and if any path changes only this file needs to updated.
So I wrote a bash script file, called Master_Library_Paths:
export Library1_Name = {Library1_Name_Path}
echo $Library1_Name
export Library2_Name = {Library2_Name_Path}
echo $Library2_Name
export Library3_Name = {Library3_Name_Path}
echo $Library3_Name
And placed it in my home directory. Then in the make files, I have a line:
$(shell . {Path for Master_Library_Paths} ) \
And load the libraries:
-I$(Library1_Name)
-I$(Library2_Name)
-I$(Library3_Name)
This works great if I run ./Master_Library_Paths in the terminal session first and then go to the directory to compile the program, however that is quite time consuming, How can I fix it so that these arguments Library1_Name, Library2_Name ect are known throughout the system?

New system wide LD_LIBRARY_PATH´s can be added in /etc/ld.so.conf , /etc/ld.so.conf.d/
Or may be in /etc/profile.d/
-

How do I modify my user PROFILE file to append a scripts folder I created to the end of my PATH variable? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
How do I modify my user PROFILE file to append a scripts folder I created to the end of my PATH variable?
I am not totally sure what this means. Can anyone explain?

In unix/linux systems, you have a user id ('john') and a home directory ('/home/john'). The home directory has an abbreviation, the tilde: ~ (at the start of a directory path) means the same as your home directory ("/home/john").
In the home directory are several files that begin with a period (aka dot files because they start with a dot, i.e., a period). When you log in, the shell (i.e., the program that processes the command line when you type commands) that is started to supply you a command line looks for these files and reads them, using their content to initialize your shell environment. You can see these files (if they exist) by entering these commands at the command line:
cd
ls -a
The cd with no args means 'change the current directory to be my HOME directory. The ls command lists files in a directory (among other things); the -a option says 'show hidden files'. Hidden files are those that start with a period - this is the convention used in unix/linux to 'hide' files.
The .profile (said out loud it's often pronounced 'dot profile') file is one such dot file used for initializing your environment.
The PATH environment variable is used by the shell to search for executable files (programs).
You can google for 'how to update PATH in profile' and similar to learn more about the topic.
Here is a typical snippet found in a .profile file; its purpose is to allow you to run programs that are stored in the directory /usr/mypackage/bin.
PATH="/usr/mypackage/bin:$PATH"
export PATH
Putting a directory on the PATH allows you to type just a program name ('myprogram') in place of the longer form ('/usr/mypackage/bin/myprogram').
You can see the effect of this snippet using echo $PATH; it will show the entire value of the PATH variable. The value should be a list of paths (directories) separated by colon. A simple example:
echo $PATH
/usr/mypackage/bin:/usr/bin:/bin
That should give you a foothold to begin investigating the details. Trying searching for topics like 'how do I set up my linux/unix login', 'what is .profile file', etc., to learn more.
It's advisable to use double-quotes when setting the value of PATH to encapsulate any 'usual' characters that may be in the names of the items in the path. Single quotes are not suitable for this as they will prevent the evaluation of $PATH (which is what supplies your existing path when defining your new path value). For more on quotes, here is one discussion of single vs double quotes

Built-in programs like cat and cd simply work by entering the command. However, they are located in a certain folder, such as /usr/bin/. Try for yourself, and see which folder cat is located in, by entering which cat.
When you type in such command, your shell needs a list of folders in which it has to look for the command just entered. It used the $PATH variable for this, which stores this list. You can see it by entering echo $PATH.
Now, if you close your shell, the $PATH variable is gone. When you reopen your shell, it starts a certain amount of scripts, one of them being the .profile script. In this script, the $PATH variable is loaded. Therefore, you could adjust the .profile file in order to save your $PATH permanently. To do so, simply edit this file and edit the line where $PATH is defined (e.g. pico ~/.profile).
In your particular case, adding your scripts folder to the $PATH like this, will make you can simply write the name of your script instead of the whole pad when you want to launch one.

The PATH variable stores the list of directories the shell searches for programs/commands when you try to run them. You can access its value from the command line by typing:
echo $PATH
Be careful when modifying it, otherwise you could interfere with your ability to run programs from the command line. To add a new directory without modifying the original value, you could put a line in your file such as:
PATH=$PATH:/directory_to_add
where 'directory_to_add' is the directory you want to add to the path ($PATH tells the shell to insert the value of PATH). Then, if you type the name of one of the scripts in the folder at the command line, it will run without having to type the full pathname (as long as it has execute permission).
Note - your profile file can be found at ~/.profile, and you can add the line above with a text editor and resave the file. Then, from your home directory, type sh ./.profile, and your path should now include the desired directory.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string