Redirect program output without changing directory - linux

Problem
I'm writing a set of scripts to help with automated batch job execution on a cluster.
The specific thing I have is a $OUTPUT_DIR, and an arbitrary $COMMAND.
I would like to execute the $COMMAND such that its output ends up in $OUTPUT_DIR.
For example, if COMMAND='cp ./foo ./bar; mv ./bar ./baz', I would like to run it such that the end result is equivalent to cp ./foo ./$OUTPUT_DIR/baz.
Ideally, the solution would look something like eval PWD="./$OUTPUT_DIR" $COMMAND, but that doesn't work.
Known solutions
[And their problems]
Editing $COMMAND: In most cases the command will be a script, or a compiled C or FORTRAN executable. Changing the internals of these isn't an option.
unionfs, aufs, etc.: While this is basically perfect, users running this won't have root, and causing thousands+ of arbitrary mounts seems like a questionable choice.
copying/ hard/soft links: This might be the solution I will have to use: some variety of actually duplicating the entire content of ./ into ./$OUTPUT_DIR
cd $OUTPUT_DIR; ../$COMMAND : Fails if $COMMAND ever reads files
pipes : only works if $COMMAND doesn't directly work with files; which it usually does
Is there another solution that I'm missing, or is this request actually impossible?
[EDIT:]Chosen Solution
I'm going to go with something where each object in the directory is symbolic-linked into the output directory, and the command is then run from there.
This has the downside of creating a lot of symbolic links, but it shouldn't be too bad.

You can't solve this without making some assumptions about the interface of $COMMAND. There is no single definition of what "output ends up in $OUTPUT_DIR" means. For one program this may be some files, but another program might just print something to stdout and yet another might try sending some data over the internet using some protocol or display something in a GUI and there isn't an obvious way of mapping all of these to "output goes to $OUTPUT_DIR".
So, you need to invent some assumptions and require any $COMMAND implementation to follow them. Then, it may get as simple as requesting that the command accept a parameter such as --target=<DIR>. If your command was some simple command, you would have to create a wrapper script around it to translate that parameter into what the app accepts. cp, mv and a few more utils already accept the parameter --target, so that may be a good starting point.

You cannot set the output directory, you can only set the working directory.
The problem is, once you set the working directory, other references are going to be invalid. For example in your code foo:
cp ./foo ./bar
If you have a specific command, there are workarounds (creating a script that alters arguments, prepending the directory to specific arguments), but in general this is not possible.

Related

Add comments next to files in Linux

I'm interested in simply adding a comment next to my files in Linux (Ubuntu). An example would be:
info user ... my_data.csv Raw data which was sent to me.
info user ... my_data_cleaned.csv Raw data with duplicates filtered.
info user ... my_data_top10.csv Cleaned data with only top 10 values selected for each ID.
So sort of the way you can comment commits in Git. I don't particularly care about searching on these tags, filtering them etc. Just seeings them when I list files in a directory. Bonus if the comments/tags follow the document around as I copy or move it.
Most filesystem types support extended attributes where you could store comments.
So for example to create a comment on "foo.file":
xattr -w user.comment "This is a comment" foo.file
The attributes can be copied/moved with the file just be aware that many utilities require special options to copy the extended attributes.
Then to list files with comments use a script or program that grabs the extended attribute. Here is a simple example to use as a starting point, it just lists the files in the current directory:
#!/bin/sh
ls -1 | while read -r FILE; do
comment=`xattr -p user.comment "$FILE" 2>/dev/null`
if [ -n "$comment" ]; then
echo "$FILE Comment: $comment"
else
echo "$FILE"
fi
done
The xattr command is really slow and poorly written (it doesn't even return error status) so I suggest something else if possible. Use setfattr and getfattr in a more complex script than what I have provided. Or maybe a custom ls command that is aware of the user.comment attribute.
This is a moderately serious challenge. Basically, you want to add attributes to files, keep the attributes when the file is copied or moved, and then modify ls to display the values of these attributes.
So, here's how I would attack the problem.
1) Store the information in a sqlLite database. You can probably get away with one table. The table should contain the complete path to the file, and your comment. I'd name the database something like ~/.dirinfo/dirinfo.db. I'd store it in a subfolder, because you may find later on that you need other information in this folder. It'd be nice to use inodes rather than pathnames, but they change too frequently. Still, you might be able to do something where you store both the inode and the pathname, and retrieve by pathname only if the retrieval by inode fails, in which case you'd then update the inode information.
2) write a bash script to create/read/update/delete the comment for a given file.
3) Write another bash function or script that works with ls. I wouldn't call it "ls" though, because you don't want to mess with all the command line options that are available to ls. You're going to be calling ls always as ls -1 in your script, possibly with some sort options, such as -t and/or -r. Anyway, your script will call ls -1 and loop through the output, displaying the file name, and the comment, which you'll look up using the script from 2). You may also want to add file size, but that's up to you.
4) write functions to replace mv and cp (and ln??). These would be wrapper functions that would update the information in your table, and then call the regular Unix versions of these commands, passing along any arguments received by the functions (i.e. "$#"). If you're really paranoid, you'd also do it for things like scp, which can be used (inefficiently) to copy files locally. Still, it's unlikely you'll catch all the possibilities. What if someone else does a mv on your file, who doesn't have the function you have? What if some script moves the file by calling /bin/mv? You can't easily get around these kinds of issues.
Or if you really wanted to get adventurous, you'd write some C/C++ code to do this. It'd be faster, and honestly not all that much more challenging, provided you understand fork() and exec(). I can't recall whether sqlite has a C API. I assume it does. You'd have to tangle with that, too, but since you only have one database, and one table, that shouldn't be too challenging.
You could do it in perl, too, but I'm not sure that it would be that much easier in perl, than in bash. Your actual code isn't that complex, and you're not likely to be doing any crazy regex stuff or string manipulations. There are just lots of small pieces to fit together.
Doing all of this is much more work than should be expected for a person answering a question here, but I've given you the overall design. Implementing it should be relatively easy if you follow the design above and can live with the constraints.

File name multiple extensions order

I want to create some bash scripts. They're actually going to be build scripts for Scala, so I'm going to identify them with my own .bld extension. They will be a sort of sub type of a shell script. Hence I want them to be easily recognised as a shell script. Should I call them
ProjectA.bld.sh //or
ProjectA.sh.bld
Edit: My natural inclination would be to go for the former but .tar.gz files seem to follow the latter naming convention.
A shell script doesn't mind what you call it.
It just needs to be..
executable (chmod +x)
in your path
contain a "shebang" as it's first line #!/bin/sh
The shebang determines which program is used to execute your script.
Call it ProjectA.bld.sh (or preferably buildProjectA.sh).
The .sh extension (although not necessary for the script to run) will allow you and everyone else to easily recognise it as a shell script.
While for the most part, naming conventions like this don't really matter at all to Unix/Linux, the usual convention is for the "extensions" to be in the order of the steps used to create the file. So, for example, a file named foo.tar.bz2.gpg.part01 would indicate a sequence of operations like the following:
Use tar to create foo.tar, which contains some other files
Use bzip2 to compress foo.tar into foo.tar.bz2
Use gnupg to encrypt foo.tar.bz2 into foo.tar.bz2.gpg
Use split or something similar to break the file into chunks for transmission/storage, resulting in one or more foo.tar.bz2.gpg.part* files.
The naming conventions are mostly just for human semantic meaning, though, and there's nothing stopping you from doing exactly the opposite, or even something completely random, except your own ability to remember exactly what you did...

Bash script ignores arguments when in /bin

I have this bash script that I can pass up to three arguments to. It works like a charm when I call it from the directory ./script -h but when I copy the same file to /bin and call it from anywhere with script -h, it seems to ignore the arguments passed.
Why? or maybe more importantly:
What can I do do change that?
script is a very useful standard utility program which take a copy of your current session (look for a file called typescript). It creates another shell interface, so you probably didn't notice it was running.
When you write a new program, use a naming convention, like script.sh.
Edit:
If you don't like using a file suffix (because it looks too much like Windows) then fine, but use some other naming convention which will ensure your script names do not clash with existing commands. test is another favorite, for example. You can use type to check a command, but that only checks your current environment, you might still have a name collision when running from a different username, for example.

Getting linux terminal value from my application

I am developing a Qt application in Linux. I wanted to pass Linux commands to a terminal. That worked but now i also want to get a response from the terminal for this specific command.
For example,
ls -a
As you know this command lists the directories and files of the current working directory. I now want to pass the returned values from the ls call to my application. What is a correct way to do this?
QProcess is the qt class that will let you spawn a process and read the result. There's an example of usage for reading the result of a command on that page.
popen() , api of linux systerm , return FILE * that you can read it like a file descriptor, may help youp erhaps。
Parsing ls(1) output is dangerous -- make a few files with funny names in a directory and test it out:
touch "one file"
touch "`printf "\x0a\x0a\x0ahello\x0a world"`"
That creates two files in the current working directory. I expect your attempts to parse ls(1) output won't work. This might be alright if you're showing the results to a human, (though a human will be immensely confused if a filename includes output that looks just like ls(1) output!) but if you're trying to present something like an explorer.exe or Finder.app representation of files in the filesystem, this is horribly broken.
Instead, use opendir(3), readdir(3), and closedir(3) to read directory entries yourself. This will be safer, more portable, and (as a side benefit) slightly better performing.

change shell directory from a script?

i want to make a script (to) that makes it easier for me to enter folders.
so eg. if i type "to apache" i want it to change the current directory to /etc/apache2.
however, when i use the "cd" command inside the script, it seems like it changes the path WITHIN the script, so the path in the shell has not changed.
how could i make this work?
Use an alias or function, or source the script instead of executing it.
BASH FAQ entry #60.
use a function
to_apache(){
cd /etc/apache
}
put in a file eg mylibrary.sh and whenever you want to use it, source the file. eg
#!/bin/bash
source /path/mylibrary.sh
to_apache
As Ignacio said, make it into a function (or perhaps an alias).
The way I tend to do it is have a shell script that creates the function - and the script and the function have the same name. Then once at some point in time, I will source the script ('. funcname') and thereafter I can simply use the function.
I tend to prefer functions to aliases; it is easier to manage arguments etc.
Also, for the specific case of changing directories, I use CDPATH. The trick with using CDPATH is to have the empty entry at the start:
export CDPATH=:/work4/jleffler:/u/jleffler:/work4/jleffler/src:\
/work4/jleffler/src/perl:/work4/jleffler/src/sqltools:/work4/jleffler/lib:\
/work4/jleffler/doc:/u/jleffler/mail:/work4/jleffler/work:/work4/jleffler/ids
On this machine, my main home directory is /work4/jleffler. I can get to most of the relevant sub-directories in one go with 'cd whatever'.
If you don't put the empty entry (or an explicit '.') first, then you can't 'cd' into a sub-directory of the current directory, which is disconcerting at least.
Ignacio Vazquez-Abrams gave a link to what probably answers the question, although I didn't really follow it. The short answer is to use either "source" or a single dot before the command, eg:
. to apache
But, I found there are down problems to this if you have a more complicated script. It seems that the original script filename variable ($0) is lost. I see "-bash" instead, so your script can't echo error text that that would include the full filename.
Also, you can't use the "exit" command, or your shell will exit (especially disconcerting from ssh).
Also, the "basename" function gives an error if you use that.
So, it seems to me that a function might be the only way to get around some of these problems, like if you are passing parameters.

Resources