Create symlink from the contents of a file - linux

Due to an inefficient workflow where I have to copy directories between a Linux machine and a windows machine. The directories contain symlinks which (after copying Linux>Windows>Linux) contain the link in plaintext (eg foobar.C contains the text ../../../Foo/Bar/foobar.C)
Is there an efficient way to recreate the symlinks from the contents of the file recursively for a complete directory?
I have tried:
find . | xargs ln -s ??A?? ??B?? && mv ??B?? ??A??
where I really have no idea how to populate the variables, but ??A?? should be the symlink's destination from the file and ??B?? should be the name of the file with the suffix _temp appended.

If you are certain that all the files contain a symlink, it's not very hard.
find . -print0 | xargs -r 0 sh -c '
for f; do ln -s "$(cat "$f")" "${f}_temp" && mv "${f}_temp" "$f"; done' _
The _ dummy argument is necessary because the second argument to sh -c is used to populate $0 in the subshell. The shell itself is necessary because you cannot directly pass multiple commands to xargs.
The -print0 and corresponding xargs -0 are a GNU extension to correctly cope with tricky file names. See the find manual for details.
I would perhaps add a simple verification check before proceeding with the symlinking; for example, if grep -cm2 . on the file returns 2, skip the file (it contains more than one line of text). If you can be more specific (say, all the symlinks begin with ../) by all means be more specific.

Related

No such file or directory when piping. Each command works separately, but not when piping

I have 2 folders: folder_a & folder_b. In each of these folders there are a bunch of files. I am trying to use sed to move all of these files out of these folders and into my current working directory I am currently in.
My folder structure looks like this:
mytest:
a:
1.txt
2.txt
3.txt
b:
4.txt
5.txt
The command I am trying to use is:
find . -type d ! -iname '*.*' # find all folders other than root
| sed -r 's/.*/&\/*/' # add '/*' to each of the arguments
| sed -r 'p;s/.*/./' # output: a/* . b/* .
| xargs -n 2 mv # should be creating two commands: 'mv a/* .' and 'mv b/* .'
Unfortunately I get an error:
mv: cannot stat './aaa/*': No such file or directory
I also get the same error when I try this other strategy (using ls instead of mv):
for dir in */; do
ls $dir;
done;
Even if I use sed to replace the spaces in each directory name with '\ ', or surround the directory names with quotes I get the same error.
I'm not sure if these 2 examples are related in my misunderstanding of bash but they both seem to demonstrate my ignorance of how bash translates the output from one command into the input of another command.
Can anyone shed some light on this?
Update: Completely rewritten.
As #EtanReisner and #melpomene have noted, mv */* . or, more specifically, mv a/* b/* . is the most straightforward solution, but you state that this is in part a learning exercise, so the remainder of the answer shows an efficient find-based solution and explains the problem with the original command.
An efficient find-based solution
Generally, if feasible, it's best and most efficient to let find itself do the work, without involving additional tools; find's -exec action is like a built-in xargs, with {} representing the path at hand (with terminator \;) / all paths (with +):
find . -type f -exec echo mv -t . {} +
To be safe, his will just print the mv commands that would be executed; remove the echo to actually execute them.
This will execute a single[1] mv command to which all matching files are passed, and -t . moves them all to the current dir.
[1] If the resulting command line is too long (which is unlikely), it is split up into multiple commands, just as with xargs.
Operating on files (-type f) bypasses the need for globbing, as find will then enumerate all files for you (it also bypasses the need to exclude . explicitly).
Note that this solution works on entire subtrees, not just (immediate) subdirectories.
It's tempting to consider turning on Bash 4's globstar option and using mv */** ., but that won't work, because it will attempt to move directories as well, not just the files in them.
A caveat re -exec with +: it only works if {} - the placeholder for all paths - is the token immediately before the +.
Since you're on Linux, we can satisfy this condition by specifying the target folder for mv with option -t before the {}; on BSD-based systems such as OSX, you could not do that, because mv doesn't support -t there, so you'd have to use terminator \;, which means that mv is called once for every path, which is obviously much slower.
Why your command didn't work:
As #EtanReisner points out in a comment, xargs invokes the command specified without (implicitly) involving a shell, so globbing won't work; you can verify this with the following command:
echo '*' | xargs echo # -> '*' - NO globbing
If we leave the globbing issue aside, additional work would have been necessary to make your xargs command work correctly with folder names with embedded spaces (or other shell metacharacters):
find . -mindepth 1 -type d |
sed -r "s/.*/'&'\/* ./" | # -> '<input-path>'/* . (including single-quotes)
xargs -n 2 echo mv # NOTE: still won't work due to lack of globbing
Note how the (combined) sed command now produces a single output line '<input-path>'/* ., with the input path enclosed in embedded single-quotes, which is required for xargs to recognize <input-path> as a single argument, even if it contains embedded spaces.
(If your filenames contain single-quotes, you'd have to do more work; also note that since now all arguments for a given dir. are on a single line, you could use xargs -L 1 ....)
Also note how -mindepth 1 (only process paths at the subdirectory level or below) is used to skip processing of . itself.
The only way to make globbing happen is to get the shell involved:
find . -mindepth 1 -type d |
sed -r "s/.*/'&'\/* ./" | # -> '<input-path>'/* . (including single-quotes)
xargs -I {} sh -c 'echo mv {}' # works, but is inefficient
Note the use of xargs' -I option to treat each input line as its own argument ({} is a self-chosen placeholder for the input).
sh -c invokes the (default) shell to execute the resulting command, at which globbing does happen.
However, overall, this is quite inefficient:
A pipeline with 3 segments is used.
A shell instance is invoked for every input path, which in turn calls the mv utility.
Compare this to the efficient find-only solution above, which (typically) creates only 2 processes in total.

Removing Colons From Multiple FIles on Linux

I am trying to take some directories that and transfer them from Linux to Windows. The problem is that the files on Linux have colons in them. And I need to copy these directories (I cannot alter them directly since they are needed as they are the server) over to files with a name that Windows can use. For example, the name of a directory on the server might be:
IAPLTR2b-ERVK-LTR_chr9:113137544-113137860_-
while I need it to be:
IAPLTR2b-ERVK-LTR_chr9-113137544-113137860_-
I have about sixty of these directories and I have collected the names of the files with their absolute paths in a file I call directories.txt. I need to walk through this file changing the colons to hyphens. Thus far, my attempt is this:
#!/bin/bash
$DIRECTORIES=`cat directories.txt`
for $i in $DIRECTORIES;
do
cp -r "$DIRECTORIES" "`echo $DIRECTORIES | sed 's/:/-/'`"
done
However I get the error:
./my_shellscript.sh: line 10: =/bigpartition1/JKim_Test/test_bs_1/129c-test-biq/IAPLTR1_Mm-ERVK-LTR_chr10:104272652-104273004_+.fasta: No such file or directory ./my_shellscript.sh: line 14: `$i': not a valid identifier
Can anyone here help me identify what I am doing wrong and maybe what I need to do?
Thanks in advance.
This monstrosity will rename the directories in situ:
find tmp -depth -type d -exec sh -c '[ -d "{}" ] && echo mv "{}" "$(echo "{}" | tr : _)"' \;
I use -depth so it descends down into the deepest subdirectories first.
The [ -d "{}" ] is necessary because as soon as the subdirectory is renamed, its parent directory (as found by find) may no longer exist (having been renamed).
Change "echo mv" to "mv" if you're satisfied it will do what you want.

Synchronize content of directories in Linux

Let's assume I have following source directory
source/
subdir1/file1
subdir1/file2
subdir2/file3
subdir3/file4
and target directory
target
subdir1/file5
subdir2/file6
subdir4/file7
I would like to move content of source subdirectories to right target subdirectories so result look like this
target
subdir1/file1
subdir1/file2
subdir1/file5
subdir2/file6
subdir2/file3
subdir3/file4
subdir4/file7
Is there some Linux command to do this or must I write a script myself?
To suimmarize, it is important to move, not copy. That rules out cp and rsync but allows mv. mv, however, has the issue that it is not good at merging the old directory into the new.
In the examples that you gave, the target directory had the complete directory tree but lacked files. If that is the case, try:
cd /source ; find . -type f -exec sh -c 'mv "$1" "/target/$1"' _ {} \;
The above starts by selecting the source as the current directory with cd /source. Next, we use find which is the usual *nix utility for finding files/directories. In this case, give find the -type f option to tell it to look only for files. With the -exec option, we tell it to move any such files found to the target directory.
You have choices for how to deal with conflicts between the two directory trees. You can give mv the -f option and it will overwrite files in the target without asking, or you can give it the -n option and it will never overwrite a target file, or your can give it the -i option and it will ask you each time.
In case the target directory tree is incomplete
If the target directory tree is missing some directories that are in the source, the we have to create them on the fly. This adds just minor complication:
cd /source ; find . -type f -exec sh -c 'mkdir -p "/target/${1%/*}"; mv "$1" "/target/$1"' _ {} \;
The mkdir -p command assures that the directory we want exists before we try to move the file there.
Additional notes
The form ${1%/*} is an example of one of the shells powerful features called "parameter expansion". This particular feature is suffix removal. In general, it looks like ${parameter%word} which tells bash to expand word and remove it from the end of parameter. In our case, the name of the parameter is 1, meaning the first argument to the script. We want to remove the file name and just leave behind the directory that the file is in. So, the word /* tells the shell to remove the last slash and any characters which follow.
The commands above use both single and double quotes. They have to be copied exactly for the command to work properly.
To sync dorectory maybe used rsync
Example:
rsync -avzh source/ target/
More info man rsync
Move (no copy)
rsync --remove-source-files -avzh source/ target/

Get grandparent directory in bash script - rename files for a directory in their paths

I have the following script, which I normally use when I get a bunch of files that need to be renamed to the directory name which contains them.
The problem now is I need to rename the file to the directory two levels up. How can I get the grandparent directory to make this work?
With the following I get errors like this example:
"mv: cannot move ./48711/zoom/zoom.jpg to ./48711/zoom/./48711/zoom.jpg: No such file or directory". This is running on CentOS 5.6.
I want the final file to be named: 48711.jpg
#!/bin/bash
function dirnametofilename() {
for f in $*; do
bn=$(basename "$f")
ext="${bn##*.}"
filepath=$(dirname "$f")
dirname=$(basename "$filepath")
mv "$f" "$filepath/$dirname.$ext"
done
}
export -f dirnametofilename
find . -name "*.jpg" -exec bash -c 'dirnametofilename "{}"' \;
find .
Another method could be to use
(cd ../../; pwd)
If this were executed in any top-level paths such as /, /usr/, or /usr/share/, you would get a valid directory of /, but when you get one level deeper, you would start seeing results: /usr/share/man/ would return /usr, /my/super/deep/path/is/awesome/ would return /my/super/deep/path, and so on.
You could store this in a variable as well:
GRANDDADDY="$(cd ../../; pwd)"
and then use it for the rest of your script.
Assuming filepath doesn't end in /, which it shouldn't if you use dirname, you can do
Parent = "${filepath%/*}"
Grandparent = "${filepath%/*/*}"
So do something like this
[[ "${filepath%/*/*}" == "" ]] && echo "Path isn't long enough" || echo "${filepath%/*/*}"
Also this likely won't work if you're using relative paths (like find .). In which case you will want to use
filepath=$(dirname "$f")
filepath=$(readlink -f "$filepath")
instead of
filepath=$(dirname "$f")
Also you're never stripping the extension, so there is no reason to get it from the file and then append it again.
Note:
* This answer solves the OP's specific problem, in whose context "grandparent directory" means: the parent directory of the directory containing a file (it is the grandparent path from the file's perspective).
* By contrast, given the question's generic title, other answers here focus (only) on getting a directory's grandparent directory; the succinct answer to the generic question is: grandParentDir=$(cd ../..; printf %s "$PWD") to get the full path, and grandParentDirName=$(cd ../..; basename -- "$PWD") to get the dir. name only.
Try the following:
find . -name '*.jpg' \
-execdir bash -c \
'old="$1"; new="$(cd ..; basename -- "$PWD").${old##*.}"; echo mv "$old" "$new"' - {} \;
Note: echo was prepended to mv to be safe - remove it to perform the actual renaming.
-execdir ..\; executes the specified command in the specific directory that contains a given matching file and expands {} to the filename of each.
bash -c is used to execute a small ad-hoc script:
$(cd ..; basename -- "$PWD") determines the parent directory name of the directory containing the file, which is the grandparent path from the file's perspective.
${old##*.} is a Bash parameter expansion that returns the input filename's suffix (extension).
Note how {} - the filename at hand - is passed as the 2nd argument to the command in order to bind to $1, because bash -c uses the 1st one to set $0 (which is set to dummy value _ here).
Note that each file is merely renamed, i.e., it stays in its original directory.
Caveat:
Each directory with a matching file should only contain 1 matching file, otherwise multiple files will be renamed to the same target name in sequence - effectively, only the last file renamed will survive.
Can't you use realpath ../../ or readlink -f ../../ ? See this, readlink(1), realpath(3), canonicalize_file_name(3), and realpath(1). You may want to install the realpath package on Debian or Ubuntu. Probably CentOS has an equivalent package. (readlink should always be available, it is in GNU coreutils)

Unix: traverse a directory

I need to traverse a directory so starting in one directory and going deeper into difference sub directories. However I also need to be able to have access to each individual file to modify the file. Is there already a command to do this or will I have to write a script? Could someone provide some code to help me with this task? Thanks.
The find command is just the tool for that. Its -exec flag or -print0 in combination with xargs -0 allows fine-grained control over what to do with each file.
Example: Replace all foo's by bar's in all files in /tmp and subdirectories.
find /tmp -type f -exec sed -i -e 's/foo/bar/' '{}' ';'
for i in `find` ; do
if [ -d $i ] ; then do something with a directory ; fi
if [ -f $i ] ; then do something with a file etc. ; fi
done
This will return the whole tree (recursively) in the current directory in a list that the loop will go through.
This can be easily achieved by mixing find, xargs, sed (or other file modification command).
For example:
$ find /path/to/base/dir -type f -name '*.properties' | xargs sed -ie '/^#/d'
This will filter all files with file extension .properties.
The xargs command will feed the file path generated by find command into the sed command.
The sed command will delete all lines start with # in the files (feed by xargs).
Command combination in this way is very flexible.
For example, find command have different parameters so you can filter by user name, file size, file path (eg: under /test/ subfolder), file modification time.
Another dimension of flexibility is how and what to change in your file. For ex, sed command allows you to make changes on file in applying substitution (specify via regular expressions). Similarly, you can use gzip to compress the file. And so on ...
You would usually use the find command. On Linux, you have the GNU version, of course. It has many extra (and useful) options. Both will allow you to execute a command (eg a shell script) on the files as they are found.
The exact details of how to make changes to the file depend on the change you want to make to the file. That is probably best scripted, with find running the script:
POSIX or GNU:
find . -type f -exec your_script '{}' +
This will run your script once for a group of files with those names provided as arguments. If you want to do it one file at a time, replace the + with ';' (or \;).
I am assuming SearchMe is the example directory name you need to traverse completely.
I am also assuming, since it was not specified, the files you want to modify are all text file. Is this correct?
In such scenario I would suggest using the command:
find SearchMe -type f -exec vi {} \;
If you are not familiar with vi editor, just use another one (nano, emacs, kate, kwrite, gedit, etc.) and it should work as well.
Bash 4+
shopt -s globstar
for file in **
do
if [ -f "$file" ];then
# do some processing to your file here
# where the find command can't do conveniently
fi
done

Resources