Linux script variables to SCP and delete files - linux

I am looking to set up a script to do the following:
1st: SCP a directory on the first day of month to another server
2nd: Delete the directory after successful transfer
The directory I need to move will always have a different name, and the lowest numbered one is always the one that needs to move:
2018/files/02/
2018/files/03/
So what im looking to write up is something like:
scp /2018/files/% user#host:/backups/2018/files/
{where % = lowest num} &&
rm -rf /2018/files/%
{where % = lowest num} &&
exit
Thanks for any advice

If you are open to using Ruby, you could accomplish it with something like this:
def file_number(filespec)
filespect.split('/').last.to_i
end
directories = Dir['/2018/files'].select { |f| File.directory?(f) }
sorted_dirs = directories.sort_by do |dir1, dir2|
file_number(dir1) <=> file_number(dir1)
end
dir_to_copy = sorted_dirs.first
destination_dir = File.join('/', 'backups', dir_to_copy)
`scp #{dir_to_copy} user#host:#{destination_dir}`
`rm -rf #{dir_to_copy}`
I have not tested this, but if you have any problems, let me know what they are and I can work through it with you.
While using shell scripting eliminates the need for the Ruby interpreter, to me the code is not nearly as straightforward.
In very large directory lists (maybe 10,000's?) the sort might be intolerably slow, and another method would be needed to optimize for speed.
I would caution you against doing an unconditional rm -rf after the backup -- that seems really risky to me.

The big challenge here is to actually find the right files to copy, and shudder, delete. So let us call that step 0.
Let's start with some boiler plate
sourceD=/2018/files/
targetD=/backups/2018/files/
And a little assertion, which bails out from the script if $1 does not equate to a directory.
assert_directory() { (cd ${1:?directory name}) || exit; }
step 0: Identify directory:
assert_directory $sourceD
to_be_archived=$(
# source must be two characters, hence "??"
# source must a directory, hence trailing "/"
# set -- sorts its arguments
# First match must be our source
set -- $sourceD/??/ &&
assert_directory "$1"
echo ${1:?nothing found}
) || exit
This is only a couple of lines of condensed code. Note that this may
cause trouble if you (accidentally) run this multiple times in a row.
Step 1, Copy files now appears to be the easy part.
scp -r ${to_be_archived:?} user#host:${targetD:?}
This is a simple method for copying files, but also slow and risky.
Lookup rsync over ssh for alternatives.
Step 2, Remove
The rm -fr line will do the job, but I won't include that here.
We are missing an essential step, as we need to make sure that our
files have arrived safely. Again, rsync has options for that.
In summary:
assert_directory() { (cd ${1:?directory name}) || exit; }
assert_directory $sourceD
to_be_archived=$(
set -- $sourceD/??/ &&
assert_directory "$1"
echo ${1:?nothing found}
) || exit
This will give you the first two-character name directory (if one exists) in sourceD or abort the running script. It will break if $sourceD contains spaces.

Related

Using bash to loop through nested folders to run script in current working directory

I've got (what feels like) a fairly simple problem but my complete lack of experience in bash has left me stumped. I've spent all day trying to synthesize a script from many different SO threads explaining how to do specific things with unintuitive commands, but I can't figure out how to make them work together for the life of me.
Here is my situation: I've got a directory full of nested folders each containing a file with extension .7 and another file with extension .pc, plus a whole bunch of unrelated stuff. It looks like this:
Folder A
Folder 1
Folder x
data_01.7
helper_01.pc
...
Folder y
data_02.7
helper_02.pc
...
...
Folder 2
Folder z
data_03.7
helper_03.pc
...
...
Folder B
...
I've got a script that I need to run in each of these folders that takes in the name of the .7 file as an input.
pc_script -f data.7 -flag1 -other_flags
The current working directory needs to be the folder with the .7 file when running the script and the helper.pc file also needs to be present in it. After the script is finished running, there are a ton of new files and directories. However, I need to take just one of those output files, result.h5, and copy it to a new directory maintaining the same folder structure but with a new name:
Result Folder/Folder A/Folder 1/Folder x/new_result1.h5
I then need to run the same script again with a different flag, flag2, and copy the new version of that output file to the same result directory with a different name, new_result2.h5.
The folders all have pretty arbitrary names, though there aren't any spaces or special characters beyond underscores.
Here is an example of what I've tried:
#!/bin/bash
DIR=".../project/data"
for d in */ ; do
for e in */ ; do
for f in */ ; do
for PFILE in *.7 ; do
echo "$d/$e/$f/$PFILE"
cd "$DIR/$d/$e/$f"
echo "Performing operation 1"
pc_script -f "$PFILE" -flag1
mkdir -p ".../results/$d/$e/$f"
mv "results.h5" ".../project/results/$d/$e/$f/new_results1.h5"
echo "Performing operation 2"
pc_script -f "$PFILE" -flag 2
mv "results.h5" ".../project/results/$d/$e/$f/new_results2.h5"
done
done
done
done
Obviously, this didn't work. I've also tried using find with -execdir but then I couldn't figure out how to insert the name of the file into the script flag. I'd appreciate any help or suggestions on how to carry this out.
Another, perhaps more flexible, approach to the problem is to use the find command with the -exec option to run a short "helper-script" for each file found below a directory path that ends in ".7". The -name option allows find to locate all files ending in ".7" below a given directory using simple file-globbing (wildcards). The helper-script then performs the same operation on each file found by find and handles moving the result.h5 to the proper directory.
The form of the command will be:
find /path/to/search -type f -name "*.7" -exec /path/to/helper-script '{}` \;
Where the -f option tells find to only return files (not directories) ending in ".7". Your helper-script needs to be executable (e.g. chmod +x helper-script) and unless it is in your PATH, you must provide the full path to the script in the find command. The '{}' will be replaced by the filename (including relative path) and passed as an argument to your helper-script. The \; simply terminates the command executed by -exec.
(note there is another form for -exec called -execdir and another terminator '+' that can be used to process the command on all files in a given directory -- that is a bit safer, but has additional PATH requirements for the command being run. Since you have only one ".7" file per-directory -- there isn't much benefit here)
The helper-script just does what you need to do in each directory. Based on your description it could be something like the following:
#!/bin/bash
dir="${1%/*}" ## trim file.7 from end of path
cd "$dir" || { ## change to directory or handle error
printf "unable to change to directory %s\n" "$dir" >&2
exit 1
}
destdir="/Result_Folder/$dir" ## set destination dir for result.h5
mkdir -p "$destdir" || { ## create with all parent dirs or exit
printf "unable to create directory %s\n" "$dir" >&2
exit 1
}
ls *.pc 2>/dev/null || exit 1 ## check .pc file exists or exit
file7="${1##*/}" ## trim path from file.7 name
pc_script -f "$file7" -flags1 -other_flags ## first run
## check result.h5 exists and non-empty and copy to destdir
[ -s "result.h5" ] && cp -a "result.h5" "$destdir/new_result1.h5"
pc_script -f "$file7" -flags2 -other_flags ## second run
## check result.h5 exists and non-empty and copy to destdir
[ -s "result.h5" ] && cp -a "result.h5" "$destdir/new_result2.h5"
Which essentially stores the path part of the file.7 argument in dir and changes to that directory. If unable to change to the directory (due to read-permissions, etc..) the error is handled and the script exits. Next the full directory structure is created below your Result_Folder with mkdir -p with the same error handling if the directory cannot be created.
ls is used as a simple check to verify that a file ending in ".pc" exits in that directory. There are other ways to do this by piping the results to wc -l, but that spawns additional subshells that are best avoided.
(also note that Linux and Mac have files ending in ".pc" for use by pkg-config used when building programs from source -- they should not conflict with your files -- but be aware they exists in case you start chasing why weird ".pc" files are found)
After all tests are performed, the path is trimmed from the current ".7" filename storing just the filename in file7. The file7 variabli is then used in your pc_script command (which should also include the full path to the script if not in you PATH). After the pc_script is run [ -s "result.h5" ] is used to verify that result.h5 exists and is non-empty before moving that file to your Result_Folder location.
That should get you started. Using find to locate all .7 files is a simple way to let the tool designed to find the files for you do its job -- rather than trying to hand-roll your own solution. That way you only have to concentrate on what should be done for each file found. (note: I don't have pc_script or the files, so I have not testes this end-to-end, but it should be very close if not right-on-the-money)
There is nothing wrong in writing your own routine, but using find eliminates a lot of area where bugs can hide in your own solution.
Let me know if you have further questions.

How can I stop my script to overwrite existing files

I am learning bash since 6 days I think I got some of the basics.
Anyway, for the wallpapers downloaded from Variety I've written two scripts. One of them moves downloaded photos older than 12 days to a folder and renames them all as "Aday 1,2,3..." and the other lets me select these and moves them to another folder and removes photos I didn't select. 1st script works just as I intended, my question is about the other
I think I should write the script down to better explain my problem
Script:
#!/bin/bash
#Move victors of 'Seçme-Eleme' to 'Kazananlar'
cd /home/eurydice/Bulunur\ Bir\ Şeyler/Dosyamsılar/Seçme-Eleme
echo "Select victors"
read vct
for i in $vct; do
mv -i "Aday $i.png" /home/eurydice/"Bulunur Bir Şeyler"/Dosyamsılar/Kazananlar/"Bahar $RANDOM.png" ;
mv -i "Aday $i.jpg" /home/eurydice/"Bulunur Bir Şeyler"/Dosyamsılar/Kazananlar/"Bahar $RANDOM.jpg" ;
done
#Now let's remove the rest
rm /home/eurydice/Bulunur\ Bir\ Şeyler/Dosyamsılar/Seçme-Eleme/*
In this script I originally intended to define another variable (let's call this "n") and so did I with copying and changing the variable from the first script. It was something like that
for i in $vct; do
n=1
mv "Aday $i.png" /home/eurydice/"Bulunur Bir Şeyler"/Dosyamsılar/Kazananlar/"Bahar $n.png" ;
mv "Aday $i.jpg" /home/eurydice/"Bulunur Bir Şeyler"/Dosyamsılar/Kazananlar/"Bahar $n.jpg" ;
n=$((n+1))
done
When I do that for the first time the script worked just as I intended. However, in my 2nd test run this script overwrote the files that already existed. I mean, for example in 1st run i had 5 files whose names are "Bahar 1,2,3,4,5" and the 2nd time I chose 3 files to add. I wanted their names to be "Bahar 6,7,8" but instead, my script made them the new 1,2 and 3. I tried many solutions and when I couldn't fix that I just assigned random numbers to them.
Is there a way to make this script work as I intended?
This command finds the biggest file name number amongst files in current directory. If no file is found, biggest number is assigned to 0.
biggest_number=$(ls -1 | sed -n 's/^[^0-9]*\([0-9]\+\)\(\.[a-zA-Z]\+\)\?$/\1/p' | sort -r -g | head -n 1)
[[ ! -z "$biggest_number" ]] || biggest_number=0
The regex in sed command assumes that there is no digit in filenames before the trailing number intended for increment.
As soon as you have found the biggest number, you can use it to start your loop to prevent overwrites.
n=$((biggest_number+1))

bash -- copying and change filename

I need to copy all files from
/dirA/[NAME].20151231.txt
to
/dirB/20151231.[NAME].txt
and
/dirC/20151231/[NAME].txt
i.e. I need to copy the files, but change the name.
You can assume that I know the "date" string before hand, so we can assume 20151231 is a supplied argument.
if I have a list of names, I can do something like
for n in $names; do; cp /dirA/$n.$date.txt /dirB/$date.$n.txt; done;
But what if I dont have a list of names? I am looking for an elegant solution as extracting them from dirA sounds a bit cumbersome.
Thanks!
A reasonably reliable way of processing this material is:
date=20151231
cd /dirA || exit 1
mkdir -p "/dirC/$date" || exit 1
for file in *."$date.txt"
do
name="${file%.$date.txt}"
cp "$file" "/dirB/$date.$name.txt"
cp "$file" "/dirC/$date/$name.txt"
done
The cd operation is checked; if it fails, there is no point in continuing. Likewise, the mkdir -p operation ensures that the dated directory under /dirC exists or exits. The relevant error messages were already generated by cd and mkdir.
Using the shell globbing to generate the file names is best; it avoids issues with 'what happens if the file name contains spaces (or newlines, or other unexpected characters)'.
The assignment extracts the '[NAME]' portion of the file name. This is then used to copy the file from /dirA to the relevant locations under /dirB and /dirC. It would be feasible to check that /dirB and /dirC also exist if you thought that was necessary.
Maybe I am just awful at asking questions. What I was looking for was a "sed for file names". And I found the answer -- that's rename.

One liner to append a file into another file but only if it hasn't already been added

I have an automated process that has a number of lines like the following pattern:
sudo cat /some/path/to/a/file >> /some/other/file
I'd like to transform that into a one liner that will only append to /some/other/file if /some/path/to/a/file has not already been added.
Edit
It's clear I need some examples here.
example 1: Updating a .bashrc script for a specific login
example 2: Creating a .screenrc for different logins
example 3: Appending to the end of a /etc/ config file
Some other caveats. The text is going to be added in a block (>>). Consequently, it should be relatively straight forward to see if the entire code block is added or not near the end of a file. I am trying to come up with a simple method for determining whether or not the file has already been appended to the original.
Thanks!
Example python script...
def check_for_appended(new_file, original_file):
""" Checks original_file to see if it has the contents of new_file """
new_lines = reversed(new_file.split("\n"))
original_lines = reversed(original_file.split("\n"))
appended = None
for new_line, orig_line in zip(new_lines, original_lines):
if new_line != orig_line:
appended = False
break
else:
appended = True
return appended
Maybe this will get you started - this GNU awk script:
gawk -v RS='^$' 'NR==FNR{f1=$0;next} {print (index($0,f1) ? "present" : "absent")}' file1 file2
will tell you if the contents of "file1" are present in "file2". It cannot tell you why, e.g. because you previously concatenated file1 onto the end of file2.
Is that all you need? If not update your question to clarify/explain.
Here's a technique to see if a file contains another file
contains_file_in_file() {
local small=$1
local big=$2
awk -v RS="" '{small=$0; getline; exit !index($0, small)}' "$small" "$big"
}
if ! contains_file_in_file /some/path/to/a/file /some/other/file; then
sudo cat /some/path/to/a/file >> /some/other/file
fi
EDIT: Op just told me in the comments that the files he wants to concatenate are bash scripts -- this brings us back to the good ole C preprocessor include guard tactics:
prepend every file with
if [ -z "$__<filename>__" ]; then __<filename>__=1; else
(of course replacing <filename> with the name of the file) and at the end
fi
this way, you surround the script in each file with a test for something that's only true once.
Does this work for you?
sudo (set -o noclobber; date > /tmp/testfile)
noclobber prevents overwriting an existing file.
I think it doesn't, since you wrote you want to append something but this technique might help.
When the appending all occurs in one script, then use a flag:
if [ -z "${appended_the_file}" ]; then
cat /some/path/to/a/file >> /some/other/file
appended_the_file="Yes I have done it except for permission/right issues"
fi
I would continue into writing a function appendOnce { .. }, with the content above. If you really want an ugly oneliner (ugly: pain for the eye and colleague):
test -z "${ugly}" && cat /some/path/to/a/file >> /some/other/file && ugly="dirt"
Combining this with sudo:
test -z "${ugly}" && sudo "cat /some/path/to/a/file >> /some/other/file" && ugly="dirt"
It appears that what you want is a collection of script segments which can be run as a unit. Your approach -- making them into a single file -- is hard to maintain and subject to a variety of race conditions, making its implementation tricky.
A far simpler approach, similar to that used by most modern Linux distributions, is to create a directory of scripts, say ~/.bashrc.d and keep each chunk as an individual file in that directory.
The driver (which replaces the concatenation of all those files) just runs the scripts in the directory one at a time:
if [[ -d ~/.bashrc.d ]]; then
for f in ~/.bashrc.d/*; do
if [[ -f "$f" ]]; then
source "$f"
fi
done
fi
To add a file from a skeleton directory, just make a new symlink.
add_fragment() {
if [[ -f "$FRAGMENT_SKELETON/$1" ]]; then
# The following will silently fail if the symlink already
# exists. If you wanted to report that, you could add || echo...
ln -s "$FRAGMENT_SKELETON/$1" "~/.bashrc.d/$1" 2>>/dev/null
else
echo "Not a valid fragment name: '$1'"
exit 1
fi
}
Of course, it is possible to effectively index the files by contents rather than by name. But in most cases, indexing by name will work better, because it is robust against editing the script fragment. If you used content checks (md5sum, for example), you would run the risk of having an old and a new version of the same fragment, both active, and without an obvious way to remove the old one.
But it should be straight-forward to adapt the above structure to whatever requirements and constraints you might have.
For example, if symlinks are not possible (because the skeleton and the instance do not share a filesystem, for example), then you can copy the files instead. You might want to avoid the copy if the file is already present and has the same content, but that's just for efficiency and it might not be very important if the script fragments are small. Alternatively, you could use rsync to keep the skeleton and the instance(s) in sync with each other; that would be a very reliable and low-maintenance solution.

find based filename autocomplete in Bash script

There is a command line feature I've been wanting for a long time, and I've thought about how to best realize it, but I got nothing...
So what I'd like to have is when I start typing a filename and hit tab,for example:
# git add Foo<tab>
I'd like it to run a find . -name "*$1*" and basically autocomplete the complete path to the matched File to my command line.
What I have so far:
I know I'll have to write a function that will call the app with the parameters I want,
for example git add. After that it needs to catch the tab-keystroke event and do the find mentioned above, and display the results if many, or fill in the result if one.
What I haven't been able to figure out:
How to catch the tab key event within a function within function.
So basically in pseudocode:
gadd() {git add autocomplete_file_search($1)}
autocomplete_file_search(keyword) {
if( tab-key-pressed ){
files = find . -name "*$1*";
if( filecount > 1 ) {
show list;
}
if( files == 1 ) {
return files
}
}
}
Any ideas?
thanks.
Matching anywhere in the filename is rather complicated, and I'm not sure it's really all that useful. Matching at the start of filenames makes more sense and is much easier to implement, even recursively.
Now, you mentioned find as a requirement, but bash (since version 4.0) can also find files recursively, and it should be more efficient to let bash do that part. To match recursively in bash, you enable the globstar shell option by running shopt -s globstar, then two consecutive asterisks, **, will match recursively.
Next up, given that you want to match files recursively inside a git repository, we best have a way to detect that we're actually in a git repository, otherwise, if you accidentally trigger it in / for instance, your prompt will hang while waiting for bash to search through your entire filesystem. The following function should be fairly efficient at determining if we're inside a git repository. Given the current working directory, e.g. /foo/bar/baz, it'll look for /foo/bar/baz/.git, /foo/bar/.git, /foo/.git, /.git and return true if it finds one, false otherwise.
isgit() {
local p=$PWD
while [[ $p ]]; do
[[ -d $p/.git ]] && return
p=${p%/*}
done
return 1
}
For simplicity, we'll create a gadd command to add the completions for. A completion function can only be applied to the first word of the command. E.g. we can add completion for git, but not git add, thus we'll make a new command that turns git add into one word.
gadd() {
git add "$#"
}
Now for the actual completion function. When triggered by hitting TAB, the function will be invoked with three arguments. $1 is the command being completed, $2 is the current word of the command line being completed, and $3 is the previous word on the line. So the files we want to search will be matched by the glob **/"$2"*; all files starting with "$2". We iterate these filenames, and append them to the COMPREPLY array. If the COMPREPLY array only contains one value when the function is done, the word will be replaced by that value. If it contains more than one value, hit tab another time to get a list of all the matches.
shopt -s globstar
_git_add_complete() {
local file
isgit || return
for file in **/"$2"*; do
# If the glob doesn't match, we'll get the glob itself, so make sure
# we have an existing file
[[ -e $file ]] || continue
# If it's a directory, add a trailing /
[[ -d $file ]] && file+=/
COMPREPLY+=( "$file" )
done
}
complete -F _git_add_complete gadd
Add the above three code blocks to your ~/.bashrc, then open a new terminal, enter a git repository and try gadd something<tab>.
You should take a look at this introduction to bash completion. Briefly, bash has a system for configuring and extending tab completion. Other shells do this, too, and each one has a different way to set it up. Using this system it is not necessary to do everything yourself and adding custom argument completion to a command is relatively easy.
Does this work?
$ cat .bash_completion
_foo()
{
local files
cur=${COMP_WORDS[COMP_CWORD]}
local files=$(for x in `find -type f`; do echo ${x}; done)
COMPREPLY=( $( compgen -W "${files}" -- ${cur} ) )
return 0
}
complete -F _foo foo
$ . /etc/bash_completion
$ foo ./[tab]
I wrote git-number so that I never have to hit tab when specifying files to git.
With git-number I can use numbers to represent the filenames that I want git to handle.

Resources