Automate SCP copy files from multiple directories (in brackets) to appropraite directories - linux

I have a bash script used for copy some files from different directories in remote host. All of them have the same parent. So i put them into list:
LIST=\{ADIR, BDIR, CDIR\}
and i use the scp command
sshpass -p $2 scp -o LogLevel=debug -r $1#192.168.121.1$/PATH/$LIST/*.txt /home/test/test
that command makes me able to copy all of .txt files from ADIR, BDIR, CDIR to my test directory. Is there any option which can put all of .txt files in appropriate directory like /home/test/test/ADIR or /home/test/test/BDIR ... ?

Have you considered using rsync?
You could try something along these lines:
# Rsync Options
# -a, --archive archive mode; equals -rlptgoD (no -H,-A,-X)
# -D same as --devices --specials
# -g, --group preserve group
# -l, --links copy symlinks as symlinks
# -o, --owner preserve owner (super-user only)
# -O, --omit-dir-times omit directories from --times
# -p, --perms preserve permissions
# -r, --recursive recurse into directories
# -t, --times preserve modification times
# -u, --update skip files that are newer on the receiver
# -v, --verbose increase verbosity
# -z, --compress compress file data during the transfer
for DIR in 'ADIR' 'BDIR' 'CDIR'
do
rsync -zavu --rsh="ssh -l {username}" 192.168.121.1:/$PATH/$DIR /home/test/test/
done

Finally my working code:
SOURCE='/usr/.../'
DEST='/home/test/test'
DIRS_EXCLUDED='test/ADIR test/BDIR'
EXTENSIONS_EXCLUDED='*.NTX *.EXE'
EXCLUDED_STRING=''
for DIR in $DIRS_EXCLUDED
do
EXCLUDED_STRING=$EXCLUDED_STRING'--exclude '"$DIR"' '
done
for EXTENSION in $EXTENSIONS_EXCLUDED
do
EXCLUDED_STRING=$EXCLUDED_STRING'--exclude '"$EXTENSION"' '
done
rsync -zavu $EXCLUDED_STRING --rsh="sshpass -p $2 ssh -l $1" 192.168.xxx.xxx:$SOURCE $DEST

Related

Create Directory, download file and execute command from list of URL

I am working on a Red Hat Linux server. My end goal is to run CRB-BLAST on multiple fasta files and have the results from those in separate directories.
My approach is to download the fasta files using wget then run the CRB-BLAST. I have multiple files and would like to be able to download them each to their own directory (the name perhaps should come from the URL list files), then run the CRB-BLAST.
Example URLs:
http://assemblies/Genomes/final_assemblies/10x_assemblies_v0.1/TC_3370_chr.v0.1.liftover.CDS.fasta.gz
http://assemblies/Genomes/final_assemblies/10x_assemblies_v0.1/TC_CB_chr.v0.1.liftover.CDS.fasta.gz
http://assemblies/Genomes/final_assemblies/10x_assemblies_v0.1/TC_13_chr.v0.1.liftover.CDS.fasta.gz
http://assemblies/Genomes/final_assemblies/10x_assemblies_v0.1/TC_37_chr.v0.1.liftover.CDS.fasta.gz
http://assemblies/Genomes/final_assemblies/10x_assemblies_v0.1/TC_123_chr.v0.1.liftover.CDS.fasta.gz
http://assemblies/Genomes/final_assemblies/10x_assemblies_v0.1/TC_195_chr.v0.1.liftover.CDS.fasta.gz
http://assemblies/Genomes/final_assemblies/10x_assemblies_v0.1/TC_31_chr.v0.1.liftover.CDS.fasta.gz
Ideally, the file name determines the directory name, for example, TC_3370/.
I think there might be a solution with cat URL.txt | mkdir | cd | wget | crb-blast
Currently I just run the commands in line:
mkdir TC_3370
cd TC_3370/
wget url
http://assemblies/Genomes/final_assemblies/10x_meta_assemblies_v1.0/TC_3370_chr.v1.0.maker.CDS.fasta.gz
crb-blast -q TC_3370_chr.v1.0.maker.CDS.fasta.gz -t TCV2_annot_cds.fna -e 1e-20 -h 4 -o rbbh_TC
Try this Shellcheck-clean program:
#! /bin/bash -p
while read -r url; do
file=${url##*/}
dir=${file%%_chr.*}
mkdir -v -- "$dir"
(
cd "./$dir" || exit 1
wget -- "$url"
crb-blast -q "$file" -t TCV2_annot_cds.fna -e 1e-20 -h 4 -o rbbh_TC
)
done <URL.txt
See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) for an explanation of ${url##*/} etc.
The subshell (( ... )) is used to ensure that the cd doesn't affect the main program.
Another implementation
#!/bin/sh
# Read lines as url as long as it can
while read -r url
do
# Get file name by stripping-out anything up to the last / from the url
file_name=${url##*/}
# Get the destination dir name by stripping anything from the first __chr
dest_dir=${file_name%%_chr*}
# Compose the wget output path
fasta_path="$dest_dir/$file_name"
if
# Successfully created the destination directory AND
mkdir -p -- "$dest_dir" &&
# Successfully downloaded the file
wget --output-file="$fasta_path" --quiet -- "$url"
then
# Process the fasta file into fna
fna_path="$dest_dir/TCV2_annot_cds.fna"
crb-blast -q "$fasta_path" -t "$fna_path" -e 1e-20 -h 4 -o rbbh_TC
else
# Cleanup remove destination directory if any of mkdir or wget failed
rm -fr -- "$dest_dir"
fi
# reading from the URL.txt file for the whole while loop
done < URL.txt
Download files from list is task for -i file option, if you have file named say urls.txt with one URL per line you might simply do
wget -i urls.txt
Note that this will put all files inside current working directory, so if you wish to have them in separate dirs, you would need to move them after wget finish.

How to exclude a specific file in scp linux shell command?

I am trying to execute the scp command in such a way that it can copy .csv files from source to sink, except a few specific CSV file.
For example in the source folder I am having four files:
file1.csv, file2.csv, file3.csv, file4.csv
Out of those four files, I want to copy all files, except file4.csv, to the sink location.
When I was using the below scp command:
scp /tmp/source/*.csv /tmp/sink/
It would copy all the four CSV files to the sink location.
How can I achieve the same by using the scp command or through writing a shell script?
You can use rsync with the --exclude switch, e.g.
rsync /tmp/source/*.csv /tmp/sink/ --exclude file4.csv
Bash has an extended globbing feature which allows for this. On many installations, you have to separately enable this feature with
shopt -e extglob
With that in place, you can
scp tmp/source/(!fnord*).csv /tmp/sink/
to copy all *.csv files except fnord.csv.
This is a shell feature; the shell will expand the glob to a list of matching files - scp will have no idea how that argument list was generated.
As mentioned in your comment, rsync is not an option for you. The solution presented by tripleee works only if the source is on the client side. Here I present a solution using ssh and tar. tar does have the --exclude flag, which allows us to exclude patterns:
from server to client:
$ ssh user#server 'tar -cf - --exclude "file4.csv" /path/to/dir/*csv' \
| tar -xf - --transform='s#.*/##' -C /path/to/destination
This essentially creates a tar-ball which is send over /dev/stdout which we pipe into a tar extract. To mimick scp we need to remove the full path using --transform (See U&L). Optionally you can add the destination directory.
from client to server:
We do essentially the same, but reverse the roles:
$ tar -cf - --exclude "file4.csv" /path/to/dir/*csv \
| ssh user#server 'tar -xf - --transform="s#.*/##" -C /path/to/destination'
You could use a bash array to collect your larger set, then remove the items you don't want. For example:
files=( /tmp/src/*.csv )
for i in "${!files[#]}"; do
[[ ${files[$i]} = *file4.csv ]] && unset files[$i]
done
scp "${files[#]}" host:/tmp/sink/
Note that our for loop steps through array indices rather than values, so that we'll have the right input for the unset command if we need it.

scp multiple files with different names from source and destination

I am trying to scp multiple files from source to destination.The scenario is the source file name is different from the destination file
Here is the SCP Command i am trying to do
scp /u07/retail/Bundle_de.properties rgbu_fc#<fc_host>:/u01/projects/MultiSolutionBundle_de.properties
Basically i do have more than 7 files which i am trying seperate scps to achieve it. So i want to club it to a single scp to transfer all the files
Few of the scp commands i am trying here -
$ scp /u07/retail/Bundle_de.properties rgbu_fc#<fc_host>:/u01/projects/MultiSolutionBundle_de.properties
$ scp /u07/retail/Bundle_as.properties rgbu_fc#<fc_host>:/u01/projects/MultiSolutionBundle_as.properties
$ scp /u07/retail/Bundle_pt.properties rgbu_fc#<fc_host>:/u01/projects/MultiSolutionBundle_pt.properties
$ scp /u07/retail/Bundle_op.properties rgbu_fc#<fc_host>:/u01/projects/MultiSolutionBundle_op.properties
I am looking for a solution by which i can achieve the above 4 files in a single scp command.
Looks like a straightforward loop in any standard POSIX shell:
for i in de as pt op
do scp "/u07/retail/Bundle_$i.properties" "rgbu_fc#<fc_host>:/u01/projects/MultiSolutionBundle_$i.properties"
done
Alternatively, you could give the files new names locally (copy, link, or move), and then transfer them with a wildcard:
dir=$(mktemp -d)
for i in de as pt op
do cp "/u07/retail/Bundle_$i.properties" "$dir/MultiSolutionBundle_$i.properties"
done
scp "$dir"/* "rgbu_fc#<fc_host>:/u01/projects/"
rm -rf "$dir"
With GNU tar, ssh and bash:
tar -C /u07/retail/ -c Bundle_{de,as,pt,op}.properties | ssh user#remote_host tar -C /u01/projects/ --transform 's/.*/MultiSolution\&/' --show-transformed-names -xv
If you want to use globbing (*) with filenames:
cd /u07/retail/ && tar -c Bundle_*.properties | ssh user#remote_host tar -C /u01/projects/ --transform 's/.*/MultiSolution\&/' --show-transformed-names -xv
-C: change to directory
-c: create a new archive
Bundle_{de,as,pt,op}.properties: bash is expanding this to Bundle_de.properties Bundle_as.properties Bundle_pt.properties Bundle_op.properties before executing tar command
--transform 's/.*/MultiSolution\&/': prepend MultiSolution to filenames
--show-transformed-names: show filenames after transformation
-xv: extract files and verbosely list files processed

linux bash script to create folder and move files

Hello I need to create folder based on a filename and in this folder create another one and then move file to this second folder
example:
my_file.jpg
create folder my_file
create folder picture
move my_file.jpg to picture
I have this script but it only works on windows and now I'm using Linux
for %%A in (*.jpg) do mkdir "%%~nA/picture" & move "%%A" "%%~nA/picture"
pause
Sorry if I'm not precise but English is not my native language.
#!/usr/bin/env bash
# Enable bash built-in extglob to ease file matching.
shopt -s extglob
# To deal with the case where nothing matches. (courtesy of mklement0)
shopt -s nullglob
# A pattern to match files with specific file extensions.
# Example for matching additional file types.
#match="*+(jpg|.png|.gif)"
match="*+(.jpg)"
# By default use the current working directory.
src="${1:-.}"
dest="${2:-/root/Desktop/My_pictures/}"
# Pass an argument to this script to name the subdirectory
# something other than picture.
subdirectory="${3:-picture}"
# For each file matched
for file in "${src}"/$match
do
# make a directory with the same name without file extension
# and a subdirectory.
targetdir="${dest}/$(basename "${file%.*}")/${subdirectory}"
# Remove echo command after the script outputs fit your use case.
echo mkdir -p "${targetdir}"
# Move the file to the subdirectory.
echo mv "$file" "${targetdir}"
done
Use basename to create the directory name, mkdir to create the folder, and mv the file:
for file in *.jpg; do
folder=$(basename "$file" ".jpg")"/picture"
mkdir -p "$folder" && mv "$file" "$folder"
done
Try the following:
for f in *.jpg; do
mkdir -p "${f%.jpg}/picture"
mv "$f" "${f%.jpg}/picture"
done
${f%.jpg} extracts the part of the filename before the .jpg to create the directory. Then the file is moved there.

rsync with --remove-sent-files option and open files

Every minute I need to copy recorded files from 3 servers to one data storage. I don't need to save original files - data processing is outside of all of them.
But when i use option --remove-sent-files, rsync sends and removes not finished (not closed) files.
I've tried to prevent sending these open files with lsof and --exclude-from, but it seems, that rsync does not unserstand full paths in exlude list:
--exclude-from=FILE read exclude >>patterns<< from FILE
lsof | grep /projects/recordings/.\\+\\.\\S\\+ -o | sort | uniq
/projects/recordings/<uid>/<path>/2012-07-16 13:24:32.646970-<id>.WAV
So, the script looks like:
# get open files in src dir and put them into rsync.exclude file
lsof | grep /projects/recordings/.\\+\\.\\S\\+ -o | sort | uniq > /tmp/rsync.exclude
# sync without these files
/usr/bin/rsync -raz --progress --size-only --remove-sent-files --exclude-files=/tmp/rsync.excldude /projects/recordings/ site.com:/var/www/storage/recordings/
# change owner
ssh storage#site.com chown -hR storage:storage /var/www/storage/recordings
So, may be i should try another tool? Or why rsync does not listen to exludes?
I'm not sure if this helps you, but here's my solution to only rsync files which are not currently being written to. I use it for tshark captures, writing to a new file every N seconds with the -a flag (e.g. tshark -i eth0 -a duration:30 -w /foo/bar/caps). Watch out for that tricky rsync, the order of the include and exclude is important, and if we want sub-directories we need to include "*/".
-G
$save_path=/foo/bar/
$delay_between_syncs=30
while true;
do
sleep $delay_between_syncs
# Calculate which files are currently open (i.e. the ones currently being written to)
# and avoid uploading it. This is to ensure that when we process files on the server, they
# are complete.
echo "" > /tmp/include_list.txt
for i in `find $save_path/ -type f`
do
op=`fuser $i`
if [ "$op" == "" ]
then
#echo [+] $i is good for upload, will add it list.
c=`echo $i | sed 's/.*\///g'`
echo $c >> /tmp/include_list.txt
fi
done
echo [+] Syncing...
rsync -rzt --include-from=/tmp/include_list.txt --include="*/" --exclude \* $save_path user#server:/home/backup/foo/
echo [+] Sunk...
done
rsync the files, then remove the ones that have been rsync'd by capturing the list of transferred files, and then removing only the transferred files that are not currently open. Rsync figures out what files to transfer when it gets to the directory, so your solution was bound to fail later even if it worked at first, when a newly opened file (since rsync started) was not in the exclude list.
An alternate approach would be to do a
find dir -type f -name pattern -mmin +10 | xargs -i rsync -aP {} dest:/path/to/backups

Resources