Does anyone know how to list the files that exists in one remote folder and not in another remote folder. I have two servers (say Server1 and Server2) with similar folder structure where I'm doing Rsync. However, the destination folder has more files than the source as some of the files were deleted. Now I'm trying to find a way to find which files are new in Server2 by using diff between Server 1 and Server 2.
I can take the diff between two local folders directly using the following command:
diff /home/www/images/test_images /var/www/site/images/test_images
But I was wondering if it is possible to diff folders between two remote servers using ssh. Like this?
diff ubuntu1#images.server1.com:/home/www/images/test_images ubuntu2#images.server2.com:/var/www/site/images/test_images
Say the ssh configurations of Server 1 and Server 2 are as follows:
Server 1
IP: images.server1.com
User: ubuntu1
Password: pa$$word1
Images Path: /home/www/images/test_images
Server 2
IP: images.server2.com
User: ubuntu2
Password: pa$$word2
Images Path: /var/www/site/images/test_images
Hoping for any help to solve this problem. Thanks.
Try this command:
diff -B <(sshpass -p 'pa$$word1' ssh ubuntu1#images.server1.com "find /home/www/images/test_images -type f | sed 's/\/home\/www\/images\/test_images\///g'" | sort -n) <(sshpass -p 'pa$$word2' ssh ubuntu2#images.server2.com "find /var/www/site/images/test_images -type f | sed 's/\/var\/www\/site\/images\/test_images\///g'" | sort -n) | grep ">" | awk '{print $2}'
Explanation:
You can use diff -B <() <() for taking the diff between two streams. The command first uses sshpass to ssh into the two servers without having to enter your passwords interactively.
Each parameter for diff -B uses find command to recursively list all your images in the specified directory and uses sed to remove the root path of the files (because they are different for two servers - and to make it work for the diff command); and the sort command to sort them.
Since the output of the diff command returns either > or <, grep is used to filter out only the diffs from your Server 2. Last, awk prints out only the second column (removes the > column from the output).
NOTE: You need to install sshpass first. Use apt-get to install it as follows:
sudo apt-get install sshpass
You can extend this by piping other commands like rm. Hope this works for you.
Related
i am trying to copy files from remote machine to local machine using scp
scp -r username#hostname:/directory .
I want only the file to be copied instead of directories
ie)
directory
|directory2
| file1
| file2
file12
After copying all the files the structure should be of this
localdirectory
|file1
|file1
|file12
Is this possible using scp?
Sergius is right, you can use find and scp in conjunction to achieve this. However you need to run find on remote machine over ssh first and then scp it.
You can combine find and scp, something like this:
find localdirectory | xargs scp {your parameters}
find - returns all files, xargs - will collect their full paths and gives them as argument for scp
try:
scp -r username#hostname:{/directory/directory2/file1,/directory/directory2/file2,/directory/file12} localdirectory
or just scp one by one
I'm trying to backup just one file that is generated by other application in dynamic named folders.
for example:
parent_folder/
back_01 -> file_blabla.zip (timestam 2013.05.12)
back_02 -> file_blabla01.zip (timestam 2013.05.14)
back_03 -> file_blabla02.zip (timestam 2013.05.22)
and I need to get the latest generated zip, just that one it doesnt matter the name of the file as long as is the latest, is a zip and is inside "parent_folder" get that one.
as well when I do the rsync the folder structure + file name is generated and I want to omit that I want to backup that file in a folder and with a name so I know where is the latest and it will be always named the same.
now im doing this with a perl that get the latest generated folder with
"ls -tAF | grep '/$' | head -1"
and perform the rsync but it does brings the last zip but with the folder structure that I dont want because it doesnt override my latest zip file.
rsync -rvtW --prune-empty-dirs --delay-updates --no-implied-dirs --modify-window=1 --include='*.zip' --exclude='*.*' --progress /source/ /myBackup/
as well it would be great if I could do the rsync without needing to use perl or any other script.
thanks
The file names will differ each time ?
This would be hard for any type of syncing to work.
What you could do is :
create a new folder outside of where it is found, then :
Before you start remove the last sym linked file in that folder
When the file is found i.e. ls -tAF | grep '/$' | head -1 ....
symlink it this folder
then rsync,ssh,unison file across to new node.
If the symlink name is file-latest.zip then it will always be this
one file sent across.
But why do all that when you can just scp and you can take a look at here:
https://github.com/vahidhedayati/definedscp
for a more long winded approach, and not for this situation but it uses the real file date/time stamp then converts to seconds... It might be useful if you wish to do the stat in a different way
Using stat to work out file, work out latest file then simply scp it across, here is something to get you started:
One liner:
scp $(find /path/to/parent_folder -name \*.zip -exec stat -t {} \;|awk '{print $1" "$13}'|sort -k2nr|head -n1|awk '{print $1}') remote_server:/path/to/name.zip
More long winded way, maybe of use to understand what above is doing:
#!/bin/bash
FOUND_ARRAY=()
cd parent_folder;
for file in $(find . -name \*.zip); do
ptime=$(stat -t $file|awk '{print $13}');
FOUND_ARRAY+=($file" "$ptime)
done
IFS=$'\n'
FOUND_FILE=$(echo "${FOUND_ARRAY[*]}" | sort -k2nr | head -n1|awk '{print $1}');
scp $FOUND_FILE remote_host:/backup/new_name.zip
I have a series of functionally identical servers provided by my school that run various OS and hardware configurations. For the most part, I can use 5 of these interchangeably. Unfortunately, other students tend to bunch up on some machines and It's a pain to find one that isn't bogged down.
What I want to is ssh into a machine, run the command:
w | wc -l
to get a rough estimate of the load on that server, and use that information to select the least impacted one. A sort of client-side load balancer.
Is there a way to do this or achieve the same result?
I'd put this on your .bashrc file
function choose_host(){
hosts="host1 ... hostn"
for host in $hosts
do
echo $(ssh $host 'w|wc -l') $host
done | sort | head -1 | awk '{print $2}'
}
function ssh_host(){
ssh $(choose_host)
}
choose_host should give you the one you're looking for. This is absolutely overkill but i was feeling playful :D
sort will order the output according to the result of w|wc -l, then head -1 gets the first line and awk will just print the hostname !
You can call ssh_host and should log you automatically.
You can use pdsh command from your desktop which run the specified command on the set of machines you specified and return the results. This way you can find out the one which is least loaded. This will avoid you doing ssh to every single machine and run the w | wc -l.
Yes. See e.g.:
ssh me#host "ls /etc | sort" | wc -l
The part inside "" is done remotely. The part afterwards is local.
I am writing shell script first time, I want to download latest create file from FTP.
I want to download latest file of specific folder. Below is my code for that. But it is downloading all the files of the folder not the latest one.
ftp -in ftp.abc.com << SCRIPTEND
user xyz xyz
binary
cd Rpts/
mget ls -t -r | tail -n 1
quit
SCRIPTEND
help me with this, please?
Try using wget or lftp utility instead, it compares file time/date and AFAIR its purpose is ftp scripting. Switch to ssh/rsync if possible, you can read a bit about lftp instead of rsync here:
https://serverfault.com/questions/24622/how-to-use-rsync-over-ftp
Probably the easiest way is to link last version on server side to "current", and always get the file pointed. If you're not admin of the server, you need to list all files with date/time, grab the information, parse it, decide which one is newest, in the meantime state on the server can change, and you find yourself in more complicated solution than it's worth.
The point is, that "ls" sorts output in some way, and time may not be default. There are switches to sort it e.g. base on modification time, however even when server responds with OK on ls -t , you can't be sure it really supports sorting, it can just ignore all switches and always return the same list, that's why admins usually use "current" link (ln -s). If there's no "current", to make sure you have the right file, you need to parse list anyway ( ls -al ).
http://www.catb.org/esr/writings/unix-koans/shell-tools.html
Looking at the code, the line
mget ls -t -r | tail -n 1
doesn't do what you think. It actually grabs all of the output of ls -t and then tail processes the output of mget. You could replace this line with
mget $(ls -t -r | tail -n 1)
but I am not sure if ftp will support such a call...
Try using an FTP client other than ftp. For example, curlftpfs available at curlftpfs.sourceforge.net is a good candidate as it allows you to mount an FTP to a directory as if it is a local folder and then run different commands on the files there (including find, grep, etc.). Take a look at this article.
This way, since the output comes form a local command, you'd be more certain that ls -t returns a properly sorted list.
Btw, it's a bit less convoluted to use ls -t | head -1 than ls -t -r | tail -1. They produce the same result but why reverse and grab from the tail when you can just grab the head :)
If you use curlftpfs then your script would be something like this (assuming server ftp.abc.com and user xyz with password xyz).
mkdir /tmp/ftpsession
curlftpfs ftp://xyz:xyz#ftp.abc.com /tmp/ftpsession
cd /tmp/ftpsession/Rpts
cp -Rpf $(ls -t | head -1) /your/destination/folder/or/file
cd -
umount /tmp/ftpsession
My Solution is this:
curl 'ftp://server.de/dir/'$(curl 'ftp://server.de/dir/' 2>/dev/null | tail -1 | awk '{print $(NF)}')
How can I write a bash script to list directory entries in the svn repository?
I want to write bash file because i have a large number of repositories.
If you are the subversion administrator, the following command will return the directories located in your repository.
svnlook tree $REPO_DIR --full-paths | egrep "/$"
The trick is the grep command that is looking for a trailing "/" character in the name
Same trick works for the svn command as well
svn list $REPO_URL -R | egrep "/$"
Extra notes
To repeatedly run this command you can put it into a shell for loop
for url in $URL1 $URL2 $URL2
do
svn list $url -R | egrep "/$"
done