Linux Copy recursively - linux

Every-time I copy recursively I always end up with httpdocs folder instead of the files within in public_html.
For example I might run something like:
cp -rpf /var/www/vhosts/website/httpdocs /var/www/vhosts/anotherwebsite/httpdocs
I always end up with /var/www/vhosts/anotherwebsite/httpdocs/httpdocs when all I am trying to do is move a website from one user to another.

When you just want to "push" your local website without getting offline long, you can use a temporary dir.
TODAY=$(date +%y%m%d)
NEWCODE=/var/www/vhosts/anotherwebsite/docs_${TODAY}
OLDCODE=/var/www/vhosts/anotherwebsite/docs_old
rm -rf ${NEWCODE}
cp -rpf /var/www/vhosts/website/httpdocs ${NEWCODE} || exit 1
# some checks ?
cd /var/www/vhosts/anotherwebsite/ || exit 1
mv httpdocs ${OLDCODE} || exit 1
mv ${NEWCODE} httpdocs
Between the moves you will be unavailable. When that is a problem, you might want to make a work_in_progress.html file, rename that file to httpdocs/index.html, remove all other files and copy the new files after this (the correct index.html file as the last one).
But this seems to fancy, stick with the short hickup in the solution above.

Wat do you want when a file exists under aontherwebsite and not in the source website?
I think you want anotherwebsite to be an exact copy, so make sure the old files are removed first.
rm -r /var/www/vhosts/anotherwebsite/httpdoc
cp -rpf /var/www/vhosts/website/httpdocs /var/www/vhosts/anotherwebsite/httpdoc
Edit: anotherwebsite should stay available.
The general trick is using tar:
tar cf - * | ( cd /target; tar xf -)
In your case:
TARGET=/var/www/vhosts/anotherwebsite/httpdoc
cd /var/www/vhosts/website/httpdocs || exit 1
mkdir -p ${TARGET}
[ -d ${TARGET} ] || exit 1
tar cf - * | ( cd ${TARGET}; tar xf -)
I added "|| exit 1" to be sure you do not copy from the wrong dir and ${TARGET} is a dir.
You still have 3 challenges:
1) How do you delete files you do not use anymore (yesterday_special_action.html)
2) should you copy images first, for customers opening new pages before the images are copied
3) what website do you have if the copy/tar fails after copying a part (disk full)
I will post a new answer for solving these challenges.

You need to tell copy to treat the destination directory as a file instead of as a directory (into which to copy the folder that you've listed as your source). Which means you need the -T option.
Alternatively you could use a source of /var/www/vhosts/website/httpdocs/* if that glob captures all the files you actually care about. That should work too.

Related

Linux - Copy the files (not subfolders) from origin to destination while overwriting/deleting the initial contents in the destination folder

I need to run a command in Linux where I would copy the files (not folders) in ~/folder1/subfolder1 to ~/folder2/subfolder2 while deleting the initial contents/files in folder2?
Command cp copies files from one folder to another:
cp ~/folder1/* ~/folder2/
But, how can I also delete files that were initially in the folder2, while copying only files from folder1?
Also, is there a rsync command instead of cp that would only copy files and not subfolders?
I have tried with this:
rsync --delete-during folder1/* folder2/
But, I got an error:
rsync: --delete does not work without -r or -d.
And I don't want to use -r or -d flag since that would mean the subfolders would get copied as well, and I only want to copy files.
You want to delete everything in folder2, then copy every file from folder1 ?
Try something like this:
rm folder2/* && cp folder1/* folder2/
(cp will not copy directories by default)

Recursively copy contents of directory to all target directories

I have a directory containing a set of subdirectories and files. I need to recursively copy all the content of this directory to all the subdirectories of another directory, also recursively.
How do I achieve this, preferably without using a script and only with the cp command?
You can write this in a script but you don't have to. Just write it line by line in the terminal:
# $TARGET is the directory containing subdirectories where you want to STORE the copies
# $SOURCE is the directory containing the subdirectories you want to COPY
for dir in $(ls $TARGET); do
cp -r $SOURCE/* $TARGET/$dir
done
Only uses cp and runs on both bash and zsh.
You can't. cp can copy multiple sources but will only copy to a single destination. You need to arrange to invoke cp multiple times - once per destination - for what you want to do; using, as you say, a loop or some other tool.
The first part of the command before the pipe instruct tar to create an archive of everything in the current directory and write it to standard output (the – in place of a file-name frequently indicates stdout).
tar cf - * | ( cd /target; tar xfp -)
The commands within parentheses cause the shell to change directory to the target directory and untar data from standard input. Since the cd and tar commands are contained within parentheses, their actions are performed together.
The -p option in the tar extraction command directs tar to preserve permission and ownership information, if possible given the user executing the command. If you are running the command as superuser, this option is turned on by default and can be omitted.
Also you can use the following command, but it seems to be quite slower than tar;
cp -a * /target

How to duplicate a folder exactly

I am trying to copy a filesystem for a device I am programming for. After so much time trying to figure out why the filesystem I was installing wasn't working I found out that cp didn't get the job done. I used du -s to check the size of the original filesystem and the one that I copied with cp -r, as it turns out they differ by about 150 bytes.
Something is telling me that symbolic links or some sort of kernel objects aren't being copied correctly.
Is it possible to copy a folder/file system exactly? If so how would I go about it?
Try doing this the straightforward way :
cp -a src target
from man cp
-a, --archive
same as -dR --preserve=all
It preserve rights, symlinks...
Here I tried all the code in my Linux. Seems Rsync proposed by #seanmcl as the right one while others failed to keep owners and/or some special files or a denied result. The exact code is:
$ sudo rsync -aczvAXHS --progress /var/www/html /var/www/backup
Just remember to use just the directory name and not put a slash (/) or a wildcard (/*) at the end of source and target name otherwise the hidden files right below the source are not copied.
Another popular option is to use tar c source | (cd target && tar x ). See this linuxdevcenter.com article.
The most accurate way I know of copying files is with cpio:
cd /path/to/source
find . -xdev -print0 | cpio -oa0V | (cd /path/to/target && cpio -imV)
Not really easy to use, but this is very precise, preserving timestamps, owners, permissions, special files.
Rsync is the best way to copy a file system. They are myriad arguments that let you control exactly what is copied.
This is what I do, for example to duplicate directory A -> B:
$ mkdir B
$ cd A
$ cp -a ./ ../B

shell script to increment file names when a directory contents changes (centos)

I have a folder containing 100 pictures from a webcam. When the webcam sends a new picture, I want this one to replace number 0 and have all the other jpg's move up one number. I've set up a script where inotify monitors a directory. When a new file is put into this directory the script renumbers all the files in the picture directory, renames the new uploaded picture and puts it in the folder with the rest.
This script 'sort of' works. 'Sort of', because sometimes it does what it's supposed to do and sometimes it complains about missing files:
mv: cannot stat `webcam1.jpg': No such file or directory
Sometimes it complains about only one file, sometimes 4 or 5. Of course I made sure all 100 files were there, properly named before the script was run. After the script is run, the files it complains about are indeed missing.
This is the script, in the version I tested the full paths to the directories are used of course.
#!/bin/bash
dir1= /foo # directory to be watched
while inotifywait -qqre modify "$dir1"; do
cd /f002 #directory where the images are
for i in {99..1}
do
j=$(($i+1))
f1a=".jpg"
f1="webcam$i$f1a"
f2="test"
f2="webcam$j$f1a"
mv $f1 $f2
done
rm webcam100.jpg
mv dir1/*.jpg /f002/webcam0.jpg
done
I also need to implement some error checking, but for now I don't understand why it is missing files that are there.
You are executing the following mv commands:
mv webcam99.jpg webcam100.jpg
...
mv webcam1.jpg webcam2.jpg
The mv webcam0.jpg to webcam1.jpg is missing. With the first change to "$dir" you have the following files in /foo2:
webcam99.jp
...
webcam2.jpg
webcam0.jpg
With subsequent "$dir" change you will have the following:
webcam99.jp
...
webcam3.jpg
webcam0.jpg
In other words -- you are forgetting to move webcam0.jpg to webcam1.jpg. I would modify your script like this:
rm webcam99.jpg
for i in {98..0}
do
j=$(($i+1))
f1a=".jpg"
f1="webcam$i$f1a"
f2="test"
f2="webcam$j$f1a"
mv $f1 $f2
done
mv dir1/*.jpg /f002/webcam0.jpg

Modifying files nested in tar archive

I am trying to do a grep and then a sed to search for specific strings inside files, which are inside multiple tars, all inside one master tar archive. Right now, I modify the files by
First extracting the master tar archive.
Then extracting all the tars inside it.
Then doing a recursive grep and then sed to replace a specific string in files.
Finally packaging everything again into tar archives, and all the archives inside the master archive.
Pretty tedious. How do I do this automatically using shell scripting?
There isn't going to be much option except automating the steps you outline, for the reasons demonstrated by the caveats in the answer by Kimvais.
tar modify operations
The tar command has some options to modify existing tar files. They are, however, not appropriate for your scenario for multiple reasons, one of them being that it is the nested tarballs that need editing rather than the master tarball. So, you will have to do the work longhand.
Assumptions
Are all the archives in the master archive extracted into the current directory or into a named/created sub-directory? That is, when you run tar -tf master.tar.gz, do you see:
subdir-1.23/tarball1.tar
subdir-1.23/tarball2.tar
...
or do you see:
tarball1.tar
tarball2.tar
(Note that nested tars should not themselves be gzipped if they are to be embedded in a bigger compressed tarball.)
master_repackager
Assuming you have the subdirectory notation, then you can do:
for master in "$#"
do
tmp=$(pwd)/xyz.$$
trap "rm -fr $tmp; exit 1" 0 1 2 3 13 15
cat $master |
(
mkdir $tmp
cd $tmp
tar -xf -
cd * # There is only one directory in the newly created one!
process_tarballs *
cd ..
tar -czf - * # There is only one directory down here
) > new.$master
rm -fr $tmp
trap 0
done
If you're working in a malicious environment, use something other than tmp.$$ for the directory name. However, this sort of repackaging is usually not done in a malicious environment, and the chosen name based on process ID is sufficient to give everything a unique name. The use of tar -f - for input and output allows you to switch directories but still handle relative pathnames on the command line. There are likely other ways to handle that if you want. I also used cat to feed the input to the sub-shell so that the top-to-bottom flow is clear; technically, I could improve things by using ) > new.$master < $master at the end, but that hides some crucial information multiple lines later.
The trap commands make sure that (a) if the script is interrupted (signals HUP, INT, QUIT, PIPE or TERM), the temporary directory is removed and the exit status is 1 (not success) and (b) once the subdirectory is removed, the process can exit with a zero status.
You might need to check whether new.$master exists before overwriting it. You might need to check that the extract operation actually extracted stuff. You might need to check whether the sub-tarball processing actually worked. If the master tarball extracts into multiple sub-directories, you need to convert the 'cd *' line into some loop that iterates over the sub-directories it creates.
All these issues can be skipped if you know enough about the contents and nothing goes wrong.
process_tarballs
The second script is process_tarballs; it processes each of the tarballs on its command line in turn, extracting the file, making the substitutions, repackaging the result, etc. One advantage of using two scripts is that you can test the tarball processing separately from the bigger task of dealing with a tarball containing multiple tarballs. Again, life will be much easier if each of the sub-tarballs extracts into its own sub-directory; if any of them extracts into the current directory, make sure you create a new sub-directory for it.
for tarball in "$#"
do
# Extract $tarball into sub-directory
tar -xf $tarball
# Locate appropriate sub-directory.
(
cd $subdirectory
find . -type f -print0 | xargs -0 sed -i 's/name/alternative-name/g'
)
mv $tarball old.$tarball
tar -cf $tarball $subdirectory
rm -f old.$tarball
done
You should add traps to clean up here, too, so the script can be run in isolation from the master script above and still not leave any intermediate directories around. In the context of the outer script, you might not need to be so careful to preserve the old tarball before the new is created (so rm -f $tarbal instead of the move and remove command), but treated in its own right, the script should be careful not to damage anything.
Summary
What you're attempting is not trivial.
Debuggability splits the job into two scripts that can be tested independently.
Handling the corner cases is much easier when you know what is really in the files.
You probably can sed the actual tar as tar itself does not do compression itself.
e.g.
zcat archive.tar.gz|sed -e 's/foo/bar/g'|gzip > archive2.tar.gz
However, beware that this will also replace foo with bar also in filenames, usernames and group names and ONLY works if foo and bar are of equal length

Resources