I have a bunch of directories that need to be restored, but they have to first be packaged into a .tar. Is there a script that would allow me to package all 100+ directories into their own tar so dir becomes dir.tar.
So far attempt:
for i in *; do tar czf $i.tar $i; done
The script that you wrote will not work if you have some spaces in a directory name, because the name will be split, and also it will tar files if they exist on this level.
You can use this command to list directories not recursively:
find . -maxdepth 1 -mindepth 1 -type d
and this one to perform a tar on each one:
find . -maxdepth 1 -mindepth 1 -type d -exec tar cvf {}.tar {} \;
Do you have any directory names with spaces in them at that level? If not, your script will work just fine.
What I usually do is write a script with the command I want to execute echoed out:
$ for i in *
do
echo tar czf $i.tar $i
done
Then you can look at the output and see if it's doing what you want. After you've determined that the program will work, edit the command line and remove the echo command.
If there are spaces in the directory names, then just put the variables inside double quotes:
for i in *
do
tar czf "$i.tar" "$i"
done
Get them all done simply and in parallel with GNU Parallel:
parallel tar -cf {}.tar {} ::: *
If you want to check what it is going to do without actually doing anything, add --dry-run like this:
parallel --dry-run tar -cf {}.tar {} ::: *
Sample Output
tar -cf ab.tar ab
tar -cf cd.tar cd
if number of directories are very large and their names are too long
after execution of statement number one
for i in *
do
echo tar czf $i.tar $i
done
you will get error "string too long"
Related
I was searching for ways to create a bash file that would iterate all the folders in a directory, and create a tar.gz file for each of those directories.
(This is used specifically for ubuntu/drupal website - but could be useful in other scenarios.)
After lots of searching, combining scripts and testing, I found the following works very well when run from within the main directory.
This might be slightly different depending on version of bash, what version of ubuntu and from where you schedule or run the bash file. (Run by typing sh createDirectoryTarFiles.sh at command line from within the parent folder.)
The echo line is not necessary - just for viewing purposes.
for D in *; do
if [ -d "${D}" ]; then
tx="${D%????}"
echo "Directory is ${D} - and name of file would be $tx"
tar -zcvf "$tx.tar.gz" "${D}"
fi
done
You can use find to search for directories between mindepth and maxdepth and then create the tars
find . -maxdepth 1 -mindepth 1 -type d -exec tar czf $(basename {}).tar.gz {} \;
Let's say I have a bunch of *.tar.gz files located in a hierarchy of folders. What would be a good way to find those files, and then execute multiple commands on it.
I know if I just need to execute one command on the target file, I can use something like this:
$ find . -name "*.tar.gz" -exec tar xvzf {} \;
But what if I need to execute multiple commands on the target file? Must I write a bash script here, or is there any simpler way?
Samples of commands that need to be executed a A.tar.gz file:
$ tar xvzf A.tar.gz # assume it untars to folder logs
$ mv logs logs_A
$ rm A.tar.gz
Here's what works for me (thanks to Etan Reisner suggestions)
#!/bin/bash # the target folder (to search for tar.gz files) is parsed from command line
find $1 -name "*.tar.gz" -print0 | while IFS= read -r -d '' file; do # this does the magic of getting each tar.gz file and assign to shell variable `file`
echo $file # then we can do everything with the `file` variable
tar xvzf $file
# mv untar_folder $file.suffix # untar_folder is the name of folder after untar
rm $file
done
As suggested, the array way is unsafe if file name contained space(s), and also doesn't seem to work properly in this case.
Writing a shell script is probably easiest. Take a look at sh for loops. You could use the output of a find command in an array, and then loop over that array to perform a set of commands on each element.
For example,
arr=( $(find . -name "*.tar.gz" -print0) )
for i in "${arr[#]}"; do
# $i now holds each of the filenames output by find
tar xvzf $i
mv $i $i.suffix
rm $i
# etc., etc.
done
I am trying compress files from an archive with the command
tar -czvf compress_file.tar.gz $(cat file_list.txt)
And I have an error
-bash: /bin/tar: Argument list too long
The files numbers is too long, how can I resolve this?
Use the "-T" option to pass a file to tar that contains the filenames to tar up.
tar -czv -T file_list.txt -f tarball.tar.gz
and how to make list of files to tar up:
first create the list of files to tar up
ls > temp
then
tar cvzf dicionario_ultra.tgz -X FILE -T temp
and finally
rm temp
You can use find to avoid the issue, it will list the files under current folder and the -print will trigger the tar with newline
find . -type f -print | tar -cvf somefile.tar -T -
Alright, so simple problem here. I'm working on a simple back up code. It works fine except if the files have spaces in them. This is how I'm finding files and adding them to a tar archive:
find . -type f | xargs tar -czvf backup.tar.gz
The problem is when the file has a space in the name because tar thinks that it's a folder. Basically is there a way I can add quotes around the results from find? Or a different way to fix this?
Use this:
find . -type f -print0 | tar -czvf backup.tar.gz --null -T -
It will:
deal with files with spaces, newlines, leading dashes, and other funniness
handle an unlimited number of files
won't repeatedly overwrite your backup.tar.gz like using tar -c with xargs will do when you have a large number of files
Also see:
GNU tar manual
How can I build a tar from stdin?, search for null
There could be another way to achieve what you want. Basically,
Use the find command to output path to whatever files you're looking for. Redirect stdout to a filename of your choosing.
Then tar with the -T option which allows it to take a list of file locations (the one you just created with find!)
find . -name "*.whatever" > yourListOfFiles
tar -cvf yourfile.tar -T yourListOfFiles
Try running:
find . -type f | xargs -d "\n" tar -czvf backup.tar.gz
Why not:
tar czvf backup.tar.gz *
Sure it's clever to use find and then xargs, but you're doing it the hard way.
Update: Porges has commented with a find-option that I think is a better answer than my answer, or the other one: find -print0 ... | xargs -0 ....
If you have multiple files or directories and you want to zip them into independent *.gz file you can do this. Optional -type f -atime
find -name "httpd-log*.txt" -type f -mtime +1 -exec tar -vzcf {}.gz {} \;
This will compress
httpd-log01.txt
httpd-log02.txt
to
httpd-log01.txt.gz
httpd-log02.txt.gz
Would add a comment to #Steve Kehlet post but need 50 rep (RIP).
For anyone that has found this post through numerous googling, I found a way to not only find specific files given a time range, but also NOT include the relative paths OR whitespaces that would cause tarring errors. (THANK YOU SO MUCH STEVE.)
find . -name "*.pdf" -type f -mtime 0 -printf "%f\0" | tar -czvf /dir/zip.tar.gz --null -T -
. relative directory
-name "*.pdf" look for pdfs (or any file type)
-type f type to look for is a file
-mtime 0 look for files created in last 24 hours
-printf "%f\0" Regular -print0 OR -printf "%f" did NOT work for me. From man pages:
This quoting is performed in the same way as for GNU ls. This is not the same quoting mechanism as the one used for -ls and -fls. If you are able to decide what format to use for the output of find then it is normally better to use '\0' as a terminator than to use newline, as file names can contain white space and newline characters.
-czvf create archive, filter the archive through gzip , verbosely list files processed, archive name
Edit 2019-08-14:
I would like to add, that I was also able to use essentially use the same command in my comment, just using tar itself:
tar -czvf /archiveDir/test.tar.gz --newer-mtime=0 --ignore-failed-read *.pdf
Needed --ignore-failed-read in-case there were no new PDFs for today.
Why not give something like this a try: tar cvf scala.tar `find src -name *.scala`
Another solution as seen here:
find var/log/ -iname "anaconda.*" -exec tar -cvzf file.tar.gz {} +
The best solution seem to be to create a file list and then archive files because you can use other sources and do something else with the list.
For example this allows using the list to calculate size of the files being archived:
#!/bin/sh
backupFileName="backup-big-$(date +"%Y%m%d-%H%M")"
backupRoot="/var/www"
backupOutPath=""
archivePath=$backupOutPath$backupFileName.tar.gz
listOfFilesPath=$backupOutPath$backupFileName.filelist
#
# Make a list of files/directories to archive
#
echo "" > $listOfFilesPath
echo "${backupRoot}/uploads" >> $listOfFilesPath
echo "${backupRoot}/extra/user/data" >> $listOfFilesPath
find "${backupRoot}/drupal_root/sites/" -name "files" -type d >> $listOfFilesPath
#
# Size calculation
#
sizeForProgress=`
cat $listOfFilesPath | while read nextFile;do
if [ ! -z "$nextFile" ]; then
du -sb "$nextFile"
fi
done | awk '{size+=$1} END {print size}'
`
#
# Archive with progress
#
## simple with dump of all files currently archived
#tar -czvf $archivePath -T $listOfFilesPath
## progress bar
sizeForShow=$(($sizeForProgress/1024/1024))
echo -e "\nRunning backup [source files are $sizeForShow MiB]\n"
tar -cPp -T $listOfFilesPath | pv -s $sizeForProgress | gzip > $archivePath
Big warning on several of the solutions (and your own test) :
When you do : anything | xargs something
xargs will try to fit "as many arguments as possible" after "something", but then you may end up with multiple invocations of "something".
So your attempt: find ... | xargs tar czvf file.tgz
may end up overwriting "file.tgz" at each invocation of "tar" by xargs, and you end up with only the last invocation! (the chosen solution uses a GNU -T special parameter to avoid the problem, but not everyone has that GNU tar available)
You could do instead:
find . -type f -print0 | xargs -0 tar -rvf backup.tar
gzip backup.tar
Proof of the problem on cygwin:
$ mkdir test
$ cd test
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs touch
# create the files
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs tar czvf archive.tgz
# will invoke tar several time as it can'f fit 10000 long filenames into 1
$ tar tzvf archive.tgz | wc -l
60
# in my own machine, I end up with only the 60 last filenames,
# as the last invocation of tar by xargs overwrote the previous one(s)
# proper way to invoke tar: with -r (which append to an existing tar file, whereas c would overwrite it)
# caveat: you can't have it compressed (you can't add to a compressed archive)
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs tar rvf archive.tar #-r, and without z
$ gzip archive.tar
$ tar tzvf archive.tar.gz | wc -l
10000
# we have all our files, despite xargs making several invocations of the tar command
Note: that behavior of xargs is a well know diccifulty, and it is also why, when someone wants to do :
find .... | xargs grep "regex"
they intead have to write it:
find ..... | xargs grep "regex" /dev/null
That way, even if the last invocation of grep by xargs appends only 1 filename, grep sees at least 2 filenames (as each time it has: /dev/null, where it won't find anything, and the filename(s) appended by xargs after it) and thus will always display the file names when something maches "regex". Otherwise you may end up with the last results showing matches without a filename in front.
I tried this:
DIR=/path/tar/*.gz
if [ "$(ls -A $DIR 2> /dev/null)" == "" ]; then
echo "not gz"
else
tar -zxvf /path/tar/*.gz -C /path/tar
fi
If the folder has one tar, it works. If the folder has many tar, I get an error.
How can I do this?
I have an idea to run a loop to untar, but I don't know how to solve this problem
for f in *.tar.gz
do
tar zxvf "$f" -C /path/tar
done
I find the find exec syntax very useful:
find . -name '*.tar.gz' -exec tar -xzvf {} \;
{} gets replaced with each file found and the line is executed.
for a in /path/tar/*.gz
do
tar -xzvf "$a" -C /path/tar
done
Notes
This presumes that files ending in .gz are gzipped tar files. Usually .tgz or .tar.gz is used to signify this, however tar will fail if something is not right.
You may find it easier to cd /path/tar first, then you can drop the -C /path/tar from the untar command, and the /path/tar/ in the loop.
The accepted answer worked for me with a slight modification
for f in *.tar.gz
do
tar zxvf "$f" -C \name_of_destination_folder_inside_current_path
done
I had to change the forward slash to a backslash and then it worked for me.