Where I started / The problem.
I am trying to run a fairly complex kubectl command to copy files above a specific date from kubernetes to a local drive.
I am trying to take advantage of this command.
$ kubectl cp <file-spec-src> <file-spec-dest>
Which according to https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#cp
Is a "shorthand" for this command.
kubectl exec -n <some-namespace> <some-pod> -- tar cf - /tmp/foo | tar xf - -C /tmp/bar
Which works but had not date restriction parameters.
What commands I came up with (which didn't work).
However, I do not just want to copy files, I want to copy specific files. In that pursuit I come up with 2 commands that both work on my local machine, but not when used with kubectl.
Command 1
My thought with command 1 was just to find a solution to the date problem first, then get it to tar, and the out of tar. Pipes seemed appreciate until I got to the kubectl command.
Command 1 local.
find /foo -type d -maxdepth 3 -newermt '2/25/2021 0:00:00' -print0 | xargs -0 tar cf - | tar xf - -C /bar
Command 1 kubectl.
kubectl exec -n <namespace> <pod_name> -- "find /foo -type d -maxdepth 3 -newermt '2/25/2021 0:00:00' -print0 | xargs -0 tar cf - " | tar xf - -C /bar
The quotations ("") around find and the first pipe is there because those need to be run in the kubernetes pod. The last pipe is there as in the official command, to pipe to local disk.
Error
The command only returns an error, and not a useful one at that. What I can say is that removing the last pipe returns the same error.
no such file or directory: unknown
Command 2
My thought behind command 2, was that if pipes creates too many problems for me, why not take advantage of findĀ“s -exec command, and only have 1 pipe.
Command 2 local.
find /foo -type d -maxdepth 3 -newermt '2/25/2021 0:00:00' -exec tar -rvf - {} \; | tar xf - -C /bar
Command 2 kubectl.
kubectl exec -n <pod_name> -- find /foo -type d -maxdepth 3 -newermt '4/1/2021 0:00:00' -exec tar -cf - {} ; | tar xf - -C /bar
Error
The command does this time, not return an error, but instead proceeds to copy every file it can find, even those outside of the "-type" "-maxdepth" "-newermt" parameters. So this command essentially does that same as just copying the entire folder.
Finally
I have no clue as to how to proceed from here. Is there any other combination that I could try, or is there some sort of error in my code anyone could help me with ?
Thanks :)
For now I am running it with a compromise that works.
kubectl exec -n <namespace> <pod_name> -- find /foo -type f -newermt '4/1/2021 0:00:00' -exec tar -cf - {} + | tar xf - -C ./bar --strip-components=3
This how ever takes longer to run since I look at every file, and not just the top level folders.
It seems I've got a problem. I've got some different file types in my current directory, and I want to just tar the .png files. I started with this:
find -name "*.png" | tar -cvf backupp.tar
It wouldn't work because I didn't specify which files, so looking on how others did it, I added xargs:
find -name "*.png" | xargs tar -cvf backupp.tar
It did work this time, and backupp.tar file was created, but here is the problem. I can't seem to extract it. Whenever I type:
tar -xvf backupp.tar
Nothings happens. I've tried changing chmod and sudo, but nothing gives in.
So, did I type the wrong command completely or is there somethings I just missed?
tar expects a list of names as arguments. Your use of xargs can be improved by adding the -print0 option to find and adding the -0 option to xargs to insure find is providing filenames separated by a nul-character and that xargs is processing a list of filenames separated by the same. This prevents any whitespace or other stray characters in the filenames from causing problems, e.g.
find dir -type f -name "*.png" -print0 | xargs -0 tar -cf tarfile.tar
The above will find all files in or below dir matching name "*.png" and provide a list of filenames separated by the nul-character to xargs for use by tar. You can list the files contained in the resulting archive with:
tar -tf tarfile.tar
Consider using compression (if wanted) by adding the z (gzipped) j (bzip2) or J (xz compression) and the appropriate extension to reduce you archive size. e.g.
... | xargs -0 tar -czf tarfile.tar.gz
I'm on a RedHat Linux 6 machine, running Elasticsearch and Logstash. I have a bunch of log files that were rotated daily from back in June til August. I am trying to figure out the best way to tar them up to save some diskspace, without manually taring up each one. I'm a bit of a newbie at scripting, so I was wondering if someone could help me out? The files have the name elasticsearch-cluster.log.datestamp. Ideally they would all be in their individual tar files, so that it'd be easier to go back and take a look at that particular day's logs if needed.
You could use a loop :
for file in elasticsearch-cluster.log.*
do
tar zcvf "$file".tar.gz "$file"
done
Or if you prefer a one-liner (this is recursive):
find . -name 'elasticsearch-cluster.log.*' -print0 | xargs -0 -I {} tar zcvf {}.tar.gz {}
or as #chepner mentions with the -exec option:
find . -name 'elasticsearch-cluster.log.*' -exec tar zcvf {}.tar.gz {} \;
or if want to exclude already zipped files:
find . -name 'elasticsearch-cluster.log.*' -not -name '*.tar.gz' -exec tar zcvf {}.tar.gz {} \;
If you don't mind all the files being in a single tar.gz file, you can do:
tar zcvf backups.tar.gz elasticsearch-cluster.log.*
All these commands leave the original files in place. After you validate the tar.gz files, you can delete them manually.
I found what I thought was a solution in this forum to being able to find my specific LOG files and then doing TAR.GZ on these files for a backup. However, when execute the command I'm getting an error. The command prior to the pipe works great and finds the files that I'm needing but when trying to create the backup file I blow up. Any suggestions/direction would be appreciated. Thanks.
Here is the command:
find /var/log/provenir -type f -name "*2014-09-08.log" | tar -cvzf backupProvLogFiles_20140908.tar.gz
Here is the error I'm getting:
find /var/log/provenir -type f -name "*2014-09-08.log" | tar -czvf backupProvLogFiles_20140908.tar.gz --null -T -
tar: Removing leading `/' from member names
tar: /var/log/provenir/BureauDE32014-09-08.log\n/var/log/provenir/DE_HTTP2014-09
-08.log\n/var/log/provenir/BureauDE22014-09-08.log\n/var/log/provenir/DE_HTTP220
14-09-08.log\n: Cannot stat: No such file or directory
tar: Exiting with failure status due to previous errors
You can also use gzip to do so
find /var/log/provenir -type f -name "*2014-09-08.log" | gzip > tar -cvzf backupProvLogFiles_20140908.tar EDIT
EDIT
A better solution would be to use command substituion
tar -cvzf backupProvLogFiles_20140908.tar $(find /var/log/provenir -type f -name "*2014-09-08.log")
I think you mean something like this:
find . -name "*XYZ*" -type f -print | tar -cvz -T - -f SomeFile.tgz
I was finally able to find a solution just in case someone else might be looking for another option to answer this question:
find /var/log/provenir -type f -name "*2014-09-08.log" -print0 | tar -czvf /var/log/provenir/barchive/backupProvLogFile_20140908.tar.gz --null -T -
This worked great. The answer came from this post: Find files and tar them (with spaces)
Thanks again for the help I received.
Regards.
Alright, so simple problem here. I'm working on a simple back up code. It works fine except if the files have spaces in them. This is how I'm finding files and adding them to a tar archive:
find . -type f | xargs tar -czvf backup.tar.gz
The problem is when the file has a space in the name because tar thinks that it's a folder. Basically is there a way I can add quotes around the results from find? Or a different way to fix this?
Use this:
find . -type f -print0 | tar -czvf backup.tar.gz --null -T -
It will:
deal with files with spaces, newlines, leading dashes, and other funniness
handle an unlimited number of files
won't repeatedly overwrite your backup.tar.gz like using tar -c with xargs will do when you have a large number of files
Also see:
GNU tar manual
How can I build a tar from stdin?, search for null
There could be another way to achieve what you want. Basically,
Use the find command to output path to whatever files you're looking for. Redirect stdout to a filename of your choosing.
Then tar with the -T option which allows it to take a list of file locations (the one you just created with find!)
find . -name "*.whatever" > yourListOfFiles
tar -cvf yourfile.tar -T yourListOfFiles
Try running:
find . -type f | xargs -d "\n" tar -czvf backup.tar.gz
Why not:
tar czvf backup.tar.gz *
Sure it's clever to use find and then xargs, but you're doing it the hard way.
Update: Porges has commented with a find-option that I think is a better answer than my answer, or the other one: find -print0 ... | xargs -0 ....
If you have multiple files or directories and you want to zip them into independent *.gz file you can do this. Optional -type f -atime
find -name "httpd-log*.txt" -type f -mtime +1 -exec tar -vzcf {}.gz {} \;
This will compress
httpd-log01.txt
httpd-log02.txt
to
httpd-log01.txt.gz
httpd-log02.txt.gz
Would add a comment to #Steve Kehlet post but need 50 rep (RIP).
For anyone that has found this post through numerous googling, I found a way to not only find specific files given a time range, but also NOT include the relative paths OR whitespaces that would cause tarring errors. (THANK YOU SO MUCH STEVE.)
find . -name "*.pdf" -type f -mtime 0 -printf "%f\0" | tar -czvf /dir/zip.tar.gz --null -T -
. relative directory
-name "*.pdf" look for pdfs (or any file type)
-type f type to look for is a file
-mtime 0 look for files created in last 24 hours
-printf "%f\0" Regular -print0 OR -printf "%f" did NOT work for me. From man pages:
This quoting is performed in the same way as for GNU ls. This is not the same quoting mechanism as the one used for -ls and -fls. If you are able to decide what format to use for the output of find then it is normally better to use '\0' as a terminator than to use newline, as file names can contain white space and newline characters.
-czvf create archive, filter the archive through gzip , verbosely list files processed, archive name
Edit 2019-08-14:
I would like to add, that I was also able to use essentially use the same command in my comment, just using tar itself:
tar -czvf /archiveDir/test.tar.gz --newer-mtime=0 --ignore-failed-read *.pdf
Needed --ignore-failed-read in-case there were no new PDFs for today.
Why not give something like this a try: tar cvf scala.tar `find src -name *.scala`
Another solution as seen here:
find var/log/ -iname "anaconda.*" -exec tar -cvzf file.tar.gz {} +
The best solution seem to be to create a file list and then archive files because you can use other sources and do something else with the list.
For example this allows using the list to calculate size of the files being archived:
#!/bin/sh
backupFileName="backup-big-$(date +"%Y%m%d-%H%M")"
backupRoot="/var/www"
backupOutPath=""
archivePath=$backupOutPath$backupFileName.tar.gz
listOfFilesPath=$backupOutPath$backupFileName.filelist
#
# Make a list of files/directories to archive
#
echo "" > $listOfFilesPath
echo "${backupRoot}/uploads" >> $listOfFilesPath
echo "${backupRoot}/extra/user/data" >> $listOfFilesPath
find "${backupRoot}/drupal_root/sites/" -name "files" -type d >> $listOfFilesPath
#
# Size calculation
#
sizeForProgress=`
cat $listOfFilesPath | while read nextFile;do
if [ ! -z "$nextFile" ]; then
du -sb "$nextFile"
fi
done | awk '{size+=$1} END {print size}'
`
#
# Archive with progress
#
## simple with dump of all files currently archived
#tar -czvf $archivePath -T $listOfFilesPath
## progress bar
sizeForShow=$(($sizeForProgress/1024/1024))
echo -e "\nRunning backup [source files are $sizeForShow MiB]\n"
tar -cPp -T $listOfFilesPath | pv -s $sizeForProgress | gzip > $archivePath
Big warning on several of the solutions (and your own test) :
When you do : anything | xargs something
xargs will try to fit "as many arguments as possible" after "something", but then you may end up with multiple invocations of "something".
So your attempt: find ... | xargs tar czvf file.tgz
may end up overwriting "file.tgz" at each invocation of "tar" by xargs, and you end up with only the last invocation! (the chosen solution uses a GNU -T special parameter to avoid the problem, but not everyone has that GNU tar available)
You could do instead:
find . -type f -print0 | xargs -0 tar -rvf backup.tar
gzip backup.tar
Proof of the problem on cygwin:
$ mkdir test
$ cd test
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs touch
# create the files
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs tar czvf archive.tgz
# will invoke tar several time as it can'f fit 10000 long filenames into 1
$ tar tzvf archive.tgz | wc -l
60
# in my own machine, I end up with only the 60 last filenames,
# as the last invocation of tar by xargs overwrote the previous one(s)
# proper way to invoke tar: with -r (which append to an existing tar file, whereas c would overwrite it)
# caveat: you can't have it compressed (you can't add to a compressed archive)
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs tar rvf archive.tar #-r, and without z
$ gzip archive.tar
$ tar tzvf archive.tar.gz | wc -l
10000
# we have all our files, despite xargs making several invocations of the tar command
Note: that behavior of xargs is a well know diccifulty, and it is also why, when someone wants to do :
find .... | xargs grep "regex"
they intead have to write it:
find ..... | xargs grep "regex" /dev/null
That way, even if the last invocation of grep by xargs appends only 1 filename, grep sees at least 2 filenames (as each time it has: /dev/null, where it won't find anything, and the filename(s) appended by xargs after it) and thus will always display the file names when something maches "regex". Otherwise you may end up with the last results showing matches without a filename in front.