for loops returned with a ">" - linux

I would like to extract all the tar.gz files in a directory.
I have searched in the internet and used the command:
for i in *.tar.gz; do tar -xzvf $i; done
However, the terminal returned with a > at the following line.
Can somebody enlighten me with what is the situation here? I would really appreciate your help.

Does your tar.gz files have spaces or special charas in their names? just a guess, try this:
for i in *.tar.gz; do tar -xzvf "$i"; done

I think you might need to specify the output folder for your zip files.
This is what the > stands for normally.
for i in *.tar.gz; do tar xvzf $i -C path/to/output/directory; done
This did the trick when i used it some weeks ago.

Hmm your solution should work in my opinion. You can try something simpler
ls *.tar.gz | xargs -L 1 tar -xzvf

Try this
find -name '*.tar.*' | xargs -I% tar -xvf %

No one suggested find -print0 | xargs -0 for filenames with spaces, here it is:
/tmp/foo$ ls -1
a.tar.gz
b.tar.gz
c and d.tar.gz
/tmp/foo$ find . -name "*.tar.gz" -print0 | xargs -0 -I{} tar -xzvf {}
a
c
d
b
/tmp/foo$ ls -1
a
a.tar.gz
b
b.tar.gz
c
c and d.tar.gz
d
Your solution will fail on my input data because of the file "c and d.tar.gz" which contains a space but maybe that's not the error you experienced. The one from A.M.D will work.

Related

Extract and delete all .gz in a directory- Linux

I have a directory. It has about 500K .gz files.
How can I extract all .gz in that directory and delete the .gz files?
This should do it:
gunzip *.gz
#techedemic is correct but is missing '.' to mention the current directory, and this command go throught all subdirectories.
find . -name '*.gz' -exec gunzip '{}' \;
There's more than one way to do this obviously.
# This will find files recursively (you can limit it by using some 'find' parameters.
# see the man pages
# Final backslash required for exec example to work
find . -name '*.gz' -exec gunzip '{}' \;
# This will do it only in the current directory
for a in *.gz; do gunzip $a; done
I'm sure there's other ways as well, but this is probably the simplest.
And to remove it, just do a rm -rf *.gz in the applicable directory
Extract all gz files in current directory and its subdirectories:
find . -name "*.gz" | xargs gunzip
If you want to extract a single file use:
gunzip file.gz
It will extract the file and remove .gz file.
for foo in *.gz
do
tar xf "$foo"
rm "$foo"
done
Try:
ls -1 | grep -E "\.tar\.gz$" | xargs -n 1 tar xvfz
Then Try:
ls -1 | grep -E "\.tar\.gz$" | xargs -n 1 rm
This will untar all .tar.gz files in the current directory and then delete all the .tar.gz files. If you want an explanation, the "|" takes the stdout of the command before it, and uses that as the stdin of the command after it. Use "man command" w/o the quotes to figure out what those commands and arguments do. Or, you can research online.

Use grep -lr output to add files to tar

In UBUNTU and CENTOS.
I have some files I want to tar based on their contents.
$ grep -rl "123.45" .
returns a list of about 10 files in this kind of format:
./somefolder/someotherfolder/somefile.txt
./anotherfolder/anotherfile.txt
etc...
I want to tar.gz all of them.
I tried:
$ grep -rl "123.45" . | tar -czf files.tar.gz
Doesn't work. That's why I'm here. Any ideas? Thanks.
Just tried this, and it worked in Ubuntu, but in CentOS I get "tar: 02: Cannot stat: No such file or directory".
$ tar -czf test.tar.gz `grep -rl "123.45" .`
If anyone else has a better way, let me know. That above one works great in Ubuntu, at least.
Like this:
... | tar -T - -czf files.tar.gz
"-T -" causes tar to read filenames from stdin. Second minus stands for stdin. –
grep -rl "123.45" . | xargs tar -czf files.tar.gz
Tar wants to be told what files to process, not given the names of the files via stdin.
It does however have a -T / --files-from option. So I'd suggest using that. Output your list of selected files to a temp file and then have tar read that, like this:
T=$(mktemp)
grep -rl "123.45" . > $T
tar cfz files.tar.gz -T $T
rm -f $T
If you want, you can also use shell expansion to do it like this:
tar cfz files.tar.gz -- $(grep -rl "123.45" .)
But that will fail if you have too many files or if any of the files have strange names (like spaces etc).

How to delete all files that were recently created in a directory in linux?

I untarred something into a directory that already contained a lot of things. I wanted to untar into a separate directory instead. Now there are too many files to distinguish between. However the files that I have untarred have been created just now (right ?) and the original files haven’t been modified for long (at least a day). Is there a way to delete just these untarred files based on their creation information ?
Tar usually restores file timestamps, so filtering by time is not likely to work.
If you still have the tar file, you can use it to delete what you unpacked with something like:
tar tf file.tar --quoting-style=shell-always |xargs rm -i
The above will work in most cases, but not all (filenames that have a carriage return in them will break it), so be careful.
You could remove the directories by adding -r to that, but it's probably safer to just remove the toplevel directories manually.
find . -mtime -1 -type f | xargs rm
but test first with
find . -mtime -1 -type f | xargs echo
There are several different answers to this question in order of increasing complexity.
First, if this is a one off, and in this particular instance you are absolutely sure that there are no weird characters in your filenames (spaces are OK, but not tabs, newlines or other control characters, nor unicode characters) this will work:
tar -tf file.tar | egrep '^(\./)?[^/]+(/)?$' | egrep -v '^\./$' | tr '\n' '\0' | xargs -0 rm -r
All that egrepping is to skip out on all the subdirectories of the subdirectories.
Another way to do this that works with funky filenames is this:
mkdir foodir
cd foodir
tar -xf ../file.tar
for file in *; do rm -rf ../"$file"; done
That will create a directory in which your archive has been expanded, but it sounds like you wanted that already anyway. It also will not handle any files who's names start with ..
To make that method work with files that start with ., do this:
mkdir foodir
cd foodir
tar -xf ../file.tar
find . -mindepth 1 -maxdepth 1 -print0 | xargs -0 sh -c 'for file in "$#"; do rm -rf ../"$file"; done' junk
Lastly, taking from Mat's answer, you can do this and it will work for any filename and not require you to untar the directory again:
tar -tf file.tar | egrep '^(\./)?[^/]+(/)?$' | grep -v '^\./$' | tr '\n' '\0' | xargs -0 bash -c 'for fname in "$#"; do fname="$(echo -ne "$fname")"; echo -n "$fname"; echo -ne "\0"; done' junk | xargs -0 rm -r
You can handle files and directories in one pass with:
tar -tf ../test/bob.tar --quoting-style=shell-always | sed -e "s/^\(.*\/\)'$/rmdir \1'/; t; s/^\(.*\)$/rm \1/;" | sort | bash
You can see what is going to happen leave off the pipe to 'bash'
tar -tf ../test/bob.tar --quoting-style=shell-always | sed -e "s/^\(.*\/\)'$/rmdir \1'/; t; s/^\(.*\)$/rm \1/;" | sort
to handle filenames with linefeeds you need more processing.

Linux: Find a List of Files in a Dictionary recursively

I have a Textfile with one Filename per row:
Interpret 1 - Song 1.mp3
Interpret 2 - Song 2.mp3
...
(About 200 Filenames)
Now I want to search a Folder recursivly for this Filenames to get the full path for each Filename in Filenames.txt.
How to do this? :)
(Purpose: Copied files to my MP3-Player but some of them are broken and i want to recopy them all without spending hours of researching them out of my music folder)
The easiest way may be the following:
cat orig_filenames.txt | while read file ; do find /dest/directory -name "$file" ; done > output_file_with_paths
Much faster way is run the find command only once and use fgrep.
find . -type f -print0 | fgrep -zFf ./file_with_filenames.txt | xargs -0 -J % cp % /path/to/destdir
You can use a while read loop along with find:
filecopy.sh
#!/bin/bash
while read line
do
find . -iname "$line" -exec cp '{}' /where/to/put/your/files \;
done < list_of_files.txt
Where list_of_files.txt is the list of files line by line, and /where/to/put/your/files is the location you want to copy to. You can just run it like so in the directory:
$ bash filecopy.sh
+1 for #jm666 answer, but the -J option doesn't work for my flavor of xargs, so i chaned it to:
find . -type f -print0 | fgrep -zFf ./file_with_filenames.txt | xargs -0 -I{} cp "{}" /path/to/destdir/

Find files and tar them (with spaces)

Alright, so simple problem here. I'm working on a simple back up code. It works fine except if the files have spaces in them. This is how I'm finding files and adding them to a tar archive:
find . -type f | xargs tar -czvf backup.tar.gz
The problem is when the file has a space in the name because tar thinks that it's a folder. Basically is there a way I can add quotes around the results from find? Or a different way to fix this?
Use this:
find . -type f -print0 | tar -czvf backup.tar.gz --null -T -
It will:
deal with files with spaces, newlines, leading dashes, and other funniness
handle an unlimited number of files
won't repeatedly overwrite your backup.tar.gz like using tar -c with xargs will do when you have a large number of files
Also see:
GNU tar manual
How can I build a tar from stdin?, search for null
There could be another way to achieve what you want. Basically,
Use the find command to output path to whatever files you're looking for. Redirect stdout to a filename of your choosing.
Then tar with the -T option which allows it to take a list of file locations (the one you just created with find!)
find . -name "*.whatever" > yourListOfFiles
tar -cvf yourfile.tar -T yourListOfFiles
Try running:
find . -type f | xargs -d "\n" tar -czvf backup.tar.gz
Why not:
tar czvf backup.tar.gz *
Sure it's clever to use find and then xargs, but you're doing it the hard way.
Update: Porges has commented with a find-option that I think is a better answer than my answer, or the other one: find -print0 ... | xargs -0 ....
If you have multiple files or directories and you want to zip them into independent *.gz file you can do this. Optional -type f -atime
find -name "httpd-log*.txt" -type f -mtime +1 -exec tar -vzcf {}.gz {} \;
This will compress
httpd-log01.txt
httpd-log02.txt
to
httpd-log01.txt.gz
httpd-log02.txt.gz
Would add a comment to #Steve Kehlet post but need 50 rep (RIP).
For anyone that has found this post through numerous googling, I found a way to not only find specific files given a time range, but also NOT include the relative paths OR whitespaces that would cause tarring errors. (THANK YOU SO MUCH STEVE.)
find . -name "*.pdf" -type f -mtime 0 -printf "%f\0" | tar -czvf /dir/zip.tar.gz --null -T -
. relative directory
-name "*.pdf" look for pdfs (or any file type)
-type f type to look for is a file
-mtime 0 look for files created in last 24 hours
-printf "%f\0" Regular -print0 OR -printf "%f" did NOT work for me. From man pages:
This quoting is performed in the same way as for GNU ls. This is not the same quoting mechanism as the one used for -ls and -fls. If you are able to decide what format to use for the output of find then it is normally better to use '\0' as a terminator than to use newline, as file names can contain white space and newline characters.
-czvf create archive, filter the archive through gzip , verbosely list files processed, archive name
Edit 2019-08-14:
I would like to add, that I was also able to use essentially use the same command in my comment, just using tar itself:
tar -czvf /archiveDir/test.tar.gz --newer-mtime=0 --ignore-failed-read *.pdf
Needed --ignore-failed-read in-case there were no new PDFs for today.
Why not give something like this a try: tar cvf scala.tar `find src -name *.scala`
Another solution as seen here:
find var/log/ -iname "anaconda.*" -exec tar -cvzf file.tar.gz {} +
The best solution seem to be to create a file list and then archive files because you can use other sources and do something else with the list.
For example this allows using the list to calculate size of the files being archived:
#!/bin/sh
backupFileName="backup-big-$(date +"%Y%m%d-%H%M")"
backupRoot="/var/www"
backupOutPath=""
archivePath=$backupOutPath$backupFileName.tar.gz
listOfFilesPath=$backupOutPath$backupFileName.filelist
#
# Make a list of files/directories to archive
#
echo "" > $listOfFilesPath
echo "${backupRoot}/uploads" >> $listOfFilesPath
echo "${backupRoot}/extra/user/data" >> $listOfFilesPath
find "${backupRoot}/drupal_root/sites/" -name "files" -type d >> $listOfFilesPath
#
# Size calculation
#
sizeForProgress=`
cat $listOfFilesPath | while read nextFile;do
if [ ! -z "$nextFile" ]; then
du -sb "$nextFile"
fi
done | awk '{size+=$1} END {print size}'
`
#
# Archive with progress
#
## simple with dump of all files currently archived
#tar -czvf $archivePath -T $listOfFilesPath
## progress bar
sizeForShow=$(($sizeForProgress/1024/1024))
echo -e "\nRunning backup [source files are $sizeForShow MiB]\n"
tar -cPp -T $listOfFilesPath | pv -s $sizeForProgress | gzip > $archivePath
Big warning on several of the solutions (and your own test) :
When you do : anything | xargs something
xargs will try to fit "as many arguments as possible" after "something", but then you may end up with multiple invocations of "something".
So your attempt: find ... | xargs tar czvf file.tgz
may end up overwriting "file.tgz" at each invocation of "tar" by xargs, and you end up with only the last invocation! (the chosen solution uses a GNU -T special parameter to avoid the problem, but not everyone has that GNU tar available)
You could do instead:
find . -type f -print0 | xargs -0 tar -rvf backup.tar
gzip backup.tar
Proof of the problem on cygwin:
$ mkdir test
$ cd test
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs touch
# create the files
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs tar czvf archive.tgz
# will invoke tar several time as it can'f fit 10000 long filenames into 1
$ tar tzvf archive.tgz | wc -l
60
# in my own machine, I end up with only the 60 last filenames,
# as the last invocation of tar by xargs overwrote the previous one(s)
# proper way to invoke tar: with -r (which append to an existing tar file, whereas c would overwrite it)
# caveat: you can't have it compressed (you can't add to a compressed archive)
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs tar rvf archive.tar #-r, and without z
$ gzip archive.tar
$ tar tzvf archive.tar.gz | wc -l
10000
# we have all our files, despite xargs making several invocations of the tar command
Note: that behavior of xargs is a well know diccifulty, and it is also why, when someone wants to do :
find .... | xargs grep "regex"
they intead have to write it:
find ..... | xargs grep "regex" /dev/null
That way, even if the last invocation of grep by xargs appends only 1 filename, grep sees at least 2 filenames (as each time it has: /dev/null, where it won't find anything, and the filename(s) appended by xargs after it) and thus will always display the file names when something maches "regex". Otherwise you may end up with the last results showing matches without a filename in front.

Resources