linux tar command -t option to show max-depth=1 - linux

may be a tar.gz file has content many filefolder,the filefolder may have a lot of file and filefolder,I want to show the tar.gz file only the first depth.How to write this command.
for example ,I want to show this tar.gz file
It's only to show auth,help,xa,install.txt,license.txt.release.xt,sqljdbc.jar,sqljdbcr.jar
How to write this command?

Since tar t itself does not have an option to limit the depth to which the tarball contents are listed, you need to take the full listing and reduce it to what you want.
Since this means tar will list the full archive in any case, it will not be faster than a full listing.
tar tzf <tarball> | sed "s#/.*##" | sort -u
Note that, for all well-behaved tarballs, this will only give one entry, of the same name as the tarball.
Real-world example:
$ tar tjf gcc-5.2.0.tar.bz2 | sed "s#/.*##" | sort -u
gcc-5.2.0
Tarballs that splatter the extraction directory with files and subfolders are commonly called tarbombs.

Related

Bash Scripting with xargs to BACK UP files

I need to copy a file from multiple locations to the BACK UP directory by retaining its directory structure. For example, I have a file "a.txt" at the following locations /a/b/a.txt /a/c/a.txt a/d/a.txt a/e/a.txt, I now need to copy this file from multiple locations to the backup directory /tmp/backup. The end result should be:
when i list /tmp/backup/a --> it should contain /b/a.txt /c/a.txt /d/a.txt & /e/a.txt.
For this, I had used the command: echo /a/*/a.txt | xargs -I {} -n 1 sudo cp --parent -vp {} /tmp/backup. This is throwing the error "cp: cannot stat '/a/b/a.txt /a/c/a.txt a/d/a.txt a/e/a.txt': No such file or directory"
-I option is taking the complete input from echo instead of individual values (like -n 1 does). If someone can help debug this issue that would be very helpful instead of providing an alternative command.
Use rsync with the --relative (-R) option to keep (parts of) the source paths.
I've used a wildcard for the source to match your example command rather than the explicit list of directories mentioned in your question.
rsync -avR /a/*/a.txt /tmp/backup/
Do the backups need to be exactly the same as the originals? In most cases, I'd prefer a little compression. [tar](https://man7.org/linux/man-pages/man1/tar.1.html) does a great job of bundling things including the directory structure.
tar cvzf /path/to/backup/tarball.tgz /source/path/
tar can't update compressed archives, so you can skip the compression
tar uf /path/to/backup/tarball.tar /source/path/
This gives you versioning of a sort, as if only updates changed files, but keeps the before and after versions, both.
If you have time and cycles and still want the compression, you can decompress before and recompress after.

GZip an entire directory

i used the following:
gzip -9 -c -r <some_directory> > directory.gz
how do i decompress this directory ?
I have tried
gunzip directory.gz
i am just left with a single file and not a directory structure.
As others have already mentioned, gzip is a file compression tool and not an archival tool. It cannot work with directories. When you run it with -r, it will find all files in a directory hierarchy and compress them, i.e. replacing path/to/file with path/to/file.gz. When you pass -c the gzip output is written to stdout instead of creating files. You have effectively created one big file which contains several gzip-compressed files.
Now, you could look for the gzip file header/magic number, which is 1f8b and then reconstruct your files manually.
The sensible thing to do now is to create backups (if you haven't already). Backups always help (especially with problems such as yours). Create a backup of your directory.gz file now. Then read on.
Fortunately, there's an easier way than manually reconstructing all files: using binwalk, a forensics utility which can be used to extract files from within other files. I tried it with a test file, which was created the same way as yours. Running binwalk -e file.gz will create a folder with all extracted files. It even manages to reconstruct the original file names. The hierarchy of the directories is probably lost. But at least you have your file contents and their names back. Good luck!
Remember: backups are essential.
(For completeness' sake: What you probably intended to run: tar czf directory.tar.gz directory and then tar xf directory.tar.gz)
gzip will compress 1+ files, though not meant to function like an archive utility. The posted cmd-line would yield N compressed file images concatenated to stdout, redirected to the named output file; unfortunately stuff like filenames and any dirs would not be recorded. A pair like this should work:
(create)
tar -czvf dir.tar.gz <some-dir>
(extract)
tar -xzvf dir.tar.gz

How to exclude binaries when using rsync

I want to rsync a directory to server from a mac machine to linux machine while excluding compiled files like .o files and binary executables. How do I exclude binary files?
What I am using at the moment:
rsync -av --compress --exclude="*.o" dir server:dir
This is a sticky problem because a Unix system does not have a hard and fast definition of the distinction between "binary" and "text" files. You can do a pretty good job by using the file command and searching for text in the output (see How to tell binary from text files in linux), so I'd run find to generate a list of files which file considers to be text, and use that as the list of files to rsync:
find dir | xargs file | awk -F: '$2 ~ /text/ { print $1 }' | \
rsync --files-from=- -av --compress dir server:dir
This will require some tweaking to make sure the pathnames are correct relative to the source dir, and so on, but it should get close to what you want.
In the long term, I'd want to rework my build process to put generated files in a dir/build directory, but this might help for now :-)
You can add a .cvsignore file in the directories and use the option -C to rsync.
But this is only vaguely what you specified. Maybe it suits you well, maybe it assumes other things than you. So be careful and test that properly.
Also, you can run a find before the rsync, scanning the complete tree for files matching your idea of being "binary" (maybe compiled executables?), and place all their names in an exclude file which you then use with option --exclude-from.

Getting contents of a particular file in the tar archive

This script lists the name of the file ( in a tar archive) containing a pattern.
tar tf myarchive.tar | while read -r FILE
do
if tar xf test.tar $FILE -O | grep "pattern" ;then
echo "found pattern in : $FILE"
fi
done
My question is:
Where is this feature documented, where $FILE is one of the files in the archive:
tar xf test.tar $FILE
This is usually documented in man pages, try running this command:
man tar
Unfortunately, Linux has not the best set of man pages. There is an online copy of tar manpage from this OS: http://linux.die.net/man/1/tar and it is terrible. But it links to info man command which is command to access the "info" system widely used in GNU world (many programs in linux user-space are from GNU projects, for example gcc). There is an exact link to section of online info tar about extracting specific files: http://www.gnu.org/software/tar/manual/html_node/extracting-files.html#SEC27
I may also recommend documentation from BSD (e.g. FreeBSD) or opengroup.org. Utilities can be different in detail but behave same in general.
For example, there is some rather old but good man from opengroup (XCU means 'Commands and Utilities' of the Single UNIX Specification, Version 2, 1997):
http://pubs.opengroup.org/onlinepubs/7908799/xcu/tar.html
tar key [file...]
The following operands are supported:
key --
The key operand consists of a function letter followed immediately by zero or more modifying letters. The function letter is one of the following:
x --
Extract the named file or files from the archive. If a named file matches a directory whose contents had been written onto the archive, this directory is (recursively) extracted. If a named file in the archive does not exist on the system, the file is created with the same mode as the one in the archive, except that the set-user-ID and set-group-ID modes are not set unless the user has appropriate privileges. If the files exist, their modes are not changed except as described above. The owner, group, and modification time are restored (if possible). If no file operand is given, the entire content of the archive is extracted. Note that if several files with the same name are in the archive, the last one overwrites all earlier ones.
And to fully understand command tar xf test.tar $FILE you should also read about f option:
f --
Use the first file operand (or the second, if b has already been specified) as the name of the archive instead of the system-dependent default.
So, test.tar in your command will be used by f key as archive name; then x will use second argument ($FILE) as name of file or directory to extract from archive.

Tarballing in Bash: Is there a way to only dereference links pointing outside the tar'd directory?

I regularly use the -h tag of tar to create tarballs that contain all the libaries linked to from the directory I am zipping up, but it has to side-effect of double-tarballing all the internal links within the directory.
For example, I have two very large datasets, and use a symbolic link to choose which one my test app uses, so I end up getting one of them twice. This makes the tarball way bigger than it needs to be.
So, is there any way to get tar to only dereference links if they point to a file that's not already included in the tarball?
Thanks.
Use find instead to make a list of files to be included into the tar ball.
find . -exec realpath '{}' ';' | sort | uniq | tar -T -

Resources