Compress files and directories - linux

I need to compress certain files and a directory. Suppose they're placed in /root/project.
The thing is that i need to compress them in a gzip-tarball format with certain name (name.tar.gz) and in the "root" directory, i mean, that as soon as i opened the .tar.gz all the files and directories i want to compress are there.
I have tried using the following commands:
tar czfv name.tar.gz /root/project
tar czfv name.tar.gz /root/project/*
but then the whole substructure gets compress (i mean, when i open the .tar.gz i have to navigate through the directories root/project.. which i dont, i need that the files were as soon as i opened in "/" i suppose)
I hope i've explained myself... excuse me for my bad english and thanks in advance!

If you are using GNU tar you can use the following command:
tar -C /root/project -zcvf /root/name.tar.gz .
The -C causes tar to change to the /root/project directory before adding the . directory to the archive. Make sure the destination directory for your archive is not in the directory you are archiving (/root/project). This example creates the archive in /root.

If you want the filenames in the tarball to all have a project/ prefix (which is often advisable), do
(cd /root && tar czvf `pwd`/name.tar.gz project)
If you don't want a prefix at all, do
(cd /root/project && tar czvf `pwd`/name.tar.gz .)
(The pwd captures the present working directory prior to doing the cd; the parentheses execute the command in a subshell so your current shell stays in the same directory.)

Related

Tar command keeps bundling up entire directory path

I have a few sub-directories with files inside each of them in /home/user/archived/myFiles that I'm trying to bundle into a single tar file. The issue is, it keeps bundling a full directory path instead of just everything in the myFiles folder.
When I untar the file, I just want all the bundled sub-directories/files inside to appear in the directory I extracted the file rather than having to go through a series of folders that get created.
Instead, when I currently untar the file, I get a "home" folder and I have to go through /home/user/archived/myFiles to reach all the files.
I tried using the -C flag that I saw suggested online here Tar a directory, but don't store full absolute paths in the archive where you insert parameters for the full directory minus the last folder, and then the name of the last folder which contains all the stuff you want bundled. But the tar command doesn't work as I get a no such file or directory error.
#!/bin/bash
archivedDir="/home/user/archived/myFiles"
tar -czvf "archived-files.tar.gz" "${archivedDir}"/*
rm -vrf "${archivedDir}"/*
# Attempt with -C flag
#tar -cvf "${archivedDir}/archived-files.tar.gz" -C "${archivedDir}" "/*"
So for example, if I did an ls on /home/user/archived/myFiles, and it listed two directories called folderOne and folderTwo, and I ran this bash script and did an ls on /home/user/archived/myFiles again, that directory should only contain archived-files.tar.gz.
If I extracted the tar file, then folderOne and folderTwo would appear.
As I explain already here you should first change to this directory and then create the archive.
So change you script to something like:
archivedDir="/home/user/archived/myFiles"
cd $archivedDir
tar -czvf "../archived-files.tar.gz" *
This will create the archive in upper directory so you will not remove it with the next command.
the extraction should be something like:
archivedDir="/home/user/archived/myFiles"
cd $archivedDir
tar -xzvf "../archived-files.tar.gz"

Exclude directories from tar archive with a .tarignore file

In a similar way to .gitignore, is it possible to do that a .tarignore file in a subdirectory makes it excluded from archive when doing
tar cfjvh backup.tar.bz2 .
(I know the --exclude parameter of tar but here it's not what I want).
Using the --exclude-ignore=.tarignore option causes Tar to use .tarignore files in a similar way that git uses .gitignore files. Use the same syntax as you would usually use in .gitignore files, so if you want to ignore all sub-directories and files in a particular directory, add a .gitignore file with * to that directory. For example:
$ find dir
dir
dir/a
dir/b
dir/c
dir/dir1
dir/dir1/1a
dir/dir1/1b
dir/dir2
dir/dir2/2a
dir/dir2/2b
$ echo "*" > dir/dir1/.tarignore
$ tar -c --exclude-ignore=.tarignore -vjf archive.tar.bz2 dir
dir/
dir/a
dir/b
dir/c
dir/dir1/
dir/dir2/
dir/dir2/2a
dir/dir2/2b
If you want to ignore all directories that have an empty .tarignore file, take a look at the --exclude-tag-all=.tarignore option.
$ find dir
dir
dir/a
dir/b
dir/c
dir/dir1
dir/dir1/1a
dir/dir1/1b
dir/dir2
dir/dir2/2a
dir/dir2/2b
$ touch dir/dir1/.tarignore
$ tar -c --exclude-tag-all=.tarignore -vjf archive.tar.bz2 dir
tar: dir/dir1/: contains a cache directory tag .tarignore; directory not dumped
dir/
dir/a
dir/b
dir/c
dir/dir2/
dir/dir2/2a
dir/dir2/2b
Also take a look at the --exclude-tag= and --exclude-tag= options, which are variations which have slightly different behaviors.
You can do something like
$ COPYFILE_DISABLE=true tar -c --exclude-from=.tarignore -vzf archiveName.tar.gz
The breakdown:-
Reason for COPYFILE_DISABLE=trueis for making sure files starting with ._ are not included in the .tar file made. Enable it permanently adding export COPYFILE_DISABLE=true to ~/.bash_profile on your Mac if you frequently do tar operations.
–exclude-from=.tarignore: Ignore all files and folders listed in .tarignore
You can have the following content in your .tarignore, here is mine
$ cat .tarignore
.DS_Store
.git
.gitignore
I just found out that git has a tool to package the tracked files itself:
git archive. Using that is probably the easiest way.
EDIT: If you want to make a package out of files / data that is not yet commited (but tracked), using tar -czf files.tar.gz $(git ls-files) works.

Structure of .tar files

I am trying to run some benchmarks which take in input a tar file.
Now there is a runme.sh inside the tar file and that needs to be modified and the folder has to be made a .tar again.
The original benchmark works, but the modifies one doesn't. I believe that it is a problem with the file format.
Note : My modification is not creating the problem. If I just uncompress the working tar and tar it again without modification, it does not work. Surprisingly the size of the new file changes.
What I tried :
file command on the working and non-working tar files.
Both returned same POSIX tar archive
Tried to run the command tar cvf folder_name.tar folder_name/
Does not work.
What works :
I am on Ubuntu (14.04) and I double clicked on the tar, directly edited the file I wanted and updated it. This works, but is not a feasible solution as I have a large number of files and I want to write a script to automate it.
Screenshot of how it works with GUI :
Does the original tar file include the top-level directory name? It doesn't look like it from your screenshot. If you re-create the tar file with a top level directory, as indicated by point 2 in the things you tried, the structure won't be the same, and whatever program is trying to consume the tar file won't be able to parse it.
How do you test "If I just uncompress the working tar and tar it again without modification, it does not work." In a GUI or in a shell? If in a shell - what exact commands do you use?
In a shell, you can get the contents of the tarball with the command tar -tf filename.tar. If all the files it lists starts with the same folder name, your tarball includes a top level directory. If it just lists various files and subdirectories, it doesn't. (Tarballs that don't are an abomination, but if whatever you are using them for requires it, you'll just have to cope.)
I'm guessing that if you do this on your original tar file and your modified, non-working tar file, the results will differ.
The following should work in a shell if you have/need a tarball without a toplevel directory:
$ mkdir workdir
$ cd workdir
$ tar -xf ../tarball.tar
<edit your file however you like>
$ tar -cf ../tarball-new.tar *
$ cd ..
$ rm -r workdir
In case you have/need a tarball with a toplevel directory, the following should suffice:
$ tar -xf ../tarball.tar
$ cd toplevel_directory
<edit your file however you like>
$ cd ..
$ tar -cf tarball-new.tar toplevel_directory
$ rm -r toplevel_directory
Edit: I'm glad it worked for you. The point is, of course, that tar includes the paths of the files it stores, not just the filenames. So if you need a flat list of files, you need to run tar in the directory containing those files, giving all of them as arguments to tar. If you try to take the shortcut of going up a level and only specifying the directory name to pack up, tar will include the directory name in the archive.

Extract tar archive excluding a specific folder and its contents

With PHP I am using exec("tar -xf archive.tar -C /home/user/target/folder") to extract the contents of a specific archive (archive.tar) into the target directory (/home/user/target/folder), so that all existing contents of the target directory will be overwritten by the new ones that are contained in the archive.
It works fine and all files in the target directory are being overwritten after extract, but there is one directory in the archive that I would like to omit (from extracting and thus overwriting the existing one in the target folder)...
For example, the archive.tar contains:
folderA/
folderB/
folderC/
folderD/
fileA.php
fileB.php
fileC.xml
How could I extract (and overwrite) all except (for example) folderC/? In other words, I want folderC and its contents to remain intact in the user's directory and not be overwritten by the one contained in the tar archive.
Any suggestions?
(Tar on the hosting server is GNU version 1.23.)
You can use '--exclude' to omit a folder:
tar -xf archive.tar -C /home/user/target/folder" --exclude="folderC"
There is the --exclude PATTERN option in the tar tool.
Check: tar on linuxcommand.org
To be on the safe side, you could remove all write permissions from the folder. For example:
$ chmod 000 folderC/
An then do a normal tar extract (as regular user). You'll get some error messages on console, but your folder will remain untouched.... At the end of the tar, change back your folder original permissions. For example:
$ chmod 775 folderC/
Of course '--exclude' tar option is the right solution to this particular problem, but, if you are not completely 100% sure about a command syntax, and yor're handling critical data, my solution puts you on the safe side :-).
Write --exclude='./folder' at the beginning of the tar command.
In your case that is,
exec("tar -x --exclude='./C' -f archive.tar -C /home/user/target/folder")

Is it possible to create a folder with the filename into the tar file you are creating?

Let's say I'm trying to tar.gz all the files and folders in /usr/local/bin/data/*
The file name would be data-2015-10-01.tar.gz. When I untar it, is it possible that the root directory would be data-2015-10-01 followed by the contents of whatever is inside of data/* ?
If not, how can I tar /usr/local/bin/data/* but start at the /data/ folder level?
I can't do this unfortunately since the program spits out /usr/local/bin/data/ and I'm unable to change it.
cd /usr/local/bin
tar ... /data/*
There are a couple of ways to do what I think you're trying to accomplish. First, you can use the -C option to tar when creating the archive. That changes tar's current working directory to that directory before creating the archive. Not strictly required in your case, but probably helpful.
# tar -C /usr/local/bin -czf data-2015-10-01.tar.gz data/*
That at least gets you to a single directory named data. If you have control of the extraction (manually or via a script you provide to whomever is unpacking this), then you can do something like this on the extraction:
# mkdir -f data-2015-10-01 && tar -C data-2015-10-01 --strip-components=1 -xzf data-2015-10-01.tar.gz
This will remove the first path, which is "data" and extract everything from there into the directory which is your current working directory, data-2015-10-01. So, it isn't specifically tar that's doing the renaming, but you will effectively end up with the same result.
I've accomplished something similar with a symlink. This is not a great solution if you have (or might have) symlinks in the directory structure you're trying to archive. I have to say that I prefer #geis' solution to strip out the top-level directory on extract, but this gives you another option.
ln -s /usr/local/bin/data data-2015-10-01
tar -cvhf data-2015-10-01.tar.gz data-2015-10-01/
rm data-2015-10-01
(Note the additional -h option in the tar invocation.)

Resources