Structure of .tar files - linux

I am trying to run some benchmarks which take in input a tar file.
Now there is a runme.sh inside the tar file and that needs to be modified and the folder has to be made a .tar again.
The original benchmark works, but the modifies one doesn't. I believe that it is a problem with the file format.
Note : My modification is not creating the problem. If I just uncompress the working tar and tar it again without modification, it does not work. Surprisingly the size of the new file changes.
What I tried :
file command on the working and non-working tar files.
Both returned same POSIX tar archive
Tried to run the command tar cvf folder_name.tar folder_name/
Does not work.
What works :
I am on Ubuntu (14.04) and I double clicked on the tar, directly edited the file I wanted and updated it. This works, but is not a feasible solution as I have a large number of files and I want to write a script to automate it.
Screenshot of how it works with GUI :

Does the original tar file include the top-level directory name? It doesn't look like it from your screenshot. If you re-create the tar file with a top level directory, as indicated by point 2 in the things you tried, the structure won't be the same, and whatever program is trying to consume the tar file won't be able to parse it.
How do you test "If I just uncompress the working tar and tar it again without modification, it does not work." In a GUI or in a shell? If in a shell - what exact commands do you use?
In a shell, you can get the contents of the tarball with the command tar -tf filename.tar. If all the files it lists starts with the same folder name, your tarball includes a top level directory. If it just lists various files and subdirectories, it doesn't. (Tarballs that don't are an abomination, but if whatever you are using them for requires it, you'll just have to cope.)
I'm guessing that if you do this on your original tar file and your modified, non-working tar file, the results will differ.
The following should work in a shell if you have/need a tarball without a toplevel directory:
$ mkdir workdir
$ cd workdir
$ tar -xf ../tarball.tar
<edit your file however you like>
$ tar -cf ../tarball-new.tar *
$ cd ..
$ rm -r workdir
In case you have/need a tarball with a toplevel directory, the following should suffice:
$ tar -xf ../tarball.tar
$ cd toplevel_directory
<edit your file however you like>
$ cd ..
$ tar -cf tarball-new.tar toplevel_directory
$ rm -r toplevel_directory
Edit: I'm glad it worked for you. The point is, of course, that tar includes the paths of the files it stores, not just the filenames. So if you need a flat list of files, you need to run tar in the directory containing those files, giving all of them as arguments to tar. If you try to take the shortcut of going up a level and only specifying the directory name to pack up, tar will include the directory name in the archive.

Related

Tar command keeps bundling up entire directory path

I have a few sub-directories with files inside each of them in /home/user/archived/myFiles that I'm trying to bundle into a single tar file. The issue is, it keeps bundling a full directory path instead of just everything in the myFiles folder.
When I untar the file, I just want all the bundled sub-directories/files inside to appear in the directory I extracted the file rather than having to go through a series of folders that get created.
Instead, when I currently untar the file, I get a "home" folder and I have to go through /home/user/archived/myFiles to reach all the files.
I tried using the -C flag that I saw suggested online here Tar a directory, but don't store full absolute paths in the archive where you insert parameters for the full directory minus the last folder, and then the name of the last folder which contains all the stuff you want bundled. But the tar command doesn't work as I get a no such file or directory error.
#!/bin/bash
archivedDir="/home/user/archived/myFiles"
tar -czvf "archived-files.tar.gz" "${archivedDir}"/*
rm -vrf "${archivedDir}"/*
# Attempt with -C flag
#tar -cvf "${archivedDir}/archived-files.tar.gz" -C "${archivedDir}" "/*"
So for example, if I did an ls on /home/user/archived/myFiles, and it listed two directories called folderOne and folderTwo, and I ran this bash script and did an ls on /home/user/archived/myFiles again, that directory should only contain archived-files.tar.gz.
If I extracted the tar file, then folderOne and folderTwo would appear.
As I explain already here you should first change to this directory and then create the archive.
So change you script to something like:
archivedDir="/home/user/archived/myFiles"
cd $archivedDir
tar -czvf "../archived-files.tar.gz" *
This will create the archive in upper directory so you will not remove it with the next command.
the extraction should be something like:
archivedDir="/home/user/archived/myFiles"
cd $archivedDir
tar -xzvf "../archived-files.tar.gz"

How to create a tarball that includes only part of the specific path

I need to create a tarball for directory:
/opt/myuser/userContents/JDK-17
into
/opt/myuser/userContents/jdk-17-linux-x64.tar.gz
I would like only the JDK-17 directory to be packaged into the jdk-17-linux-x64.tar.gz file.
However, for some reason, I would not be able to cd into any directory. My current directory is always /.
The command I issued was:
tar -czf /opt/myuser/userContents/jdk-17-linux-x64.tar.gz /opt/myuser/userContents/JDK-17
However, the jdk-17-linux-x64.tar.gz file I created packages the whole directory structure
/opt/myuser/userContents/JDK-17
rather than just
JDK-17
I know it will be straightforward if I can cd into the directory /opt/myuser/userContents. However, because I cannot cd, how I can create a tarball as required?
Based on #Bodo's suggestion. The following command works for me:
tar -czf /opt/myuser/userContents/jdk-17-linux-x64.tar.gz -C /opt/myuser/userContents JDK-17

Is it possible to create a folder with the filename into the tar file you are creating?

Let's say I'm trying to tar.gz all the files and folders in /usr/local/bin/data/*
The file name would be data-2015-10-01.tar.gz. When I untar it, is it possible that the root directory would be data-2015-10-01 followed by the contents of whatever is inside of data/* ?
If not, how can I tar /usr/local/bin/data/* but start at the /data/ folder level?
I can't do this unfortunately since the program spits out /usr/local/bin/data/ and I'm unable to change it.
cd /usr/local/bin
tar ... /data/*
There are a couple of ways to do what I think you're trying to accomplish. First, you can use the -C option to tar when creating the archive. That changes tar's current working directory to that directory before creating the archive. Not strictly required in your case, but probably helpful.
# tar -C /usr/local/bin -czf data-2015-10-01.tar.gz data/*
That at least gets you to a single directory named data. If you have control of the extraction (manually or via a script you provide to whomever is unpacking this), then you can do something like this on the extraction:
# mkdir -f data-2015-10-01 && tar -C data-2015-10-01 --strip-components=1 -xzf data-2015-10-01.tar.gz
This will remove the first path, which is "data" and extract everything from there into the directory which is your current working directory, data-2015-10-01. So, it isn't specifically tar that's doing the renaming, but you will effectively end up with the same result.
I've accomplished something similar with a symlink. This is not a great solution if you have (or might have) symlinks in the directory structure you're trying to archive. I have to say that I prefer #geis' solution to strip out the top-level directory on extract, but this gives you another option.
ln -s /usr/local/bin/data data-2015-10-01
tar -cvhf data-2015-10-01.tar.gz data-2015-10-01/
rm data-2015-10-01
(Note the additional -h option in the tar invocation.)

How to tar several files without containing their whole path in tar file

Let's say I am in /tmp. I want to tar /tmp/aaa/123,/tmp/bbb/222.
The command I use is: tar -vczf /tmp/my.tar.gz /tmp/aaa/123 /tmp/bbb/222.
This command works fine.
However, when I use tar -zxvf my.tar.gzto decompress the tar file, it gives me 2 file(let's say I copied my.tar.gz to a path called pwd and now I am in pwd): pwd/tmp/aaa/123 and pwd/tmp/bbb/222. The result I expect is pwd/my/123 and pwd/my/222.
Are there any commands I can use to achieve that?
Just make use of the --directory option.

Compress files and directories

I need to compress certain files and a directory. Suppose they're placed in /root/project.
The thing is that i need to compress them in a gzip-tarball format with certain name (name.tar.gz) and in the "root" directory, i mean, that as soon as i opened the .tar.gz all the files and directories i want to compress are there.
I have tried using the following commands:
tar czfv name.tar.gz /root/project
tar czfv name.tar.gz /root/project/*
but then the whole substructure gets compress (i mean, when i open the .tar.gz i have to navigate through the directories root/project.. which i dont, i need that the files were as soon as i opened in "/" i suppose)
I hope i've explained myself... excuse me for my bad english and thanks in advance!
If you are using GNU tar you can use the following command:
tar -C /root/project -zcvf /root/name.tar.gz .
The -C causes tar to change to the /root/project directory before adding the . directory to the archive. Make sure the destination directory for your archive is not in the directory you are archiving (/root/project). This example creates the archive in /root.
If you want the filenames in the tarball to all have a project/ prefix (which is often advisable), do
(cd /root && tar czvf `pwd`/name.tar.gz project)
If you don't want a prefix at all, do
(cd /root/project && tar czvf `pwd`/name.tar.gz .)
(The pwd captures the present working directory prior to doing the cd; the parentheses execute the command in a subshell so your current shell stays in the same directory.)

Resources