How to combine multiple tar files into a single tar file - linux

According to gnu documentation, to add one or more archives to the end of another archive, I can use the ‘--concatenate’ operation.
But in my testing, I found that I can't add more than one file at a time.
# ls -al
total 724
drwxr-xr-x. 3 root root 60 Oct 14 17:40 .
dr-xr-xr-x. 32 root root 4096 Oct 14 16:28 ..
-rw-r--r--. 1 root root 245760 Oct 14 18:07 1.tar
-rw-r--r--. 1 root root 245760 Oct 14 18:07 2.tar
-rw-r--r--. 1 root root 245760 Oct 14 18:07 3.tar
# tar tvf 1.tar
-rw-r--r-- root/root 238525 2021-10-14 17:28 1.txt
# tar tvf 2.tar
-rw-r--r-- root/root 238525 2021-10-14 17:29 2.txt
# tar tvf 3.tar
-rw-r--r-- root/root 238525 2021-10-14 17:29 3.txt
It appears that it only picked up the first parameter and ignored that rest
# tar -A -f 1.tar 2.tar 3.tar
# tar tvf 1.tar
-rw-r--r-- root/root 238525 2021-10-14 17:28 1.txt
-rw-r--r-- root/root 238525 2021-10-14 17:29 2.txt

As described in an excellent and comprehensive Super User answer,
this is a known bug in gnu tar (reported in August 2008)

Related

Set the permissions of all files copied in a folder the same

I would like to create a folder (in Linux) that can be used as cloud-like storage location, where all files copied there automatically will have g+rw permissions (without the need of chmod'ing), such that they are readable and writable by people beloning to that specific group.
You can use the command setfacl, e.g.:
setfacl -d -m g::rwx test/
It sets the rwx permissions to every new file in test/ folder.
$ touch test/test
$ ls -la test/
total 48
drwxr-xr-x 2 manu manu 4096 Jan 28 08:39 .
drwxrwxrwt 20 root root 40960 Jan 28 08:39 ..
-rw-r--r-- 1 manu manu 0 Jan 28 08:39 test
$ setfacl -d -m g::rwx test/
$ ls -la test/
total 48
drwxr-xr-x+ 2 manu manu 4096 Jan 28 08:39 .
drwxrwxrwt 20 root root 40960 Jan 28 08:39 ..
-rw-r--r-- 1 manu manu 0 Jan 28 08:39 test
$ touch test/test2
$ ls -la test/
total 48
drwxr-xr-x+ 2 manu manu 4096 Jan 28 08:40 .
drwxrwxrwt 20 root root 40960 Jan 28 08:39 ..
-rw-r--r-- 1 manu manu 0 Jan 28 08:39 test
-rw-rw-r-- 1 manu manu 0 Jan 28 08:40 test2

How do I grep the contents of files returned by ls and grep?

How do I grep on files returned from a ls and grep command
e.g.
# ls -alrth /app/splunk_export/*HSS* | grep 'Nov 24 11:*'
-rw-r--r-- 1 root root 63K Nov 24 11:17 /app/splunk_export/A20171124.1000+1300-1100+1300_HSS01HAM_CGP.csv
-rw-r--r-- 1 root root 40K Nov 24 11:17 /app/splunk_export/A20171124.1000+1300-1100+1300_HSS01HAM_USCDB.csv
-rw-r--r-- 1 root root 138K Nov 24 11:17 /app/splunk_export/A20171124.1000+1300-1100+1300_HSS01HAM.csv
-rw-r--r-- 1 root root 167K Nov 24 11:17 /app/splunk_export/A20171124.1000+1300-1100+1300_HSS01KPR_FE.csv
-rw-r--r-- 1 root root 71K Nov 24 11:17 /app/splunk_export/A20171124.1000+1300-1100+1300_HSS01KPR_USCDB.csv
-rw-r--r-- 1 root root 63K Nov 24 11:17 /app/splunk_export/A20171124.1000+1300-1100+1300_HSS01KPR.csv
-rw-r--r-- 1 root root 25K Nov 24 11:17 /app/splunk_export/A20171124.1030+1300-1100+1300_HSS01HAM_CGP.csv
-rw-r--r-- 1 root root 75K Nov 24 11:17 /app/splunk_export/A20171124.1030+1300-1100+1300_HSS01HAM.csv
-rw-r--r-- 1 root root 90K Nov 24 11:17 /app/splunk_export/A20171124.1030+1300-1100+1300_HSS01KPR_FE.csv
-rw-r--r-- 1 root root 28K Nov 24 11:17 /app/splunk_export/A20171124.1030+1300-1100+1300_HSS01KPR.csv
-rw-r--r-- 1 root root 15K Nov 24 11:17 /app/splunk_export/A20171124.1045+1300-1100+1300_HSS01HAM.csv
-rw-r--r-- 1 root root 140K Nov 24 11:17 /app/splunk_export/A20171124.1045+1300-1100+1300_HSS01KPR_FE.csv
-rw-r--r-- 1 root root 15K Nov 24 11:34 /app/splunk_export/A20171124.1100+1300-1115+1300_HSS01HAM.csv
-rw-r--r-- 1 root root 140K Nov 24 11:34 /app/splunk_export/A20171124.1100+1300-1115+1300_HSS01KPR_FE.csv
-rw-r--r-- 1 root root 25K Nov 24 11:34 /app/splunk_export/A20171124.1100+1300-1130+1300_HSS01HAM_CGP.csv
-rw-r--r-- 1 root root 75K Nov 24 11:34 /app/splunk_export/A20171124.1100+1300-1130+1300_HSS01HAM.csv
-rw-r--r-- 1 root root 91K Nov 24 11:34 /app/splunk_export/A20171124.1100+1300-1130+1300_HSS01KPR_FE.csv
-rw-r--r-- 1 root root 28K Nov 24 11:34 /app/splunk_export/A20171124.1100+1300-1130+1300_HSS01KPR.csv
-rw-r--r-- 1 root root 15K Nov 24 11:34 /app/splunk_export/A20171124.1115+1300-1130+1300_HSS01HAM.csv
-rw-r--r-- 1 root root 139K Nov 24 11:34 /app/splunk_export/A20171124.1115+1300-1130+1300_HSS01KPR_FE.csv
I would like to search the above files for the following string 1693701622
I have tried using xargs, but need some guidance.
# ls -alrth /app/splunk_export/*HSS* | grep 'Nov 24 11:*' | xargs grep -l 1693701622
grep: invalid option -- '-'
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
NOTE: possible duplicate here but I think mine is slightly different
You are not extracting the file name and the whole line (with the leading dashes) is being picked up by xargs and that's why the error.
Use awk to do the filtering. That would work better than grep since it handles repeated spaces gracefully:
ls -alrth | awk 'match($6$7$8, /Nov2411:.*/) { print $9 }' | xargs grep -l 1693701622
In general, it is not a good idea to parse the output of ls. See this post for why.
For your requirement, it might be better to use find to pick up the files based on their timestamp and then pass them to xargs grep ....
See this related post:
Recursively find all files newer than a given time

How to tar the muliple files with pattern match and remove same files which are being tar?

> ls -lrt
-rw-r--r-- 1 akash akash 480074 Feb 19 16:56 FP_ATL_EXTCRP003FTR_INPUTAS.txt.160219054239
-rw-r--r-- 1 akash akash 1745 Feb 19 16:56 FP_ATLR03_FTR_INPUT_ZD.txt
-rw-r--r-- 1 akash akash 480074 Feb 19 16:56 FP_ATL_EXTCRP003FTR_INPUTCT.txt.160219033636
-rw-r--r-- 1 akash akash 11501 Feb 19 16:56 FP_ATL_EXTCRP003FTR_INPUTCB.txt.160219113017
-rw-r--r-- 1 akash akash 1745 Feb 19 16:56 FP_ATLR03_FTR_INPUT_CG.txt
> tar cvf my_path.tar FP_ATL*
a FP_ATL_EXTCRP003FTR_INPUTM6.txt.160219011039 29 blocks.
a FP_ATL_EXTCRP003FTR_INPUTST.txt.160218130018 266 blocks.
a FP_ATL_EXTCRP003FTR_INPUTZK.txt.151224122755 4 blocks.
a FP_ATL_EXTCRP003FTR_INPUTZP.txt.160218102356 4 blocks.
a FP_ATL_EXTCRP003FTR_INPUTZT.txt.160218191832 4 blocks.
> ls -lrt
-rw-r--r-- 1 akash akash 480074 Feb 19 16:56 FP_ATL_EXTCRP003FTR_INPUTAS.txt.160219054239
-rw-r--r-- 1 akash akash 1745 Feb 19 16:56 FP_ATLR03_FTR_INPUT_ZD.txt
-rw-r--r-- 1 akash akash 480074 Feb 19 16:56 FP_ATL_EXTCRP003FTR_INPUTCT.txt.160219033636
-rw-r--r-- 1 akash akash 11501 Feb 19 16:56 FP_ATL_EXTCRP003FTR_INPUTCB.txt.160219113017
-rw-r--r-- 1 akash akash 1745 Feb 19 16:56 FP_ATLR03_FTR_INPUT_CG.txt
-rw-r--r-- 1 akash akash 1413120 Feb 22 16:30 my_path.tar
I want to remove the files which are being compressed by tar command. How I can achieve this in single commmand line ?
To closely replicate the behavior of zip -mj with tar, you can use this command:
tar --transform 's/.*\///' --remove-files -cvf my_path.tar FP_ATL*
Note that this doesn't compress the files that are added to the tarball. You will need to add one of the compression options like -z for that.
Thanks Kurt Stutsman..
This command has worked: tar --remove-files -czvf my_fiels.tar de_*

Trap to extract file 5 times only using bash script

I created a script to exact all tar.gz file it is composed of 5 tar files and the last tar file should remain tar it should not be extracted. So my problem is it will all extract the tar.gz file.
while true; do
for f in *.tar.gz; do
case $f in '*.tar.gz') exit 0;; esac
tar zxf "$f"
rm -v "$f"
done
done
What is the correct way to my solve problem?
You say:
I created a script to exact all tar.gz file it is composed of 5 tar files and the last tar file should remain tar it should not be extracted.
I can see two interpretations for that:
There are five files with the .tar.gz extension in a directory. The first four of those files should be extracted and removed; the fifth should be left unextracted.
There is one .tar.gz file which contains 5 .tar files. The first four of those .tar files should be extracted from the .tar.gz; the fifth should be left unextracted.
There are many ways to deal with each scenario. I'm assuming that the tar file names do not contain spaces or newlines or other oddball characters (which is plausible). If you have oddball file names, it is probably simplest to sanitize them first. Amongst other things, these assumptions mean that you could safely use ls output. However, it is still best not to do so. I am going to assume that 5 is a fixed and magic number.
Scenario 1
i=0
for tarfile in *.tar.gz
do
$(($i++))
[ $i = 5 ] && break
tar -xf "$tarfile"
done
This counts the files it extracts, and stops after the count reaches 5 (so it only extracts the first 4 files).
Scenario 2
tarfiles=$(tar -tf *.tar.gz | sed '$d')
tar -xf *.tar.gz $tarfiles
This collects the list of tar files contained inside the compressed tar file, and deletes the last one listed. It then requests tar to extract the remaining files — so the directory will contain the original .tar.gz file and all the files except the last extracted from the .tar.gz file. If the extracted files are tar files, you can then extract those individually:
for tarfile in $tarfiles
do tar -xf "$tarfile"
done
If you want to remove the extracted tar files too, you can do that.
Demo of Scenario 2
$ mkdir junk
$ cp *.c junk
$ cd junk
$ ls
bo.c zigzag.c
$ for i in {1..5}; do tar -cf tarfile-$i.tar *.c; done
$ ls -l
total 144
-rw-r--r-- 1 jleffler staff 546 May 7 22:34 bo.c
-rw-r--r-- 1 jleffler staff 10752 May 7 22:35 tarfile-1.tar
-rw-r--r-- 1 jleffler staff 10752 May 7 22:36 tarfile-2.tar
-rw-r--r-- 1 jleffler staff 10752 May 7 22:36 tarfile-3.tar
-rw-r--r-- 1 jleffler staff 10752 May 7 22:36 tarfile-4.tar
-rw-r--r-- 1 jleffler staff 10752 May 7 22:36 tarfile-5.tar
-rw-r--r-- 1 jleffler staff 7305 May 7 22:34 zigzag.c
$ tar -czf tarfile-N.tar.gz *.tar
$ ls -l
total 152
-rw-r--r-- 1 jleffler staff 546 May 7 22:34 bo.c
-rw-r--r-- 1 jleffler staff 10752 May 7 22:35 tarfile-1.tar
-rw-r--r-- 1 jleffler staff 10752 May 7 22:36 tarfile-2.tar
-rw-r--r-- 1 jleffler staff 10752 May 7 22:36 tarfile-3.tar
-rw-r--r-- 1 jleffler staff 10752 May 7 22:36 tarfile-4.tar
-rw-r--r-- 1 jleffler staff 10752 May 7 22:36 tarfile-5.tar
-rw-r--r-- 1 jleffler staff 2787 May 7 22:36 tarfile-N.tar.gz
-rw-r--r-- 1 jleffler staff 7305 May 7 22:34 zigzag.c
$ tar -tvf tarfile-N.tar.gz
-rw-r--r-- 0 jleffler staff 10752 May 7 22:35 tarfile-1.tar
-rw-r--r-- 0 jleffler staff 10752 May 7 22:36 tarfile-2.tar
-rw-r--r-- 0 jleffler staff 10752 May 7 22:36 tarfile-3.tar
-rw-r--r-- 0 jleffler staff 10752 May 7 22:36 tarfile-4.tar
-rw-r--r-- 0 jleffler staff 10752 May 7 22:36 tarfile-5.tar
$ rm *.tar *.c
$ ls -l
total 8
-rw-r--r-- 1 jleffler staff 2787 May 7 22:36 tarfile-N.tar.gz
$ tarfiles=$(tar -tf *.tar.gz | sed '$d')
$ echo $tarfiles
tarfile-1.tar tarfile-2.tar tarfile-3.tar tarfile-4.tar
$ tar -xf *.tar.gz $tarfiles
$ ls -l
total 104
-rw-r--r-- 1 jleffler staff 10752 May 7 22:35 tarfile-1.tar
-rw-r--r-- 1 jleffler staff 10752 May 7 22:36 tarfile-2.tar
-rw-r--r-- 1 jleffler staff 10752 May 7 22:36 tarfile-3.tar
-rw-r--r-- 1 jleffler staff 10752 May 7 22:36 tarfile-4.tar
-rw-r--r-- 1 jleffler staff 2787 May 7 22:36 tarfile-N.tar.gz
$ for file in $tarfiles; do echo $file; tar -tvf $file; rm $file; done
tarfile-1.tar
-rw-r--r-- 0 jleffler staff 546 May 7 22:34 bo.c
-rw-r--r-- 0 jleffler staff 7305 May 7 22:34 zigzag.c
tarfile-2.tar
-rw-r--r-- 0 jleffler staff 546 May 7 22:34 bo.c
-rw-r--r-- 0 jleffler staff 7305 May 7 22:34 zigzag.c
tarfile-3.tar
-rw-r--r-- 0 jleffler staff 546 May 7 22:34 bo.c
-rw-r--r-- 0 jleffler staff 7305 May 7 22:34 zigzag.c
tarfile-4.tar
-rw-r--r-- 0 jleffler staff 546 May 7 22:34 bo.c
-rw-r--r-- 0 jleffler staff 7305 May 7 22:34 zigzag.c
$ rm -f *
$ cd ..
$ rmdir junk
$
The problem is you are using this command to extract the file:
tar zxf "$f"
And what that does is decompress the gzip file & then extract tar the archive all at once. So all tarred contents would be extracted. Instead, what you want to do is use gzip -d instead:
gzip -d "$f"
That will only decompress the gzip. As for the logic of only extracting the contents of the first four tar archives but not the fifth, it seems like there is more code logic you are not sharing?

Get ONLY sym links to a file

I looked into symbolic link: find all files that link to this file and https://stackoverflow.com/questions/6184849/symbolic-link-find-all-files-that-link-to-this-file but they didn't seem to solve the problem.
if I do find -L -samefile path/to/file
the result contains hard links as well as sym links.
I've been trying to come up with a solution to fetch ONLY sym links, but can't seem to figure it out.
I've been trying to combine -samefile and -type l but that got me nowhere.
man find
says you can combine some options into an expression, but I failed to do it properly.
Any help greatly appreciated!
Ok, I completely misread the question at first.
To find only symlinks to a certain file, I think it's still good approach to combine multiple commands.
So you know the file you want to link to, let's call it targetfile.txt. We have our directory structure like this:
$ ls -laR
.:
total 24
drwxrwxr-x 4 telorb telorb 4096 Mar 28 09:51 .
drwxrwxr-x 57 telorb telorb 4096 Mar 28 09:49 ..
-rw-rw-r-- 1 telorb telorb 21 Mar 28 09:51 another_file.txt
drwxrwxr-x 2 telorb telorb 4096 Mar 28 09:52 folder1
drwxrwxr-x 2 telorb telorb 4096 Mar 28 09:53 folder2
-rw-rw-r-- 3 telorb telorb 28 Mar 28 09:52 targetfile.txt
./folder1:
total 12
drwxrwxr-x 2 telorb telorb 4096 Mar 28 09:52 .
drwxrwxr-x 4 telorb telorb 4096 Mar 28 09:51 ..
-rw-rw-r-- 3 telorb telorb 28 Mar 28 09:52 hardlink
lrwxrwxrwx 1 telorb telorb 17 Mar 28 09:49 symlink1 -> ../targetfile.txt
./folder2:
total 12
drwxrwxr-x 2 telorb telorb 4096 Mar 28 09:57 .
drwxrwxr-x 4 telorb telorb 4096 Mar 28 09:51 ..
-rw-rw-r-- 3 telorb telorb 28 Mar 28 09:52 hardlink2
lrwxrwxrwx 1 telorb telorb 17 Mar 28 09:57 symlink2_to_targetfile -> ../targetfile.txt
lrwxrwxrwx 1 telorb telorb 19 Mar 28 09:53 symlink_to_anotherfile -> ../another_file.txt
The file in folder1/hardlink is a hardlink to targetfile.txt, folder1/symlink1 is a symbolic link we are interested, and same with folder2/symlink2_to_targetfile. There is also another symlink to another file, which we are not interested in.
The approach I would take is first use find . -type l to get symbolic links recursively from specified folder (and we still have full path information).
Then pipe that to xargs and ls -l to get the information which file the link is pointing to, and finally grep our targetfile.txt, so that we remove links that are not pointing to our desired file. The command in full:
find . -type l | xargs -I % ls -l % | grep targetfile.txt
lrwxrwxrwx 1 telorb telorb 17 Mar 28 09:57 ./folder2/symlink2_to_targetfile -> ../targetfile.txt
lrwxrwxrwx 1 telorb telorb 17 Mar 28 09:49 ./folder1/symlink1 -> ../targetfile.txt
The xargs -I % ls -l % sometimes confuses people. Basically with -I % you are telling xargs that % sign will denote all places where you want xargs place the input it receives. So it will effectively replace it to ls -l output_of_find_command

Resources