How I can pack into tar-file only last directory from path? - linux

How I can pack into tar-file only last directory from path?
For example, I have several paths
/usr/local/files/dir1/
file1.txt
file2.txt
file3.txt
/usr/local/files/dir2/
file3.txt
file4.txt
file5.txt
if I run command:
tar czf my_arch.tar.gz -C /usr/local/files/dir1 .
I gain only containment of dir1 catalog, without itself.
So I have - my_arch.tar.gz/file1.txt, file2.txt, file3.txt, But I need structure like that inside my archive -
my_arch.tar.gz/dir1/file1.txt, file2.txt, file3.txt
How I can do this?
Thank you.

try
cd /usr/local/files
tar -cvzf my_arch.tar.gz dir1
The -C directive will make you change into dir1 and thus not archive the folder, but its contents:
-C, --directory DIR
change to directory DIR

you cannot do this directly through tar.
here's my suggestion :
#!/bin/bash
mydir=/my_dir/whit/long_and/complicated_path/the_stuff_is_here
dirname=$(dirname $mydir )
basename=$(basename $mydir )
tar cvf /tmp/$basename.tar -C $dirname $basename

$ tar vczf tmp/export/files.tar.gz -C tmp/export src
structure for files.tar.gz
src
src/app
src/app/main.js
src/app/util
src/app/util/runtime.js

If I understand what you are asking correctly, you want your tar file to contain the directory.
Try it without the -C flag as in:
tar -czf my_arch.tar.gz /usr/local/files/dir1

If you specify -C then you directory path is ./. Probably the following works like you want:
$ touch asdf/foo/bar/{1,2,3}
$ tree asdf/
asdf/
└── foo
└── bar
├── 1
├── 2
└── 3
2 directories, 3 files
$ tar -cv -C asdf/foo/bar/ -f asdf.tar ./
./
./3
./2
./1
$ tar tf asdf.tar
./
./3
./2
./1

Related

How to copy folders and subfolders which have selected files

I have a directory oridir with structure as follows:
.
├── DIRA
│   ├── DIRA1
│   │   └── file2.txt
│   └── DIRA2
│   ├── file1.xls
│   └── file1.txt
└── DIRB
├── DIRB1
│   └── file1.txt
└── DIRB2
└── file2.xls
I have to copy files which have extension .xls while maintaining the directory structure. So I need to have following directory and files in a newdir folder:
.
├── DIRA
│   └── DIRA2
│   └── file1.xls
└── DIRB
└── DIRB2
└── file2.xls
I tried following command but it copies all files and folders:
cp -r oridir newdir
Finding required files can be done as follows:
$ find oridir | grep xls$
oridir/DIRB/DIRB2/file2.xls
oridir/DIRA/DIRA2/file1.xls
Also as follows:
$ find oridir -type f -iname *.xls
./oridir/DIRB/DIRB2/file2.xls
./oridir/DIRA/DIRA2/file1.xls
But how to create these folders and copy files. How can I achieve this selected creation of directories and copying files with `bash' in Linux?
Edit: There are space also in some file and directory names.
cp's --parents flag makes use full source file name under DIRECTORY
For example, if recursive glob ** is enabled (shopt -s globstar):
cp --parents origin/**/*.xls target
If recursive glob is not enabled, you have to add a wildcard for each level on directory hierarchy:
cp --parents origin/*/*/*.xls target
If a destination dir is "dest".
foo.sh
#!/bin/bash
dest=./dest
find . -type f -name "*.xls" | while read f
do
d=$(dirname "${f}")
d="${dest}/${d}"
mkdir -p "${d}"
cp "${f}" "${d}"
done
Make dirs and files.
$ mkdir -p DIRA/DIRA1
$ mkdir -p DIRA/DIRA2
$ mkdir -p DIRB/DIRB1
$ mkdir -p DIRB/DIRB2
$ touch DIRA/DIRA1/file2.txt
$ touch DIRA/DIRA2/file1.xls
$ touch DIRA/DIRA2/file1.txt
$ touch DIRB/DIRB1/file1.txt
$ touch DIRB/DIRB1/file2.xls
A result is
$ find dest
dest
dest/DIRB
dest/DIRB/DIRB1
dest/DIRB/DIRB1/file2.xls
dest/DIRA
dest/DIRA/DIRA2
dest/DIRA/DIRA2/file1.xls
See Yuji's excellent answer first, but I think tar is also a good option here:
cd oridir; find . -name "*.xls" | xargs tar c | (cd ../newdir; tar x)
You may need to adjust oridir and/or ../newdir depending on the precise paths of your directories.
Possible improvement: Here is a version that may be better in that it will handle files (and paths) with spaces (or other strange characters) in their names, and that uses tar's own options instead of xargs and cd:
cd oridir; find . -print0 -name "*.xls" | tar -c --null -T- | tar -C ../newdir -x
Explanation:
The -print0 and the --null cause the respective commands to separate filenames by the null (ASCII 0) character only.
-T- causes tar to read filenames from standard input.
-C causes tar to cd before extracting.

Copy files and preserving directory structure

Here's what I have to do: Find all files which are in the directory src (or in its subdirectories) and have str in their name and copy them to dest preserving the subdirectory structure. For example I have the directory dir1 which contains foo.txt and the subdirectory subdir which also contains foo.txt. After running my script (with str=txt and dest=dir2) dir2 should countain foo.txt and subdir/foo.txt. So far I have come up with this code:
while read -r line; do
cp --parents $line $dest
done <<< "$(find $src -name "*$str*")"
which almost does the job except that it creates dir1 inside of dir2 and the desired files are inside dir2/dir1. I also tried doing it with the -exec option of find but didn't get better results.
IIUC, this can be done with find ... -exec. Let's say we have the following directory:
$ tree
.
└── src
├── dir1
│   └── yet_another_file_src
└── file_src
2 directories, 2 files
We can copy all files that contain *src* to /tmp/copy-here like this:
$ find . -type f -name "*src*" -exec sh -c 'echo mkdir -p /tmp/copy-here/$(dirname {})' \; -exec sh -c 'echo cp {} /tmp/copy-here/$(dirname {})' \;
mkdir -p /tmp/copy-here/./src
cp ./src/file_src /tmp/copy-here/./src
mkdir -p /tmp/copy-here/./src/dir1
cp ./src/dir1/yet_another_file_src /tmp/copy-here/./src/dir1
Notice that I used echo instead of really running this command -
read the output and make sure that this is what you want to
achieve. If you're sure that this would be what you want just remove
echo like this:
$ find . -type f -name "*src*" -exec sh -c 'mkdir -p /tmp/copy-here/$(dirname {})' \; -exec sh -c 'cp {} /tmp/copy-here/$(dirname {})' \;
$ tree /tmp/copy-here
/tmp/copy-here
└── src
├── dir1
│   └── yet_another_file_src
└── file_src
2 directories, 2 files
EDIT:
And of course, you can always use rsync:
$ rsync -avz --include "*/" --include="*src*" --exclude="*" "$PWD" /tmp/copy-here

How to exclude specific files with the tar command?

Let's assume I have the following directory structure
dir/
├── subdir
│   ├── dir
│   │   └── TODO.txt
│   └── TODO.txt
└── TODO.txt
I wish to bundle dir/ recursively into a tarball with the command tar on Linux, but I want to exclude the root TODO.txt. How can I specify this with a relative path ?
Attempt #1
tar -czf dir.tar.gz dir/ --exclude='TODO.txt'
Doesn't work : it gets rid of every TODO.txt in the resulting tarball :
dir/
└── subdir
└── dir
Attempt #2
tar -czf dir.tar.gz dir/ --exclude='dir/TODO.txt'
This also fails, because the dir subdirectory is also matched by this pattern. The resulting tarball hence contains
dir/
└── subdir
├── dir
└── TODO.txt
Is there any way I can specify exactly that I want to exclude the root TODO.txt with a relative path ?
Instead of using dir/ to name your transfer, cd into dir, then name it as .. The folder name . will never appear later in any files path, so it serves as a robust anchor. Then use --transform=, to have the paths in the archive begin with dir/.
Demonstration
without filter:
$ tar -czf dir.tar.gz -v dir
dir/
dir/TODO.txt
dir/subdir/
dir/subdir/TODO.txt
dir/subdir/dir/
dir/subdir/dir/TODO.txt
cd into dir, name it as .:
$ tar -czf dir.tar.gz -v -C dir .
./
./TODO.txt
./subdir/
./subdir/TODO.txt
./subdir/dir/
./subdir/dir/TODO.txt
exclude, anchoring on .:
$ tar -czf dir.tar.gz -v -C dir --exclude='./TODO.txt' .
./
./subdir/
./subdir/TODO.txt
./subdir/dir/
./subdir/dir/TODO.txt
change . back to dir inside the archive (--show-transformed-names makes tar show the names as they go into the archive):
$ tar -czf dir.tar.gz -v -C dir --exclude='./TODO.txt' --transform='s/^\./dir/g' --show-transformed-names .
dir/
dir/subdir/
dir/subdir/TODO.txt
dir/subdir/dir/
dir/subdir/dir/TODO.txt
From #arkascha 's answer :
find dir/ -type f | grep -v "^dir/TODO.txt" > files.txt
then
tar -czf dir.tar.gz -T files.txt
From the first line, there are 2 tricks to pay attention to :
The -type f option. If not put, directories will be included in find's result. This is bad, because it would include each file as many times as their depth in the file hierarchy.
The ^ in grep's regex : it ensures that we're excluding the pattern from the begining of the file hierarchy

diff to output only the file names

I'm looking to run a Linux command that will recursively compare two directories and output only the file names of what is different. This includes anything that is present in one directory and not the other or vice versa, and text differences.
From the diff man page:
-q Report only whether the files differ, not the details of the differences.
-r When comparing directories, recursively compare any subdirectories found.
Example command:
diff -qr dir1 dir2
Example output (depends on locale):
$ ls dir1 dir2
dir1:
same-file different only-1
dir2:
same-file different only-2
$ diff -qr dir1 dir2
Files dir1/different and dir2/different differ
Only in dir1: only-1
Only in dir2: only-2
You can also use rsync
rsync -rv --size-only --dry-run /my/source/ /my/dest/ > diff.out
If you want to get a list of files that are only in one directory and not their sub directories and only their file names:
diff -q /dir1 /dir2 | grep /dir1 | grep -E "^Only in*" | sed -n 's/[^:]*: //p'
If you want to recursively list all the files and directories that are different with their full paths:
diff -rq /dir1 /dir2 | grep -E "^Only in /dir1*" | sed -n 's/://p' | awk '{print $3"/"$4}'
This way you can apply different commands to all the files.
For example I could remove all the files and directories that are in dir1 but not dir2:
diff -rq /dir1 /dir2 | grep -E "^Only in /dir1*" | sed -n 's/://p' | awk '{print $3"/"$4}' xargs -I {} rm -r {}
The approach of running diff -qr old/ new/ has one major drawback: it may miss files in newly created directories. E.g. in the example below the file data/pages/playground/playground.txt is not in the output of diff -qr old/ new/ whereas the directory data/pages/playground/ is (search for playground.txt in your browser to quickly compare). I also posted the following solution on Unix & Linux Stack Exchange, but I'll copy it here as well:
To create a list of new or modified files programmatically the best solution I could come up with is using rsync, sort, and uniq:
(rsync -rcn --out-format="%n" old/ new/ && rsync -rcn --out-format="%n" new/ old/) | sort | uniq
Let me explain with this example: we want to compare two dokuwiki releases to see which files were changed and which ones were newly created.
We fetch the tars with wget and extract them into the directories old/ and new/:
wget http://download.dokuwiki.org/src/dokuwiki/dokuwiki-2014-09-29d.tgz
wget http://download.dokuwiki.org/src/dokuwiki/dokuwiki-2014-09-29.tgz
mkdir old && tar xzf dokuwiki-2014-09-29.tgz -C old --strip-components=1
mkdir new && tar xzf dokuwiki-2014-09-29d.tgz -C new --strip-components=1
Running rsync one way might miss newly created files as the comparison of rsync and diff shows here:
rsync -rcn --out-format="%n" old/ new/
yields the following output:
VERSION
doku.php
conf/mime.conf
inc/auth.php
inc/lang/no/lang.php
lib/plugins/acl/remote.php
lib/plugins/authplain/auth.php
lib/plugins/usermanager/admin.php
Running rsync only in one direction misses the newly created files and the other way round would miss deleted files, compare the output of diff:
diff -qr old/ new/
yields the following output:
Files old/VERSION and new/VERSION differ
Files old/conf/mime.conf and new/conf/mime.conf differ
Only in new/data/pages: playground
Files old/doku.php and new/doku.php differ
Files old/inc/auth.php and new/inc/auth.php differ
Files old/inc/lang/no/lang.php and new/inc/lang/no/lang.php differ
Files old/lib/plugins/acl/remote.php and new/lib/plugins/acl/remote.php differ
Files old/lib/plugins/authplain/auth.php and new/lib/plugins/authplain/auth.php differ
Files old/lib/plugins/usermanager/admin.php and new/lib/plugins/usermanager/admin.php differ
Running rsync both ways and sorting the output to remove duplicates reveals that the directory data/pages/playground/ and the file data/pages/playground/playground.txt were missed initially:
(rsync -rcn --out-format="%n" old/ new/ && rsync -rcn --out-format="%n" new/ old/) | sort | uniq
yields the following output:
VERSION
conf/mime.conf
data/pages/playground/
data/pages/playground/playground.txt
doku.php
inc/auth.php
inc/lang/no/lang.php
lib/plugins/acl/remote.php
lib/plugins/authplain/auth.php
lib/plugins/usermanager/admin.php
rsync is run with theses arguments:
-r to "recurse into directories",
-c to also compare files of identical size and only "skip based on checksum, not mod-time & size",
-n to "perform a trial run with no changes made", and
--out-format="%n" to "output updates using the specified FORMAT", which is "%n" here for the file name only
The output (list of files) of rsync in both directions is combined and sorted using sort, and this sorted list is then condensed by removing all duplicates with uniq
On my linux system to get just the filenames
diff -q /dir1 /dir2|cut -f2 -d' '
I have a directory.
$ tree dir1
dir1
├── a
│   └── 1.txt
├── b
│   └── 2.txt
└── c
├── 3.txt
├── 4.txt
└── d
└── 5.txt
4 directories, 5 files
I have another directory.
$ tree dir2
dir2
├── a
│   └── 1.txt
├── b
└── c
├── 3.txt
├── 5.txt
└── d
└── 5.txt
4 directories, 4 files
I can diff two directories.
$ diff <(cd dir1; find . -type f | sort) <(cd dir2; find . -type f| sort)
--- /dev/fd/11 2022-01-21 20:27:15.000000000 +0900
+++ /dev/fd/12 2022-01-21 20:27:15.000000000 +0900
## -1,5 +1,4 ##
./a/1.txt
-./b/2.txt
./c/3.txt
-./c/4.txt
+./c/5.txt
./c/d/5.txt
rsync -rvc --delete --size-only --dry-run source dir target dir

How to use 'mv' command to move files except those in a specific directory?

I am wondering - how can I move all the files in a directory except those files in a specific directory (as 'mv' does not have a '--exclude' option)?
Lets's assume the dir structure is like,
|parent
|--child1
|--child2
|--grandChild1
|--grandChild2
|--grandChild3
|--grandChild4
|--grandChild5
|--grandChild6
And we need to move files so that it would appear like,
|parent
|--child1
| |--grandChild1
| |--grandChild2
| |--grandChild3
| |--grandChild4
| |--grandChild5
| |--grandChild6
|--child2
In this case, you need to exclude two directories child1 and child2, and move rest of the directories in to child1 directory.
use,
mv !(child1|child2) child1
This will move all of rest of the directories into child1 directory.
Since find does have an exclude option, use find + xargs + mv:
find /source/directory -name ignore-directory-name -prune -print0 | xargs -0 mv --target-directory=/target/directory
Note that this is almost copied from the find man page (I think using mv --target-directory is better than cpio).
First get the names of files and folders and exclude whichever you want:
ls --ignore=file1 --ignore==folder1 --ignore==regular-expression1 ...
Then pass filtered names to mv as the first parameter and the second parameter will be the destination:
mv $(ls --ignore=file1 --ignore==folder1 --ignore==regular-expression1 ...) destination/
This isn't exactly what you asked for, but it might do the job:
mv the-folder-you-want-to-exclude somewhere-outside-of-the-main-tree
mv the-tree where-you-want-it
mv the-excluded-folder original-location
(Essentially, move the excluded folder out of the larger tree to be moved.)
So, if I have a/ and I want to exclude a/b/c/*:
mv a/b/c ../c
mv a final_destination
mkdir -p a/b
mv ../c a/b/c
Or something like that. Otherwise, you might be able to get find to help you.
This will move all files at or below the current directory not in the ./exclude/ directory to /wherever...
find -E . -not -type d -and -not -regex '\./exclude/.*' -exec echo mv {} /wherever \;
ls | grep -v exclude-dir | xargs -t -I '{}' mv {} exclude-dir
rename your directory to make it hidden so the wildcard does not see it:
mv specific_dir .specific_dir
mv * ../other_dir
#!/bin/bash
touch apple banana carrot dog cherry
mkdir fruit
F="apple banana carrot dog cherry"
mv ${F/dog/} fruit
# this removes 'dog' from the list F, so it remains in the
current directory and not moved to 'fruit'
Inspired by #user13747357 's answer.
First you can ls the file and filter them by:
ls | egrep -v '(dir_name|file_name.ext)'
Then you can run the following command to move the files except the specific ones:
mv $(ls | egrep -v '(dir_name|file_name.ext)') target_dir
* Note that I tested this inside a specific directory. Cross-directory operation should be more carefully executed :)
suppose you directory is
.
├── dir1
│ └── a.txt
├── dir2
│ ├── b.txt
│ └── hello.c
├── file1.txt
├── file2.txt
└── file3.txt
and you gonna put file1 file2 file3 into dir2.
you can use
mv $(ls -p | grep -v /) /dir2 to finish it, because
ls -p | grep -v / will print all files except directory in cwd.
For example, if I want to move all files/directories - except a specified file or directory - inside "var/www/html" to a sub-folder named "my_sub_domain", then I use "mv" with the command "!(what_to_exclude)":
$ cd /var/www/html
$ mv !(my_sub_domain) my_sub_domain
To exclude more I use "|" to seperate file/directory names:
$ mv !(my_sub_domain|test1.html) my_sub_domain
mv * exclude-dir
was the perfect solution for me

Resources