Linux command to find the difference between two folders - linux

I have two folders, each with sub folders, and I want to see if there are any sub folders in one file that does not exist the other folder. I have tried this command:
diff -r file1 file2
but it does not provide the results that I want.
For example if file1 contains three folders A, B, and C and file 2 contains 1 folder B, then the output should be folders A and C.

diff -r dir1 dir2 | grep dir1 | awk '{print $4}' > difference1.txt
Explanation:
diff -r dir1 dir2 shows which files are only in dir1 and those
only in dir2 and also the changes of the files present in both directories if any.
diff -r dir1 dir2 | grep dir1 shows which files are only in dir1
awk to print only filename.

Related

find main directory name which doesn't contains file in sub directory

I have three folders as test1, test2,test3. under these three folders i have two folders as common dir1,dir2 . Each of the above three parent folders will have these two as sub folders. I will have one text file (file.txt) in dir1 in any one of the parent directory.
Is there a find or grep single line command to get the parent directory name as output if file.txt doesn't exists in dir1 sub folder?
Example:
ls test1
o/p: dir1, dir2
ls test2
o/p: dir1/file.txt,dir2
ls test3
o/p: dir1/file.txt,dir2
i need a command which gives test1 as output
something like below should do what you need :
for f in test*/dir1/
do
[[ $(find "$f" -name "file.txt") ]] || echo `dirname "$f"`
done

Compare directory structures

Is it possible to compare directory structures of two different server? I need to compare the directory structure of a test with that of a production server and list the directories that exists on prod but no in test (the test server has lot less info).
I am using following rsync command
rsync -rvnc --delete userid#servername:/directory /directory
Besides above rsync, i have also tried running find commands on both server, sdiff the two output of find
find directory1 -type d -printf "%P\n" | sort > file1
find directory2 -type d -printf "%P\n" | sort > file2
sdiff file 1 file2 > file3
Please help which approach would be better.
you can use rsync -ai dir1/ dir2/ --dry-run to create a machine-readable list of changes between dir1 and dir2.
source: https://stackoverflow.com/a/42160545/2536029

Linux (ubuntu): copying directory to another directory which does not exist

I am going through linux tutorial, and in it is written that cp -r dir1 dir2 will copy directory dir1 into dir2, and if dir2 does not exist then dir2 will be created and then dir1 will be copied to dir2.
However when I try in my ubuntu 12.04 system this is not happening.
sps#sps-Inspiron-N5110:~$ ls | grep dir
sps#sps-Inspiron-N5110:~$ mkdir dir1
sps#sps-Inspiron-N5110:~$ ls | grep dir
dir1
sps#sps-Inspiron-N5110:~$ cp -r dir1 dir2
sps#sps-Inspiron-N5110:~$ ls | grep dir
dir1
dir2
sps#sps-Inspiron-N5110:~$ ls ./dir2
sps#sps-Inspiron-N5110:~$
So you see that although dir2 is created, it is empty. Confirming this below.
sps#sps-Inspiron-N5110:~$ cd dir2
sps#sps-Inspiron-N5110:~/dir2$ ls
sps#sps-Inspiron-N5110:~/dir2$
Could someone tell me if this is expected or something is going wrong ? Why is dir1 not copied to dir2 ?
Thanks.

Diff between files in same directory structure in linux

I have two directories that contains the same directory structure, also directory names are same (possibly different number of files) and how can I find out the differences between all the file contents and files in Linux?
Here is an example
\dir1
\subdir1
\file1
\subdir2
\file2
\file3
dir2
\subdir1
\file1
\subdir2
\file2
\file3
\file4
The content of file1 in dir1 and file1 in dir2 are different. The content of file2 in dir1 and file2 in dir2 are different. I can use
$diff dir1\subdir1\file1 dir2\subdir1\file1
$diff dir1\subdir1\file2 dir2\subdir1\file2
But I have to manually do the diff for each file. I would like to have an automatic way.
If you want to list the files being different, try:
diff -rq dir1 dir2
If you want to list the difference inside each file, remove the q:
diff -r dir1 dir2

diff to output only the file names

I'm looking to run a Linux command that will recursively compare two directories and output only the file names of what is different. This includes anything that is present in one directory and not the other or vice versa, and text differences.
From the diff man page:
-q Report only whether the files differ, not the details of the differences.
-r When comparing directories, recursively compare any subdirectories found.
Example command:
diff -qr dir1 dir2
Example output (depends on locale):
$ ls dir1 dir2
dir1:
same-file different only-1
dir2:
same-file different only-2
$ diff -qr dir1 dir2
Files dir1/different and dir2/different differ
Only in dir1: only-1
Only in dir2: only-2
You can also use rsync
rsync -rv --size-only --dry-run /my/source/ /my/dest/ > diff.out
If you want to get a list of files that are only in one directory and not their sub directories and only their file names:
diff -q /dir1 /dir2 | grep /dir1 | grep -E "^Only in*" | sed -n 's/[^:]*: //p'
If you want to recursively list all the files and directories that are different with their full paths:
diff -rq /dir1 /dir2 | grep -E "^Only in /dir1*" | sed -n 's/://p' | awk '{print $3"/"$4}'
This way you can apply different commands to all the files.
For example I could remove all the files and directories that are in dir1 but not dir2:
diff -rq /dir1 /dir2 | grep -E "^Only in /dir1*" | sed -n 's/://p' | awk '{print $3"/"$4}' xargs -I {} rm -r {}
The approach of running diff -qr old/ new/ has one major drawback: it may miss files in newly created directories. E.g. in the example below the file data/pages/playground/playground.txt is not in the output of diff -qr old/ new/ whereas the directory data/pages/playground/ is (search for playground.txt in your browser to quickly compare). I also posted the following solution on Unix & Linux Stack Exchange, but I'll copy it here as well:
To create a list of new or modified files programmatically the best solution I could come up with is using rsync, sort, and uniq:
(rsync -rcn --out-format="%n" old/ new/ && rsync -rcn --out-format="%n" new/ old/) | sort | uniq
Let me explain with this example: we want to compare two dokuwiki releases to see which files were changed and which ones were newly created.
We fetch the tars with wget and extract them into the directories old/ and new/:
wget http://download.dokuwiki.org/src/dokuwiki/dokuwiki-2014-09-29d.tgz
wget http://download.dokuwiki.org/src/dokuwiki/dokuwiki-2014-09-29.tgz
mkdir old && tar xzf dokuwiki-2014-09-29.tgz -C old --strip-components=1
mkdir new && tar xzf dokuwiki-2014-09-29d.tgz -C new --strip-components=1
Running rsync one way might miss newly created files as the comparison of rsync and diff shows here:
rsync -rcn --out-format="%n" old/ new/
yields the following output:
VERSION
doku.php
conf/mime.conf
inc/auth.php
inc/lang/no/lang.php
lib/plugins/acl/remote.php
lib/plugins/authplain/auth.php
lib/plugins/usermanager/admin.php
Running rsync only in one direction misses the newly created files and the other way round would miss deleted files, compare the output of diff:
diff -qr old/ new/
yields the following output:
Files old/VERSION and new/VERSION differ
Files old/conf/mime.conf and new/conf/mime.conf differ
Only in new/data/pages: playground
Files old/doku.php and new/doku.php differ
Files old/inc/auth.php and new/inc/auth.php differ
Files old/inc/lang/no/lang.php and new/inc/lang/no/lang.php differ
Files old/lib/plugins/acl/remote.php and new/lib/plugins/acl/remote.php differ
Files old/lib/plugins/authplain/auth.php and new/lib/plugins/authplain/auth.php differ
Files old/lib/plugins/usermanager/admin.php and new/lib/plugins/usermanager/admin.php differ
Running rsync both ways and sorting the output to remove duplicates reveals that the directory data/pages/playground/ and the file data/pages/playground/playground.txt were missed initially:
(rsync -rcn --out-format="%n" old/ new/ && rsync -rcn --out-format="%n" new/ old/) | sort | uniq
yields the following output:
VERSION
conf/mime.conf
data/pages/playground/
data/pages/playground/playground.txt
doku.php
inc/auth.php
inc/lang/no/lang.php
lib/plugins/acl/remote.php
lib/plugins/authplain/auth.php
lib/plugins/usermanager/admin.php
rsync is run with theses arguments:
-r to "recurse into directories",
-c to also compare files of identical size and only "skip based on checksum, not mod-time & size",
-n to "perform a trial run with no changes made", and
--out-format="%n" to "output updates using the specified FORMAT", which is "%n" here for the file name only
The output (list of files) of rsync in both directions is combined and sorted using sort, and this sorted list is then condensed by removing all duplicates with uniq
On my linux system to get just the filenames
diff -q /dir1 /dir2|cut -f2 -d' '
I have a directory.
$ tree dir1
dir1
├── a
│   └── 1.txt
├── b
│   └── 2.txt
└── c
├── 3.txt
├── 4.txt
└── d
└── 5.txt
4 directories, 5 files
I have another directory.
$ tree dir2
dir2
├── a
│   └── 1.txt
├── b
└── c
├── 3.txt
├── 5.txt
└── d
└── 5.txt
4 directories, 4 files
I can diff two directories.
$ diff <(cd dir1; find . -type f | sort) <(cd dir2; find . -type f| sort)
--- /dev/fd/11 2022-01-21 20:27:15.000000000 +0900
+++ /dev/fd/12 2022-01-21 20:27:15.000000000 +0900
## -1,5 +1,4 ##
./a/1.txt
-./b/2.txt
./c/3.txt
-./c/4.txt
+./c/5.txt
./c/d/5.txt
rsync -rvc --delete --size-only --dry-run source dir target dir

Resources