How to compare contents of two directoriers in bash? - linux

Lets say there are two dirs
/path1 and /path2
for example
/path1/bin
/path1/lib
/path1/...
/path2/bin
/path2/lib
/path2/...
And one needs to know if they are identical by contents (names of files and content of files) and if not have differences listed.
How to do this in Linux?
Is there some Bash/Zsh command for it?

The diff command can show all the differences between two directories:
diff -qr /path1 /path2

Someone suggested this already but deleted their answer, not sure why. Try using rsync:
rsync -avni /path1/ /path2
This program will normally sync two folders, but with -n it will do a dry-run instead.

I'm using this script for such a task:
diff <(cd "$dir1"; find . -type f -printf "%p %s\n" | sort) \
<(cd "$dir2"; find . -type f -printf "%p %s\n" | sort)
Feel free to adjust the script in the <(...) part to your specific needs. This version uses find to print the directory contents by printing the paths and the sizes of the files it found therein. Other things are possible of course.

Related

linux diff on folder and file structure [duplicate]

I have two directories with the same list of files. I need to compare all the files present in both the directories using the diff command. Is there a simple command line option to do it, or do I have to write a shell script to get the file listing and then iterate through them?
You can use the diff command for that:
diff -bur folder1/ folder2/
This will output a recursive diff that ignore spaces, with a unified context:
b flag means ignoring whitespace
u flag means a unified context (3 lines before and after)
r flag means recursive
If you are only interested to see the files that differ, you may use:
diff -qr dir_one dir_two | sort
Option "q" will only show the files that differ but not the content that differ, and "sort" will arrange the output alphabetically.
Diff has an option -r which is meant to do just that.
diff -r dir1 dir2
diff can not only compare two files, it can, by using the -r option, walk entire directory trees, recursively checking differences between subdirectories and files that occur at comparable points in each tree.
$ man diff
...
-r --recursive
Recursively compare any subdirectories found.
...
Another nice option is the über-diff-tool diffoscope:
$ diffoscope a b
It can also emit diffs as JSON, html, markdown, ...
If you specifically don't want to compare contents of files and only check which one are not present in both of the directories, you can compare lists of files, generated by another command.
diff <(find DIR1 -printf '%P\n' | sort) <(find DIR2 -printf '%P\n' | sort) | grep '^[<>]'
-printf '%P\n' tells find to not prefix output paths with the root directory.
I've also added sort to make sure the order of files will be the same in both calls of find.
The grep at the end removes information about identical input lines.
If it's GNU diff then you should just be able to point it at the two directories and use the -r option.
Otherwise, try using
for i in $(\ls -d ./dir1/*); do diff ${i} dir2; done
N.B. As pointed out by Dennis in the comments section, you don't actually need to do the command substitution on the ls. I've been doing this for so long that I'm pretty much doing this on autopilot and substituting the command I need to get my list of files for comparison.
Also I forgot to add that I do '\ls' to temporarily disable my alias of ls to GNU ls so that I lose the colour formatting info from the listing returned by GNU ls.
When working with git/svn or multiple git/svn instances on disk this has been one of the most useful things for me over the past 5-10 years, that somebody might find useful:
diff -burN /path/to/directory1 /path/to/directory2 | grep +++
or:
git diff /path/to/directory1 | grep +++
It gives you a snapshot of the different files that were touched without having to "less" or "more" the output. Then you just diff on the individual files.
In practice the question often arises together with some constraints. In that case following solution template may come in handy.
cd dir1
find . \( -name '*.txt' -o -iname '*.md' \) | xargs -i diff -u '{}' 'dir2/{}'
Here is a script to show differences between files in two folders. It works recursively. Change dir1 and dir2.
(search() { for i in $1/*; do [ -f "$i" ] && (diff "$1/${i##*/}" "$2/${i##*/}" || echo "files: $1/${i##*/} $2/${i##*/}"); [ -d "$i" ] && search "$1/${i##*/}" "$2/${i##*/}"; done }; search "dir1" "dir2" )
Try this:
diff -rq /path/to/folder1 /path/to/folder2

Counting number of files in a directory with an OSX terminal command

I'm looking for a specific directory file count that returns a number. I would type it into the terminal and it can give me the specified directory's file count.
I've already tried echo find "'directory' | wc -l" but that didn't work, any ideas?
You seem to have the right idea. I'd use -type f to find only files:
$ find some_directory -type f | wc -l
If you only want files directly under this directory and not to search recursively through subdirectories, you could add the -maxdepth flag:
$ find some_directory -maxdepth 1 -type f | wc -l
Open the terminal and switch to the location of the directory.
Type in:
find . -type f | wc -l
This searches inside the current directory (that's what the . stands for) for all files, and counts them.
The fastest way to obtain the number of files within a directory is by obtaining the value of that directory's kMDItemFSNodeCount metadata attribute.
mdls -name kMDItemFSNodeCount directory_name -raw|xargs
The above command has a major advantage over find . -type f | wc -l in that it returns the count almost instantly, even for directories which contain millions of files.
Please note that the command obtains the number of files, not just regular files.
I don't understand why folks are using 'find' because for me it's a lot easier to just pipe in 'ls' like so:
ls *.png | wc -l
to find the number of png images in the current directory.
I'm using tree, this is the way :
tree ph

How to recursively remove different files in two directories

I have 2 different recursive directories, in one directory have 200 .txt files in another have 210 .txt files, need a script to find the different file names and remove them from the directory.
There are probably better ways, but I think about:
find directory1 directory2 -name \*.txt -printf '%f\n' |
sort | uniq -u |
xargs -I{} find directory1 directory2 -name {} -delete
find directory1 directory2 -name \*.txt -printf '%f\n':
print basename of each file matching the glob *.txt
sort | uniq -u:
only print unique lines (if you wanted to delete duplicate, it would have been uniq -d)
xargs -I{} find directory1 directory2 -name {} -delete:
remove them (re-specify the path to narrow the search and avoid deleting files outside the initial search path)
Notes
Thank's to #KlausPrinoth for all the suggestions.
Obviously I'm assuming a GNU userland, I suppose people running with the tools providing bare minimum POSIX compatibility will be able to adapt it.
Yet another way is to use diff which is more than capable in finding file differences in files in directories. For instance if you have d1 and d2 that contain your 200 and 210 files respectively (with the first 200 files being the same), you could use diff and process substitution to provide the names to remove to a while loop:
( while read -r line; do printf "rm %s\n" ${line##*: }; done < <(diff -q d1 d2) )
Output (of d1 with 10 files, d2 with 12 files)
rm file11.txt
rm file12.txt
diff will not fit all circumstances, but is does a great job finding directory differences and is quite flexible.

How to list files on directory shell script

I need to list the files on Directory. But only true files, not the folders.
Just couldn't find a way to test if the file is a folder or a directory....
Could some one provide a pice o script for that?
Thanks
How about using find?
To find regular files in the current directory and output a sorted list:
$ find -maxdepth 1 -type f | sort
To find anything that is not a directory (Note: there are more things than just regular files and directories in Unix-like systems):
$ find -maxdepth 1 ! -type d | sort
In bash shell test -f $file will tell you if $file is a file:
if test -f $file; then echo "File"; fi
You can use ls -l | grep ^d -v to realize what you want. I tested it in Redhat 9.0, it lists only the true files, including the Hidden Files.
If u want to get a list the folders on Directory. But only folders, not the true files. You can use ls -l | grep ^d

Finding executable files using ls and grep

I have to write a script that finds all executable files in a directory. So I tried several ways to implement it and they actually work. But I wonder if there is a nicer way to do so.
So this was my first approach:
ls -Fla | grep \*$
This works fine, because the -F flag does the work for me and adds to each executable file an asterisk, but let's say I don't like the asterisk sign.
So this was the second approach:
ls -la | grep -E ^-.{2}x
This too works fine, I want a dash as first character, then I'm not interested in the next two characters and the fourth character must be a x.
But there's a bit of ambiguity in the requirements, because I don't know whether I have to check for user, group or other executable permission. So this would work:
ls -la | grep -E ^-.{2}x\|^-.{5}x\|^-.{8}x
So I'm testing the fourth, seventh and tenth character to be a x.
Now my real question, is there a better solution using ls and grep with regex to say:
I want to grep only those files, having at least one x in the ten first characters of a line produced by ls -la
Do you need to use ls? You can use find to do the same:
find . -maxdepth 1 -perm -111 -type f
will return all executable files in the current directory. Remove the -maxdepth flag to traverse all child directories.
You could try this terribleness but it might match files that contain strings that look like permissions.
ls -lsa | grep -E "[d\-](([rw\-]{2})x){1,3}"
If you absolutely must use ls and grep, this works:
ls -Fla | grep '^\S*x\S*'
It matches lines where the first word (non-whitespace) contains at least one 'x'.
Find is the perfect tool for this. This finds all files (-type f) that are executable:
find . -type f -executable
If you don't want it to recursively list all executables, use maxdepth:
find . -maxdepth 1 -type f -executable
Perhaps with test -x?
for f in $(\ls) ; do test -x $f && echo $f ; done
The \ on ls will bypass shell aliases.
for i in `ls -l | awk '{ if ( $1 ~ /x/ ) {print $NF}}'`; do echo `pwd`/$i; done
This gives absolute paths to the executables.
While the question is very old and has been answered a long time ago, I want to add the version for anyone who is using the fd utility (which I personally highly recommend, see https://github.com/sharkdp/fd if you want to try), you get the same result as find . -type f -executable by running:
fd -tx
or
fd --type executable
One can also add -d or --max-depth argument, same as for the original find.
Maybe someone will find this useful.
file * |grep "ELF 32-bit LSB executable"|awk '{print $1}'

Resources