Find specific string in subdirectories and order top directories by modification date - linux

I have a directory structure containing some files. I'm trying to find the names of top directories that do contain a file with specific string in it.
I've got this:
grep -r abcdefg . | grep commit_id | sed -r 's/\.\/(.+)\/.*/\1/';
Which returns something like:
topDir1
topDir2
topDir3
I would like to be able to take this output and somehow feed it into this command:
ls -t | grep -e topDir1 -e topDir2 -e topDir3
which would returned the output filtered by the first command and ordered by modification date.
I'm hoping for a one liner. Or maybe there is a better way of doing it?

This should work as long as none of the directory names contain whitespace or wildcard characters:
ls -td $(grep -r abcdefg . | grep commit_id | dirname)

Related

Filter directories in piped input

I have a bash command that lists a number of files and directories. I want to remove everything that is not an existing directory. Is there anyway I can do this without creating a script of my own? I.e. I want to use pre-existing programs available in linux.
E.g. Given that I have this folder:
dir1/
dir2/
file.txt
I want to be able to run something like:
echo dir1 dir2 file.txt somethingThatDoesNotExist | xargs [ theCommandIAmLookingFor]
and get
dir1
dir2
It would be better if the command generating the putative paths used a better delimeter, but you might be looking for something like:
... | xargs -n 1 sh -c 'test -d "$0" && echo $0'
You can use this command line using grep -v:
your_command | grep -vxFf <(printf '%s\n' */ | sed 's/.$//') -
This will filter out all the sub-directories in current path from your list.
If in case you want to list only existing directories then remove -v as:
your_command | grep -xFf <(printf '%s\n' */ | sed 's/.$//') -
Note that glob */ prints all sub-directories in current path with a trailing / and sed is used to remove this last /.

Modify ls output to display [+] in front of directories

I am looking for a way to modify the ls output in that way that every directory displays [+] in front of the directory name. Ideally doing via bashrc.
me#computer[~]$ ls
[+]directory [+]directory
[+]directory file.png
file file.txt
readme
Currently I am just customizing the color output:
LS_COLORS=$LS_COLORS:'di=1;37;4' ; export LS_COLORS
This might help you, but it gives you only one column output:
ls | sed -r "$(find -maxdepth 1 -type d | cut -d/ -f2 | sed "1 d; 2~1 { s:.*:s/^\\(&\\)$/[+]\\\\1/;:g}")"
It works by piping the output of ls through sed and the sed script is dynamically build using a pipe that converts a list of directories to a list of S/^dirname$/[+]dirname/; sed script lines.
Just try out all the parts individually to see how it works.
For example when run in /etc the outputs starts likes this:
[+]acpi
adduser.conf
[+]adobe
[+]akonadi
aliases
aliases.db
You might want to alias the command in your bashrc.
And you might want to look into the tree command.
You can use :
ls -l : directories will start with d.
ls -p : a slash will be added into directory name like dir/
ls -F : will also add a slash after dir names and other marks to other file types (*, etc)
ls -d */ : As advised in comments, will list only dir names with a slash at the end. Remove -d to see also sub dir contents.
In terms of manipulating ls output you could go like :
ls -l |awk '/^d/{print "[+]"$NF}; /^[^d]/{print $NF}' |column
You can also use find and avoid parsing ls since had been said that parsing ls might break if file names contain strange chars like new lines etc.
find in this format will produce output identical to above ls:
find . -maxdepth 1 -printf '%Y %f\n' |awk '/^d/{print "[+]"$NF}; /^[^d]/{print $NF}' |column
you should also try this using a bash script
#!/usr/bin/env bash
myls() {
for i in *;do
[[ -d "${i}" ]] && {
printf "%s\n" "[+] ${i}"
continue;
}
printf "%s\n" "${i}"
done
}
source the script in your .bashrc file. Whenever you want to use this, just call myls in the directory.
you should note that it does not give you a colored output

How to find files with same name part in directory using the diff command?

I have two directories with files in them. Directory A contains a list of photos with numbered endings (e.g. janet1.jpg laura2.jpg) and directory B has the same files except with different numbered endings (e.g. janet41.jpg laura33.jpg). How do I find the files that do not have a corresponding file from directory A and B while ignoring the numbered endings? For example there is a rachael3 in directory A but no rachael\d in directory B. I think there's a way to do with the diff command in bash but I do not see an obvious way to do it.
I can't see a way to use diff for this directly. It will probably be easier to use a sums tool (md5, sha1, etc.) on both directories and then sort both files based on the first (sum) column and diff/compare those output files.
Alternatively, something like findimagedupes (which isn't as simple a comparison as diff or a sums check) might be a simpler (and possibly more useful) solution.
It seems you know that your files are the same, if they exist and you are sure, there is only one of a kind per directory.
So to diff the contents of the directory according to this, you need to get only the relevant parts of the file name ("laura", "janet").
This could be done by simple grepping the appropriate parts from the output of ls like this:
ls dir1/ | egrep -o '^[a-A]+'
Then to compare, let's say dir1 and dir2, you can use:
diff <(ls dir1/ | egrep -o '^[a-A]+') <(ls dir2/ | egrep -o '^[a-A]+')
Assuming the files are simply renamed and otherwise identical, a simple solution to find the missing ones is to use md5sum (or sha or somesuch) and uniq:
#!/bin/bash
md5sum A/*.jpg B/*.jpg >index
awk '{print $1}' <index | sort >sums # delete dir/file
# list unique files (missing from one directory)
uniq -u sums | while read s; do
grep "$s" index | sed 's/^[a-z0-9]\{32\} //'
done
This fails in the case where a folder contains several copies of the same file renamed (such that the hash matches multiple files in one folder), but that is easily fixed:
#!/bin/bash
md5sum A/*.jpg B/*.jpg > index
sed 's/\/.*//' <index | sort >sums # just delete /file
# list unique files (missing from one directory)
uniq sums | awk '{print $1}' |\
uniq -u | while read s junk; do
grep "$s" index | sed 's/^[a-z0-9]\{32\} //'
done

grep -o and display part of filenames using ls

I have a directory which has many directories inside it with the pattern of their name as :
YYYYDDMM_HHMISS
Example: 20140102_120202
I want to extract only the YYYYDDMM part.
I tried ls -l|awk '{print $9}'|grep -o ^[0-9]* and got the answer.
However i have following questions:
Why doesnt this return any results: ls -l|awk '{print $9}'|grep -o [0-9]* . Infact it should have returned all the directories.
Strangely just including '^' before [0-9] works fine :
ls -l|awk '{print $9}'|grep -o ^[0-9]*
Any other(simpler) way to achieve the result?
Why doesnt this return any results: ls -l|awk '{print $9}'|grep -o [0-9]*
If there are files in your current directory that start with [0-9], then the shell will expand them before calling grep. For example, if I have two files a1, a2 and a3 and run this:
ls | grep a*
After the filenames are expanded, the shell will run this:
ls | grep a1 a2 a3
The result of which is that it will print the lines in a2 and a3 that match the text "a1". It will also ignore whatever is coming from stdin, because when you specify filenames for grep (2nd argument and beyond), it will ignore stdin.
Next, consider this:
ls | grep ^a*
Here, ^ has no special meaning to the shell, so it uses it verbatim. Since I don't have filenames starting with ^a, it will use ^a* as the pattern. If I did have filenames like ^asomething or ^another, then again, ^a* would be expanded to those filenames and grep would do something I didn't really intend.
This is why you have to quote search patterns, to prevent the shell from expanding them. The same goes for patterns in find /path -name 'pattern'.
As for a simpler way for what you want, I think this should do it:
ls | sed -ne 's/_.*//p'
To show only the YYDDMM part of the directory names:
for i in ./*; do echo $(basename "${i%%_*}"); done
Not sure what you want to do with it once you've got it though...
You must avoid parsing ls output.
Simple is to use this printf:
printf "%s\n" [0-9]*_[0-9]*|egrep -o '^[0-9]+'

grep search for a string with a hyphen in the middle

I am looking for files that would contain the string abc-def in a folder.
I am using grep -l -r abc-def *, but I am not sure if that is the right way (no files were found when this command was used, but perhaps it simply means that no file contains the string). I have also tried grep -l -r 'abc-def' * (found files, but when I manually looked for the string they were not there, only the individual parts of the string, i.e. abc and def). Since the pattern does not start with a hyphen -e would not work here.
What would be the proper way to grep search for a string with a hyphen in the middle?
Try grep -r abc-def first to see which lines match. grep -r abc-def * and grep -r 'abc-def' * should really yield the same result.
fgrep (f is for 'fixed string`) is not necessary here.
this should work:
fgrep -r -l 'abc-def' .

Resources