Skip characters in "find" output - linux

I'm writing a bash script which in a certain part of the process should list the files in a directory older than 1 day and print the list to a text file to work with it later. This is the current command I have:
find . -mtime +0 > list.txt
The problem with this command is that it prints the filenames preceded by "./", e.g.:
./file1
./file2
./file3
How can I do to print only the filenames in this way?
file1
file2
file3

Use basename:
find . -mtime +0 -type f -exec basename {} \; > list.txt
(the reason for the -type f is because otherwise the searched directory is printed).

No need to use extra binary commands if your find supports it:
find . -mtime +0 -printf '%f\n' > list.txt
When targeting files, just add -type f:
find . -mtime +0 -printf '%f\n' -type f > list.txt
Or if you intend to show the files and directories in a specified directory:
find some_dir -mtime +0 -printf '%f\n' -mindepth 1 > list.txt

Related

Get n level parent with find command

I want to find my n-th level parent with the find command,
initially, when used this command, it gives me the whole file path:
Modified_files_users="$(find /var/lib/abcccc/tamm/acb-Beta-DB-abcc/abc-central/src/main/taff/com/hifinite/components/user
-type f -mtime -5;)";
Output:
/var/lib/abcccc/tamm/acb-Beta-DB-abcc/abc-central/src/main/taff/com/hifinite/components/user/file/foo.ext
Hence I used the basename GNU with find, but it only gives the file name.
Modified_files_users="$(find /var/lib/abcccc/tamm/acb-Beta-DB-abcc/abc-central/src/main/taff/com/hifinite/components/user
-type f -mtime -5 -exec basename \{} \;)";
Ouput:
foo.txt
but the Output I expect is
/file/foo.ext
Is there any way I can get this by adding anything to the -exec command?
basically either I should be able to specify the nth parent which should be included in the output OR) find the whole path after
/var/lib/abcccc/tamm/acb-Beta-DB-abcc/abc-central/src/main/taff/com/hifinite/components/user
You need to use printf with %P:
find somedirectory -type f -printf '%P\n'
Document:
%P File’s name with the name of the command line argument under which it was found removed.
Example:
$ find /home/abc/temp -type f
/home/abc/temp/A2018001.txt
/home/abc/temp/myfiles.zip
/home/abc/temp/org/springframework/boot/loader/PropertiesLauncher$PrefixMatchingArchiveFilter.class
With printf %P:
$ find /home/abc/temp -type f -printf '%P\n'
A2018001.txt
myfiles.zip
org/springframework/boot/loader/PropertiesLauncher$PrefixMatchingArchiveFilter.class

Formatting md5sum differently?

I need to find the md5sum of files recursively and list these files alphabetically. However, in my final output I don't want the sum to actually show up. For example if I issue:
find -not -empty -type f -exec md5sum "{}" \;
I get this:
0df8724ef24b15e54cc9a26e7679bb90 ./doc1.txt
d453430ce039863e242365eecaad7888 ./doc2.txt
53b2e8ae1dfaeb64ce894f75dd6b957c ./test.sh~
1ba03849883277c3c315d5132d10d6f0 ./md5file.txt
6971b4dbbd6b5b8d1eefbadc0ecd1382 ./test.sh
is there a simple way make this command to show only the files like:
./doc1.txt
./doc2.txt
./test.sh~
./md5file.txt
./test.sh
thx!
As Cyrus and Sriharsha say, simply using:
find -not -empty -type f
will give you the result you need.
Pass the output of find command to awk or cut.
find -not -empty -type f -exec md5sum "{}" \; | awk '{print $2}'
OR
Use sed if the filename contains spaces.
find -not -empty -type f -exec md5sum "{}" \; | sed 's/^[^ ]\+ \+//'

cat files in subdirectories using linux commands

I have the following directories:
P922_101
P922_102
.
.
Each directory, for instance P922_101 has following subdirectories:
140311_AH8MHGADXX 140401_AH8CU4ADXX
Each subdirectory, for instance 140311_AH8MHGADXX has the following files:
1_140311_AH8MH_P922_101_1.fastq.gz 1_140311_AH8MH_P922_101_2.fastq.gz
2_140311_AH8MH_P922_101_1.fastq.gz 2_140311_AH8MH_P922_101_2.fastq.gz
And files in 140401_AH8CU4ADXX are:
1_140401_AH8CU_P922_101_1.fastq.gz 1_140401_AH8CU_P922_4001_2.fastq.gz
2_140401_AH8CU_P922_101_1.fastq.gz 2_140401_AH8CU_P922_4001_2.fastq.gz
I want to do 'cat' for the files in the subdirectories in the following way:
cat 1_140311_AH8MH_P922_101_1.fastq.gz 2_140311_AH8MH_P922_101_1.fastq.gz
1_140401_AH8CU_P922_101_1.fastq.gz 2_140401_AH8CU_P922_101_1.fastq.gz > P922_101_1.fastq.gz
which means that files ending with _1.fastq.gz should be concatenated into a single file and files ending with _2.fatsq.gz into another file.
It should be run for all files in subdirectories in all directories. Could someone give a linux solution to do this?
Since they're compressed, you should probably use gzip -dc (decompress and write to stdout) -
find /somePath -type f -name "*.fastq.gz" -exec gzip -dc {} \; | \
tee -a /someOutFolder/out.txt
You can use find for this:
find /top/path -mindepth 2 -type f -name "*_1.fastq.gz" -exec cat {} \; > one_file
find /top/path -mindepth 2 -type f -name "*_2.fastq.gz" -exec cat {} \; > another_file
This will look for all the files starting from /top/path and having a name matching the pattern _1.fastq.gz / _2.fastq.gz and cat them into the desired file. -mindepth 2 makes find look for files that are at least under the current directory; this way, files in /top/path won't be matched.
Note that you will probably need zcat instead of cat, for gz files.
As you keep adding details in comments, let's see what else we can do:
Say you have the list of directories in a file directories_list, each line containing one:
while read directory
do
find $directory -mindepth 2 -type f -name "*_1.fastq.gz" -exec cat {} \; > $directory/output
done < directories_list

cronjob to remove files older than N days with special characters

I'm trying to create a job to delete files on a linux box older than X days. Pretty straightforward with:
find /path/to/files -mtime +X -exec rm {}\;
Problem is all my files have special characters b/c they are pictures from a webcam - most contain parenthesis so the above command fails with "no such file or directory".
Have you tried this:
find /path/to/files -mtime +X -exec rm '{}' \;
Or perhaps:
rm $(find /path/to/files -mtime +X);
Or even this method using xargs instead of -exec:
find /path/to/files -mtime +X | xargs rm -f;
Another twist on xargs is to use -print0 which will help the script differentiate between spaces in filenames & spaces between the returned list by using the ASCII null character as a file separator:
find /path/to/files -mtime +X -print0 | xargs -0 rm -f;
Or as man find explains under -print0:
This primary always evaluates to true. It prints the pathname of
the current file to standard output, followed by an ASCII NUL
character (character code 0).
I would also recommend adding the -maxdepth and -type flags to better control what the script does. So I would use this for a dry-run test:
find /path/to/files -maxdepth 1 -type f -mtime +1 -exec echo '{}' \;
The -maxdepth flag controls how many directories down the find will execute and -type will limit the search to files (aka: f) so the script is focused on files only. This will simply echo the results. Then when you are comfortable with it, change the echo to rm.
Does
find /path/to/files -mtime +X -print | tr '()' '?' | xargs rm -f
work?

How to copy all the files with the same suffix to another directory? - Unix

I have a directory with unknown number of subdirectories and unknown level of sub*directories within them. How do I copy all the file swith the same suffix to a new directory?
E.g. from this directory:
> some-dir
>> foo-subdir
>>> bar-sudsubdir
>>>> file-adx.txt
>> foobar-subdir
>>> file-kiv.txt
Move all the *.txt files to:
> new-dir
>> file-adx.txt
>> file-kiv.txt
One option is to use find:
find some-dir -type f -name "*.txt" -exec cp \{\} new-dir \;
find some-dir -type f -name "*.txt" would find *.txt files in the directory some-dir. The -exec option builds a command line (e.g. cp file new.txt) for every matching file denoted by {}.
Use find with xargs as shown below:
find some-dir -type f -name "*.txt" -print0 | xargs -0 cp --target-directory=new-dir
For a large number of files, this xargs version is more efficient than using find some-dir -type f -name "*.txt" -exec cp {} new-dir \; because xargs will pass multiple files at a time to cp, instead of calling cp once per file. So there will be fewer fork/exec calls with the xargs version.

Resources