Using Perl how can I clean up left over directories with no files? - linux

There is a specific directory which is used as a temp/scratch directory by some program.
E.g. /a/b/c/work
Under work multiple hierarchical directories may exist e.g.
/a/b/c/work/
\d1/
\d1.1
\d2
\d2.2
What I want is to clean up this work directory as there are left over files that take space.
Essentially I need to delete all subdirectories under work that the leaf directory is empty.
So if d1.1 is empty but d2.2 has files then delete everything under d1 (including d1) but not d2.
What is the cleanest/standard way to do this in perl?
I thought to use a solution with backticks e.g. rm -rf etc but I thought there could be some better way than coding sequences of ls folowed by rm
Note: Just to be clear. I want a solution in Perl as this is not a one time thing and I dont want to do this manually each time

If you use find command this way you can achieve it.
find /path/to/dir -empty -type d -delete
Where,
-empty Only find empty files and make sure it is a regular file or a directory.
-type d Only match directories.
-delete Delete files.
Always put -delete option at the end of find command as find command line is evaluated as an expression, so putting -delete first will make find try to delete everything below the starting points you specified.
To automate this in shell script follow below code:
path=`pwd`
find $path -empty -type d -delete
or you can give certain input as arguments of shell script like myShell.sh /path/to/mydir in that case the following code will be do the work,
$path=$1
find $path -empty -type d -delete
As for if you really want to go for perl you can find your answer as follows
use strict;
use warnings;
use File::Util;
my $path = '...';
my $fu = File::Util->new();
my #all_dirs = $fu->list_dir($path, '--recurse', '--dirs-only');
my #empty_dirs = grep { not $fu->list_dir($_) } #all_dirs;
also a short method
perl -MFile::Find -e"finddepth(sub{rmdir},'.')"
which is explained very good here.

Related

Why find's -exec option is including 'non-matched' items?

I'm trying to use find to find and exclude/filter few directories from being copied to another backup directory.
My attempts to do so using find's '-exec' option end up copying every processed file instead of only the matches, so I'm quite confused about what the expected behavior should be and would appreciate help gaining better understanding.
Starting point:
me#computer>ls
AddMonitorsOnEntry MantisCoreFormatting MantisGraph PastePicture XmlImportExport
Make sure find excludes the unwanted 'files' as expected
me#computer>find . -maxdepth 1 -not -regex '.*MantisCoreFormatting\|.*MantisGraph\|.*XmlImportExport'
.
./AddMonitorsOnEntry
./PastePicture
Now to copy those 2 directories to a backup dir:
me#computer>find . -maxdepth 1 -not -regex '.*MantisCoreFormatting\|.*MantisGraph\|.*XmlImportExport' -exec cp -dr '{}' ~/backup \;
Now to see if it worked...
me#computer>cd ~/backup
me#computer>ls
AddMonitorsOnEntry backup MantisCoreFormatting MantisGraph PastePicture XmlImportExport
WTH??
I thought '-exec' only operated on the matches, according to this snippet from the man page: " ...The specified command is run once for each matched file..."
I know there are other ways to accomplish this task, but '-exec' seems to work well enough for the poster here https://unix.stackexchange.com/questions/50612/how-to-combine-2-name-conditions-in-find/50633. I'm looking for help understanding how to make use of "-exec" versus using xargs or something else. Thanks.
Now to copy those 2 directories to a backup dir
You don't have 2 matches. Your command shows 3:
.
./AddMonitorsOnEntry
./PastePicture
. is the current directory, so your cp command copies everything.
Instead of find . you can use find * to skip the current directory ., but still process all the (non-hidden) files/dirs within it.
Silly of me..
My initial find expression includes the current directory as a result, so any files in the current dir will be operated on by "-exec".
To fix I added the current dir among the ones excluded.
me#computer>find . -maxdepth 1 -not -regex '.*MantisCoreFormatting\|.*MantisGraph\|.*XmlImportExport\|\.'
./AddMonitorsOnEntry
./PastePicture

Customized deleting files from a folder

I have a folder where different files can be located. I would like to check if it contains other files than .gitkeep and delete them, keeping .gitkeep at once. How can I do this ? (I'm a newbie when it comes to bash)
As always, there are multiple ways to do this, I am just sharing what little I know of linux :
1)find <path-to-the-folder> -maxdepth 1 -type f ! -iname '\.gitkeep' -delete
maxdepth of 1 specifies to search only the current directory. If you remove maxdepth, it will recursively find all files other than '.gitkeep' in all directories under your path. You can increase maxdepth to however deep you want find to go into directories from your path.
'-type f' specifies that we are just looking for files . If you want to find directories as well (or links, other types ) then you can omit this option.
-iname '.gitkeep' specifies a case insensitive math for '.gitkeep', the '\' is used for escaping the '.', since in bash, '.' is a regular expression.
You can leave it to be -name instead of -iname for case sensitive match.
The '!' before -iname, is to do an inverse match, i.e to find all files that don't have the name '.gitkeep', if you remove the '!', then you will get all files that match '.gitkeep'.
finally, '-delete' will delete the files that match this specification.
If you want to see what all files will be deleted before executing -delete, you can remove that flag and it will show you all the files :
find <path-to-the-folder> -maxdepth 1 -type f ! -iname '\.gitkeep'
(you can also use -print at the end, which is just redundant)
2) for i in `ls -a | grep -v '\.gitkeep'` ; do rm -rf $i ; done
Not really recommended to do it this way, since rm -rf is always a bad idea (IMO). You can change that to rm -f (to ensure it just works on file and not directories).
To be on the safe side, it is recommended to do an echo of the file list first to see if you are ready to delete all the files shown :
for i in `ls -a | grep -v '\.gitkeep'` ; do echo $i ; done
This will iterate thru all the files that don't match '.gitkeep' and delete them one by one ... not the best way I suppose to delete files
3)rm -rf $(ls -a | grep -v '\.gitkeep')
Again, careful with rm -rf, instead of rm -rf above, you can again do an echo to find out the files that will get deleted
I am sure there are more ways, but just a glimpse of the array of possibilities :)
Good Luck,
Ash
================================================================
EDIT :
=> manpages are your friend when you are trying to learn something new, if you don't understand how a command works or what options it can take and do, always lookup man for details.
ex : man find
=> I understand that you are trying to learn something out of your comfort zone, which is always commendable, but stack overflow doesn't like people asking questions without researching.
If you did research, you are expected to mention it in your question, letting people know what you have done to find answers on your own.
A simple google search or a deep dive into stack overflow questions would have provided you with a similar or even a better answer to your question. So be careful :)
Forewarned is forearmed :)
You can use find:
find /path/to/folder -maxdepth 1 ! -name .gitkeep -delete

Remove files for a lot of directories - Linux

How can I remove all .txt files present in several directories
Dir1 >
Dir11/123.txt
Dir12/456.txt
Dir13/test.txt
Dir14/manifest.txt
In my example I want to run the remove command from Dir1.
I know the linux command rm, but i don't know how can I make this works to my case.
PS.: I'm using ubuntu.
To do what you want recursively, find is the most used tool in this case. Combined with the -delete switch, you can do it with a single command (no need to use -exec (and forks) in find like other answers in this thread) :
find Dir1 -type f -name "*.txt" -delete
if you use bash4, you can do too :
( shopt -s globstar; rm Dir1/**/*.txt )
We're not going to enter sub directories so no need to use find; everything is at the same level. I think this is what you're looking for: rm */*.txt
Before you run this you can try echo */*.txt to see if the correct files are going to be removed.
Using find would be useful if you want to search subfolders of subfolders, etc.
There is no Dir1 in the current folder so don't do find Dir1 .... If you run the find from the prompt above this will work:
find . -type f -name "*.txt" -delete

parsing ls -R output into a variable in Unix

I am executing a ls-R /files/
I got the following output
./: nvision
./nvision: layout
./nvision/layout: abcd.txt
I am looking to get path in the listing like
/nvision
/nvision/layout/
/nvision/layout/abcd.txt
and I should be able to copy the required path to a variable
ps: I am not searching for nvision
I am trying to get the list of folders and files under files folder
can any one help me with that
Have you tried using find (see reference)
It would be as easy as find . to get the list of files and folders inside the current directory. Change the . to any path to obtain the list of files and directories inside that path:
nvision
nvision/abcd.txt
nvision/layout
To save it to a variable
var=`find .`
And to add the initial slash to every line (if required)
var=`find . -exec echo /{} \;`
Here var has no special meaning, it's just the variable name.
To later use the variable you can use $var or ${var}. For example, to print it or save it to file:
# Print the variable content
echo $var
# Save the content of var to a file
echo $var > /tmp/file.txt
You should really use find for these kind of things. Simply use find directory. If you require more specific output formatting you can make use of find's -printf option. Find is a really powerful tool that also allows all kinds of filtering. Make sure you check the documentation for more information: GNU FindUtils.
To store the results in a variable use one of the following statements:
result=`find ...`
or
result=$(find ...)
You can also use find to directly execute a command for each match using find's -exec option. Again, make sure to check out the documentation. It's really comprehensive.
Update (Mac / UNIX users – Linux users are not affected)
BSD find requires a path. Use
find .
instead of just
find
if you require a listing of all files in your working directory.
well the answer is all over this page you should be using find which lists all files found yo can define
where . is current folder otherwise replace . with path you are wishing to search
find .-type d -print
which lists directories only or find
or
find . -type f -print
which will list all files only
if you are looking for both then
find . -print
and if you only wish to define recursive level try
find . -maxdepth 1 -print
and here is a script
#!/bin/bash
for names in $(find . -type f -print); do
echo $names
done

Problem using 'find' in BASH

I'm following this guide to get some basic skills in Linux.
At the exercises of chapter 3 section, there are two exercises:
*Change to your home directory. Create a new directory and copy all
the files of the /etc directory into it. Make sure that you also copy
the files and directories which are in the subdirectories of /etc!
(recursive copy)
*Change into the new directory and make a directory for files starting
with an upper case character and one for files starting with a lower
case character. Move all the files to the appropriate directories. Use
as few commands as possible.
The first part was simple but I have encountered problems in the second part (although I thought it should be simple as well).
I did the first part successfully - that is, I have a copy of the /etc folder in ~/newetc - with all the files copied recursively into subdirectories.
I've created ~/newetc/upper and ~/newetc/lower directories.
My intention was to do something like mv 'find ... ' ./upper for example.
But first I thought I should make sure that I can find all the files with Upper/Lower case seperately. At this I failed.
I thought that find ~/newetc [A-Z].* (also tried: find ~/newetc -name [A-Z].*) to find all the upper case files - but it simply returns no results.
What's even stranger: find ~/newetc -name [a-z].*) returns only two files, although of course there are a lot more then that...
any idea what am I doing wrong?
Thank you for your time!
Edit: (I have tried to read the Man for find command btw, but didn't come up with anything)
The -name argument does not take a full regular expression by default. So [A-Z].* will match only if the second character is a dot.
Use the expression [A-Z]*, or use -regex and -regextype to match using a real regex.
You need to use quotes
find ~/new_etc -name "[A-Z]*"
find ~/new_etc -name "[a-z]*"
If you want to use regexp, then you must use -regex (or -iregex).
For finding stuff, the other answers tell you how to do it.
For moving the results of find, use the -exec flag (while being in newetc):
find -name "[A-Z]*" -exec mv {} upper/{} \;
find -name "[a-z]*" -exec mv {} lower/{} \;
The -name parameter takes a glob, not a regular expression (those are both very useful pages). So the dot does not have a special meaning for this parameter - It is interpreted as a literal dot character. Also, in a regular expression the * means "0 or more of the previous expression" while in a glob it means "any number of any character." So, as others have pointed out, the following should get you any files below the current directory which start with an uppercase character:
find . -name '[A-Z]*'
If you want to find all the name beginning with a capital letter you have to use
find . -name "[A-Z]*"
NOT
find [A-Z].*
otherwise yo will try to locate all the file that begin with a capital letter and have a . just after

Resources