I have a function which generate a shell command which use find to delete all files that are not useful anymore:
DOWNLOAD_DIR = '/home/user/directory';
function purge(psmil, callback) {
var arg = [DOWNLOAD_DIR, '\\(', '-name', '"*.mp4"', '-o', '-name', '"*.zip"', '\\)', '!', '\\('],
file = [],
i = 0;
cpurge;
//Fill file with names of the files to keep
arg.push('-name');
arg.push('"' + file[i] + '"');
i = i + 1;
while( i < file.length) {
arg.push('-o');
arg.push('-name');
arg.push('"' + file[i] + '"');
i = i + 1;
}
arg.push('\\)');
arg.push('-ls');
arg.push('-delete');
cpurge = spawn('find', arg);
cpurge.stdout.on('data', function(data) {
console.log('data');
}
cpurge.stderr.on('data', function(data) {
console.log('err: ' + data);
}
cpurge.stdout.on('data', function(data) {
callback();
}
}
Example, it will generate the command:
find /home/user/directory \( -name "*.mp4" -o -name "*.zip" \) ! \( -name "tokeep.mp4" -o -name "tokeep2.mp4" \) -ls -delete
Which, put in a .sh file and started, work file, it list all .mp4 and .zip in /home/user/directory, print them and delete them
But when I look at the log of my app, it list everything on the disk, and delete all .mp4 and .zip in the directory
Why?
EDIT: Use find directly
I ve tried to use strace, I ve got this line:
2652 execve("/usr/bin/find", ["find", "/home/user/directory/", "\\(", "-name", "\"*.mp4\"", "-o", "-name", "\"*.zip\"", "\\)", "!", "\\(", "-name", "\"filetokeep.mp4", "-o", "-name", "\"filetokeep2.mp4\"", ...], [/* 17 vars */]) = 0
With Bash
When you pass arguments to bash using -c, then the argument just after -c must contain the whole thing you want bash to run. To illustrate, assuming NONEXISTENT does not exist:
$ bash -c ls NONEXISTENT
Will just ls all the files in your directory, no error.
$ bash -c 'ls NONEXISTENT'
Will launch ls NONEXISTENT and will give an error.
So your arg list must be built something like this:
['-c', 'find /home/user/directory \( -name "*.mp4" -o -name "*.zip" \) ! \( -name "tokeep.mp4" -o -name "tokeep2.mp4" \) -ls -delete']
The argument that comes after -c is the whole command you want bash to run.
Without Bash
But as I've said in the comment, I do not see anything in your use of find that should require you pass it to bash. So you could reduce your arg list to just what you want find to execute and spawn find directly. If you decide to do this, you must not quote the arguments you pass to find. So "*.mp4" must become *.mp4 (remove the quotes), \( must become (. The presence of the quotes and the slashes are just for bash. If you no longer use bash, then you must remove them. For instance, this:
'\\(', '-name', '"*.mp4"', '-o', '-name', '"*.zip"', '\\)', '!', '\\('
must become:
'(', '-name', '*.mp4', '-o', '-name', '*.zip', ')', '!', '('
and the same transformation must be applied to the rest of your arguments.
Related
I've got a bash command that works:
find content/posts \( -iname '*.jpg' \) -print0 | xargs -0 -P8 -n2 mogrify -auto-orient
I'm trying to convert it into a Rust command:
let mut cmd = Command::new("/usr/bin/find")
.args([
post.dest_dir(&dest_dir).as_os_str().to_str().unwrap(),
"(-iname '*.jpg')",
"-print0",
])
.stdout(Stdio::piped())
.spawn()
.unwrap();
let output = Command::new("/usr/bin/xargs")
.args(["-0", "-P8", "-n2", "mogrify", "-auto-orient"])
.stdin(cmd.stdout.take().unwrap())
.output()
.unwrap_or_else(|e| panic!("failed to execute process: {}", e));
However I'm getting the following output:
find: (-iname '*.jpg'): No such file or directory
exit code: exit status: 1
mogrify: no decode delegate for this image format `' # error/constitute.c/ReadImage/741.
mogrify: no decode delegate for this image format `MD' # error/constitute.c/ReadImage/741.
mogrify: insufficient image data in file `content/posts/2021-06-26-united-states/DSCF1507.JPG' # error/jpeg.c/ReadJPEGImage_/1120.
It appears to be reading the right directory but it's not respecting the pattern passed to find \( -iname '*.jpg' \) and including markdown files.
Is there a proper way to pass those arguments into Command?
Two rules will make this work:
Everything that's space-separated in the original bash command needs to be passed as a separate argument.
Quoting with ' and escaping with \ are shell syntax that don't need to be carried over to the Rust version.
You can also use post.dest_dir(&dest_dir) directly by passing it in a separate .arg() call. This gets rid of the string conversions and the possibility of failure if the path is not valid UTF-8.
let mut cmd = Command::new("/usr/bin/find")
.arg(post.dest_dir(&dest_dir))
.args(["(", "-iname", "*.jpg", ")", "-print0"]);
For example:
I have a directory with project.
In that 3 sub directories.
In that 3 sub directories 1 text file in each.
Now I am using scandir() to find how many files& directive present in that project. But scandir() is only scanning 1 level mean it is not scanning sub directories how to scan them also.
If you are using the command line, you can use find and wc.
To count all files recursively:
find . -type f | wc -l
To find directory count:
find . -type d | wc -l
+++++++++++++++++++++++++++++
-type f = file
-type d = directory
wc = prints newline, word or byte count which takes a parameter -l to give you line count
If you are referring to using scandir in PHP, you can try something like this:
<?php
function dirToArray($dir) {
$result = array();
$cdir = scandir($dir);
foreach ($cdir as $key => $value)
{
if (!in_array($value,array(".","..")))
{
if (is_dir($dir . DIRECTORY_SEPARATOR . $value))
{
$result[$value] = dirToArray($dir . DIRECTORY_SEPARATOR . $value);
}
else
{
$result[] = $value;
}
}
}
return $result;
}
?>
Source: comment #88 http://php.net/manual/en/function.scandir.php
We have a typical directory structure,and we need to navigate into a directory.
The problem is the name of the directory changes every time and I am trying to do that by using a script. Below is the directory structure
/home/km5001731/cxs/ratc/1670/RATC1670/xxxxx
I want to navigate into that "xxxxx" directory and I do not know the name of that directory, and there are some more directories present inside that, and I know the names of those directories.
How can I navigate to the one I want?
You can use this to find all directories available
find /home/km5001731/cxs/ratc/1670/RATC1670/ -type d -maxdepth 1
Then, you can iterate through them looking for the one in question
#!/bin/bash
base_path='/home/km5001731/cxs/ratc/1670/RATC1670/'
correct_directory=''
for directory in $(find "$base_path" -type d -maxdepth 1)
do
subdirectories=$(find "${directory}" -type d -maxdepth 1)
if grep -q "known_dir1" <<< "$subdirectories" && grep -q "known_dir2" <<< "$subdirectories"
then
correct_directory="${directory}"
break
fi
done
if [[ "$correct_directory" = "" ]]
then
echo "Didn't find it!"
exit
fi
cd "$correct_directory"
Or you can write a small piece of C code recursively calling opendir() and readdir() with regular expression argument for exclusive or inclusive folder name pattern
void examinedir(char *dir, RegExp p)
{
DIR *dp;
struct dirent *entry;
struct stat statbuf;
if((dp=opendir(dir))== NULL)
{
//Error
return;
}
while(entry=readdir(dp))
{
char abspath[256] = {0};
sprintf(abspath, "%s/%s",dir,entry->d_name);
lstat(abspath, &statbuf);
if(S_ISDIR(statbuf.st_mode))
{
// It is folder, examine it with p
// call examinedir(abspath,p) if you want
}
else
{
// file
}
}
closedir(dp);
}
By doing below we can navigate to the current sub directory
!/bin/bash
cd /home/km5001731/cxs/ratc/1670/RATC1670/
Out_dir=ls -Art | tail -n 1
cd /home/km5001731/cxs/ratc/1670/RATC1670/$Out_dir
Here is my command
for i in `find . -name '*Source*.dat'`; do cp "$i" $INBOUND/$RANDOM.dat; done;
Here are the files (just a sample):
/(12)SA1 (Admitting Diagnosis) --_TA1-1 + TA1-2/Source.dat
./(12)SA1 (Admitting Diagnosis) --_TA1-1 + TA1-2/Source_2000C.dat
./(13)SE1 (External Cause of Injury) --_ TE1-1+TE1-2/Source.dat
./(13)SE1 (External Cause of Injury) --_ TE1-1+TE1-2/Source_2000C.dat
./(13)SE1 (External Cause of Injury) --_ TE1-1+TE1-2/Source_POATest.dat
./(14)SP1(Primary)--_ TP1-1 + TP1-2/Source.dat
./(14)SP1(Primary)--_ TP1-1 + TP1-2/Source_2000C.dat
./(14)SP1(Primary)--_ TP1-1 + TP1-2/Source_ProcDateTest.dat
./(15)SP1(Primary)--_ TP1-1 + TP1-2 - SP2 -- TP2-1 + TP2-2/Source.dat
./(16)SP1(Primary)--_ TP1-1 + TP1-2 +TP1-3- SP2 -- TP2-1 + TP2-2/Source.dat
./(17)SP1(Primary)--_ TP1-1 + TP1-2 +TP1-3/Source.dat
./(18)SP1(Primary)--_ TP1-1 + TP1-2 - SP2 -- TP2-1 + TP2-2 - Copy/Source.dat
./(19)SD1 (Primary)+SD2 (Other Diagnosis)--_ TD12/Source.dat
./(19)SD1 (Primary)+SD2 (Other Diagnosis)--_ TD12/Source_2000C.dat
./(19)SD1 (Primary)+SD2 (Other Diagnosis)--_ TD12/Source_POATest.dat
./(2)SD3--_TD4 SD4--_TD4/Source.dat
./(2)SD3--_TD4 SD4--_TD4/Source2.dat
Those spaces are getting tokenized by bash and this doesn't work.
In addition, I want to append some randomness to the end of these files so they don't collide in the destination directory but that's another story.
find . -name '*Source*.dat' -exec sh -c 'cp "$1" "$2/$RANDOM.dat"' -- {} "$INBOUND" \;
Using -exec to execute commands is whitespace safe. Using sh to execute cp is necessary to get a different $RANDOM for each copy.
If all the files are at the same directory level, as in your example, you don't need find. For example,
for i in */*Source*.dat; do
cp "$i" $INBOUND/$RANDOM.dat
done
will tokenize correctly and will find the correct files provided they are all in directories which are children of the current directory.
As #chepner points out in a comment, if you have bash v4 you can use **:
for i in **/*Source*.dat; do
cp "$i" $INBOUND/$RANDOM.dat
done
which should find exactly the same files as find would, without the tokenizing issue.
How about:
find . -name '*file*' -print0 | xargs -0 -I {} cp {} $INBOUND/{}-$RANDOM.dat
xargs is a handy way of constructing an argument list and passing it to a command.
find -print0 and xargs -0 go together, and are basically an agreement between the two commands about how to terminate arguments. In this case, it means the space won't be interpreted as the end of an argument.
-I {} sets up the {} as an argument placeholder for xargs.
As for randomising the file name to avoid a collision, there are obviously lots of things you could do to generate a random string to attach. The most important part, though, is that you verify that your new file name also does not exist. You might use a loop something like this to attempt that:
$RANDOM=$(date | md5)
filename=$INBOUND/$RANDOM.dat
while [ -e $filename ]; do
$RANDOM=$(date | md5)
filename=$INBOUND/$RANDOM.dat
done
I'm not necessarily advocating for or against generating a random filename with a hash of the current time: the main point is that you want to check for existence of that file first, just in case.
There are several ways of treating files with spaces. You can use findin a pipe, while and read:
find . -name '*Source*.dat' | while read file ; do cp "$file" "$INBOUND/$RANDOM.dat"; done
try something like
while read i;do
echo "file is $i"
cp "$i" $INBOUND/$RANDOM.dat
done < <(find . -name '*Source*.dat')
I am trying to write a perl script which checks all the directories in the current directory and then accordingly penetrates in the subsequent directories to the point where it contains the last directory. This is what I have written:
#!/usr/bin/perl -w
use strict;
my #files = <*>;
foreach my $file (#files){
if (-d $file){
my $cmd = qx |chown deep:deep $file|;
my $chdir = qx |cd $file|;
my #subfiles = <*>:
foreach my $ subfile(#subfiles){
if (-d $file){
my $cmd = qx |chown deep:deep $subfile|;
my $chdir = qx |cd $subfile|;
. # So, on in subdirectories
.
.
}
}
}
}
Now, some of the directories I have conatins around 50 sub directories. How can I penetrate through it without writing 50 if conditions? Please suggest. Thank you.
Well, a CS101 way (if this is just an exercise) is to use a recursive function
sub dir_perms {
$path = shift;
opendir(DIR, $path);
my #files = grep { !/^\.{1,2}$/ } readdir(DIR); # ignore ./. and ./..
closedir(DIR);
for (#files) {
if ( -d $_ ) {
dir_perms($_);
}
else {
my $cmd = qx |chown deep:deep $_|;
system($cmd);
}
}
}
dir_perms(".");
But I'd also look at File::Find for something more elegant and robust (this can get caught in a circular link trap, and errors out if you don't call it on a directory, etc.), and for that matter I'd look at plain old UNIX find(1), which can do exactly what you're trying to do with the -exec option, eg
/bin/bash$ find /path/to/wherever -type f -exec chown deep:deep {} \;
perldoc File::Find has examples for what you are doing. Eg,
use File::Find;
finddepth(\&wanted, #directories_to_search);
sub wanted { ... }
further down the doc, it says you can use find2perl to create the wanted{} subproc.
find2perl / -name .nfs\* -mtime +7 \
-exec rm -f {} \; -o -fstype nfs -prune
NOTE: The OS usually won't let you change ownership of a file or directory unless you are the superuser (i.e. root).
Now, we got that out of the way...
The File::Find module does what you want. Use use warnings; instead of -w:
use strict;
use warnings;
use feature qw(say);
use autodie;
use File::Find;
finddepth sub {
return unless -d; # You want only directories...
chown deep, deep, $File::Find::name
or warn qq(Couldn't change ownership of "$File::Find::name\n");
}, ".";
The File::Find package imports a find and a finddepth subroutine into your Perl program.
Both work pretty much the same. They both recurse deeply into your directory and both take as their first argument a subroutine that's used to operate on the found files, and list of directories to operate on.
The name of the file is placed in $_ and you are placed in the directory of that file. That makes it easy to run the standard tests on the file. Here, I'm rejecting anything that's not a directory. It's one of the few places where I'll use $_ as the default.
The full name of the file (from the directory you're searching is placed in $File::Find::name and the name of that file's directory is $File::Find::dir.
I prefer to put my subroutine embedded in my find, but you can also put a reference to another subroutine in there too. Both of these are more or less equivalent:
my #directories;
find sub {
return unless -d;
push #directories, $File::Find::name;
}, ".";
my #directories;
find \&wanted, ".";
sub wanted {
return unless -d;
push #directories, $File::Find::name;
}
In both of these, I'm gathering the names of all of the directories in my path and putting them in #directories. I like the first one because it keeps my wanted subroutine and my find together. Plus, the mysteriously undeclared #directories in my subroutine doesn't look so mysterious and undeclared. I declared my #directories; right above the find.
By the way, this is how I usually use find. I find what I want, and place them into an array. Otherwise, you're stuck putting all of your code into your wanted subroutine.