No such file or directory when piping. Each command works separately, but not when piping - linux

I have 2 folders: folder_a & folder_b. In each of these folders there are a bunch of files. I am trying to use sed to move all of these files out of these folders and into my current working directory I am currently in.
My folder structure looks like this:
mytest:
a:
1.txt
2.txt
3.txt
b:
4.txt
5.txt
The command I am trying to use is:
find . -type d ! -iname '*.*' # find all folders other than root
| sed -r 's/.*/&\/*/' # add '/*' to each of the arguments
| sed -r 'p;s/.*/./' # output: a/* . b/* .
| xargs -n 2 mv # should be creating two commands: 'mv a/* .' and 'mv b/* .'
Unfortunately I get an error:
mv: cannot stat './aaa/*': No such file or directory
I also get the same error when I try this other strategy (using ls instead of mv):
for dir in */; do
ls $dir;
done;
Even if I use sed to replace the spaces in each directory name with '\ ', or surround the directory names with quotes I get the same error.
I'm not sure if these 2 examples are related in my misunderstanding of bash but they both seem to demonstrate my ignorance of how bash translates the output from one command into the input of another command.
Can anyone shed some light on this?

Update: Completely rewritten.
As #EtanReisner and #melpomene have noted, mv */* . or, more specifically, mv a/* b/* . is the most straightforward solution, but you state that this is in part a learning exercise, so the remainder of the answer shows an efficient find-based solution and explains the problem with the original command.
An efficient find-based solution
Generally, if feasible, it's best and most efficient to let find itself do the work, without involving additional tools; find's -exec action is like a built-in xargs, with {} representing the path at hand (with terminator \;) / all paths (with +):
find . -type f -exec echo mv -t . {} +
To be safe, his will just print the mv commands that would be executed; remove the echo to actually execute them.
This will execute a single[1] mv command to which all matching files are passed, and -t . moves them all to the current dir.
[1] If the resulting command line is too long (which is unlikely), it is split up into multiple commands, just as with xargs.
Operating on files (-type f) bypasses the need for globbing, as find will then enumerate all files for you (it also bypasses the need to exclude . explicitly).
Note that this solution works on entire subtrees, not just (immediate) subdirectories.
It's tempting to consider turning on Bash 4's globstar option and using mv */** ., but that won't work, because it will attempt to move directories as well, not just the files in them.
A caveat re -exec with +: it only works if {} - the placeholder for all paths - is the token immediately before the +.
Since you're on Linux, we can satisfy this condition by specifying the target folder for mv with option -t before the {}; on BSD-based systems such as OSX, you could not do that, because mv doesn't support -t there, so you'd have to use terminator \;, which means that mv is called once for every path, which is obviously much slower.
Why your command didn't work:
As #EtanReisner points out in a comment, xargs invokes the command specified without (implicitly) involving a shell, so globbing won't work; you can verify this with the following command:
echo '*' | xargs echo # -> '*' - NO globbing
If we leave the globbing issue aside, additional work would have been necessary to make your xargs command work correctly with folder names with embedded spaces (or other shell metacharacters):
find . -mindepth 1 -type d |
sed -r "s/.*/'&'\/* ./" | # -> '<input-path>'/* . (including single-quotes)
xargs -n 2 echo mv # NOTE: still won't work due to lack of globbing
Note how the (combined) sed command now produces a single output line '<input-path>'/* ., with the input path enclosed in embedded single-quotes, which is required for xargs to recognize <input-path> as a single argument, even if it contains embedded spaces.
(If your filenames contain single-quotes, you'd have to do more work; also note that since now all arguments for a given dir. are on a single line, you could use xargs -L 1 ....)
Also note how -mindepth 1 (only process paths at the subdirectory level or below) is used to skip processing of . itself.
The only way to make globbing happen is to get the shell involved:
find . -mindepth 1 -type d |
sed -r "s/.*/'&'\/* ./" | # -> '<input-path>'/* . (including single-quotes)
xargs -I {} sh -c 'echo mv {}' # works, but is inefficient
Note the use of xargs' -I option to treat each input line as its own argument ({} is a self-chosen placeholder for the input).
sh -c invokes the (default) shell to execute the resulting command, at which globbing does happen.
However, overall, this is quite inefficient:
A pipeline with 3 segments is used.
A shell instance is invoked for every input path, which in turn calls the mv utility.
Compare this to the efficient find-only solution above, which (typically) creates only 2 processes in total.

Related

Syntacticaly error while writing a Unix Shell script (Bash shell) [duplicate]

Say I want to copy the contents of a directory excluding files and folders whose names contain the word 'Music'.
cp [exclude-matches] *Music* /target_directory
What should go in place of [exclude-matches] to accomplish this?
In Bash you can do it by enabling the extglob option, like this (replace ls with cp and add the target directory, of course)
~/foobar> shopt extglob
extglob off
~/foobar> ls
abar afoo bbar bfoo
~/foobar> ls !(b*)
-bash: !: event not found
~/foobar> shopt -s extglob # Enables extglob
~/foobar> ls !(b*)
abar afoo
~/foobar> ls !(a*)
bbar bfoo
~/foobar> ls !(*foo)
abar bbar
You can later disable extglob with
shopt -u extglob
The extglob shell option gives you more powerful pattern matching in the command line.
You turn it on with shopt -s extglob, and turn it off with shopt -u extglob.
In your example, you would initially do:
$ shopt -s extglob
$ cp !(*Music*) /target_directory
The full available extended globbing operators are (excerpt from man bash):
If the extglob shell option is enabled using the shopt builtin, several extended
pattern matching operators are recognized.A pattern-list is a list of one or more patterns separated by a |. Composite patterns may be formed using one or more of the following sub-patterns:
?(pattern-list)
Matches zero or one occurrence of the given patterns
*(pattern-list)
Matches zero or more occurrences of the given patterns
+(pattern-list)
Matches one or more occurrences of the given patterns
#(pattern-list)
Matches one of the given patterns
!(pattern-list)
Matches anything except one of the given patterns
So, for example, if you wanted to list all the files in the current directory that are not .c or .h files, you would do:
$ ls -d !(*#(.c|.h))
Of course, normal shell globing works, so the last example could also be written as:
$ ls -d !(*.[ch])
Not in bash (that I know of), but:
cp `ls | grep -v Music` /target_directory
I know this is not exactly what you were looking for, but it will solve your example.
If you want to avoid the mem cost of using the exec command, I believe you can do better with xargs. I think the following is a more efficient alternative to
find foo -type f ! -name '*Music*' -exec cp {} bar \; # new proc for each exec
find . -maxdepth 1 -name '*Music*' -prune -o -print0 | xargs -0 -i cp {} dest/
A trick I haven't seen on here yet that doesn't use extglob, find, or grep is to treat two file lists as sets and "diff" them using comm:
comm -23 <(ls) <(ls *Music*)
comm is preferable over diff because it doesn't have extra cruft.
This returns all elements of set 1, ls, that are not also in set 2, ls *Music*. This requires both sets to be in sorted order to work properly. No problem for ls and glob expansion, but if you're using something like find, be sure to invoke sort.
comm -23 <(find . | sort) <(find . | grep -i '.jpg' | sort)
Potentially useful.
You can also use a pretty simple for loop:
for f in `find . -not -name "*Music*"`
do
cp $f /target/dir
done
In bash, an alternative to shopt -s extglob is the GLOBIGNORE variable. It's not really better, but I find it easier to remember.
An example that may be what the original poster wanted:
GLOBIGNORE="*techno*"; cp *Music* /only_good_music/
When done, unset GLOBIGNORE to be able to rm *techno* in the source directory.
My personal preference is to use grep and the while command. This allows one to write powerful yet readable scripts ensuring that you end up doing exactly what you want. Plus by using an echo command you can perform a dry run before carrying out the actual operation. For example:
ls | grep -v "Music" | while read filename
do
echo $filename
done
will print out the files that you will end up copying. If the list is correct the next step is to simply replace the echo command with the copy command as follows:
ls | grep -v "Music" | while read filename
do
cp "$filename" /target_directory
done
One solution for this can be found with find.
$ mkdir foo bar
$ touch foo/a.txt foo/Music.txt
$ find foo -type f ! -name '*Music*' -exec cp {} bar \;
$ ls bar
a.txt
Find has quite a few options, you can get pretty specific on what you include and exclude.
Edit: Adam in the comments noted that this is recursive. find options mindepth and maxdepth can be useful in controlling this.
The following works lists all *.txt files in the current dir, except those that begin with a number.
This works in bash, dash, zsh and all other POSIX compatible shells.
for FILE in /some/dir/*.txt; do # for each *.txt file
case "${FILE##*/}" in # if file basename...
[0-9]*) continue ;; # starts with digit: skip
esac
## otherwise, do stuff with $FILE here
done
In line one the pattern /some/dir/*.txt will cause the for loop to iterate over all files in /some/dir whose name end with .txt.
In line two a case statement is used to weed out undesired files. – The ${FILE##*/} expression strips off any leading dir name component from the filename (here /some/dir/) so that patters can match against only the basename of the file. (If you're only weeding out filenames based on suffixes, you can shorten this to $FILE instead.)
In line three, all files matching the case pattern [0-9]*) line will be skipped (the continue statement jumps to the next iteration of the for loop). – If you want to you can do something more interesting here, e.g. like skipping all files which do not start with a letter (a–z) using [!a-z]*, or you could use multiple patterns to skip several kinds of filenames e.g. [0-9]*|*.bak to skip files both .bak files, and files which does not start with a number.
this would do it excluding exactly 'Music'
cp -a ^'Music' /target
this and that for excluding things like Music?* or *?Music
cp -a ^\*?'complete' /target
cp -a ^'complete'?\* /target

Only list files; iconv and directories?

I want to convert the coding of some csv-files with iconv. It has to be a script so I am working with while; do done. The script lists every item in a specific directory and converts them into another coding (utf-8).
Currently, my script lists EVERY item, including directories... So here are my questions
Does iconv has a problem with directories or does it ignore them?
And if there is a problem, how can I only list/search only for files?
I tried How to list only files in Bash? a ***./*** at the beginning of every item and that's kinda annoying (and my program doesn't like it, too).
Another possibility is ls -p | grep -v / but this would also affect files with / in the name, wouldn't it?
I hope you can help me. Thank you.
Here is the code:
for item in $(ls directory/); do
FileName=$item
iconv -f "windows-1252" -t "UTF-8" FileName -o FileName
done
Yea, i know, the input and output file cannot be the same^^
Use find directly:
find . -maxdepth 1 -type f -exec bash -c 'iconv -f "windows-1252" -t "UTF-8" $1 > $1.converted && mv $1.converted $1' -- {} \;
find . -maxdepth 1 -type f finds all files in the working directory
-exec ... executes a command on each such file (including correct handling of e.g. spaces or newlines in the filename)
bash -c '...' executes the command in '...' in a subshell (easier to do the subsequent steps, involving multiple expansions of the filename, this way)
-- terminates option processing, and treats anything after the -- as arguments to the call.
{} is replaced by find with the file name(s) found
$1 in the bash command is replaced with the first (and only) argument, which is the {} replaced by the filename (see above)
\; tells find where the -exec'ed command ends.
Building upon the existing question that you referenced, Why don't you just remove the first 2 characters i.e. ./?
find . -maxdepth 1 -type f | cut -c 3-
Edit: I agree with #DevSolar about the space-based problem in the for-loop. While I think that his solution is better for this problem, I just want to give an alternative way to get out of the space-based for-loop issue.
OLD_IFS=$IFS
IFS=$'\n'
for item in $(find . -maxdepth 1 -type f | cut -c 3-); do
FileName=$item
iconv -f "windows-1252" -t "UTF-8" FileName -o FileName
done
IFS=$OLD_IFS

Linux: how to replace all instances of a string with another in all files of a single type

I want to replace for example all instances of "123" with "321" contained within all .txt files in a folder (recursively).
I thought of doing this
sed -i 's/123/321/g' | find . -name \*.txt
but before possibly screwing all my files I would like to ask if it will work.
You have the sed and the find back to front. With GNU sed and the -i option, you could use:
find . -name '*.txt' -type f -exec sed -i s/123/321/g {} +
The find finds files with extension .txt and runs the sed -i command on groups of them (that's the + at the end; it's standard in POSIX 2008, but not all versions of find necessarily support it). In this example substitution, there's no danger of misinterpretation of the s/123/321/g command so I've not enclosed it in quotes. However, for simplicity and general safety, it is probably better to enclose the sed script in single quotes whenever possible.
You could also use xargs (and again using GNU extensions -print0 to find and -0 and -r to xargs):
find . -name '*.txt' -type f -print0 | xargs -0 -r sed -i 's/123/321/g'
The -r means 'do not run if there are no arguments' (so the find doesn't find anything). The -print0 and -0 work in tandem, generating file names ending with the C null byte '\0' instead of a newline, and avoiding misinterpretation of file names containing newlines, blanks and so on.
Note that before running the script on the real data, you can and should test it. Make a dummy directory (I usually call it junk), copy some sample files into the junk directory, change directory into the junk directory, and test your script on those files. Since they're copies, there's no harm done if something goes wrong. And you can simply remove everything in the directory afterwards: rm -fr junk should never cause you anguish.

Remove files not containing a specific string

I want to find the files not containing a specific string (in a directory and its sub-directories) and remove those files. How I can do this?
The following will work:
find . -type f -print0 | xargs --null grep -Z -L 'my string' | xargs --null rm
This will firstly use find to print the names of all the files in the current directory and any subdirectories. These names are printed with a null terminator rather than the usual newline separator (try piping the output to od -c to see the effect of the -print0 argument.
Then the --null parameter to xargs tells it to accept null-terminated inputs. xargs will then call grep on a list of filenames.
The -Z argument to grep works like the -print0 argument to find, so grep will print out its results null-terminated (which is why the final call to xargs needs a --null option too). The -L argument to grep causes grep to print the filenames of those files on its command line (that xargs has added) which don't match the regular expression:
my string
If you want simple matching without regular expression magic then add the -F option. If you want more powerful regular expressions then give a -E argument. It's a good habit to use single quotes rather than double quotes as this protects you against any shell magic being applied to the string (such as variable substitution)
Finally you call xargs again to get rid of all the files that you've found with the previous calls.
The problem with calling grep directly from the find command with the -exec argument is that grep then gets invoked once per file rather than once for a whole batch of files as xargs does. This is much faster if you have lots of files. Also don't be tempted to do stuff like:
rm $(some command that produces lots of filenames)
It's always better to pass it to xargs as this knows the maximum command-line limits and will call rm multiple times each time with as many arguments as it can.
Note that this solution would have been simpler without the need to cope with files containing white space and new lines.
Alternatively
grep -r -L -Z 'my string' . | xargs --null rm
will work too (and is shorter). The -r argument to grep causes it to read all files in the directory and recursively descend into any subdirectories). Use the find ... approach if you want to do some other tests on the files as well (such as age or permissions).
Note that any of the single letter arguments, with a single dash introducer, can be grouped together (for instance as -rLZ). But note also that find does not use the same conventions and has multi-letter arguments introduced with a single dash. This is for historical reasons and hasn't ever been fixed because it would have broken too many scripts.
GNU grep and bash.
grep -rLZ "$str" . | while IFS= read -rd '' x; do rm "$x"; done
Use a find solution if portability is needed. This is slightly faster.
EDIT: This is how you SHOULD NOT do this! Reason is given here. Thanks to #ormaaj for pointing it out!
find . -type f | grep -v "exclude string" | xargs rm
Note: grep pattern will match against full file path from current directory (see find . -type f output)
One possibility is
find . -type f '!' -exec grep -q "my string" {} \; -exec echo rm {} \;
You can remove the echo if the output of this preview looks correct.
The equivalent with -delete is
find . -type f '!' -exec grep -q "user_id" {} \; -delete
but then you don't get the nice preview option.
To remove files not containing a specific string:
Bash:
To use them, enable the extglob shell option as follows:
shopt -s extglob
And just remove all files that don't have the string "fix":
rm !(*fix*)
If you want to don't delete all the files that don't have the names "fix" and "class":
rm !(*fix*|*class*)
Zsh:
To use them, enable the extended glob zsh shell option as follows:
setopt extended_glob
Remove all files that don't have the string, in this example "fix":
rm -- ^*fix*
If you want to don't delete all the files that don't have the names "fix" and "class":
rm -- ^(*fix*|*class*)
It's possible to use it for extensions, you only need to change the regex: (.zip) , (.doc), etc.
Here are the sources:
https://www.tecmint.com/delete-all-files-in-directory-except-one-few-file-extensions/
https://codeday.me/es/qa/20190819/1296122.html
I can think of a few ways to approach this. Here's one: find and grep to generate a list of files with no match, and then xargs rm them.
find yourdir -type f -exec grep -F -L 'yourstring' '{}' + | xargs -d '\n' rm
This assumes GNU tools (grep -L and xargs -d are non-portable) and of course no filenames with newlines in them. It has the advantage of not running grep and rm once per file, so it'll be reasonably fast. I recommend testing it with "echo" in place of "rm" just to make sure it picks the right files before you unleash the destruction.
This worked for me, you can remove the -f if you're okay with deleting directories.
myString="keepThis"
for x in `find ./`
do if [[ -f $x && ! $x =~ $myString ]]
then rm $x
fi
done
Another solution (although not as fast). The top solution didn't work in my case because the string I needed to use in place of 'my string' has special characters.
find -type f ! -name "*my string*" -exec rm {} \; -print

Unix: traverse a directory

I need to traverse a directory so starting in one directory and going deeper into difference sub directories. However I also need to be able to have access to each individual file to modify the file. Is there already a command to do this or will I have to write a script? Could someone provide some code to help me with this task? Thanks.
The find command is just the tool for that. Its -exec flag or -print0 in combination with xargs -0 allows fine-grained control over what to do with each file.
Example: Replace all foo's by bar's in all files in /tmp and subdirectories.
find /tmp -type f -exec sed -i -e 's/foo/bar/' '{}' ';'
for i in `find` ; do
if [ -d $i ] ; then do something with a directory ; fi
if [ -f $i ] ; then do something with a file etc. ; fi
done
This will return the whole tree (recursively) in the current directory in a list that the loop will go through.
This can be easily achieved by mixing find, xargs, sed (or other file modification command).
For example:
$ find /path/to/base/dir -type f -name '*.properties' | xargs sed -ie '/^#/d'
This will filter all files with file extension .properties.
The xargs command will feed the file path generated by find command into the sed command.
The sed command will delete all lines start with # in the files (feed by xargs).
Command combination in this way is very flexible.
For example, find command have different parameters so you can filter by user name, file size, file path (eg: under /test/ subfolder), file modification time.
Another dimension of flexibility is how and what to change in your file. For ex, sed command allows you to make changes on file in applying substitution (specify via regular expressions). Similarly, you can use gzip to compress the file. And so on ...
You would usually use the find command. On Linux, you have the GNU version, of course. It has many extra (and useful) options. Both will allow you to execute a command (eg a shell script) on the files as they are found.
The exact details of how to make changes to the file depend on the change you want to make to the file. That is probably best scripted, with find running the script:
POSIX or GNU:
find . -type f -exec your_script '{}' +
This will run your script once for a group of files with those names provided as arguments. If you want to do it one file at a time, replace the + with ';' (or \;).
I am assuming SearchMe is the example directory name you need to traverse completely.
I am also assuming, since it was not specified, the files you want to modify are all text file. Is this correct?
In such scenario I would suggest using the command:
find SearchMe -type f -exec vi {} \;
If you are not familiar with vi editor, just use another one (nano, emacs, kate, kwrite, gedit, etc.) and it should work as well.
Bash 4+
shopt -s globstar
for file in **
do
if [ -f "$file" ];then
# do some processing to your file here
# where the find command can't do conveniently
fi
done

Resources