Identifying multiple file types with bash

Identifying multiple file types with bash - linux

I'm pretty sure I've seen this done before but I can't remember the exact syntax.
Suppose you have a couple of files with different file extensions:
foo.txt
bar.rtf
index.html
and instead of doing something with all of them (cat *), you only want to run a command on 2 of the 3 file extensions.
Can't you do something like this?
cat ${*.txt|*.rtf}
I'm sure there's some find trickery to identify the files first and pipe them to a command, but I think bash supports what I'm talking about without having to do that.

The syntax you want is cat *.{txt,rft}. A comma is used instead of a pipe.
$ echo foo > foo.txt
$ echo bar > bar.rft
$ echo "bar txt" > bar.txt
$ echo "test" > index.html
$ cat *.{txt,rft}
bar txt
foo
bar
$ ls *.{txt,rft}
bar.rft bar.txt foo.txt
But as Anthony Geoghegan said in their answer there's a simpler approach you can use.

Shell globbing is much more basic than regular expressions. If you want to cat all the files which have a .txt or .rtf suffix, you'd simply use:
cat *.txt *.rtf
The glob patterns will be expanded to list all the filenames that match the pattern. In your case, the above command would call the cat command with foo.txt and bar.rtf as its arguments.

Here's a simple way i use to do it using command substitution.
cat $(find . -type f \( -name "*.txt" -o -name "*.rtf" \))
But Anthony Geoghegan's answer is much simpler. I learned from it too.

Related

How to match file extensions regardless of case in bash script

Let's just say i want to match the .abc extension but it could come over as .ABC or .AbC. How can I identify all of these variations of the .abc extension in order to process .abc files?
Right now i'm using:
ls | grep -i .abc
but i've heard that piping to grep is usually not the best idea. Is there a better way to do this?

If you enter the extension literally, you can use character classes:
ls *.[Aa][Bb][Cc]
You can also use the -iname option of find:
find -maxdepth 1 -iname '*.abc'

You can use the nocaseglob option to the shopt builtin to make globs not pay attention to case.
$ touch foo.abc foo.ABC
$ echo *.abc
foo.abc
$ shopt -s nocaseglob
$ echo *.abc
foo.ABC foo.abc

what is the difference between something and `something` in the linux shell?

I want to find all the .pdf files recursively by using find
So I typed in find . -name *.pdf
And the output was weird ,it only contains all the pdf files in the current directory , other pdf fils in the sub directory is omitted
Then I tried this find . -name '*.pdf'
This time ,every thing is fine .And I got what I want, I mean all the pdf files including those located in the sub directory.
So here comes the deal: what is the difference between find . -name *.pdf and find . -name '*.pdf'

Yes as you've found that quoting makes all the difference there.
Without quoting *.pdf gets expanded by shell glob expansion rule even before find runs and thus find command shows all the pdf files from that list only.
In other words this find command:
find . -name *.pdf
is same as:
printf "%s\n" *.pdf
So right way to use find is:
find . -name '*.pdf'

In the Linux shell (bash, csh, sh, and probably many others I'm not as familiar with), different quotes mean different things.
Fundamentally, quotes do two things for you:
Limiting text substitution: You're experiencing the shell substituting references to all PDF files in your current directory for *.pdf. That's an example of text substitution. Text substitution can also occur with variable names--for example:
bash$ MYVAR=test
bash$ echo $MYVAR
test
bash$ echo '$MYVAR'
$MYVAR
will give you the output test, because $MYVAR is substituted with the value the variable is set to.
Overriding space as an argument separator: Let's pretend you have a directory with these files
bash$ ls -1
file1
file1 file2
file2
If you type ls file1, you'd get file1 as you expect. Similarly, ls file2 gives you file2. The following commands show the significance of quotes overriding space as an argument separator:
bash$ ls -1 file1 file2
file1
file2
bash$ ls -1 "file1 file2"
file1 file2
Notice how the first example displays two files file1 and file2, while the second example displays one file file1 file2. That's because the quotes stop " " (a single space) from being used as an argument separator.
One final note: your original question asks about the difference between something and 'something'. It's worth nothing that there is actually a difference between something, 'something', and "something". Consider this:
bash$ MYVAR=test
bash$ echo $MYVAR
test
bash$ echo '$MYVAR'
$MYVAR
bash$ echo "$MYVAR"
test
Note the difference between '$MYVAR' and "$MYVAR". The ' (single quote) is considered a "strong quote," meaning everything contained inside it is explicit. The " (double quote) is a "weak quote," which does not expand * or ?, but does expand variable names and command substitutions.
The Grymorie provides a tremendous amount of information about quotes. Have fun learning!

List the first few lines of every file in a directory

I'm trying to create a really simple bash script, which will list the first few lines of every file in a specific directory. The directory should be specified by the argument.
I think that the Grep command should be used, but I have really no idea how.
My existing script does not seem to work at all, so it's no use putting it in here.

Use head command:
head -3 /path/to/dir/*

For any answer using head and *, redirect stderr to /dev/null unless you want to see errors like:
head: error reading ‘tmp’: Is a directory

for file in dir/*; do
echo "-- $file --"
head "$file"
echo
done

If you want the first few lines of all files ending in .txt, try
head *.txt
or
head --lines=3 *.txt

Because bash does filename expansion (globbing) by default, you can just let your shell expand input and let head do the rest:
head *
The * wildcard expands to all the filenames in the working directory. On zsh you can see this nicely, when it autocompletes your commandline when you press tab.
You can change the amount of lines with the -n argument to head.
If you want to do this recursively:
find . \! -type d -exec head '{}' +

About the composition of Linux command

Assuming:
the path of file f is ~/f
"which f" shows "~/f",
Then,
which f | cat shows ~/f. So cat here is applied to the quotation of ~/f, which is different with cat ~/f.
My question is: how I could use one command composed of which and cat to achieve the result of cat ~/f? When I don't know the result of which f in advance, using this composition can be very convenient. Currently, if I don't know the result of which f in advance, I have to invoke which f first, and copy-paste the result to feed less.
A related question is: how can I assign the result of which f to a variable?
Thanks a lot!

Try:
cat `which ~/f`
For the related question:
foo=`which ~/f`
echo $foo

cat "`which f`"

Like so in bash:
cat "$(which f)"
var="$(which f)"

What you want is:
cat `which f`

In which f | cat the cat program gets the output of which f on standard input. That then just passes that standard input through, so the result is the same as a plain which f. In the call cat ~/f the data is passed as a parameter to the command. cat then opens the file ~/f and displays it's contents.
To get the output of which f as a parameter to cat you can, as others have answered, use backticks or $():
cat `which f`
cat $(which f)
Here the shell takes the output of which f and inserts it as a parameter for cat.

In bash, you can use:
cat "$(which f)"
to output the contents of the f that which finds. This, like the backtick solution, takes the output of the command within $(...) and uses that as a parameter to the cat command.
I prefer the $(...) to the backtick method since the former can be nested in more complex situations.
Assigning the output of which to a variable is done similarly:
full_f="$(which f)"
In both cases, it's better to use the quotes in case f, or it's path, contains spaces, as heinous as that crime is :-)
I've often used a similar trick when I want to edit a small group of files with similar names under a given sub-directory:
vim $(find . -type f -name Makefile)
which will give me a single vim session for all the makefiles (obviously, if there were a large number, I'd be using sed or perl to modify them en masse instead of vim).

cat echos the contents of files to the standard output. When you write stuff | cat, the file cat works on is the standard input, which is connected to the output of stuff (because pipes are files, just like nearly everything else in unix).
There is no quoting going on in the sense that a lisp programmer would use the word.

how to manipulate strings with shell-script

This is what I used:
for i in `find some -type f -name *.class`
I got:
some/folder/subOne/fileOne.class
some/folder/subOne/fileTwo.class
some/other/sub/file.class
next, I would like to get rid of the "some/" for each value of $i. What command can I use? Do I HAVE to save them into a file first?
Thanks

$ i=some/other/sub/file.class
$ echo ${i#some/}
other/sub/file.class
Bash has simple string manipulation built in. See also ${i%.class} and the basename and dirname commands.

awk :)
http://en.wikipedia.org/wiki/AWK
EDIT: Oh and you can pipe commands together, so the output of the first command acts as the input for the second. Like 'cat example.txt | less' will output the file through a paginator.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string