How do I exclude a character in Linux - linux

Write a wildcard to match all files (does not matter the files are in which directory, just ask for the wildcard) named in the following rule: starts with a string “image”, immediately followed by a one-digit number (in the range of 0-9), then a non-digit char plus anything else, and ends with either “.jpg” or “.png”. For example, image7.jpg and image0abc.png should be matched by your wildcard while image2.txt or image11.png should not.
My folder contained these files imag2gh.jpeg image11.png image1agb.jpg image1.png image2gh.jpg image2.txt image5.png image70.jpg image7bn.jpg Screenshot .png
If my command work it should only display image1agb.jpg image1.png image2gh.jpg image5.png image70.jpg image7bn.jpg
This is the command I used (ls -ad image[0-9][^0-9]*{.jpg,.png}) but I'm only getting this image1agb.jpg image2gh.jpg image7bn.jpg so I'm missing (image1.png image5.png)Kali Terminal and what I did
ls -ad image[0-9][!0-9]*{.jpg,.png}

Info
Character ranges like [0-9] are usually seen in RegEx statements and such. They won't work as shell globs (wildcards) like that.
Possible solution
Pipe output of command ls -a1
to standard input of the grep command (which does support RegEx).
Use a RegEx statement to make grep filter filenames.
ls -a1|grep "image"'[[:digit:]]\+[[:alpha:]]*\.\(png\|jpg\)'

Related

Pick the specific file in the folder

I want pick the specific format of file among the list of files in a directory. Please find the below example.
I have a below list of files (6 files).
Set-1
1) MAG_L_NT_AA_SUM_2017_01_20.dat
2) MAG_L_NT_AA_2017_01_20.dat
Set-2
1) MAG_L_NT_BB_SUM_2017_01_20.dat
2) MAG_L_NT_BB_2017_01_20.dat
Set-3
1) MAG_L_NT_CC_SUM_2017_01_20.dat
2) MAG_L_NT_CC_2017_01_20.dat
From the above three sets I need only 3 files.
1) MAG_L_NT_AA_2017_01_20.dat
2) MAG_L_NT_BB_2017_01_20.dat
3) MAG_L_NT_CC_2017_01_20.dat
Note: There can be multiple lines of commands because i have create the script for above req. Thanks
Probably easiest and least complex solution to your problem is combining find (a tool for searching for files in a directory hierarchy) and grep (tool for printing lines that match a pattern). You also can read those tools manuals by typing man find and man grep.
Before going straight to solution we need to understand, how we will approach your problem. To find pattern in a name of file we search we will use find command with option -name:
-name pattern
Base of file name (the path with the leading directories removed) matches shell pattern pattern. The metacharacters ('*', '?', and '[]')
match a '.' at the start of the base name (this is a change in
findutils-4.2.2; see section STANDARDS CONFORMANCE below). To ignore a
directory and the files under it, use -prune; see an example in the
description of -path. Braces are not recognised as being special,
despite the fact that some shells including Bash imbue braces with a
special meaning in shell patterns. The filename matching is performed
with the use of the fnmatch(3) library function. Don't forget to
enclose the pattern in quotes in order to protect it from expansion by
the shell.
For instance, if we want to search for a file containing string 'abc' in directory called 'words_directory', we will enter following:
$ find words_directory -name "*abc*"
And if we want to search all directories in directory:
$ find words_directory/* -name "*abc*"
So first, we will need to find all files, which begin with string "MAG_L_NT_" and end with ".dat", therefore to find all matching names in /your/specified/path/ which contains many subdirectories, which could contain files that match this pattern:
$ find /your/specified/path/* -name "MAG_L_NT_*.dat"
However this prints all found filenames, but we still get names containing "SUM" string, there comes in grep. To exclude names containing unwanted string we will use option -v:
-v, --invert-match
Invert the sense of matching, to select non-matching lines. (-v is
specified by POSIX .)
To use grep to filter out first commands output we will use pipe () |:
The standard shell syntax for pipelines is to list multiple commands,
separated by vertical bars ("pipes" in common Unix verbiage). For
example, to list files in the current directory (ls), retain only the
lines of ls output containing the string "key" (grep), and view the
result in a scrolling page (less), a user types the following into the
command line of a terminal:
ls -l | grep key | less
"ls -l" produces a process, the output (stdout) of which is piped to
the input (stdin) of the process for "grep key"; and likewise for the
process for "less". Each process takes input from the previous process
and produces output for the next process via standard streams. Each
"|" tells the shell to connect the standard output of the command on
the left to the standard input of the command on the right by an
inter-process communication mechanism called an (anonymous) pipe,
implemented in the operating system. Pipes are unidirectional; data
flows through the pipeline from left to right.
process1 | process2 | process3
After you got acquainted to mentioned commands and options which will be used to achieve your goal, you are ready for solution:
$ find /your/specified/path/* -name "MAG_L_NT_*.dat" | grep -v "SUM"
This command will produce output of all names which begin "MAG_L_NT_" and end with ".dat". grep -v will use first command output as input and remove all lines containing "SUM" string.

What is file globbing?

I was just wondering what is file globbing? I have never heard of it before and I couldn't find a definition when I tried looking for it online.
Globbing is the * and ? and some other pattern matchers you may be familiar with.
Globbing interprets the standard wild card characters * and ?, character lists in square brackets, and certain other special characters (such as ^ for negating the sense of a match).
When the shell sees a glob, it will perform pathname expansion and replace the glob with matching filenames when it invokes the program.
For an example of the * operator, say you want to copy all files with a .jpg extension in the current directory to somewhere else:
cp *.jpg /some/other/location
Here *.jpg is a glob pattern that matches all files ending in .jpg in the current directory. It's equivalent to (and much easier than) listing the current directory and typing in each file you want manually:
$ ls
cat.jpg dog.jpg drawing.png recipes.txt zebra.jpg
$ cp cat.jpg dog.jpg zebra.jpg /some/other/location
Note that it may look similar, but it is not the same as Regular Expressions.
You can find more detailed information here and here

Understanding sed

I am trying to understand how
sed 's/\^\[/\o33/g;s/\[1G\[/\[27G\[/' /var/log/boot
worked and what the pieces mean. The man page I read just confused me more and I tried the info sai Id but had no idea how to work it! I'm pretty new to Linux. Debian is my first distro but seemed like a rather logical place to start as it is a root of many others and has been around a while so probably is doing stuff well and fairly standardized. I am running Wheezy 64 bit as fyi if needed.
The sed command is a stream editor, reading its file (or STDIN) for input, applying commands to the input, and presenting the results (if any) to the output (STDOUT).
The general syntax for sed is
sed [OPTIONS] COMMAND FILE
In the shell command you gave:
sed 's/\^\[/\o33/g;s/\[1G\[/\[27G\[/' /var/log/boot
the sed command is s/\^\[/\o33/g;s/\[1G\[/\[27G\[/' and /var/log/boot is the file.
The given sed command is actually two separate commands:
s/\^\[/\o33/g
s/\[1G\[/\[27G\[/
The intent of #1, the s (substitute) command, is to replace all occurrences of '^[' with an octal value of 033 (the ESC character). However, there is a mistake in this sed command. The proper bash syntax for an escaped octal code is \nnn, so the proper way for this sed command to have been written is:
s/\^\[/\033/g
Notice the trailing g after the replacement string? It means to perform a global replacement; without it, only the first occurrence would be changed.
The purpose of #2 is to replace all occurrences of the string \[1G\[ with \[27G\[. However, this command also has a mistake: a trailing g is needed to cause a global replacement. So, this second command needs to be written like this:
s/\[1G\[/\[27G\[/g
Finally, putting all this together, the two sed commands are applied across the contents of the /var/log/boot file, where the output has had all occurrences of ^[ converted into \033, and the strings \[1G\[ have been converted to \[27G\[.

Find all PHP files in the current folder that contain a string

How could I show names of all PHP files in the current folder that contain the string "Form.new" in a Linux system?
I have tried grep "Form.new" .
You need to search recursive or using* instead of ., depending of whether you want to search only file right inside that directory or also in deeper levels. So:
grep -r "Form\.new" .
or
grep "Form\.new" *
Assuming that your PHP files have a .php extension, the following will do the trick:
grep "Form\.new" *.php
Like #LaughDonor mentioned, it's good practise to escape the dot; otherwise, dot is interpreted as “any character” by grep. "Form.new" also matches "Form_new", "Form-new", "Form:new", "FormAnew", etc.

Using wildcards to exclude files with a certain suffix

I am experimenting with wildcards in bash and tried to list all the files that start with "xyz" but does not end with ".TXT" but getting incorrect results.
Here is the command that I tried:
$ ls -l xyz*[!\.TXT]
It is not listing the files with names "xyz" and "xyzTXT" that I have in my directory. However, it lists "xyz1", "xyz123".
It seems like adding [!\.TXT] after "xyz*" made the shell look for something that start with "xyz" and has at least one character after it.
Any ideas why it is happening and how to correct this command? I know it can be achieved using other commands but I am especially interested in knowing why it is failing and if it can done just using wildcards.
These commands will do what you want
shopt -s extglob
ls -l xyz!(*.TXT)
shopt -u extglob
The reason why your command doesn't work is beacause xyz*[!\.TXT] which is equivalent to xyz*[!\.TX] means xyz followed by any sequence of character (*) and finally a character in set {!,\,.,T,X} so matches 'xyzwhateveryouwant!' 'xyzwhateveryouwant\' 'xyzwhateveryouwant.' 'xyzwhateveryouwantT' 'xyzwhateveryouwantX'
EDIT: where whateveryouwant does not contain any of !\.TX
I don't think this is doable with only wildcards.
Your command isn't working because it means:
Match everything that has xyz followed by whatever you want and it must not end with sequent character: \, .,T and X. The second T doesn't count as far as what you have inside [] is read as a family of character and not as a string as you thought.
You don't either need to 'escape' . as long as it has no special meaning inside a wildcard.
At least, this is my knowledge of wildcards.

Resources