Complex shell wildcard

Complex shell wildcard - linux

I want to use echo to display(not content) directories that start with atleast 2 characters but can't begin with "an"
For example if had the following in the directory:
a as an23 an23 blue
I would only get
as blue back
I tried echo ^an* but that returns the directory with 1 charcter too.
Is there any way i can do this in the form of echo globalpattern

You can use the shells extended globbing feature, in bash:
bash$ setsh -s extglob
bash$ echo !(#(?|an*))
The !() construct inverts its internal expression, see this for more.
In zsh:
zsh$ setopt extendedglob
zsh$ print *~(?|an*)
In this case the ~ negates the pattern before the tilde. See the manual for more.

Since you want at least two characters in the names, you can use printf '%s\n' ??* to echo each such name on a separate line. You can then eliminate those names that start with an with grep -v '^an', leading to:
printf '%s\n' ??* | grep -v '^an'
The quotes aren't strictly necessary in the grep command with modern shells. Once upon a quarter of a century or so ago, the Bourne shell had ^ as a synonym for | so I still use quotes around carets.
If you absolutely must use echo instead of printf, then you'll have to map white space to newlines (assuming you don't have any names that contain white space).
I'm trying with just the echo command, no grep either?
What about:
echo [!a]?* a[!n]*
The first term lists all the two-plus character names not beginning with a; the second lists all the two-plus character names where the first is a and the second is not n.

This should do it, but you'd likely be better off with ls or even find:
echo * | tr ' ' '\012' | egrep '..' | egrep -v '^an'
Shell globbing is a form of regex, but it's not as powerful as egrep regex's.

Related

Problem with using grep to match the whole word

I am trying to match a whole string in a list of new line separated strings. Here is my example:
[hemanth.a#gateway ~]$ echo $snapshottableDirs
/user/hemanth.a/dummy1 /user/hemanth.a/dummy3
[hemanth.a#gateway ~]$ echo $snapshottableDirs | tr -s ' ' '\n'
/user/hemanth.a/dummy1
/user/hemanth.a/dummy3
[hemanth.a#gateway ~]$ echo $snapshottableDirs | tr -s ' ' '\n' | grep -w '/user/hemanth.a'
/user/hemanth.a/dummy1
/user/hemanth.a/dummy3
My aim is to only find a match if and only if the string /user/hemanth.a exists as a whole word(in a new line) in the list of strings. But the above command is also returning strings that contain /user/hemanth.a.
This is a sample scenario. There is no guarantee that all the strings that I would want to match will be in the form of /user/xxxxxx.x. Ideally I would want to match the exact string if it exists in a new line as a whole word in the list.
Any help would be appreciated. thank you.

Update: Using fgrep -x '/user/hemanth.a' is probably a better solution here, as it avoids having to escape characters such as $ to prevent grep from interpreting them as meta-characters. fgrep performs a literal string match as opposed to a regular expression match, and the -x option tells it to only match whole lines.
Example:
> cat testfile.txt
foo
foobar
barfoo
barfoobaz
> fgrep foo testfile.txt
foo
foobar
barfoo
barfoobaz
> fgrep -x foo testfile.txt
foo
Original answer:
Try adding the $ regex metacharacter to the end of your grep expression, as in:
echo $snapshottableDirs | tr -s ' ' '\n' | grep -w '/user/hemanth.a$'.
The $ metacharacter matches the end of the line.
While you're at it, you might also want to use the ^ metacharacter, which matches the beginning of the line, so that grep '/user/hemanth.a$' doesn't accidentally also match something like /user/foo/user/hemanth.a.
So you'd have this:
echo $snapshottableDirs | tr -s ' ' '\n' | grep '^/user/hemanth\.a$'.
Edit: You probably don't actually want the -w here, so I've removed that from my answer.
Edit 2: #U. Windl brings up a good point. The . character in a regular expression is a metacharacter that matches any character, so grep /user/hemanth.a might end up matching things you're not expecting, such as /user/hemanthxa, etc. Or perhaps more likely, it would also match the line /user/hemanth/a. To fix that, you need to escape the . character. I've updated the grep line above to reflect this.
Update: In response to your question in the comments about how to escape a string so that it can be used in a grep regular expression...
Yes, you can escape a string so that it should be able to be used in a regular expression. I'll explain how to do so, but first I should say that attempting to escape strings for use in a regex can become very complicated with lots of weird edge cases. For example, an escaped string that works with grep won't necessarily work with sed, awk, perl, bash's =~ operator, or even grep -e.
On top of that, if you change from single quotes to double quotes, you might then have to add another level of escaping so that bash will expand your string properly.
For example, if you wanted to search for the literal string 'foo [bar]* baz$'using grep, you'd have to escape the [, *, and $ characters, resulting in the regular expression:
'foo \[bar]\* baz\$'
But if for some reason you decided to pass that expression to grep as a double-quoted string, you would then have to escape the escapes. Otherwise, bash would interpret some of them as escapes. You can see this if you do:
echo "foo \[bar]\* baz\$"
foo \[bar]\* baz$
You can see that bash interpreted \$ as an escape sequence representing the character $, and thus swallowed the \ character. This is because normally, in double quoted strings $ is a special character that begins a parameter expansion. But it left \[ and \* alone because [ and * aren't special inside a double-quoted string, so it interpreted the backslashes as literal \ characters. To get this expression to work as an argument to grep in a double-quoted string, then, you would have to escape the last backslash:
# This command prints nothing, because bash expands `\$` to just `$`,
# which grep then interprets as an end-of-line anchor.
> echo 'foo [bar]* baz$' | grep "foo \[bar]\* baz\$"
# Escaping the last backslash causes bash to expand `\\$` to `\$`,
# which grep then interprets as matching a literal $ character
> echo 'foo [bar]* baz$' | grep "foo \[bar]\* baz\\$"
foo [bar]* baz$
But note that "foo \[bar]\* baz \\$" will not work with sed, because sed uses a different regex syntax in which escaping a [ causes it to become a meta-character, whereas in grep you have to escape it to prevent it from being interpreted as a meta-character.
So again, yes, you can escape a literal string for use as a grep regular expression. But if you need to match literal strings containing characters that will need to be escaped, it turns out there's a better way: fgrep.
The fgrep command is really just shorthand for grep -F, where the -F tells grep to match "fixed strings" instead of regular expression. For example:
> echo '[(*\^]$' | fgrep '[(*\^]$'
[(*\^]$
This works because fgrep doesn't know or care about regular expressions. It's just looking for the exact literal string '[(*\^]$'. However, this sort of puts you back at square one, because fgrep will match on substrings:
> echo '/users/hemanth/dummy' | fgrep '/users/hemanth'
/users/hemanth/dummy
Thankfully, there's a way around this, which it turns out was probably a better approach than my initial answer, considering your specific needs. The -x option to fgrep tells it to only match the entire line. Note that -x is not specific to fgrep (since fgrep is really just grep -F anyway). For example:
> echo '/users/hemanth/dummy' | fgrep -x '/users/hemanth' # prints nothing
This is equivalent to what you would have gotten by escaping the grep regex, and is almost certainly a better answer than my previous answer of enclosing your regex in ^ and $.
Now, as promised, just in case you want to go this route, here's how you would escape a fixed string to use as a grep regex:
# Suppose we want to match the literal string '^foo.\ [bar]* baz$'
# It contains lots of stuff that grep would normally interpret as
# regular expression meta-characters. We need to escape those characters
# so grep will interpret them as literals.
> str='^foo.\ [bar]* baz$'
> echo "$str"
^foo.\ [bar]* baz$
> regex=$(sed -E 's,[.*^$\\[],\\&' <<< "$str")
> echo "$regex"
\^foo\.\\ \[bar]\* baz\$
> echo "$str" | grep "$regex"
^foo.\ [bar]* baz$
# Success
Again, for the reasons cited above, I don't recommend this approach, especially not when fgrep -x exists.

Read "Anchoring" in man grep:
Anchoring
The caret ^ and the dollar sign $ are meta-characters that respectively
match the empty string at the beginning and end of a line.
Also be aware that . matches any character (from said manual page):
The period . matches any single character.

Bash Script - Nested $(..) Commands - Not working correctly

I was trying to do these few operations/commands on a single line and assign it to a variable. I have it working about 90% of the way except for one part of it.
I was unaware you could do this, but I read that you can nest $(..) inside other $(..).... So I was trying to do that to get this working, but can't seem to get it the rest of the way.
So basically, what I want to do is:
1. Take the output of a file and assign it to a variable
2. Then pre-pend some text to the start of that output
3. Then append some text to the end of the output
4. And finally remove newlines and replace them with "\n" character...
I can do this just fine in multiple steps but I would like to try and get this working this way.
So far I have tried the following:
My 1st attempt, before reading about nested $(..):
MY_VAR=$(echo -n "<pre style=\"display:inline;\">"; cat input.txt | sed ':a;N;$!ba;s/\n/\\n/g'; echo -n "</pre>")
This one worked 99% of the way except there was a newline being added between the cat command's output and the last echo command. I'm guessing this is from the cat command since sed removed all newlines except for that one, maybe...?
Other tries:
MY_VAR=$( $(echo -n "<pre style=\"display:inline;\">"; cat input.txt; echo -n "</pre>") | sed ':a;N;$!ba;s/\n/\\n/g')
MY_VAR="$( echo $(echo -n "<pre style=\"display:inline;\">"; cat input.txt; echo "</pre>") | sed ':a;N;$!ba;s/\n/\\n/g' )"
MY_VAR="$( echo "$(echo -n "<pre style=\"display:inline;\">"; cat input.txt; echo "</pre>")" | sed ':a;N;$!ba;s/\n/\\n/g' )"
*Most these others were tried with and without the extra double-quotes surrounding the different $(..) parts...
I had a few other attempts, but they didn't have any luck either... On a few of the other attempts above, it seemed to work except sed was NOT inserting the replacement part of it. The output was correct for the most part, except instead of seeing "\n" between lines it just showed each of the lines smashed together into one line without anything to separate them...
I'm thinking there is something small I am missing here if anyone has any idea..?
*P.S. Does Bash have a name for the $(..) structure? It's hard trying to Google for that since it doesn't really search symbols...

You have no need to nest command substitutions here.
your_var='<pre style="display:inline;">'"$(<input.txt)"'</pre>'
your_var=${your_var//$'\n'/'\n'}
"$(<input.txt)" expands to the contents of input.txt, but without any trailing newline. (Command substitution always strips trailing newlines; printf '%s' "$(cat ...)" has the same effect, albeit less efficiently as it requires a subshell, whereas cat ... alone does not).
${foo//bar/baz} expands to the contents of the shell variable named foo, with all instances of bar replaced with baz.
$'\n' is bash syntax for a literal newline.
'\n' is bash syntax for a two-character string, beginning with a backslash.
Thus, tying all this together, it first generates a single string with the prefix, the contents of the file, and the suffix; then replaces literal newlines inside that combined string with '\n' two-character sequences.
Granted, this is multiple lines as implemented above -- but it's also much faster and more efficient than anything involving a command substitution.
However, if you really want a single, nested command substitution, you can do that:
your_var=$(printf '%s' '<pre style="display:inline;">' \
"$(sed '$ ! s/$/\\n/g' <input.txt | tr -d '\n')" \
'</pre>')
The printf %s combines its arguments without any delimiter between them
The sed operation adds a literal \n to the end of each line except the last
The tr -d '\n' operation removes literal newlines from the file
However, even this approach could be done more efficiently without the nesting:
printf -v your_var '%s' '<pre style="display:inline;">' \
"$(sed '$ ! s/$/\\n/g' <input.txt | tr -d '\n')" \
'</pre>')
...which has the printf assign its results directly to your_var, without any outer command substitution required (and thus saving the expense of the outer subshell).

Delete _ and - characters using sed

I am trying to convert 2015-06-03_18-05-30 to 20150603180530 using sed.
I have this:
$ var='2015-06-03_18-05-30'
$ echo $var | sed 's/\-\|\_//g'
$ echo $var | sed 's/-|_//g'
None of these are working. Why is the alternation not working?

As long as your script has a #!/bin/bash (or ksh, or zsh) shebang, don't use sed or tr: Your shell can do this built-in without the (comparatively large) overhead of launching any external tool:
var='2015-06-03_18-05-30'
echo "${var//[-_]/}"
That said, if you really want to use sed, the GNU extension -r enables ERE syntax:
$ sed -r -e 's/-|_//g' <<<'2015-06-03_18-05-30'
20150603180530
See http://www.regular-expressions.info/posix.html for a discussion of differences between BRE (default for sed) and ERE. That page notes, in discussing ERE extensions:
Alternation is supported through the usual vertical bar |.
If you want to work on POSIX platforms -- with /bin/sh rather than bash, and no GNU extensions -- then reformulate your regex to use a character class (and, to avoid platform-dependent compatibility issues with echo[1], use printf instead):
printf '%s\n' "$var" | sed 's/[-_]//g'
[1] - See the "APPLICATION USAGE" section of that link, in particular.

Something like this ought to do.
sed 's/[-_]//g'
This reads as:
s: Search
/[-_]/: for any single character matching - or _
//: replace it with nothing
g: and do that for every character in the line
Sed operates on every line by default, so this covers every instance in the file/string.

I know you asked for a solution using sed, but I offer an alternative in tr:
$ var='2015-06-03_18-05-30'
$ echo $var | tr -d '_-'
20150603180530
tr should be a little faster.
Explained:
tr stands for translate and it can be used to replace certain characters with another ones.
-d option stands for delete and it removes the specified characters instead of replacing them.
'_-' specifies the set of characters to be removed (can also be specified as '\-_' but you need to escape the - there because it's considered another option otherwise).

Easy:
sed 's/[-_]//g'
The character class [-_] matches of the characters from the set.

sed 's/[^[:digit:]]//g' YourFile
Could you tell me what failed on echo $var | sed 's/\-\|\_//g', it works here (even if escapping - and _ are not needed and assuming you use a GNU sed due to \| that only work in this enhanced version of sed)

how to replace a special characters by character using shell

I have a string variable x=tmp/variable/custom-sqr-sample/test/example
in the script, what I want to do is to replace all the “-” with the /,
after that,I should get the following string
x=tmp/variable/custom/sqr/sample/test/example
Can anyone help me?
I tried the following syntax
it didnot work
exa=tmp/variable/custom-sqr-sample/test/example
exa=$(echo $exa|sed 's/-///g')

sed basically supports any delimiter, which comes in handy when one tries to match a /, most common are |, # and #, pick one that's not in the string you need to work on.
$ echo $x
tmp/variable/custom-sqr-sample/test/example
$ sed 's#-#/#g' <<< $x
tmp/variable/custom/sqr/sample/test/example
In the commend you tried above, all you need is to escape the slash, i.e.
echo $exa | sed 's/-/\//g'
but choosing a different delimiter is nicer.

The tr tool may be a better choice than sed in this case:
x=tmp/variable/custom-sqr-sample/test/example
echo "$x" | tr -- - /
(The -- isn't strictly necessary, but keeps tr (and humans) from mistaking - for an option.)

In bash, you can use parameter substitution:
$ exa=tmp/variable/custom-sqr-sample/test/example
$ exa=${exa//-/\/}
$ echo $exa
tmp/variable/custom/sqr/sample/test/example

Linux command line: split a string

I have long file with the following list:
/drivers/isdn/hardware/eicon/message.c//add_b1()
/drivers/media/video/saa7134/saa7134-dvb.c//dvb_init()
/sound/pci/ac97/ac97_codec.c//snd_ac97_mixer_build()
/drivers/s390/char/tape_34xx.c//tape_34xx_unit_check()
(PROBLEM)/drivers/video/sis/init301.c//SiS_GetCRT2Data301()
/drivers/scsi/sg.c//sg_ioctl()
/fs/ntfs/file.c//ntfs_prepare_pages_for_non_resident_write()
/drivers/net/tg3.c//tg3_reset_hw()
/arch/cris/arch-v32/drivers/cryptocop.c//cryptocop_setup_dma_list()
/drivers/media/video/pvrusb2/pvrusb2-v4l2.c//pvr2_v4l2_do_ioctl()
/drivers/video/aty/atyfb_base.c//aty_init()
/block/compat_ioctl.c//compat_blkdev_driver_ioctl()
....
It contains all the functions in the kernel code. The notation is file//function.
I want to copy some 100 files from the kernel directory to another directory, so I want to strip every line from the function name, leaving just the filename.
It's super-easy in python, any idea how to write a 1-liner in the bash prompt that does the trick?
Thanks,
Udi

cat "func_list" | sed "s#//.*##" > "file_list"
Didn't run it :)

You can use pure Bash:
while read -r line; do echo "${line%//*}"; done < funclist.txt
Edit:
The syntax of the echo command is doing the same thing as the sed command in Eugene's answer: deleting the "//" and everything that comes after.
Broken down:
"echo ${line}" is the same as "echo $line"
the "%" deletes the pattern that follows it if it matches the trailing portion of the parameter
"%" makes the shortest possible match, "%%" makes the longest possible
"//*" is the pattern to match, "*" is similar to sed's ".*"
See the Parameter Expansion section of the Bash man page for more information, including:
using ${parameter#word} for matching the beginning of a parameter
${parameter/pattern/string} to do sed-style replacements
${parameter:offset:length} to retrieve substrings
etc.

here's a one liner in (g)awk
awk -F"//" '{print $1}' file

Here's one using cut and rev
cat file | rev | cut -d'/' -f2-| rev

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Complex shell wildcard - linux

This should do it, but you'd likely be better off with ls or even find: echo * | tr ' ' '\012' | egrep '..' | egrep -v '^an' Shell globbing is a form of regex, but it's not as powerful as egrep regex's.

Related

Problem with using grep to match the whole word

Bash Script - Nested $(..) Commands - Not working correctly

Delete _ and - characters using sed

how to replace a special characters by character using shell

Linux command line: split a string

Categories

Resources