Using the tr command to change a single word into uppercase - linux

I'm having difficulty interpreting the tr --help.
I know that
tr [:lower:] [:upper:] <inputfile
turns all the characters in the file into uppercase
How do I turn a single word into uppercase?
I am not limited to using tr. I am looking for a way to scan a file (or input) for a set sequence for letters and then once it finds a match to turn them into uppercase letters.

Can sed solve your problem?
sed 's/sequence/SEQUENCE/g' < inputfile

If ghost is the word you are looking to upper-case, the following might do the trick. Here \< and \> represent word-boundaries. \( and \) delineate the capture group and \U upper-cases the captured group \1
sed 's/\(\<ghost\>\)/\U\1/g'

Related

How to correctly detect and replace apostrophe (') with sed?

I'm having a directory with many files having special characters and spaces. I want to perform an operation with all these files so I'm trying to store all filenames in a list.txt and then run the command with this list.
The special characters in my list are & []'.
So basically I want to use sed to replace each occurence with \ + the character in question.
E.g. : filename .txt => filename\ .txt etc...
The thing is I have trouble handling apostrophes.
Here is my command as of now :
ls | sed 's/\ /\\ /g' | sed 's/\&/\\&/g' | sed "s/\'/\\'/g" | sed 's/\[/\\[/g' | sed 's/\]/\\]/g'
At first I had issues with, I believe, the apostrophes in the string command in conflict with the apostrophes surrounding the string. So I used double quotes instead, but it still doesn't work.
I've tried all these and nothing worked :
sed "s/\'/\\'/g" (escaping the apostrophe)
sed "s/'/\'/g" (escaping nothing)
sed "s/'/\\'/g" (escaping the backslash)
sed 's/"'"/\"'"/g' (double quoting single quote)
As a disclaimer, I must say, I'm completely new to sed. I just run my first sed command today, so maybe I'm doing something wrong I didn't realize.
PS : I've seen those thread, but no answer worked for me :
https://unix.stackexchange.com/questions/157076/how-to-remove-the-apostrophe-and-delete-the-space
How to replace to apostrophe ' inside a file using SED
This may do:
cat file
avbadf
test&rr
more [ yes
this ]
and'df
sed -r 's/(\x27|&|\[|\])/\\\1/g' file
avbadf
test\&rr
more \[ yes
this \]
and\'df
\x27 is equal to singe quote '
\x22 is equal to double quote "
Whoops, I found the answer to my question. Here is the working input :
sed "s/'/\\\'/g"
This will effectively replace any ' with \'.
However I'm having trouble understanding exactly what's happening here.
So if I understand correctly, we are escaping the backslash and the apostrophe in the replacement string. Now, if somebody could answer some those, I would be grateful :
Why don't we need to escape the first quote (the one in the pattern to find) ?
Why do we have to escape the backslash whereas for the other characters, there's no need ?
Why do we need to escape the second quote (the one in the replacement string) ?
I think all of your sed matches actually need that replacement pattern. This one seems to work for all examples:
ls | sed "s/\ /\\\ /g" | sed "s/\&/\\\&/g" | sed "s/\[/\\\[/g" | sed "s/\]/\\\]/g" | sed "s/'/\\\'/g"
So it is s/regex/replacement/command and 'regex' and 'replacement' have different sets of special characters.
The only one that's different is s/'/\\\'/g and there only because I don't believe there is any special ' character on the regex expression. There is some obscure \' special character in the replacement expression, for matching buffer ends in multi-line mode, accord to the docs. That might be why it needs an escape in the replacement side, but not in the regex side.
For example, \5 is a special character in the replacement expression, so to replace:
filename5.txt -> filename\5.txt
You would also need, as with apostrophe:
sed "s/5/\\\5/g"
It probably has to do with the mysterious inner works of sed parsing, it might read from right to left or something.
Please try the following:
sed 's/[][ &'\'']/\\&/g' file
By using the same example by #Jotne, the result will be:
gavbadf
gtest\&rr
gmore\ \[\ yes
gthis\ \]
gand\'df
[How it works]
The regex part in the sed s command above just defines a character
class of & []', which should be escaped with a backslash.
The right square bracket ] does not need escaping when put
immediately after the left square bracket [.
The obfuscating part will be the handling of a single quote.
We cannot put a single quote within single quotes even if we escape it.
The workaround is as follows: Say we have an assignment str='aaabbb'.
To put a single quote between "aaa" and "bbb", we can say as
str='aaa'\''bbb'.
It may look puzzling but it just concatenates the three sequences;
1) to close the single-quoted string as 'aaa'.
2) to put a single quote with an escaping backslash as \'.
3) to restart the single-quoted string as 'bbb'.
Hope this helps.

How cut characters from string and put it at the end- In shell

I want to be able to do the following:
String1= "HELLO 3002_3322 3.2.1.log"
And get output like:
output = "3002_3322 3.2.1.log HELLO"
I know the command sed is able to do this but I need some guidance.
Thanks!
AWK
awk is one tool to do something like that:
echo "HELLO 3002_3322 3.2.1.log" | awk '{print $2$3" "$1}'
What it does:
awk, without delimiter flag of -F splits by whitespace sequences
that means, HELLO 3002_3322 and 3.2.1.log will be seen
HELLO is referred to by $1; 3002_3322 is $2 and so on
we print $2, then $3 then one space, then $1
SED
I have a unpretty looking sed example for you:
echo "HELLO 3002_3322 3.2.1.log" | sed 's_\(.*\)\s\(.*\)\s\(.*\)_\3 \2 \1_'
What it does:
nomenclature is s_<pattern>_<replacement>_
first s stands for substitute
_ is the delimiter
(.*) is paranthesis dot star parenthesis. That is the first group of characters we are asking sed to match. .* means match any sequence of characters or no characters at all. Ignore the \ before ( and ) for now
Notice the \s after the group. \s matches one space. So, we are asking sed to separate out (.*)\s - i.e. ()
We repeat that to tell sed - (group1)(group2)(group3)
First group's shorthand is \1, group2's shorthand is \2 etc.
For replacement, we tell sed to arrange \3 (group3) first, then \2 (group2) and then \1 (group1)
( is a special character in sed. So we have to escape it by a forward slash. So, (.*)\s(.*)\s(.*) becomes \(.*\)\s\(.*\)\s\(.*\). Oh so pretty!
In sed you can do:
sed 's/\([^[:blank:]]*\)[[:blank:]]*\(.*\)/\2 \1/'
Which outputs 3002_3322 3.2.1.log HELLO.
Explanation
The first word is captured by
\([^[:blank:]]*\)
The \(\) means I want to capture this group to use later. [:blank:] is a POSIX character class for whitespace characters. You can see the other POSIX character classes here:
http://www.regular-expressions.info/posixbrackets.html
The outer [] means match anyone of the characters, and the ^ means any character except those listed in the character class. Finally the * means any number of occurrences (including 0) of the previous character. So in total [^[:blank:]]* this means match a group of characters that are not whitespace, or the first word. We have to do this somewhat complicated regex because POSIX sed only supports BRE (basic regex) which is greedy matching, and to find the first word we want non-greedy matching.
[[:blank:]]*, as explained above, this means match a group of consecutive whitespaces.
\(.*\) This means capture the rest of the line. The . means any single character, so combined with the * it means match the rest of the characters.
For the replacement, the \2 \1 means replace the pattern we matched with the 2nd capture group, a space, then the first capture group.
This might work for you (GNU sed):
sed -r 's/^(\S+)(\s+)(.*)/\3\2\1/' file
Pattern match non-spaces, spaces and what is left and then use the remembered patterns (back references) in the replacement part of the substitution command.
N.B. The -r aurgument just removes the need for copius back slashes, so the same solution may be written as:
sed 's/^\(\S\S*\)\(\s\s*\)\(.*\)/\3\2\1/' file
This also removes the syntatic sugar of the the metacharacter + which means one or more of the preceeding pattern.
Further note, that \S and \s may be replaced by [^[:space:]] and [[:space:]] respectively. Leading to:
sed 's/^\([^[:space:]][^[:space:]]*\)\([[:space:]][[:space:]]*\)\(.*\)/\3\2\1/' file
You can do this too (without awk or sed):
#!/bin/sh
String1="HELLO 3002_3322 3.2.1.log"
start="${String1%% *}"
end="${String1#* }"
output="$end $start"
echo "$output"
Or using cut (in Bash):
#!/bin/bash
String1="HELLO 3002_3322 3.2.1.log"
rstr="$(echo "$String1" |cut -d" " -f1)"
output="${String1/$rstr /} $rstr"
echo "$output"

fedora sed command replace special characters

i am totally new to sed and as part of script writing i am trying to replace specific string from a fiel. I know the special characters need to be escaped using backslash but the problem is if the special character is first in the line then it is not replaced....
For e.g my file contains
sldgfkls $bdxcv sldflksd
Now if i write the below code
sed -i 's/\b\$bdxcv\b/abcd/' filename
Then the above word is not replaced....But if the file contains
sldgfkls a$bdxcv sldflksd
Now if i write the below code
sed -i 's/\ba\$bdxcv\b/abcd/' filename
Then the above word is replaced.....
Please Help me here....
Clearly, \b does not consider a dollar sign to be a word character, so there is no word boundary for it to match between space and $.
Perhaps you want this instead:
sed -i 's/\(^\|[\t ]\)\$bdxcv\b/\1abcd/' filename
Assuming yours is GNU sed, see https://www.gnu.org/software/sed/manual/sed.html which contains this definition:
A “word” character is any letter or digit or the underscore character.
and thus not dollar sign.
sed cannot operator on strings, only regular expressions. Trying to figure out which characters need to be escaped to disable their regexp (or sed delimiter or sed backreference) meaning to make a regexp in sed behave as if it were a string is a fool's errand, just use a tool that can operate on strings, e.g. awk.
$ awk '{for (i=1;i<NF;i++) if ($i == "$bdxcv") $i="abcd"} 1' file
sldgfkls abcd sldflks
The above uses string comparison and string assignment - no need to escape anything unless one of the strings contained the string delimiter, ".

How can I use sed to get an xml value

How can I use sed to get the SOMETHING in <version.suffix>SOMETHING</version.suffix>?
I tried sed 's#.*>\(.*\)\<version\.suffix\>#\1#' ,but fails.
Try this one:
sed 's/<.*>\(.*\)<.*>/\1/'
It should be general enough to get every xml value.
If you need to eliminate the indentation add \s* at the beginning like this:
sed 's/\s*<.*>\(.*\)<.*>/\1/'
Alternatively if you only want version.suffix's value, you can make the command more specific like this:
sed 's/<version\.suffix>\(.*\)<.*>/\1/'
You could use the below sed command,
$ echo '<version.suffix>SOMETHING</version.suffix>' | sed 's#^<[^>]*>\(.*\)<\/[^>]*>$#\1#'
SOMETHING
^<[^>]*> Matches the first tag string <version.suffix>.
\(.*\)<\/[^>]*>$ Characters upto the next closing tag are captured. And the remaining closing tag was matched by this <\/[^>]*> regex.
Finally all the matched characters are replaced by the characters which are present inside the group index 1.
Your regex is correct but the only thing is, you forget to use / inside the closing tag.
$ echo '<version.suffix>SOMETHING</version.suffix>' | sed 's#.*>\(.*\)</version\.suffix>#\1#'
|<-Here
SOMETHING
Many ways possible, e.g:
with sed
echo '<version.suffix>SOMETHING</version.suffix>' | sed 's#<[^>]*>##g'
or grep
echo '<version.suffix>SOMETHING</version.suffix>' | grep -oP '<version.suffix>\KSOMETHING(?=</version.suffix>)'
Assuming the formatting of the question is accurate, when I run the example in the question as-is:
$ echo '<version.suffix>SOMETHING</version.suffix>' | sed 's#.*>\(.*\)\<version\.suffix\>#\1#'
I see the following output:
SOMETHING</>
In case my formatting skills fail me, this output ends with the trailing left angle bracket, a forward slash, and finally the right angle bracket.
So, why this "failure"? Well, on my system (Linux with GNU grep 2.14), grep(1) includes the following snippet:
The Backslash Character and Special Expressions
The symbols \< and \> respectively match the empty string at the beginning and end of a word.
Other answers suggest good alternatives to extract the value in XML tag syntax; use them.
I just wanted to point out why the RE in the original problem fails on current Linux systems: some symbols match no actual characters, but instead match empty boundaries in these apps that support posix-extended regular expressions. So, in this example, the brackets in the source are matched in unexpected ways:
the (.*)has matched SOMETHING</, to be printed by the \1 back-reference
the left-hand side of version.suffix is matched by \<
version.suffix is matched by version\.suffix
the right-hand side of version.suffix is matched by \>
the trailing > character remains in sed's pattern space and is printed.
TL;DR -"\X" does not mean "just match an X" for all X!

grep is not working as expected

i have tried to grep in a file ......
in file i have 5 entities
vivek
vivek.a
a.vivek
vivek_a
a_vivek
when i grep as grep -iw vivek filename, then it should give me
vivek only but it give
vivek
vivek.a
a.vivek
Looks fine to me. . is a non-word character. If you meant something else then you should have used a more-specific regex instead of using -w.
It does that because the definition of a word (which is what the w option chooses) permits . to separate words, though _ is considered part of a word. This definition is useful for programming languages, but not so useful for English text.
A set of characters with letters, underscore and digits is considered as a word. So any other character apart from that set denotes the word boundary. Therefore, in the line "vivek.a", the dot denotes end of word, and all the characters before that form a word "vivek", which matches with the word you are trying to match using option -w.
So, one way is to define your own word boundaries like this:
$ grep -i -e "[[:space:]]vivek[[:space:]]" -e "^vivek[[:space:]]" -e "[[:space:]]vivek$" -e "^vivek$" file

Resources