Embedding quotation marks in command string generated by AWK? - linux

I need to match all instances of strings in one file, with a master list in another. However, if my string is abc I want only that, not abcdef, abc1234 and so on.
So, a word boundary for the regex? Right now, I'm using a simple awk one liner:
cat results_file| sort -k 1| awk -F" " '{ print $1" /home/owner/file_2_search"}'|
xargs -L 1 /bin/grep -i
However, to force a word boundary, I'd need to grep string\b and the quotes (single or double) seem to be required.
In awk, \b is a special character, you need \\b ... And the quoted quotes ... (arg) ... Or am I missing something and overdoing this?
This is a Linux box, so presumably gawk. I have gone over quoting rules for awk, and realize this has got to be simple (and not complex ... but), but am not seeing it.

Had meant to post as an answer, not a comment. Will try to pose a more readable question, but confess to having second thoughts about doing this as a one-liner in the first place -- may be best to follow an alternate method. Appreciate the willingness to help.
--Joe

Related

How to capitalize and replace characters in shell script in one echo

I am trying to find a way to capitalize and replace dashes of a string in one echo. I do not have the ability to use multiple lines for reassigning the string value.
For example:
string='test-e2e-uber' needs to echo $string as TEST_E2E_UBER
I currently can do one or the other by utilizing
${string^^} for capitalization
${string//-/_} for replacement
However, when I try to combine them it does not appear to work (bad substitution error).
Is there a correct syntax to achieve this?
echo ${string^^//-/_}
This does not answer directly your question, but still following script achieves what you wanted :
declare -u string='test-e2e-uber'
echo ${string//-/_}
You can do that directly with the 'tr' command, in just one 'echo'
echo "$string" | tr "-" "_" | tr "[:lower:]" "[:upper:]"
TEST_E2E_UBER
I don't think 'tr' allows to do the conversion of 2 objects in one command only, so I used pipe for output redirection
or you could do something similar with 'awk'
echo "$string" | awk '{gsub("-","_",$0)} {print toupper($0)}'
TEST_E2E_UBER
in this case, I'm replacing with 'gsub' the hyphen, then i'm printing the whole record to uppercase
Why do you dislike it so much to have two successive assignment statements? If you really hate it, you will have to revert to some external program to do the task for you, such as
string=$(tr a-z- A-Z_ <<<$string)
but I would consider it a waste of resources to create a child process for such a simple operation.

I am trying to replace a text for example

Example:
"word" -nothing
To
word" - nothing
in gvim.
I tried
:%s/^.*\"/
But what I get is: -nothing
Well I am new to scripting so I would like to know if it can be done in any other way like using gvim or awk or sed.
In vim... Check for \(word + quote + space + hyphen\) as first reference, followed directly by another \(word\) as second reference... replace by first reference + space + second reference... Make sure the find/replace can happen multiple times on a line with g suffix.
:%s/\(\w" -\)\(\w\)/\1 \2/g
Note that I left out the leading quote... I suppose it is possible you might have spaces in the quoted text - and I think this form might be better for you. Now in sed, that is the really cool thing about the relationship between *nix tools - they all use similar (or the same) regular expressions pattern language. So, the same exact pattern as above can be done in sed (using : as delimiter for clarity).
sed 's:\(\w" -\)\(\w\):\1 \2:g'
Awk doesn't do back references; so, not to say it can't be done, but it is not so convenient.
Could you please try following and let me know if this helps you.
awk '{sub(/^"/,"");sub(/-/,"- ")} 1' Input_file
Solution 2nd: With sed.
sed 's/^"//;s/-/- /' Input_file
Since you also tagged grep: GNU grep has the -P switch for PCRE (Perl compatible reg ex) which has \K: Keep the stuff left of the \K, don't include it in $&, so:
$ echo \"word\" | grep -oP "\"\Kword\""
word"
If I understand your question correctly you want to replace first " in each line with empty string. So in sed it is just:
sed 's/"//'
Without g flag it will replace only first occurrence in each line.
EDIT:
The same way it will work in Vim (unless you have 'gdefault' option set), so in Vim you can:
:%s/"//
try this :
:%s/\"(.*)\"/\1\"/gc

Delete Repeated Characters without back-referencing with SED

Let's say we have a file that contains
HHEELLOO
HHYYPPOOTTHHEESSIISS
and we want to delete repeated characters. To my knowledge we can do this with
s/\([A-Z]\)\1/\1/g
This is a homework problem and the professor said he wants us to try the exercises without back-referencing or extended regular expressions. Is that possible on this one? I would appreciate it if anyone could point me in the right direction, thanks!
The only reasonable way to do this is to use the right tool for the job, in this case tr:
$ tr -s 'A-Z' < file
HELO
HYPOTHESIS
If you were going to use sed for that specific problem though then it'd just be:
$ sed 's/\(.\)./\1/g' file
HELO
HYPOTHESIS
If that's not what you're looking for then edit your question to show more truly representative sample input and expected output.
Here's one way:
s/AA/A/g
s/BB/B/g
...
s/ZZ/Z/g
As a one-liner:
sed 's/AA/A/g; s/BB/B/g; ...'

How to reverse each word in a text file with linux commands without changing order of words

There's lots of questions indicating how to reverse each word in a sentence, and I could readily do this in Python or Javascript for example, but how can I do it with Linux commands? It looks like tac might be an option, but seems like this would likely reverse lines as well as words, rather than just words? What other tools can do this? I literally have no idea. I know rev and tac and awk all seem like contenders...
So I'd like to go from:
cat dog sleep
pillow green blue
to:
tac god peels
wollip neerg eulb
**slight followup
From this reference it looks like I could use awk to break each field up into an array of single characters and then write a for loop to reverse manually each word in this way. This is quite awkward. Surely there's a better/more succinct way to do this?
Try this on for size:
sed -e 's/\s+/ /g' -e 's/ /\n/g' < file.txt | rev | tr '\n' ' ' ; echo
It collapses all the space and counts punctuation as part of "words", but it looks like it (at least mostly) works. Hooray for sh!

sed regex with variables to replace numbers in a file

Im trying to replace numbers in my textfile by adding one to them. i.e.
sed 's/3/4/g' path.txt
sed 's/2/3/g' path.txt
sed 's/1/2/g' path.txt
Instead of this, Can i automate it, i.e. find a /d and add one to it in the replace.
Something like
sed 's/\([0-8]\)/\1+1/g' path.txt
Also wanted to capture more than one digit i.e. ([0-9])\t([0-9]) and change each one keeping the tab inbetween
Thanks
edited #2
Using the perl example,
I also would like it to work with more digits i.e.
perl -pi~ -e 's/(\d+)\.(\d+)\.(\d+)\.(\d+)/ ($1+1)\.($2+1)\.($3+1)\.($4+1) /ge' output.txt
Any tips on making the above work?
There is no support for arithmetic in sed, but you can easily do this in Perl.
perl -pe 's/(\d+)/ $1+1 /ge'
With the /e option, the replacement expression needs to be valid Perl code. So to handle your final updated example, you need
perl -pi~ -e 's/(\d+)\.(\d+)\.(\d+)\.(\d+)/ $1+1 . "." $2+1 . "." . $3+1 . "." . $4+1 /ge'
where strings are properly quoted and adjacent strings are concatenated together with the . Perl string concatenation operator. (The arithmetic numbers are coerced into strings as well when they are concatenated with a string.)
... Though of course, the first script already does that more elegantly, since with the /g flag it already increments every sequence of digits with one, anywhere in the string.
Triplee's perl solution is the more generic answer, but Michal's sed solution works well for this particular case. However, Michal's sed solution is more easily written:
sed y/12345678/23456789/ path.txt
and is better implemented as
tr 12345678 23456789 < path.txt
This utterly fails to handle 2 digit numbers (as in the edited question).
You can do it with sed but it's not easy, see this thread.
And it's hard with awk too, see this.
I'd rather use perl for this (something like this can be seen in action # ideone):
perl -pe 's/([0-8])/$1+1/e'
(The ideone.com example must have some looping as ideone does not sets -pe by default.)
You can't do addition directly in sed - you could do it in awk by matching numbers using a regex in each line and increasing the value, but it's quite complicated. If do not need to handle arbitrary numbers but a limited set, like only single-digit numbers from 0 to 8, you can just put several replacement commands on a single sed command line by separating them with semicolons:
sed 's/8/9/g ; s/7/8/g; s/6/7/g; s/5/6/g; s/4/5/g; s/3/4/g; s/2/3/g; s/1/2/g; s/0/1/g' path.txt
This might work for you (GNU sed & Bash):
sed 's/[0-9]/$((&+1))/g;s/.*/echo "&"/e' file
This will add one to every individual digit, to increment numbers:
sed 's/[0-9]\+/$((&+1))/g;s/.*/echo "&"/e' file
N.B. This method is fraught with problems and may cause unexpected results.

Resources