sed delete match within quotes on line containing several quotes

sed delete match within quotes on line containing several quotes - linux

I have a file called names.xml
That looks like the below:
NAME="Stacey" SURNAME="Ford"
blah blah blah
NAME="Stacey" SURNAME="Ford"
blah blah blah
I need to find all occurrences of NAME=" and with the "" quotes I need to replace the name with another value.
So the output needs to look like this:
NAME="Jack" SURNAME="Ford"
blah blah blah
NAME="Jack" SURNAME="Ford"
blah blah blah
I am using: sed 's/NAME=".*"/NAME="Jack"/g' names.xml
But this is the result it gives me:
NAME="Jack"
blah blah blah
NAME="Jack"
blah blah blah
It is looking at everything up until the last " on SURNAME.
Your time and assistance is greatly appreciated.

You need to use a negated character class [^"]* which matches any character but not of " zero or more times. .* in your regex is greedy by default, it eats all the characters upto the last " double quotes. So that only it matches Stacey and upto the last Ford. And also you must need to add a word boundary \b before the NAME, so that it won't match the string NAME in SURNAME . \b matches between a word character and a non-word character.
sed 's/\bNAME="[^"]*"/NAME="Jack"/g' names.xml

Here is an awk version:
awk -F\" -vOFS=\" '$1~/NAME=/ {$2="Jack"}1' file
NAME="Jack" SURNAME="Ford"
blah blah blah
NAME="Jack" SURNAME="Ford"
blah blah blah
Use " as field separator. If field 1 contains NAME= replace filed 2 with Jack and print it.

Related

Can I remove a substring from a string starting at a known position and ending at a given character?

I need to extract sections of a string but I won't always know the length/content.
I've tried converting the string to XML or JSON for instance, and can't come up with any other way to achieve what I'm looking for.
Example string:
'Other parts of the string Name="SomeRandomAmountOfCharacters" blah blah'
What I need to remove always starts with an attribute name and ends with a closing double quote. So can I say I'd like to remove substring starting at Name=" and go until we reach the closing "?
Expected result:
'Other parts of the string blah blah'

You'll want to do something like this
$s = 'Other parts of the string Name="SomeRandomAmountOfCharacters" blah blah'
$s -replace ' Name=".*?"'
or like this:
$s = 'Other parts of the string Name="SomeRandomAmountOfCharacters" blah blah'
$s -replace ' Name="[^"]*"'
to avoid unintentionally removing other parts of your string in case it contains multiple attributes or additional double quotes. .*? is a non-greedy match for a sequence of any character except newlines, so it'll match up to the next double quote. [^"]* is a character class matching the longest consecutive sequence of characters that aren't double-quotes, so it'll also match up to the next double quote.
You'll also want to add the miscellaneous construct (?ms) to your expression if you have a multiline string.

Here is a good reference: https://www.regular-expressions.info/powershell.html
In your case
$s = 'Other parts of the string Name="SomeRandomAmountOfCharacters" blah blah'
$s -replace '\W*Name=".*"\W*', " "
or
$newString = $s -replace 'W*Name=".*"\W*', " "
This will replace your matching string, including the surrounding whitespace, with a single space.

Look at something like this and understand how it works.
$pattern = '(.*)Name=".*" (.*)'
$str = 'Other parts of the string Name="SomeRandomAmountOfCharacters" blah blah'
$ret = $str -match $pattern
$out = $Matches[1]+$Matches[2]
$str
"===>"
$out
See also: https://regex101.com/r/wM2xlc/1

Serialize multiline string with |?

Using YamlDotNet, the following string;
"blah blah blah \n blah blah blah"
gets serialized as:
test: >-
blah blah blah
blah blah blah
Is it possible to have this serialized as
test: |
blah blah blah
blah blah blah
dotnet fiddle:
https://dotnetfiddle.net/zT1Ujs

Found it by searching github, add a [YamlMember(ScalarStyle = ScalarStyle.Literal)] attribute to the property works.

Text in file manipulation

I have a text file with text like this:
{"id":2705,"status":"Analyze","severity":"Critical",Blah Blah ... "file":"/home/foo.c","message":"Message is...","url":"http://aaa..."}
{"id":2706,"status":"Fix","severity":"Low",Blah Blah ... "file":"/home/foo1.h","message":"Message2 is...","url":"http://bbb..."}
I would like to have bash script, that reads file, and for each line use all pairs of data as variables (for example id=2705, status="Analyze"...) and echo them.

awk 'BEGIN{RS=",";FS=":";OFS="="}{$1=$1;gsub("}|{|\"","")}1' infile
id=2705
status=Analyze
severity=Critical
Blah Blah ... file=/home/foo.c
message=Message is...
url=http=//aaa...
id=2706
status=Fix
severity=Low
Blah Blah ... file=/home/foo1.h
message=Message2 is...
url=http=//bbb...

grep -A <num> until a string

assuming that we have a file containing the following:
chapter 1 blah blah
blah num blah num
num blah num blah
...
blah num
chapter 2 blah blah
and we want to grep this file so we take the lines
from chapter 1 blah blah to blah num
(the line before the next chapter).
The only things we know are
the stating string chapter 1 blah blah
somewhere after that there is another line starting with chapter
a dummy way to do this is
grep -A <num> -i "chapter 1" <file>
with large enough <num> so the whole chapter will be in it.

sed -ne '/^chapter 1/,/^chapter/{/^chapter/d;p}' file

This is easy to do with awk
awk '/chapter/ {f=0} /chapter 1/ {f=1} f' file
chapter 1 blah blah
blah num blah num
num blah num blah
...
blah num
It will print the line if flag f is true.
The chapter 1 and next chapter to changes the flag.
You can use range with awk but its less flexible if you have other stuff to test.
awk '/chapter 1/,/chapter [^1]/ {if (!/chapter [^1]/) print}' file
chapter 1 blah blah
blah num blah num
num blah num blah
...
blah num

You could do this through grep itself also but you need to enable Perl-regexp parameter P and z.
$ grep -oPz '^chapter 1[\s\S]*?(?=\nchapter)' file
chapter 1 blah blah
blah num blah num
num blah num blah
...
blah num
[\s\S]*? will do a non-greedy match of zero or more characters until the line which has the string chapter at the start is reached.
From man grep
-z, --null-data a data line ends in 0 byte, not newline
-P, --perl-regexp PATTERN is a Perl regular expression
-o, --only-matching show only the part of a line matching PATTERN

Find a String occurrence between Two occurrences of other

I've a very long file as follows.
Input file :-
Text Point
Blah
Blah
Blah
Blah
Blah
Blah
String
Blah
Blah
Blah
Blah
Blah
Blah
Text Point
Blah
Blah
Blah
Blah
Blah
Blah
Text Point
String
Blah
Blah
Text Point
Blah
Blah
Blah
String
Blah
Blah
Blah
Text Point
Blah
Blah
String
Blah
After each Occurrence of a 'Text Point', and before the next occurrence, I expect 'String' to occur at maximum once. I've to extract string if it is occurring between Two consecutive 'Text point's to a output file Or I've to put a dash if it is not occurring.
In this case, I need a output like this
String
-
String
String
String
I tried using following command
sed -n '/Text point/{:a;N;/^\n/s/^\n//;/Text point/{p;s/.*//;};ba};' $1 | grep "String" >> Outfile
But the problem with this is when string isn't found it will not append anything to outfile.
So please help me out with the code. Thanks.

I have a solution with perl
use strict;
use warnings;
$/="Text Point";
while(<>) {
if(/String/m) {
print "String \n" ;
}
else{
print "- \n" ;
}
}

awk '/^Text Point/{print p; p="-" } /String/{ p=$0} END{print p}' input

Using a perl one-liner
perl -0777 -ne 'print /(.*String.*\n)/ ? $1 : "-\n" for split /(?=Text Point)/' file
Explanation:
Switches:
-0777: Slurp entire file
-n: Creates a while(<>){...} loop for each “line” in your input file.
-e: Tells perl to execute the code on command line.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

sed delete match within quotes on line containing several quotes - linux

Here is an awk version: awk -F\" -vOFS=\" '$1~/NAME=/ {$2="Jack"}1' file NAME="Jack" SURNAME="Ford" blah blah blah NAME="Jack" SURNAME="Ford" blah blah blah Use " as field separator. If field 1 contains NAME= replace filed 2 with Jack and print it.

Related

Can I remove a substring from a string starting at a known position and ending at a given character?

Serialize multiline string with |?

Text in file manipulation

grep -A <num> until a string

Find a String occurrence between Two occurrences of other

Categories

Resources