Text in file manipulation

Text in file manipulation - string

I have a text file with text like this:
{"id":2705,"status":"Analyze","severity":"Critical",Blah Blah ... "file":"/home/foo.c","message":"Message is...","url":"http://aaa..."}
{"id":2706,"status":"Fix","severity":"Low",Blah Blah ... "file":"/home/foo1.h","message":"Message2 is...","url":"http://bbb..."}
I would like to have bash script, that reads file, and for each line use all pairs of data as variables (for example id=2705, status="Analyze"...) and echo them.

awk 'BEGIN{RS=",";FS=":";OFS="="}{$1=$1;gsub("}|{|\"","")}1' infile
id=2705
status=Analyze
severity=Critical
Blah Blah ... file=/home/foo.c
message=Message is...
url=http=//aaa...
id=2706
status=Fix
severity=Low
Blah Blah ... file=/home/foo1.h
message=Message2 is...
url=http=//bbb...

Related

Using gawk to Replace a Pattern of Text with the Contents of a File Whose Filename is Inside the Text

I am trying to replace text inside a text file according to a certain criteria.
For example, if I have three text files, with outer.txt containing:
Blah Blah Blah
INCLUDE inner1.txt
Etcetera Etcetera
INCLUDE inner2.txt
end of file
And inner1.txt containing:
contents of inner1
And inner2.txt containing:
contents of inner2
At the end of the replacement, the outer.txt file would look like:
Blah Blah Blah
contents of inner1
Etcetera Etcetera
contents of inner2
end of file
The overall pattern would be that for every instance of the word "INCLUDE", replace that entire line with the contents of the file whose filename immediately follows that instance of "INCLUDE", which in one case would be inner1.txt and in the second case would be inner2.txt.
Put more simply, is it possible for gawk to be able to determine which text file is to be embedded into the outer text file based on the very contents to be replaced in the outer text file?

With gnu sed
sed -E 's/( *)INCLUDE(.*)/printf "%s" "\1";cat \2/e' outer.txt

If you set the +x bit on the edit-file ('chmod +x edit-file'), then you can do:
g/include/s//cat/\
.w\
d\
r !%
w
q
Explanation:
g/include/s//cat/\
Starts a global command.
.w\
(from within the global context), overwrites the edit-file with the current line only (effectively: 'cat included_file', where you replace included_file for the filename in question.)
d\
(from within the global context), deletes the current line from the buffer. (i.e. deletes 'include included_file', again, included_file standing for the file in question).
r !%
(from within the global context), reads the output from executing the default file (which is file we are editing, and was overwritten above with 'cat...').
w
(finally, outside the global context). Writes (saves) the buffer back to the edit-file.
q
quit.

With GNU awk:
awk --load readfile '{if ($1=="INCLUDE") {printf readfile($2)} else print}' outer.txt

Another ed approach would be something like:
#!/bin/sh
ed -s outer.txt <<-'EOF'
/Blah Blah Blah/+1kx
/end of file/-1ky
'xr inner.txt
'xd
'yr inner2.txt
'yd
%p
Q
EOF
Change Q to w if in-place editing is required
Remove the %p to silence the output.

Serialize multiline string with |?

Using YamlDotNet, the following string;
"blah blah blah \n blah blah blah"
gets serialized as:
test: >-
blah blah blah
blah blah blah
Is it possible to have this serialized as
test: |
blah blah blah
blah blah blah
dotnet fiddle:
https://dotnetfiddle.net/zT1Ujs

Found it by searching github, add a [YamlMember(ScalarStyle = ScalarStyle.Literal)] attribute to the property works.

grep -A <num> until a string

assuming that we have a file containing the following:
chapter 1 blah blah
blah num blah num
num blah num blah
...
blah num
chapter 2 blah blah
and we want to grep this file so we take the lines
from chapter 1 blah blah to blah num
(the line before the next chapter).
The only things we know are
the stating string chapter 1 blah blah
somewhere after that there is another line starting with chapter
a dummy way to do this is
grep -A <num> -i "chapter 1" <file>
with large enough <num> so the whole chapter will be in it.

sed -ne '/^chapter 1/,/^chapter/{/^chapter/d;p}' file

This is easy to do with awk
awk '/chapter/ {f=0} /chapter 1/ {f=1} f' file
chapter 1 blah blah
blah num blah num
num blah num blah
...
blah num
It will print the line if flag f is true.
The chapter 1 and next chapter to changes the flag.
You can use range with awk but its less flexible if you have other stuff to test.
awk '/chapter 1/,/chapter [^1]/ {if (!/chapter [^1]/) print}' file
chapter 1 blah blah
blah num blah num
num blah num blah
...
blah num

You could do this through grep itself also but you need to enable Perl-regexp parameter P and z.
$ grep -oPz '^chapter 1[\s\S]*?(?=\nchapter)' file
chapter 1 blah blah
blah num blah num
num blah num blah
...
blah num
[\s\S]*? will do a non-greedy match of zero or more characters until the line which has the string chapter at the start is reached.
From man grep
-z, --null-data a data line ends in 0 byte, not newline
-P, --perl-regexp PATTERN is a Perl regular expression
-o, --only-matching show only the part of a line matching PATTERN

sed delete match within quotes on line containing several quotes

I have a file called names.xml
That looks like the below:
NAME="Stacey" SURNAME="Ford"
blah blah blah
NAME="Stacey" SURNAME="Ford"
blah blah blah
I need to find all occurrences of NAME=" and with the "" quotes I need to replace the name with another value.
So the output needs to look like this:
NAME="Jack" SURNAME="Ford"
blah blah blah
NAME="Jack" SURNAME="Ford"
blah blah blah
I am using: sed 's/NAME=".*"/NAME="Jack"/g' names.xml
But this is the result it gives me:
NAME="Jack"
blah blah blah
NAME="Jack"
blah blah blah
It is looking at everything up until the last " on SURNAME.
Your time and assistance is greatly appreciated.

You need to use a negated character class [^"]* which matches any character but not of " zero or more times. .* in your regex is greedy by default, it eats all the characters upto the last " double quotes. So that only it matches Stacey and upto the last Ford. And also you must need to add a word boundary \b before the NAME, so that it won't match the string NAME in SURNAME . \b matches between a word character and a non-word character.
sed 's/\bNAME="[^"]*"/NAME="Jack"/g' names.xml

Here is an awk version:
awk -F\" -vOFS=\" '$1~/NAME=/ {$2="Jack"}1' file
NAME="Jack" SURNAME="Ford"
blah blah blah
NAME="Jack" SURNAME="Ford"
blah blah blah
Use " as field separator. If field 1 contains NAME= replace filed 2 with Jack and print it.

Find a String occurrence between Two occurrences of other

I've a very long file as follows.
Input file :-
Text Point
Blah
Blah
Blah
Blah
Blah
Blah
String
Blah
Blah
Blah
Blah
Blah
Blah
Text Point
Blah
Blah
Blah
Blah
Blah
Blah
Text Point
String
Blah
Blah
Text Point
Blah
Blah
Blah
String
Blah
Blah
Blah
Text Point
Blah
Blah
String
Blah
After each Occurrence of a 'Text Point', and before the next occurrence, I expect 'String' to occur at maximum once. I've to extract string if it is occurring between Two consecutive 'Text point's to a output file Or I've to put a dash if it is not occurring.
In this case, I need a output like this
String
-
String
String
String
I tried using following command
sed -n '/Text point/{:a;N;/^\n/s/^\n//;/Text point/{p;s/.*//;};ba};' $1 | grep "String" >> Outfile
But the problem with this is when string isn't found it will not append anything to outfile.
So please help me out with the code. Thanks.

I have a solution with perl
use strict;
use warnings;
$/="Text Point";
while(<>) {
if(/String/m) {
print "String \n" ;
}
else{
print "- \n" ;
}
}

awk '/^Text Point/{print p; p="-" } /String/{ p=$0} END{print p}' input

Using a perl one-liner
perl -0777 -ne 'print /(.*String.*\n)/ ? $1 : "-\n" for split /(?=Text Point)/' file
Explanation:
Switches:
-0777: Slurp entire file
-n: Creates a while(<>){...} loop for each “line” in your input file.
-e: Tells perl to execute the code on command line.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Text in file manipulation - string

awk 'BEGIN{RS=",";FS=":";OFS="="}{$1=$1;gsub("}|{|\"","")}1' infile id=2705 status=Analyze severity=Critical Blah Blah ... file=/home/foo.c message=Message is... url=http=//aaa... id=2706 status=Fix severity=Low Blah Blah ... file=/home/foo1.h message=Message2 is... url=http=//bbb...

Related

Using gawk to Replace a Pattern of Text with the Contents of a File Whose Filename is Inside the Text

Serialize multiline string with |?

grep -A <num> until a string

sed delete match within quotes on line containing several quotes

Find a String occurrence between Two occurrences of other

Categories

Resources