Grep for string and read content until next match string

Grep for string and read content until next match string - search

I am trying to read a file and search for a string using grep. Once I find the string, I want to read everything after the string until I match another string. So in my example, I am searching for ...SUMMARY... and I want to read everything until the occurrence of ... Here is an example:
**...SUMMARY...**
Severe thunderstorms are most likely across north-central/northeast
Texas and the Ark-La-Tex region during the late afternoon and
evening. Destructive hail and wind, along with a few tornadoes are
possible. Severe thunderstorms are also expected across the
Mid-South and Ohio Valley.
**...**North-central/northeast TX and southeast OK/ArkLaTex...
In the wake of a decaying MCS across the Lower Mississippi River
Valley, a northwestward-extending outflow boundary will continue to
modify/drift northward with rapid/strong destabilization this
afternoon particularly along and south of it. A quick
reestablishment of lower/some middle 70s F surface dewpoints will
occur into prior-MCS-impacted areas, with MLCAPE in excess of 4000
J/kg expected for parts of north-central/northeast Texas into far
southeast Oklahoma and the nearby ArkLaTex. Special 19Z observed
soundings are expected from Fort Worth/Shreveport to help better
gauge/confirm this destabilization trend and the degree of capping.
I have tried using the following code but only displays the ...SUMMARY... and the next line.
sed -n '/...SUMMARY.../,/.../p'
What can I do to solve this?
=======================================================================
Followup:
This is the result I am trying to get. Only show the paragraph under ...SUMMARY... and end at the next ... so this is what I should get in the end:
Severe thunderstorms are most likely across north-central/northeast
Texas and the Ark-La-Tex region during the late afternoon and
evening. Destructive hail and wind, along with a few tornadoes are
possible. Severe thunderstorms are also expected across the
Mid-South and Ohio Valley.
I have tried the following based on a recommendation Shellter:
sed -n '/...SUMMARY.../,/**...**/p'
But I get everything.

You may use
sed -n '/^[[:blank:]]*\.\.\.SUMMARY\.\.\./,/^[[:blank:]]*\.\.\./{//!p;}' file
See this online sed demo.
NOTES:
Escape literal dots
Literal asterisks should also be escaped, and they are necessary as /.../ just matches a line with any 3 chars
^ matches the start of a line and [[:space:]]* matches any 0+ whitespace chars
{//!p;} gets you the contents between two lines excluding those lines (see How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)?)

Related

convert the numbers to their names(1=one etc) with SED command

I need to convert digits to their name ex: 1=one 2=two etc but I can only use SED command. Just the one-digit numbers should change.
sed 's/[1]/one/g; s/[2]/two/g; s/[3]/three/g; s/[4]/four/g; s/[5]/five/g; s/[6]/six/g; s/[7]/seven/g; s/[8]/eight/g; s/[9]/nine/g; s/[0]/zero/g'
Text:
Lo5se eyes get fat shew. Win4ter can indeed letter oppose way change te5nded now. So is imp6rove my charmed picture exposed adapt5ed demands. Received had en4d prod4uced prepared dive5rted strictly off man br55anched. Known 72ye money 6so large decay voice t6here to. Preserved be mr cordially incom88888mode as an. He 3doors qui03ck child an point at. Had sh2a9re vexed front least style off why him.
The result should be:
Lofivese eyes get fat shew. Winfourter can indeed letter oppose way change tefivended now. So is impsixrove my charmed picture exposed adaptfiveed demands. Received had enfourd prodfouruced prepared divefiverted strictly off man br55anched. Known 72ye money sixso large decay voice tsixhere to. Preserved be mr cordially incom88888mode as an. He threedoors qui03ck child an point at. Had shtwoaninere vexed front least style off why him.

With a sed that has -E to enable EREs and recognizes \n as meaning a newline (e.g. GNU sed):
sed -E 's/(^|[^0-9])([0-9])([^0-9]|$)/\1\n\2\n\3/g; s/\n1\n/one/g; s/\n2\n/two/g; s/\n3\n/three/g; s/\n4\n/four/g; s/\n5\n/five/g; s/\n6\n/six/g; s/\n7\n/seven/g; s/\n8\n/eight/g; s/\n9\n/nine/g; s/\n0\n/zero/g' file
Lofivese eyes get fat shew. Winfourter can indeed letter oppose way change tefivended now. So is impsixrove my charmed picture exposed adaptfiveed demands. Received had enfourd prodfouruced prepared divefiverted strictly off man br55anched. Known 72ye money sixso large decay voice tsixhere to. Preserved be mr cordially incommode as an. He threedoors qui03ck child an point at. Had shtwoa9re vexed front least style off why him.

This might work for you (GNU sed):
sed -E 's/[0-9]+/\n&/g;s/\n(.[0-9])/\1/g;s/$/\n1one2two3three4four5five6six7seven8eight9nine0zero/;:a;s/\n(.)(.*\n.*\1([^0-9]+))/\3\2/;ta;P;d' file
Prepend a newline to each group of numbers.
Remove the prepended newline for numbers with more than two digits.
Append a newline and a lookup table to the end of the line.
Use a loop and pattern matching to replace each single digit with its literal, ensuring the lookup table is maintained.
Print the amended current line less the lookup table.

How to edit this file using grep or using cat or using vim or using another tool?

One of my elder brother who is studying in Statistics. Now, he is writing his thesis paper in LaTeX. Almost all contents are written for the paper. And he took 5 number after point(e.g. 5.55534) for each value those are used for his calculation. But, at the last time his instructor said to change those to 3 number after point(e.g. 5.555) which falls my brother in trouble. Finding and correcting those manually is not easy. So, he told me to help.
I believe there is also a easy solution which is know to me. The snapshot of a portion of the thesis looks like-
&se($\hat\beta_1$)&0.35581&0.35573&0.35573\\
&mse($\hat\beta_1$)&.12945&.12947&.12947\\
\addlinespace
&$\hat\beta_2$&0.03329&0.03331&0.03331 \\
&se($\hat\beta_2$)&0.01593&0.01592&0.01591\\
&mse($\hat\beta_2$)&.000265&.000264&.000264 \\
\midrule
{n=100} & $\hat\beta_1$&-.52006&-.52001&-.51946\\
&se($\hat\beta_1$)&.22819&.22814&.22795\\
&mse($\hat\beta_1$)&.05247&.05244&.05234\\
\addlinespace
&$\hat\beta_2$&0.03134&0.03134&0.03133 \\
&se($\hat\beta_2$)&0.00979&0.00979&0.00979\\
&mse($\hat\beta_2$)&.000098&.000098&.000098
I want -
&se($\hat\beta_1$)&0.355&0.355&0.355\\
&mse($\hat\beta_1$)&.129&.129&.129\\
......................................................................
........................................................................
........................................................................
Note: Don't feel boring for the syntax(These are LaTeX syntax).
If anybody has solution or suggestion, please provide. Thank you.

In sed:
$ sed 's/\(\.[0-9]\{3\}\)[0-9]*/\1/g' file
&se($\hat\beta_1$)&0.355&0.355&0.355\\
&mse($\hat\beta_1$)&.129&.129&.129\\
ie. replace period starting numeric strings with at least 3 numbers with the leading period and three first numbers.

Here is the command in vim:
:%s/\.\d\{3}\zs\d\+//g
Explanation:
: entering command-mode
% is the range of all lines of the file
s substitution command
\.\d\{3}\zs\d\+ pattern you would like to change
\. literal point (.)
\d\{3} match 3 consecutive digits
\zs start substitution from here
\d\+ one or more digits
g Replace all occurrences in the line
Concerning grep and cat they have nothing to do with replacing text. These commands are only for searching and printing contents of files.
Instead, what you are looking is substitution there are lots of commands in Linux that can do that mainly sed, perl, awk, ex etc.

Swapping character locations in a string (bash)

I have encountered an issue where I need to change the order of the characters in a string in bash, in order to get the ICCID of a sim card.
The number I acquire from the modem looks for example like this; 980136010000006187F5. What I need to do is to take the characters in the string in pairs and recite them in reverse. In this example 98 would be 89, 01 would be 10 and so on, finally adding up to 8910631000000016785F which is a correct ICCID number.
I am thinking that this might be possible using sed or a for-loop of sorts, but I've gotten stuck in how to achieve this. Help would be very appreciated!
Regards,
Carl

sed can do it easily.
sed 's/\(.\)\(.\)/\2\1/g' <<<980136010000006187F5
The sed command finds every match to .. (i.e. a pair of characters), capturing them independently, and then replaces them with the two captures in inverse order.

sed regex not being greedy?

In bash I have a string variable tempvar, which is created thus:
tempvar=`grep -n 'Mesh Tally' ${meshtalfile}`
meshtalfile is a (large) input file which contains some header lines and a number of blocks of data lines, each marked by a beginning line which is searched for in the grep above.
In the case at hand, the variable tempvar contains the following string:
5: Mesh Tally Number 4 977236: Mesh Tally Number 14 1954467: Mesh Tally Number 24 4354479: Mesh Tally Number 34
I now wish to extract the line number relating to a particularly mesh tally number - so I define a variable meshnum1 as equal to 24, and run the following sed command:
echo ${tempvar} | sed -r "s/^.*([0-9][0-9]*):\sMesh\sTally\sNumber\s${meshnum1}.*$/\1/"
This is where things go wrong. I expect the output 1954467, but instead I get 7. Trying with number 34 instead returns 9 instead of 4354479. It seems that sed is returning only the last digit of the number - which surely violates the principle of greedy matching? And oddly, when I move the open parenthesis ( left a couple of characters to include .*, it returns the whole line up to and including the single character it was previously returning. Surely it cannot be greedy in one situation and antigreedy in another? Hopefully I have just done something stupid with the syntax...

The problem is that the .* is being greedy too, which means that it will get all numbers too. Since you force it to get at least one digit in the [0-9][0-9]* part, the .* before it will be greedy enough to leave only one digit for the expression after it.
A solution could be:
echo ${tempvar} | sed -r "s/^.*\s([0-9][0-9]*):\sMesh\sTally\sNumber\s${meshnum1}.*$/\1/"
Where now the \s between the .* and the [0-9][0-9]* explictly forces there to be a space before the digits you want to match.
Hope this helps =)

Are the values in $tempvar supposed to be multiple or a single line? Because if it is a single line, ".*$" should match to the end of line, meaning all the other values too, right?

There's no need for sed, here's one way using GNU grep:
echo "$tempvar" | grep -oP "[0-9]+(?=:\sMesh\sTally\sNumber\s${meshnum1}\b)"

Format text with regard to punctuation

How can I format text in a natural language taking punctuation into account? The built-in gq command of Vim, or command line tools, such as fmt or par break lines without regard to punctuation. Let me give you an example,
fmt -w 40 gives not what I want:
we had everything before us, we had
nothing before us, we were all going
direct to Heaven, we were all going
direct the other way
smart_formatter -w 40 would give:
we had everything before us,
we had nothing before us,
we were all going direct to Heaven,
we were all going direct the other way
Of course, there are cases when no punctuation mark is found within the given text width, then it can fallback to the standard text formatting behavior.
The reason why I want this is to get a meaningful diff of text where I can spot which sentence or subsentence changed.

Here is a not very elegant, but working method I finally came up with. Suppose, a line break at a punctuation mark is worth 6 characters. It means, I'll accept a result which is more ragged but contains more lines ending in a punctuation mark if the "raggedness" is less than 6 characters long. For example, this is OK ("raggedness" is 3 characters).
Wait!
He said.
This is not OK ("raggedness" is more than 6 characters)
Wait!
He said to them.
The method is to add 6 dummy characters after each punctuation mark, format the text, then remove the dummy characters.
Here is the code for this
sed -e 's/\([.?!,]\)/\1 _ _ _/g' | fmt -w 34 | sed -e 's/ _//g' -e 's/_ //g'
I used _ (space + underscore) as a pair of dummy characters, supposing they're not contained in the text. The result looks quite good,
we had everything before us,
we had nothing before us,
we were all going direct to
Heaven, we were all going
direct the other way

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Grep for string and read content until next match string - search

Related

convert the numbers to their names(1=one etc) with SED command

How to edit this file using grep or using cat or using vim or using another tool?

Swapping character locations in a string (bash)

sed regex not being greedy?

Format text with regard to punctuation

Categories

Resources