Why does the negated character class doesn't work as expected? - vim

xyz mnl pqt aaaa ccc
yz mn ats aa cbc ddd eee ggg
I want to match the first two columns with:
[^\s]*\s[^\s]*\s
But this pattern matches up to all columns but the last one. That is:
xyz mnl pqt aaaa
yz mn ats aa cbc ddd eee
I don't understand this in VIM.

Two things:
\s doesn't work in a character class. Use \S instead.
Prefix the regex with ^ to make it start from the beginning of each line.
^\S*\s\S*\s
Which matches:
xyz mnl pqt aaaa ccc
^^^^^^^^
yz mn ats aa cbc ddd eee ggg
^^^^^^

Related

Regex for simple pattern in python

I have a string containing exactly one pair of parentheses (and some words between them), and lots of other words.
How would one create a regex to split the string into [ words before (, words between (), words after )]?
e.g.
line = "a bbbb cccc dd ( ee fff ggg ) hhh iii jk"
would be split into
[ "a bbbb cccc dd", "ee fff ggg", "hhh iii jk" ]
I've tried
line = re.compile("[^()]+").split(line)
but it doesn't work.
It seems that in the process you want to remove the leading and trailing whitespaces, i.e., the whitespaces before and after ( and ). You could try:
>>> line = "a bbbb cccc dd ( ee fff ggg ) hhh iii jk"
>>> re.split(r'\s*[\(\)]\s*', line)
['a bbbb cccc dd', 'ee fff ggg', 'hhh iii jk']
>>>
>>> # to make it look as in your description ...
>>> line = re.compile(r'\s*[\(\)]\s*').split(line)
>>> line
['a bbbb cccc dd', 'ee fff ggg', 'hhh iii jk']
To split the output in three I think the simplest option is to use three capture groups (some_regex)(another_regex)(yet_another_regex). In your case, the first part is any character that is not a (, followed by (, then any character that is not ) followed by ) and finally followed by any character.
Therefore the regex is ([^(]*)\(([^)]*)\)(.*), which you can then use to retrieve groups (your desired output):
>>> import re
>>> pattern = re.compile(r'([^(]*)\(([^)]*)\)(.*)')
>>> pattern.match(line).groups()
('a bbbb cccc dd ', ' ee fff ggg ', ' hhh iii jk')
With:
([^(]*) the first group
([^)]*) the second group
(.*) the last group

How can join consecutive non-empty lines using sed/awk?

How can i join consecutive non-empty lines into a single lines using sed or awk?
An example is given of what I am trying to do.
Input:
aaa ff gg
bbb eee eee
ss gg dd
aaa ff gg
bbb eee eee
ss gg dd
aaa ff gg
bbb eee eee
ss gg dd
Converts to
aaa ff gg bbb eee eee ss gg dd
aaa ff gg bbb eee eee ss gg dd
aaa ff gg bbb eee eee ss gg dd
Not sure if you REALLY want a blank line between each data line or not so here's both:
$ awk -v RS= '{$1=$1}1' file
aaa ff gg bbb eee eee ss gg dd
aaa ff gg bbb eee eee ss gg dd
aaa ff gg bbb eee eee ss gg dd
$ awk -v RS= -v ORS='\n\n' '{$1=$1}1' file
aaa ff gg bbb eee eee ss gg dd
aaa ff gg bbb eee eee ss gg dd
aaa ff gg bbb eee eee ss gg dd
This might work for you (GNU sed):
sed ':a;N;/\n$/!s/\n/ /;ta' file
Unless the last line appended is empty, replace a newline by a space and repeat. Otherwise print and repeat.
If you want empty lines deleted, then:
sed ':a;N;/\n$/!s/\n/ /;ta;P;d' file
If perl is okay:
$ perl -00 -pe 's/\n(?!$)/ /g' ip.txt
aaa ff gg bbb eee eee ss gg dd
aaa ff gg bbb eee eee ss gg dd
aaa ff gg bbb eee eee ss gg dd
-00 read input in paragraph mode
See http://perldoc.perl.org/perlrun.html#Command-Switches for more info and for -pe options
use perl -i -00 -pe for inplace editing
s/\n(?!$)/ /g replace all newlines except the one from blank line with space
#Schon:#try:
awk '{ORS=/^$/?RS RS:FS} {$1=$1} 1;END{print RS}' Input_file
EDIT: Adding explanation too now.
awk '{
ORS= ##### Setting Output field separator here.
/^$/ ##### Checking the condition if a line starts from null.
? ##### ? means if above condition is TRUE then run following action.
RS RS ##### set ORS as RS RS means set it to 2 new lines, default value of RS will be new line.
: ##### : is a conditional operator which will execute the action following it when condition is FALSE.
FS} ##### Set ORS to FS, which is field separator and it's default value is space.
{$1=$1} ##### Re-setting the first field again of line to reflect the new value of ORS.
1; ##### making the condition as TRUE and not mentioning the action, so by default print will happen of current line.
END
{print RS} ##### printing the RS value at last which is new line.
' Input_file ##### Mentioning the Input_file here.
A more readable example, less Perl-like:
awk '{ if ($0 == "") { print line "\n"; line = "" } else line = line $0 } END { if (line) print line }' file

How to format text, such that single line will not exceed N characters [duplicate]

This question already has answers here:
vim command to restructure/force text to 80 columns
(6 answers)
Closed 9 years ago.
I want to know what is the best practice to format selected text, such that each column does not exceed N characters.
For example I had this text at the begging (Note the text doesn't exceed 80 columns):
aaaaaaa aaaaa aaaaaa aaaaa aaaaaaa aaaaaaaaaaaaaaa aaaaaaaaa aaaaaaaa
aaaaa aaaaaaa aaaaaa aaaaaaaaa a aaaa aaaaaaaaa aaaaaaa aaaaaaa aaaa
aaaaa aaaaaaa aaaaaaaaa aaaaaaaa aaaaaa aaaaaaaa aaaaa a aaaaaaaaaaaaaa
aaaaaaa aaaa aaaaaaaaaaa aaaaaa aaaaaaaa aaaaaaa aaaaaaaaa aaaaaaaaaa
aaaaaa aaaaaaaaaa aaaaaaaaaaaa aaaaa aaaaa aaaaa aaaaaaaa aaaaa aaa aaaaa
And then suddenly I had to change the first line and and text like:
BBBBBB BB B BBB BB BBB BBB BBB BBBB BBBBBBBBBBBB
Such the text will become similar to this:
aaaaaaa aaaaa BBBBBB BB B BBB BB BBB BBB BBB BBBB BBBBBBBBBBBB aaaaaa aaaaa aaaaaaa aaaaaaaaaaaaaaa aaaaaaaaa aaaaaaaa
aaaaa aaaaaaa aaaaaa aaaaaaaaa a aaaa aaaaaaaaa aaaaaaa aaaaaaa aaaa
aaaaa aaaaaaa aaaaaaaaa aaaaaaaa aaaaaa aaaaaaaa aaaaa a aaaaaaaaaaaaaa
aaaaaaa aaaa aaaaaaaaaaa aaaaaa aaaaaaaa aaaaaaa aaaaaaaaa aaaaaaaaaa
aaaaaa aaaaaaaaaa aaaaaaaaaaaa aaaaa aaaaa aaaaa aaaaaaaa aaaaa aaa aaaaa
So what is the easiest way to format the text to kind force the limit on columns up to 80 characters?
P.S
I don't wan't to format every line manually.
See here.
Basically, :set tw=80, then use the gq command to reformat preexisting text. To auto-wrap the entire file, go to the first line and type gqG (note capital G).
See this question.
Set textwidth to 80, move to the start of the file (can be done with
Ctrl-Home or gg), and type gqG.
gqG formats the text starting from the current position and to the end
of the file. It will automatically join consecutive lines when
possible. You can place a blank line between two lines if you don't
want those two to be joined together.

How to replace the character I want in a line

1 aaa bbb aaa
2 aaa ccccccccc aaa
3 aaa xx aaa
How to replace the second aaa to yyy for each line
1 aaa bbb yyy
2 aaa ccccccccc yyy
3 aaa xx yyy
Issuing the following command will solve your problem.
:%s/\(aaa.\{-}\)aaa/\1yyy/g
Another way would be with \zs and \ze, which mark the beginning and end of a match in a pattern. So you could do:
:%s/aaa.*\zsaaa\ze/yyy
In other words, find "aaa" followed by anything and then another "aaa", and replace that with "yyy".
If you have three "aaa"s on a line, this won't work, though, and you should use \{-} instead of *. (See :h non-greedy)

Delete whole line NOT containing given string

Is there a way to delete the whole line if it contains specific word using sed? i.e.
I have the following:
aaa bbb ccc
qqq fff yyy
ooo rrr ttt
kkk ccc www
I want to delete lines that contain 'ccc' and leave other lines intact. In this example the output would be:
qqq fff yyy
ooo rrr ttt
All this using sed. Any hints?
sed -n '/ccc/!p'
or
sed '/ccc/d'

Resources