I have a million lines text that looks like:
HELLO random1 WORLD
HELLO random2 WORLD
HELLO random3 WORLD
How with the tools Sublime provide, I can extract only the text that I need so the result would be:
random1
random2
random3
Search using regex, with this parameter HELLO (\w+) WORLD and replace it with \1 (or $1)
the \w is regex for a word. And the brackets around it capture it and assign it to the variable $1
use regex option in search and replace. This regex, without quotes, will select both words.
"HELLO | WORLD"
This is probably oversimplified solution. You need to post more realistic examples for us to provide with an exact solution
Related
I have a paragraph like this:
"Nothing is worth more than the truth.
I like to say hello to him.
Give me peach or give me liberty.
Please say hello to strangers.
I'ts ok to say Hello to strangers."
I want to result:
"Nothing is worth more than the truth.
Give me peach or give me liberty."
if a line uses the word "hello" then remove that line and take only the line without that word.
I find some information in reference:
enter image description here
so I think ít as follow: regexp "^[^hello]" $line
but it doesn't work
There are several problems with your attempt of:
regexp "^[^hello]" $line
A good practice with Tcl regular expressions is to put your regex inside curly braces instead of double quotes. Square brackets inside double quotes will be evaluated by Tcl as a command.
^ means the beginning of the line in regular expression.
Characters inside square brackets in a regular expression are considered a "character-class". [^hello] does not mean the opposite of matching "hello". Instead, it matches a single character that is not h, e, l, or o.
Do you care about case? If not, then add -nocase.
A Tcl expression, which you can use in an if statement to check that a line does not include "hello" (or "Hello") is simply this:
![regexp -nocase {hello} $line]
I want to extract the first instance of a string per line in linux. I am currently trying grep but it yields all the instances per line. Below I want the strings (numbers and letters) after "tn="...but only the first set per line. The actual characters could be any combination of numbers or letters. And there is a space after them. There is also a space before the tn=
Given the following file:
hello my name is dog tn=12g3 fun 23k3 hello tn=1d3i9 cheese 234kd dks2 tn=6k4k ksk
1263 chairs are good tn=k38493kd cars run vroom it95958 tn=k22djd fair gold tn=293838 tounge
Desired output:
12g3
k38493
Here's one way you can do it if you have GNU grep, which (mostly) supports Perl Compatible Regular Expressions with -P. Also, the non-standard switch -o is used to only print the part matching the pattern, rather than the whole line:
grep -Po '^.*?tn=\K\S+' file
The pattern matches the start of the line ^, followed by any characters .*?, where the ? makes the match non-greedy. After the first match of tn=, \K "kills" the previous part so you're only left with the bit you're interested in: one or more non-space characters \S+.
As in Ed's answer, you may wish to add a space before tn to avoid accidentally matching something like footn=.... You might also prefer to use something like \w to match "word" characters (equivalent to [[:alnum:]_]).
Just split the input in tn=-separators and pick the second one. Then, split again to get everything up to the first space:
$ awk -F"tn=" '{split($2,a, " "); print a[1]}' file
12g3
k38493kd
$ awk 'match($0,/ tn=[[:alnum:]]+/) {print substr($0,RSTART+4,RLENGTH-4)}' file
12g3
k38493kd
The sample for substituting:
hello world (one) hello world (two two) hello world (three three three)
The result I want:
hello world $one# hello world $two two# hello world $three three three#
I've tried to use:
s/(\(\w\\+\s*\))/$\1#/g
but it does not work.
Doing these two simple substitutions is a lot more intuitive and a lot faster than wasting your time trying to come up with a single one:
:s/(/\$/g
:s/)/#/g
Anyway:
:s/(\([^)]\+\))/\$\1#/g
The search part: we are looking for an opening parenthese, followed by one or more characters that are not closing parentheses that we put in a capture group, followed by a closing parenthese.
The replacement part: we replace with a dollar sign, followed by our capture group, followed by an octothorpe.
Thanks very much for everyone. I have woke out it.:s/(\(\(\w*\s*\)*\))/$\1#/g. The preceding command can give the right result I want.
Say I have a file called hello.txt which contains "Hello World!".
If I wanted to make a script which opened the file and read the contents (I know how to do that) and added stuff to the string, how would I go about doing this?
For example: Hello World would have '..' inserted at the start of the string/content, and then every 2 characters later, except at the end. Also consider the contents will not always be "Hello World".
Since you already know how to read from a file, I take it your only real question is how to add .. after every 2 characters of any given string:
my $string = "Hello World";
$string =~ s/^|(..)(?!$)/$1../g;
print "$string\n";
Output:
..He..ll..o ..Wo..rl..d
Though I can't imagine how that would ever be useful.
The regex looks for the start of string or two characters not followed by the end of the string, using negative look-ahead, and replaces all matches with any captured characters followed by two periods.
I am copy-pasting an example from a PDF to Vim and I have to replace all “ and ” with "
and all ‘ and ’ with ' so that the code works.
Well that will probably seem easier to understand:
I want to replace all foo and bar with foobar simultaneously.
Try this in vi:
:1,$s/[“”]/"/g
then
:1,$s/[‘’]/'/g
Use tr as a filter:
Unix way:
:%!tr “”‘’ \"\"\'\'
If you want to replace all "foo"s and all "bar"s with "foobar" you can use this:
%s/\v<(foo|bar)>/foobar/g
This will replace the "foo"s and the "bar"s but will leave any "foobar"s alone.
%s/ - substitute across the whole file
\v - use very magic regex syntax (see :help magic for more info)
< - match a left word boundary
(foo|bar) - foo or bar
> - match a right word boundary
/foobar/ - replacement string
g - globally (will happen for every occurrence, not just the first on the line)
Note that if you are just dealing with punctuation you'll probably want to remove the word boundary parts of this regex or it won't work.