Vim alternatives file - vim

:[range]s/{pattern}/{string}/[flags] [count]
For each line in [range] replace a match of {pattern}
with {string}.
The "or" operator in a pattern is "\|". Example:
/foo\|bar
This matches "foo" or "bar". More alternatives can be concatenated:
/one\|two\|three
Matches "one", "two" and "three".
Can we use a pattern/alternatives file with 3 lines?
one
two
three

The following command works on my system:
let #a = system('cat repl.vim | tr "\n" "|"') | exe '%s/\v'.#a.'<bs>/x/g'
Here, I have a list of words in the file repl.vim. The first part of the command uses let to save the list of words in registry a replacing every newline \n with an or operator |. In the second part exe %s performs the substitution.
In practice, if repl.vim contains:
pattern1
pattern2
pattern3
Running the command will result in:
%s/\vpattern1|pattern2|pattern3/x/g

Related

How to grep the string with specific pattern

I am trying to grep a file.txt to search 2 strings cp and (target file name) where the line in file is as below,
cp (source file name) (target file name)
the problem for me here is string '(target file name)' has specific pattern as /path/to/file/TC_12_IT_(6 digits)_(6 digits)_TC_12_TEST _(2 digits).tc12.tc12
I am using below grep command to search a line with these 2 strings,
grep -E cp.*/path/to/file/TC_12_IT_ file.txt
how can I be more specific about (target file name) in grep command to search (target file name) with all its patterns, something like below,
grep -E 'cp.*/path/to/file/TC_12_IT_*_*_TC_12_TEST_*.tc12.tc12' file.txt
can we use wildcards in grep to search string in file just like we can use wilecard like * in listing out files e.g.
ls -lrt TC_12_*_12345678.txt
please suggest if there are any other ways to achieve this.
More specifically:
grep -P '^cp\s+.+\s+\S+/TC_12_IT_\d{6}_\d{6}_TC_12_TEST _\d2[.]tc12[.]tc12$' in_file > out_file
^ : beginning of the line.
\s+ : 1 or more whitespace characters.
.+ : 1 or more any characters.
\S+ : 1 or more non-whitespace characters.
\d{6} : exactly 6 digits.
[.] : literal dot (.). Note that just plain . inside a regular expression means any character, unless it is inside a character class ([.]) or escaped (\.).
$ : end of the line.
SEE ALSO:
GNU grep manual
perlre - Perl regular expressions
Like this, using GNU grep:
grep -P 'cp.*TC_12_IT_\d{6}_\d{6}TC_12_TEST\d{2}.tc12.tc12' file
The regular expression matches as follows:
Node
Explanation
cp
'cp'
.*
any character except \n (0 or more times (matching the most amount possible))
TC_12_IT_
'TC_12_IT_'
\d{6}
digits (0-9) (6 times)
_
_
\d{6}
digits (0-9) (6 times)
TC_12_TEST
'TC_12_TEST'
\d{2}
digits (0-9) (2 times)
.
any character except \n
tc12
'tc12'
.
any character except \n
tc12
'tc12'

Partial replace with sed command

We have a filewith some utf-16 decimal characters and we would like to replace them in the following manner
Test Line in a file \u343- ? some random words \u1233? 300 \u241? \u208?\cell
The required out put is
Test Line in a file \u343- ? some random words UTF16-1233| 300 UTF16-241| UTF16-208|\cell
The requirement is to change \u[0-9]+? to UTF16-[0-9]+|
Replace the initial \u to UTF16- and the ending ? with a pipe |.
Please note if there is any non digit character between \u and ? it should not be considered
Using sed to modify the file in place, you can:
Match \\u([0-9]+)\?:
Match a literal \u, match and capture one or more digits, match a literal ?.
Replace UTF16-\1:
Replace with the string UTF16- followed by the captured group.
$ sed -i -E 's/\\u([0-9]+)\?/UTF16-\1|/g' file
$ cat file
Test Line in a file \u343- ? some random words UTF16-1233| 300 UTF16-241| UTF16-208|\cell

How to remove all data from a file before a line containing string by passing variable in linux

I am trying to trim the data above the line from a file, where line containing some string by passing variable to it
varfile=$(cat variable.txt)
echo "$varfile"
if [ -z "$varfile" ]; then
echo "null"
else
echo "data"
sed "1,/$varfile/d" fileee.txt
fi
Here I am taking a string from variable.txt file and trying to find that text in fileee.txt file and removing all the data above the line
EX: variable.txt has 3
I am finding 3 in fileee.txt and removing data above three
INPUT:
1
2
3
4
OUTPUT:
3
4
I suppose the issue here is that you want to remove all lines before the match, but not the matching line itself?
One way, with GNU sed, is to explicitly add a print for the matching line first:
pattrn=3
seq 1 4 | sed -e "/$pattrn/p;1,/$pattrn/d"
Though this will duplicate any further lines that match the pattern.
Better, invert the sense of the match:
seq 1 4 | sed -ne "/$pattrn/,\$p"
That is, don't print by default (-n), but print (p) anything from a match to the end ($, escaped because of the double-quoted string)
Even better would be to use awk:
pattrn=3
seq 1 4 | awk -vpat="$pattrn" '$0 ~ pat {p=1} p'
This sets a flag on the line where the whole line ($0) matches the pattern (~ is a regex match), then prints the lines whenever that flag is set.
The awk solution is also better in that special characters in the pattern don't cause issues (at least not as many); in the sed case, if the pattern contains a slash /, it will terminate the regex in the sed code, and cause syntax errors or allow for code injection.
I used seq from GNU coreutils here only to make up the sequence of numbers for input.

Command to insert lines before first match

I have file with the below info
testing
testing
testing
I want to insert a word(tested) before the first testing word using sed or any linux command
need to get output like
tested
testing
testing
testing
Thanks
For lines consisting of "testing" exactly:
sed '0,/^testing$/s/^testing$/tested\n&/' file
For lines containing "testing":
sed '0,/.*testing.*/s/.*testing.*/tested\n&/' file
For Lines Starting with "testing"
sed '0,/^testing.*/s/^testing.*/tested\n&/' file
For lines ending with "testing":
sed '0,/.*testing$/s/.*testing$/tested\n&/' file
To update the content of the file with the result add "-i", example:
sed -i '0,/testing/s/testing/tested\n&/' file
To provide an awk-based alternative that is easier to understand:
awk '!found && /testing/ { print "tested"; found=1 } 1' file
found is used to keep track of whether the first instance of testing has been found (variable found, as any Awk variable, defaults to 0, i.e., false in a Boolean context).
/testing/ therefore matches the first line containing testing, and processes the associated block:
{ print "tested"; found=1 } prints the desired text and sets the flag that the first testing line has been found
1 is a common shorthand for { print }, i.e., simply printing the current input line as is.
This might work for you (GNU sed):
sed -e '/testing/{itested' -e ':a;n;ba}' file
Insert tested before the first match of testing and then use a loop to read/print the remainder of the file.
Or use the GNU specific:
sed '0,/testing/itested' file
Consider file:
apple
banana
bar TEST qux
TEST foo
baz TEST
TEST
TEST
Insert a line before the first occurrence of a line consisting solely of the string 'TEST':
sed '0,/^TEST$/s//INSERTED LINE\n&/' <file
Result:
apple
banana
bar TEST qux
TEST foo
baz TEST
INSERTED LINE
TEST
TEST
0,/^TEST$ is an address range. This range begins at the first line of input and extends until a line beginning and ending with TEST is reached. The use of a range restricts the region of applicability of the following command, a substitution command in our case.
A note about the meaning of //:
The empty regular expression ‘//’ repeats the last regular expression match
– https://www.gnu.org/software/sed/manual/html_node/Regexp-Addresses.html
A note about the meaning of the ampersand, &, used in the replacement portion of the s command:
the replacement can contain unescaped & characters which reference the whole matched portion of the pattern space
– https://www.gnu.org/software/sed/manual/html_node/The-_0022s_0022-Command.html
For completeness, here's three additional cases to consider...
Insert a line before the first occurrence of a line containing 'TEST':
sed '0,/.*TEST/s//INSERTED LINE\n&/' <file
Insert a line before the first occurrence of a line beginning with 'TEST':
sed '0,/^TEST/s//INSERTED LINE\n&/' <file
Insert a line before the first occurrence of a line ending with 'TEST':
sed '0,/.*TEST$/s//INSERTED LINE\n&/' <file

Working with sed linux command

In my shellscript code I saw that there is line that is handling Telephone number using sed command.
sed "s~<Telephone type[ ]*=[ ]*\"fax\"[ ]*><Number>none[ ]*</Number></Telephone>~~g" input.xml > output.xml
I am not understanding what the regular expression actually does.
<Telephone type[ ]*=[ ]*\"fax\"[ ]*><Number>none[ ]*</Number></Telephone>
I am doing revere engineering to get this working.
My xml structure like below.
<ContactMethod>
<InternetEmailAddress>donald.francis#lexisnexis.com</InternetEmailAddress>
<Telephone type = "work">
<Number>215-639-9000 x3281</Number>
</Telephone>
<Telephone type = "home">
<Number>484-231-1141</Number>
</Telephone>
<Telephone type = "fax">
<Number>N/A</Number>
</Telephone>
<Telephone type = "work">
<Number>215-639-9000 x3281</Number>
</Telephone>
<Telephone type = "home">
<Number>484-231-1141</Number>
</Telephone>
<Telephone type = "fax">
<Number>none</Number>
</Telephone>
<Telephone type1 = "fax12234">
<Number>484-231-1141sadsadasdasdaasd</Number>
</Telephone>
</ContactMethod>
That regex recognises <Telephone type = "fax"> entries where the number is given as none, and deletes them.
Breakdown:
s sed command for "substitution".
~ pattern separator. You can choose any character for this. sed recoginizes it because it comes right after the s.
<Telephone type This matches the literal text "<Telephone type".
[ ]* matches zero or more spaces.
= matches a literal "="
[ ]* matches zero or more spaces.
\"fax\" matches literal text. The quotes are escaped because the whole pattern appears inside quotes, but the shell removes the quote characters (\) before sed sees them.
[ ]* matches zero or more spaces.
><Number>none matches literal text.
[ ]* matches zero or more spaces.
</Number></Telephone> matches the literal text.
~~ the pattern separators end the search pattern, and surround an empty replace pattern.
g is a flag that means the substitution will be performed multiple times on each line.
The only thing that confuses me is that this pattern won't match anything that has line breaks in it, so I presume your input.xml isn't actually formatted like you have in your example data?

Resources