Vim substitute a Regex with randomly generated numbers - vim

Is it possible to substitute a regular expression with a randomly generated number in Vim ? The (random) number to be replaced should be different for each pattern that matches the regular expression. Here's an example of what I need.
Input File:
<a>XYZ</a>
<a>XYZ</a>
<a>XYZ</a>
<a>XYZ</a>
After substituting XYZ with random numbers, the output could be:
<a>599</a>
<a>14253</a>
<a>1718</a>
<a>3064</a>

If you don't mind a little perl in your vim, you can use
:%! perl -pne 's/XYZ/int(rand 1000)/ge'
Edit: updated to allow unlimited substitutions on a given line, per suggestion by #hobbes3, so
XYZ XYZ
XYZ XYZ XYZ
XYZ XYZ XYZ XYZ XYZ XYZ
XYZ XYZ
Becomes something like
86 988
677 477 394
199 821 193 649 502 471
732 208

Try this: put the below code to a buffer then source it (:source %).
let rnd = localtime() % 0x10000
function! Random()
let g:rnd = (g:rnd * 31421 + 6927) % 0x10000
return g:rnd
endfun
function! Choose(n) " 0 n within
return (Random() * a:n) / 0x10000
endfun
Then you can do:
:s_\(<a>\).*\(</a>\)_\1\=Choose(line('.')*100).\2_

Related

Extract & store Strings with uneven spaces using AWK

I have a file contain data like below. I want to cut first and last Columns and store in variables. I am able to print it using command "awk -F" {2,}" '{print $1,$NF}' filename.txt " but I am unable to store it in variables using awk -v command.
The main problem is that first column contains space between words and awk is treating it 3 columns if I am using awk -v command.
Please suggest me how I can achieve this.
XML 2144 11270 2846 3385074
Java 7356 272651 242949 1350596
C++ 671 46497 42702 179366
C/C++ Header 671 16932 57837 44248
XSD 216 3131 807 27634
Korn Shell 129 3686 4279 12431
IDL 90 1098 0 8697
Perl 17 717 795 5698
Python 37 1102 786 4640
Ant 62 596 154 4015
XSLT 18 117 13 2153
make 14 414 1659 1833
Bourne Again Shell 32 532 469 1830
JavaScript 10 204 35 1160
CSS 5 95 45 735
SKILL 2 77 0 523
HTML 11 70 49 494
SQL 9 39 89 71
C Shell 3 13 25 31
D 1 5 15 10
SUM: 11498 359246 355554 5031239
The -v VAR=value parameter is evaluated before the awk code executes. It's not actually part of the code, so you can't reference fields because they don't exist yet. Instead, set the variable in code:
awk '{ Lang=$1; Last=$NF; print Lang, Last; }'
Also, setting those variables within awk won't affect bash's variables. Environments are hierarchical--each child environment inherits some state from the parent environment, but it never flows back upwards. The only way to get state from a child is for the child to print it in a format that the parent can handle. For example, you can pipe the above command to while read LANG LAST; do ...; done to read the awk output into variables.
It seems from your comment that you're trying to mix awk and shell in a way that doesn't quite make sense. So the correct full code (for getting the variables in a bash loop) would be:
cat loc.txt | awk '{ Lang=$1; Last=$NF; print Lang, Last; }' | while read LANG LAST; do ...; done
Or if it's a fixed number of fields, you can skip awk entirely:
cat loc.txt | while read LANG _ _ _ _ LAST; do ...; done
where the "_" just represents a variable which is created and ignored. It's a bit of a convention that underscores represent placeholders in some programming languages, and in this case it's actually a variable which could be printed with echo $_. You'd give it a real name, and name each field differently, if you cared about the middle values.
Neither of these solutions cares about how much whitespace there is. Awk doesn't care unless you tell it to, and neither does the shell.

Edit part of a line in text without losing other lines

I tried to replace the tstop parameter of the text from 120 to 80. What I got was a single line of text: tstop 80, losing the rest of the text. I used
sed -i -rne 's/(tstop)\s+\w+/\1 80/gip'
I want to change only the line tstop and keep the rest of text as it is.
Part of the text is:
[Grid]
X1-grid 1 -6.0 24 u 6.0
X2-grid 1 -24. 96 u 24.
X3-grid 1 -18.0 72 u 18.0
[Chombo Refinement]
Levels 4
Ref_ratio 2 2 2 2 2
Regrid_interval 2 2 2 2
Refine_thresh 0.3
Tag_buffer_size 3
Block_factor 8
Max_grid_size 64
Fill_ratio 0.75
[Time]
CFL 0.3
CFL_max_var 1.1
tstop 120
first_dt 1.e-5
[Solver]
Solver tvdlf
with GNU sed:
sed -E 's/^(tstop +)[^ ]*/\180/' file
or
sed -E '/^tstop/s/[^ ]+$/80/' file
If you want to edit your file "in place" use sed's option -i.
See: The Stack Overflow Regular Expressions FAQ
The n flag in -rne suppresses the normal output of the sed command. Only lines matching your pattern will be output with your p command. Try this:
sed -i -re 's/(tstop)\s+\w+/\1 80/gi'
A more portable version using BRE(Basic Regular Expresssions) could be:
sed -i -e 's/\(tstop\)\( *\)[[:alnum:]]*/\1\280/' file
Note that spaces after tstop are also captured here, to preserve the file format. Also i and g modifiers seem to be useless in your case.

Remove Lines from File which not appear in another File, error

I have two files, similar to the ones below:
File 1 - with phenotype informations, the first column are the individual, the orinal file has 400 rows:
215 2 25 13.8354303 15.2841303
222 2 25.2 15.8507278 17.2994278
216 2 28.2 13.0482192 14.4969192
223 11 15.4 9.2714745 11.6494745
File 2 - with SNPs information, the original file has 400 lines and 42,000 characters per line.
215 20211111201200125201212202220111202005111102
222 20111011212200025002211001111120211015112111
216 20210005201100025210212102210212201005101001
223 20222120201200125202202102210121201005010101
217 20211010202200025201202102210121201005010101
218 02022000252012021022101212010050101012021101
And I need to remove from file 2 individuals that do not appear in the file 1, for example:
215 20211111201200125201212202220111202005111102
222 20111011212200025002211001111120211015112111
216 20210005201100025210212102210212201005101001
223 20222120201200125202202102210121201005010101
I could do this with this code:
awk 'NR==FNR{a[$1]; next}$1 in a{print $0}' file1 file2> file3
However, when I do my main analysis with the generated file the following error appears:
*** Error in `./airemlf90': free(): invalid size: 0x00007f5041cc2010 ***
*** Error in `./postGSf90': free(): invalid size: 0x00007fec4a04f010 ***
airemlf90 and postGSf90 are software. But when I use original file this problem does not occur. Does the command that I made to delete individuals is adequate? Another detail that did not say is that some individuals have identification with 4 characters, can be this the error?
Thanks
I wrote a small python script in a few minutes. Works well, I have tested with 42000-char lines and it works fine.
import sys,re
# rudimentary argument parsing
file1 = sys.argv[1]
file2 = sys.argv[2]
file3 = sys.argv[3]
present = set()
# first read file 1, discard all fields except the first one (the key)
with open(file1,"r") as f1:
for l in f1:
toks = re.split("\s+",l) # same as awk fields
if toks: # robustness against empty lines
present.add(toks[0])
#now read second one and write in third one only if id is in the set
with open(file2,"r") as f2:
with open(file3,"w") as f3:
for l in f2:
toks = re.split("\s+",l)
if toks and toks[0] in present:
f3.write(l)
(First install python if not already present.)
Call my sample script mytool.py and run it like this:
python mytool.py file1.txt file2.txt file3.txt
To process several files at once simply in a bash file (to replace the original solution) it's easy (although not optimal because could be done in a whirl in python)
<whatever the for loop you need>; do
python my_tool.py $1 $2 $3
done
exactly like you would call awk with 3 files.

how can i remove string and use the number in front of it

I have a file like this.I use Ubuntu and terminal.
1345345 dfgdfg
1345 dfgdfg
13445 dfgdfg
1345345 ddfg
15345 df
145 dfgdfg
45 dfgdfg
15 dfgdfg
I want to create a script that i can i remove the strings and divide the number or multiply the number like this and print the result near by
1345345 *3 or /3 result =
1345 *3 or /3
13445 *3 or /3
1345345 *3 or /3
15345 *3 or /3
145 *3 or /3
45 *3 or /3
15 *3 or /3
this is for a file with 50 or more entry's and then output it on a new text file
All this i have made them on Ubuntu.
thanks
a very basic example would be something like this:
cat input | sed -r 's/ *([0-9]+).*/\1/' | xargs perl -e 'for($c=0;$c<=$#ARGV;$c++) {print ($ARGV[$c] . ": " . $ARGV[$c] * 3 . "\n");}'
(input is a file that contains your data)
gives:
1345345: 4036035
1345: 4035
13445: 40335
1345345: 4036035
15345: 46035
145: 435
45: 135
15: 45
You'll need to flesh it out more to serve your complete purpose no doubt, but that's supposed to be part of the fun
So let's break it down.
we pipe (using |) the contents of our input file into a sed regular expression that only extracts the first numbers and ignores everything else: cat input | sed -r 's/ *([0-9]+).*/\1/'
it takes any numbers that it can find after any or none spaces * (since the example had a few when I copied it)
with ([0-9]+)
that may be followed by anything else .*
and replaces the complete string with its find that's what the s/input/replace/ construct is about
this would land you with the following result:
1345345
1345
13445
1345345
15345
145
45
15
you wish to perform an operation on this data: for this you need to use some programming language in general, like python, perl, ruby or whatever else suits your needs. (some things your shell will do just fine for you), I used perl here which begets us | xargs perl -e 'for($c=0;$c<=$#ARGV;$c++) {print ($ARGV[$c] . ": " . $ARGV[$c] * 3 . "\n");}'
So again we pipe the data to our next command with |
to send output from a previous pipe to another program as an argument we use xargs, it's as simple as just prepending your command with
next in your program you loop (for($c=0;$c<=$#ARGV;$c++) { ... } through the commandline arguments provided, perform your action (here we perform * 3) and print the result out (print ($ARGV[$c] . ": " . $ARGV[$c] * 3 . "\n");).
Once you have your data, redirect it to a new file, not yet done here
alternatively you could also use grep or many other programs, that's the beauty of *nix, it has many tools. The basic concept you're looking for however is filtering your data, working on it and spitting it out again.
Using perl : (in this example, I'm not expecting a space between number and string)
perl -lne 's/\W//; print "$_ x 3 = ", $_ * 3' file.txt | tee newfile.txt
Using awk :
awk '{print $1 " x 3 = " $1 * 3}' file.txt | tee newfile.txt
Using only bash :
while read n x; do echo "$n x 3 = $((n * 3))"; done < file.txt | tee newfile.txt

How do I use the "substitute" command using a provided pattern in vimscript?

I have the following function defined in my .vimrc. For a given file, this should change the beginning of each line from the 3rd line onwards with the line number.
function Foo()
3,$ s/^/ /g
3
let i=1 | ,$ g/^/ s//\=i/ | let i+=1
1
endfunction
However, I want to change the function, so that it will accept one argument. It will insert that word, so that the function will look something as follows:
function Foo(chr)
3,$ s/^/ /g
3
let i=1 | ,$ g/^/ s//\=i/ | let i+=1
1
3,$ s/^/chr /g
endfunction
EDIT: Providing an example.
My input file looks something like this:
some text1
some text 2
0000
0000
0001
0002
I want to make the file look as follows:
sm1 1 0000
sm1 2 0000
sm1 3 0001
.
.
So i want to be able to give that "sm1" as a argument to the function, so that for another file i might want to have "sm2" instead of "sm1".
You probably don't need a function since
:3,$s/^/chr /
should work. However, if you wanted to make a command for this you could make one like this:
command! -nargs=1 Example 3,$s/^/<args> /
This would allow you to use :Example chr to insert chr at the beginning of lines 3 and above.
Also, you said that your original function inserts the "line number", but instead it will insert 1 on line 3 and so on. I'm sure you know that you can enable line numbers with :set nu, but if you want to insert line numbers on each line 3 and above you can do:
fun! Foo()
3,$s/^/\=line('.')." "
endfun
or if you want to keep your previous functionality, this is more succint:
fun! Foo()
3,$s/^/\=(line('.')-2)." "
endfun
If you want to combine all of it into one command you can do
com! -nargs=1 Example 3,$s/^/\="<args> ".(line('.')-2)." "
This will give you an :Example <argument> command. So now you can do :Example sm1 like you wanted.
If you want to keep your function as is, to make it work you should use a:chr like this:
function Foo(chr)
3,$ s/^/ /g
3
let i=1 | ,$ g/^/ s//\=i/ | let i+=1
1
exe "3,$s/^/".a:chr." /g"
endfunction

Resources