Is there a command to determine length of a longest line in vim? And to append that length at the beginning of the file?
Gnu's wc command has a -L --max-line-length option which prints out the max line length of the file. See the gnu man wc. The freebsd wc also has -L, but not --max-line-length, see freebsd man wc.
How to use these from vim? The command:
:%!wc -L
Will filter the open file through wc -L and make the file's contents the maximum line length.
To retain the file contents and put the maximum line length on the first line do:
:%yank
:%!wc -L
:put
Instead of using wc, Find length of longest line - awk bash describes how to use awk to find the length of the longest line.
Ok, now for a pure Vim solution. I'm somewhat new to scripting, but here goes. What follows is based on the FilterLongestLineLength function from textfilter.
function! PrependLongestLineLength ( )
let maxlength = 0
let linenumber = 1
while linenumber <= line("$")
exe ":".linenumber
let linelength = virtcol("$")
if maxlength < linelength
let maxlength = linelength
endif
let linenumber = linenumber+1
endwhile
exe ':0'
exe 'normal O'
exe 'normal 0C'.maxlength
endfunction
command PrependLongestLineLength call PrependLongestLineLength()
Put this code in a .vim file (or your .vimrc) and :source the file. Then use the new command:
:PrependLongestLineLength
Thanks, figuring this out was fun.
If you work with tabulations expanded, a simple
:0put=max(map(getline(1,'$'), 'len(v:val)'))
is enough.
Otherwise, I guess we will need the following (that you could find as the last example in :h virtcol(), minus the -1):
0put=max(map(range(1, line('$')), "virtcol([v:val, '$'])-1"))
:!wc -L %
rather than
:%!wc -L
To append that length at the beginning of the file:
:0r !wc -L % | cut -d' ' -f1
Here is a simple, hence easily-remembered approach:
select all text: ggVG
substitute each character (.) with "a": :'<,'>s/./a/g
sort, unique: :'<,'>sort u
count the characters in the longest line (if too many characters to easily count, just look at the column position in the Vim status bar)
I applied this to examine Enzyme Commission (EC) numbers, prior to making a PostgreSQL table:
I copied the ec_numbers data to Calc, then took each column in Neovim, replaced each character with "a",
:'<,'>s/./a/g
and then sorted for unique lines
:'<,'>sort u
aaaaaaa
aaaaaaaa
aaaaaaaaa
aaaaaaaaaa
aaaaaaaaaaa
... so the longest EC number entry [x.x.x.x] is 11 char, VARCHAR(11).
Similarly applied to the Accepted Names, we get
aaaaa
aaaaaa
...
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
i.e. the longest name is 147 char: VARCHAR(200) should cover it!
For neovim users:
local lines = vim.api.nvim_buf_get_lines(bufnr, 0, -1, false)
local width = #(lines[1])
for _, line in ipairs(lines) do
if #line > width then
width = #line
end
end
Related
Consider the following Vim ex command,
:let i=1 | '<,'>g/^/ s/^\ *-/\=i/ | let i+=1
It replaces the heading dash with ordered number in selected lines.
I don't understand why this command works as a loop from the first line to the last line of the selected lines. That is, how g can repeat let i+=1 over and over again.
The pattern of a global command is:
:range g[lobal][!]/pattern/cmd
The global commands work by first scanning through the [range] of of the lines and marking each line where a match occurs. In a second scan the [cmd] is executed for each marked line with its line number prepended. If a line is changed or deleted its mark disappears. The default for the [range] is the whole file. (see http://vimregex.com/#global for more details)
Now let's analyse
:let i=1 | '<,'>g/^/ s/^\ *-/\=i/ | let i+=1
step by step.
let i=1 is a single command executed setting the basic number for the loop. We can just execute it alone at the very beginning. Then '<,'>g/^/ s/^\ *-/\=i/ | let i+=1 looks a little more like a global command.
'<,'>g defines the range. '< represents the first line and '> represents the last line of the selected area. (:help '< for more details)
^ of course matches every line in range.
s/^\ *-/\=i/ | let i+=1 is the [cmd], the number of times it will be executed equals to the number of lines in the selected area, and this is the most important reason why the loop took place.
The part before | is a typical substitute command :range s[ubstitute]/pattern/string/ (see http://vimregex.com/#substitute for more details)
^\ *- matches 0 or more whitespace followed by a dash at the beginning of a line. We substitute \=i for this pattern. (:help :s\= for more details)
After s/^\ *-/\=i/, let i+=1 is executed. Then the next line, ... , till the last line of selected area.
For better understanding that s/^\ *-/\=i/ | let i+=1 is a [cmd] as a whole, we can change the order of the two [sub-cmd], obtaining let i+=1 | s/^\ *-/\=i/. But for the same effect, let i=0 at the very beginning is essential.
This is the general pattern of a :global command:
:g/foo/command
Because everything after the second separator is considered as one command, the counter is incremented each time the command is executed: one time for each matching line.
Here you can see an output of "cat tcl.log":
Discovered serial numbers for slot 1 olt 1:
sernoID Vendor Serial Number sernoID Vendor Serial Number
5 ZNTS 032902A6
And that's how it looks in VIM:
^MDiscovered serial numbers for slot 1 olt 1:
^MsernoID Vendor Serial Number sernoID Vendor Serial Number
^M<SPACE> for next page, <CR> for next line, A for all, Q to quit^H ^H^H ^H^...
5 ZNTS 032902A6
I don't mind the ^M and ^H characters, I know how to get rid of them. The problem is that for some reason my C++ program (unlike cat) is seeing the line starting with "< SPACE >". What can I do about it? I'm using the fstream library to read the log file and I want it to ignore the line I mentioned. I tried to do something like this:
std::ofstream logFinal("logFinal");
std::ifstream log("tcl.log");
std::string temp;
while (std::getline(log, temp)){
if (temp.find("SPACE") != std::string::npos){
temp = "";
}
logFinal << temp << std::endl;
}
But for some reason it doesn't find any "SPACE" in the temp variable. It looks like the "< SPACE >" is some kind of a special character of which I've never heard about.
You're obtaining that log file from/via some sort of program that does paging. (It could be buried inside things; these things happen.) That paging program prints a message like this at the end of a page:
<SPACE> for next page, <CR> for next line, A for all, Q to quit
The <SPACE> is just part of some message with human-readable text; it's seven very ordinary characters. HOWEVER, the ^H that follow it are more interesting, as they're really backspace characters; it's where the preceding characters are deleted again to make way for the next line of real output.
The easiest way (assuming you're on — or have easy access to — a Unix/Linux system) is to feed that log file through col -b (the col program with the -b option, to do backspace elimination). Check out this little cut-n-paste from a shell session:
bash$ echo -e 'abc\b\b\bdef'
def
bash$ echo -e 'abc\b\b\bdef' | od -c
0000000 a b c \b \b \b d e f \n
0000012
bash$ echo -e 'abc\b\b\bdef' | col -b | od -c
0000000 d e f \n
0000004
(The \b should be the same as ^H in your log file.)
I have a file that contains lines as follows:
one one
one one
two two two
one one
three three
one one
three three
four
I want to remove all occurrences of the duplicate lines from the file and leave only the non-duplicate lines. So, in the example above, the result should be:
two two two
four
I saw this answer to a similar looking question. I tried to modify the ex one-liner as given below:
:syn clear Repeat | g/^\(.*\)\n\ze\%(.*\n\)*\1$/exe 'syn match Repeat "^' . escape(getline ('.'), '".\^$*[]') . '$"' | d
But it does not remove all occurrences of the duplicate lines, it removes only some occurrences.
How can I do this in vim? or specifically How can I do this with ex in vim?
To clarify, I am not looking for sort u.
If you have access to UNIX-style commands, you could do:
:%!sort | uniq -u
The -u option to the uniq command performs the task you require. From the uniq command's help text:
-u, --unique
only print unique lines
I should note however that this answer assumes that you don't mind that the output doesn't match any sort order that your input file might have already.
if you are on linux box with awk available, this line works for your needs:
:%!awk '{a[$0]++}END{for(x in a)if(a[x]==1)print x}'
Assuming you are on an UNIX derivative, the command below should do what you want:
:sort | %!uniq -u
uniq only works on sorted lines so we must sort them first with Vim's buit-in :sort command to save some typing (it works on the whole buffer by default so we don't need to pass it a range and it's a built-in command so we don't need the !).
Then we filter the whole buffer through uniq -u.
My PatternsOnText plugin version 1.30 now has a
:DeleteAllDuplicateLinesIgnoring
command. Without any arguments, it'll work as outlined in your question.
It does not preserve the order of the remaining lines, but this seems to work:
:sort|%s/^\(.*\)\n\%(\1\n\)\+//
(This version is #Peter Rincker's idea, with a little correction from me.) On vim 7.3, the following even shorter version works:
:sort | %s/^\(.*\n\)\1\+//
Unfortunately, due to differences between the regular-expression engines, this no longer works in vim 7.4 (including patches 1-52).
Taking the code from here and modifying it to delete the lines instead of highlighting them, you'll get this:
function! DeleteDuplicateLines() range
let lineCounts = {}
let lineNum = a:firstline
while lineNum <= a:lastline
let lineText = getline(lineNum)
if lineText != ""
if has_key(lineCounts, lineText)
execute lineNum . 'delete _'
if lineCounts[lineText] > 0
execute lineCounts[lineText] . 'delete _'
let lineCounts[lineText] = 0
let lineNum -= 1
endif
else
let lineCounts[lineText] = lineNum
let lineNum += 1
endif
else
let lineNum += 1
endif
endwhile
endfunction
command! -range=% DeleteDuplicateLines <line1>,<line2>call DeleteDuplicateLines()
This is not any simpler than #Ingo Karkat's answer, but it is a little more flexible. Like that answer, this leaves the remaining lines in the original order.
function! RepeatedLines(...)
let first = a:0 ? a:1 : 1
let last = (a:0 > 1) ? a:2 : line('$')
let lines = []
for line in range(first, last - 1)
if index(lines, line) != -1
continue
endif
let newlines = []
let text = escape(getline(line), '\')
execute 'silent' (line + 1) ',' last
\ 'g/\V' . text . '/call add(newlines, line("."))'
if !empty(newlines)
call add(lines, line)
call extend(lines, newlines)
endif
endfor
return sort(lines)
endfun
:for x in reverse(RepeatedLines()) | execute x 'd' | endfor
A few notes:
My function accepts arguments instead of handling a range. It defaults to the entire buffer.
This illustrates some of the functions for manipulating lists. :help list-functions
I use /\V (very no magic) so the only character I need to escape in a search pattern is the backslash itself. :help /\V
Add line number so that you can restore the order before sort
:%s/^/=printf("%d ", line("."))/g
sort
:sort /^\d+/
Remove duplicate lines
:%s/^(\d+ )(.*)\n(\d+ \2\n)+//g
Restore order
:sort
Remove line number added in #1
:%s/^\d+ //g
please use perl ,perl can do it easily !
use strict;use warnings;use diagnostics;
#read input file
open(File1,'<input.txt') or die "can not open file:$!\n";my #data1=<File1>;close(File1);
#save row and count number of row in hash
my %rownum;
foreach my $line1 (#data1)
{
if (exists($rownum{$line1}))
{
$rownum{$line1}++;
}
else
{
$rownum{$line1}=1;
}
}
#if number of row in hash =1 print it
open(File2,'>output.txt') or die "can not open file:$!\n";
foreach my $line1 (#data1)
{
if($rownum{$line1}==1)
{
print File2 $line1;
}
}
close(File2);
By outside, I want solutions that does not use Vim's scripting hacks but try to reuse certain basic *ix tools. Inside Vim stuff asks for solutions to get the column-increment with inside stuff such as scripting.
1 1
1 2
1 3
1 ---> 4
1 5
1 6
. .
. .
Vim has a script that does column-vise incrementing, VisIncr. It has gathered about 50/50 ups and down, perhaps tasting a bit reinventing-the-wheel. How do you column-increment stuff in Vim without using such script? Then the other question is, how do you column-increment stuff without/outside Vim?
Most elegant, reusable and preferably-small wins the race!
I don't see a need for a script, a simple macro would do
"a yyp^Ayy
then play it, or map to play it.
Of course, there is always the possibility that I misunderstood the question entirely...
The optimal choice of a technique highly depends on the actual circumstances
of the transformation. There are at least two points variations affecting
implementation:
Whether the lines to operate on are the only ones in a file? If not,
is the range of lines defined by context (i.e. it separated by blank
lines, like a paragraph) or is it arbitrary and should be specified by
user?
Are those lines already contain numbers that should be changed or is
it necessary to insert new ones leaving the text on the lines in tact?
Since there is no information to answer these questions, below we will try to
construct a flexible solution.
A general solution is a substitution operating on the beginnings of the lines
in the range specified by the user. Visual mode is probably the simplest way
of selecting an arbitrary range of lines, so we assume here that boundaries of
the range are defined by the visual selection.
:'<,'>s/^\d\+/\=line(".")-line("''")+1/
If it is necessary to number every line in a buffer, the command can be
simplified as follows.
:%s/^\d\+/\=line('.')/
In any case, if the number should be merely inserted at the beginnings of the
lines (without modifying the ones that already exist), one can change the
pattern from ^\d\+ to ^, and optionally add a separator:
:'<,'>s/^\d\+/\=(line(".")-line("''")+1).' '/
or
:%s/^/\=line('.').' '/
respectively.
For a solution based on command-line tools, one can consider using stream
editors like Sed or text extraction and reporting tools like AWK.
To number each of the lines in a file using Sed, run the commands
$ sed = filename | sed 'N;s/\n/ /'
In order to do the same in AWK, use the command
$ awk '{print NR " " $0}' filename
which could be easily modfied to limit numbering to a particular range of lines
satisfying a certain condition. For example, the following command numbers the
lines two through eight.
$ awk '{print (2<=NR && NR<=8 ? ++n " " : "") $0}' filename
Having an interest in how commands similar to those from the script linked in
the question statement are implemented, one can use the following command as
a reference.
vnoremap <leader>i :call EnumVisualBlock()<cr>
function! EnumVisualBlock() range
if visualmode() != "\<c-v>"
return
endif
let [l, r] = [virtcol("'<"), virtcol("'>")]
let [l, r] = [min([l, r]), max([l, r])]
let start = matchstr(getline("'<"), '^\d\+', col("'<")-1)
let off = start - line("'<")
let w = max(map([start, line("'>") + off], 'len("".v:val)'))
exe "'<,'>" 's/\%'.l.'v.*\%<'.(r+1).'v./'.
\ '\=printf("%'.w.'d",line(".")+off).repeat(" ",r-l+1-w)'
endfunction
If you want change 1 1 1 1 ... to 1 2 3 4 .... (Those numbers should be on different lines.)
:let i=1 | g/1/s//\=i/g | let i+=1
If some of 1 1 1 1 ... are in the same line:
:let g:i = 0
:func! Inc()
: let g:i+=1
: return g:i
:endfun
:%s/1/\=Inc()/g
Does anyone know how to replace line a with line b and line b with line a in a text file using the sed editor?
I can see how to replace a line in the pattern space with a line that is in the hold space (i.e., /^Paco/x or /^Paco/g), but what if I want to take the line starting with Paco and replace it with the line starting with Vinh, and also take the line starting with Vinh and replace it with the line starting with Paco?
Let's assume for starters that there is one line with Paco and one line with Vinh, and that the line Paco occurs before the line Vinh. Then we can move to the general case.
#!/bin/sed -f
/^Paco/ {
:notdone
N
s/^\(Paco[^\n]*\)\(\n\([^\n]*\n\)*\)\(Vinh[^\n]*\)$/\4\2\1/
t
bnotdone
}
After matching /^Paco/ we read into the pattern buffer until s// succeeds (or EOF: the pattern buffer will be printed unchanged). Then we start over searching for /^Paco/.
cat input | tr '\n' 'ç' | sed 's/\(ç__firstline__\)\(ç__secondline__\)/\2\1/g' | tr 'ç' '\n' > output
Replace __firstline__ and __secondline__ with your desired regexps. Be sure to substitute any instances of . in your regexp with [^ç]. If your text actually has ç in it, substitute with something else that your text doesn't have.
try this awk script.
s1="$1"
s2="$2"
awk -vs1="$s1" -vs2="$s2" '
{ a[++d]=$0 }
$0~s1{ h=$0;ind=d}
$0~s2{
a[ind]=$0
for(i=1;i<d;i++ ){ print a[i]}
print h
delete a;d=0;
}
END{ for(i=1;i<=d;i++ ){ print a[i] } }' file
output
$ cat file
1
2
3
4
5
$ bash test.sh 2 3
1
3
2
4
5
$ bash test.sh 1 4
4
2
3
1
5
Use sed (or not at all) for only simple substitution. Anything more complicated, use a programming language
A simple example from the GNU sed texinfo doc:
Note that on implementations other than GNU `sed' this script might
easily overflow internal buffers.
#!/usr/bin/sed -nf
# reverse all lines of input, i.e. first line became last, ...
# from the second line, the buffer (which contains all previous lines)
# is *appended* to current line, so, the order will be reversed
1! G
# on the last line we're done -- print everything
$ p
# store everything on the buffer again
h