I have specific dataformat, say 'n' (arbitrary) row and '4' columns. If 'n' is '10', the example data would go like this.
1.01e+00 -2.01e-02 -3.01e-01 4.01e+02
1.02e+00 -2.02e-02 -3.02e-01 4.02e+02
1.03e+00 -2.03e-02 -3.03e-01 4.03e+02
1.04e+00 -2.04e-02 -3.04e-01 4.04e+02
1.05e+00 -2.05e-02 -3.05e-01 4.05e+02
1.06e+00 -2.06e-02 -3.06e-01 4.06e+02
1.07e+00 -2.07e-02 -3.07e-01 4.07e+02
1.08e+00 -2.08e-02 -3.08e-01 4.07e+02
1.09e+00 -2.09e-02 -3.09e-01 4.09e+02
1.10e+00 -2.10e-02 -3.10e-01 4.10e+02
Constraints in building this input would be
data should have '4' columns.
data separated by white spaces.
I want to implement a feature to check whether the input file has '4' columns in every row, and built my own based on the 'M.S.B's answer in the post Reading data file in Fortran with known number of lines but unknown number of entries in each line.
program readtest
use :: iso_fortran_env
implicit none
character(len=512) :: buffer
integer :: i, i_line, n, io, pos, pos_tmp, n_space
integer,parameter :: max_len = 512
character(len=max_len) :: filename
filename = 'data_wrong.dat'
open(42, file=trim(filename), status='old', action='read')
print *, '+++++++++++++++++++++++++++++++++++'
print *, '+ Count lines +'
print *, '+++++++++++++++++++++++++++++++++++'
n = 0
i_line = 0
do
pos = 1
pos_tmp = 1
i_line = i_line+1
read(42, '(a)', iostat=io) buffer
(*1)! Count blank spaces.
n_space = 0
do
pos = index(buffer(pos+1:), " ") + pos
if (pos /= 0) then
if (pos > pos_tmp+1) then
n_space = n_space+1
pos_tmp = pos
else
pos_tmp = pos
end if
endif
if (pos == max_len) then
exit
end if
end do
pos_tmp = pos
if (io /= 0) then
exit
end if
print *, '> line : ', i_line, ' n_space : ', n_space
n = n+1
end do
print *, ' >> number of line = ', n
end program
If I run the above program with a input file with some wrong rows like follows,
1.01e+00 -2.01e-02 -3.01e-01 4.01e+02
1.02e+00 -2.02e-02 -3.02e-01 4.02e+02
1.03e+00 -2.03e-02 -3.03e-01 4.03e+02
1.04e+00 -2.04e-02 -3.04e-01 4.04e+02
1.05e+00 -2.05e-02 -3.05e-01 4.05e+02
1.06e+00 -2.06e-02 -3.06e-01 4.06e+02
1.07e+00 -2.07e-02 -3.07e-01 4.07e+02
1.0 2.0 3.0
1.08e+00 -2.08e-02 -3.08e-01 4.07e+02 1.00
1.09e+00 -2.09e-02 -3.09e-01 4.09e+02
1.10e+00 -2.10e-02 -3.10e-01 4.10e+02
The output is like this,
+++++++++++++++++++++++++++++++++++
+ Count lines +
+++++++++++++++++++++++++++++++++++
> line : 1 n_space : 4
> line : 2 n_space : 4
> line : 3 n_space : 4
> line : 4 n_space : 4
> line : 5 n_space : 4
> line : 6 n_space : 4
> line : 7 n_space : 4
> line : 8 n_space : 3 (*2)
> line : 9 n_space : 5 (*3)
> line : 10 n_space : 4
> line : 11 n_space : 4
>> number of line = 11
And you can see that the wrong rows are properly detected as I intended (see (*2) and (*3)), and I can write 'if' statements to make some error messages.
But I think my code is 'extremely' ugly since I had to do something like (*1) in the code to count consecutive white spaces as one space. I think there would be much more elegant way to ensure the rows contain only '4' column each, say,
read(*,'4(X, A)') line
(which didn't work)
And also my program would fail if the length of 'buffer' exceeds 'max_len' which is set to '512' in this case. Indeed '512' should be enough for most practical purposes, I also want my checking subroutine to be robust in this way.
So, I want to improve my subroutine in at least these aspects
Want it to be more elegant (not as (*1))
Be more general (especially in regards to 'max_len')
Does anyone has some experience in building this kind of input-checking subroutine ??
Any comments would be highly appreciated.
Thank you for reading the question.
Without knowledge of the exact data format, I think it would be rather difficult to achieve what you want (or at least, I wouldn't know how to do it).
In the most general case, I think your space counting idea is the most robust and correct.
It can be adapted to avoid the maximum string length problem you describe.
In the following code, I go through the data as an unformatted, stream access file.
Basically you read every character and take note of new_lines and spaces.
As you did, you use spaces to count to columns (skipping double spaces) and new_line characters to count the rows.
However, here we are not reading the entire line as a string and going through it to find spaces; we read char by char, avoiding the fixed string length problem and we also end up with a single loop. Hope it helps.
EDIT: now handles white spaces at beginning at end of line and empty lines
program readtest
use :: iso_fortran_env
implicit none
character :: old_char, new_char
integer :: line, io, cols
logical :: beg_line
integer,parameter :: max_len = 512
character(len=max_len) :: filename
filename = 'data_wrong.txt'
! Output format to be used later
100 format (a, 3x, i0, a, 3x , i0)
open(42, file=trim(filename), status='old', action='read', &
form="unformatted", access="stream")
! set utils
old_char = " "
line = 0
beg_line = .true.
cols = 0
! Start scannig char by char
do
read(42, iostat = io) new_char
! Exit if EOF
if (io < 0) then
exit
end if
! Deal with empty lines
if (beg_line .and. new_char==new_line(new_char)) then
line = line + 1
write(*, 100, advance="no") "Line number:", line, &
"; Columns: Number", cols
write(*,'(6x, a5)') "EMPTYLINE"
! Deal with beginning of line for white spaces
elseif (beg_line) then
beg_line = .false.
! this indicates new columns
elseif (new_char==" " .and. old_char/=" ") then
cols = cols + 1
! End of line: time to print
elseif (new_char==new_line(new_char)) then
if (old_char/=" ") then
cols = cols+1
endif
line = line + 1
! Printing out results
write(*, 100, advance="no") "Line number:", line, &
"; Columns: Number", cols
if (cols == 4) then
write(*,'(6x, a5)') "OK"
else
write(*,'(6x, a5)') "ERROR"
end if
! Restart with a new line (reset counters)
cols = 0
beg_line = .true.
end if
old_char = new_char
end do
end program
This is the output of this program:
Line number: 1; Columns number: 4 OK
Line number: 2; Columns number: 4 OK
Line number: 3; Columns number: 4 OK
Line number: 4; Columns number: 4 OK
Line number: 5; Columns number: 4 OK
Line number: 6; Columns number: 4 OK
Line number: 7; Columns number: 4 OK
Line number: 8; Columns number: 3 ERROR
Line number: 9; Columns number: 5 ERROR
Line number: 10; Columns number: 4 OK
Line number: 11; Columns number: 4 OK
If you knew your data format, you could read your lines in a vector of dimension 4 and use iostat variable to print out an error on each line where iostat is an integer greater than 0.
Instead of counting whitespace you can use manipulation of substrings to get what you want. A simple example follows:
program foo
implicit none
character(len=512) str ! Assume str is sufficiently long buffer
integer fd, cnt, m, n
open(newunit=fd, file='test.dat', status='old')
do
cnt = 0
read(fd,'(A)',end=10) str
str = adjustl(str) ! Eliminate possible leading whitespace
do
n = index(str, ' ') ! Find first space
if (n /= 0) then
write(*, '(A)', advance='no') str(1:n)
str = adjustl(str(n+1:))
end if
if (len_trim(str) == 0) exit ! Trailing whitespace
cnt = cnt + 1
end do
if (cnt /= 3) then
write(*,'(A)') ' Error'
else
write(*,*)
end if
end do
10 close(fd)
end program foo
this should read any line of reasonable length (up to the line limit your compiler defaults to, which is generally 2GB now-adays). You could change it to stream I/O to have no limit but most Fortran compilers have trouble reading stream I/O from stdin, which this example reads from. So if the line looks anything like a list of numbers it should read them, tell you how many it read, and let you know if it had an error reading any value as a number (character strings, strings bigger than the size of a REAL value, ....). All the parts here are explained on the Fortran Wiki, but to keep it short this is a stripped down version that just puts the pieces together. The oddest behavior it would have is that if you entered something like this with a slash in it
10 20,,30,40e4 50 / this is a list of numbers
it would treat everything after the slash as a comment and not generate a non-zero status return while returning five values. For a more detailed explanation of the code I think the annotated pieces on the Wiki explain how it works. In the search, look for "getvals" and "readline".
So with this program you can read a line and if the return status is zero and the number of values read is four you should be good except for a few dusty corners where the lines would definitely not look like a list of numbers.
module M_getvals
private
public getvals, readline
implicit none
contains
subroutine getvals(line,values,icount,ierr)
character(len=*),intent(in) :: line
real :: values(:)
integer,intent(out) :: icount, ierr
character(len=:),allocatable :: buffer
character(len=len(line)) :: words(size(values))
integer :: ios, i
ierr=0
words=' '
buffer=trim(line)//"/"
read(buffer,*,iostat=ios) words
icount=0
do i=1,size(values)
if(words(i).eq.'') cycle
read(words(i),*,iostat=ios)values(icount+1)
if(ios.eq.0)then
icount=icount+1
else
ierr=ios
write(*,*)'*getvals* WARNING:['//trim(words(i))//'] is not a number'
endif
enddo
end subroutine getvals
subroutine readline(line,ier)
character(len=:),allocatable,intent(out) :: line
integer,intent(out) :: ier
integer,parameter :: buflen=1024
character(len=buflen) :: buffer
integer :: last, isize
line=''
ier=0
INFINITE: do
read(*,iostat=ier,fmt='(a)',advance='no',size=isize) buffer
if(isize.gt.0)line=line//buffer(:isize)
if(is_iostat_eor(ier))then
last=len(line)
if(last.ne.0)then
if(line(last:last).eq.'\\')then
line=line(:last-1)
cycle INFINITE
endif
endif
ier=0
exit INFINITE
elseif(ier.ne.0)then
exit INFINITE
endif
enddo INFINITE
line=trim(line)
end subroutine readline
end module M_getvals
program tryit
use M_getvals, only: getvals, readline
implicit none
character(len=:),allocatable :: line
real,allocatable :: values(:)
integer :: icount, ier, ierr
INFINITE: do
call readline(line,ier)
if(allocated(values))deallocate(values)
allocate(values(len(line)/2+1))
if(ier.ne.0)exit INFINITE
call getvals(line,values,icount,ierr)
write(*,'(*(g0,1x))')'VALUES=',values(:icount),'NUMBER OF VALUES=',icount,'STATUS=',ierr
enddo INFINITE
end program tryit
Honesty, it should work reasonably with just about any line you throw at it.
PS:
If you are always reading four values, using list-directed I/O and checking the iostat= value on READ and checking if you hit EOR would be very simple (just a few lines) but since you said you wanted to read lines of arbitrary length I am assuming four values on a line was just an example and you wanted something very generic.
I have the following code (line numbers included):
1 def test():
2 a = 1
3 b = 1
4 c = 1
5 d = 1
6 if a == 1:
7 print('This is a sample program.')
And the cursor is on line 7, the last line. Is there a fast, and ideally one key, way to navigate up to line 6, which is one indentation level up, and then, on the next key press, to line 1, one indentation level up again? Conversely, is there a matching method to "drill down" that way?
There is a plugin for that: https://github.com/jeetsukumaran/vim-indentwise
The mappings it provides that match what you are looking for are:
[- : Move to previous line of lesser indent than the current line.
[+ : Move to previous line of greater indent than the current line.
]- : Move to next line of lesser indent than the current line.
]+ : Move to next line of greater indent than the current line.
Then, if you really wanted to do what you asked for in a single keypress, you can remap them like so, for example:
nmap - [-
nmap + ]+
Previously, I received help in the following link:
Lua Line Wrapping excluding certain characters
Short description of the above is that I was looking for a way to be able run a line wrap function while ignoring character count of certain characters.
Now I've come across another issue. I want to be able to carry the last colour code over to the new line. For example:
If this line #Rwere over 79 characters, I would want to #Breturn the last known colour code #Yon the line break.
Running the function I have in mind would result in:
If this line #Rwere over 79 characters, I would want to #Breturn the last known
#Bcolour code #Yon the line break.
instead of
If this line #Rwere over 79 characters, I would want to #Breturn the last known
colour code #Yon the line break.
I wish for it to do so because in many cases, the MUD will default back to the #w colour code, so it would make colourizing text rather difficult.
I've figured the easiest way to do that would be a reverse match, so I've written a reverse_text function:
function reverse_text(str)
local text = {}
for word in str:gmatch("[^%s]+") do
table.insert(text, 1, word)
end
return table.concat(text, " ")
end
and it turns:
#GThis #Yis #Ba #Mtest.
to
#Mtest. #Ba #Yis #GThis
The issue I'm running into with creating the string.match is the fact that colour codes can be in one of two formats:
#%a or #x%d%d%d
Additionally, I don't want it to return a colour code that doesn't colour, which is indicated as:
##%a or ##x%d%d%d
What's the best way to accomplish my end goal without compromising my requirements?
function wrap(str, limit, indent, indent1)
indent = indent or ""
indent1 = indent1 or indent
limit = limit or 79
local here = 1-#indent1
local last_color = ''
return indent1..str:gsub("(%s+)()(%S+)()",
function(sp, st, word, fi)
local delta = 0
local color_before_current_word = last_color
word:gsub('()#([#%a])',
function(pos, c)
if c == '#' then
delta = delta + 1
elseif c == 'x' then
delta = delta + 5
last_color = word:sub(pos, pos+4)
else
delta = delta + 2
last_color = word:sub(pos, pos+1)
end
end)
here = here + delta
if fi-here > limit then
here = st - #indent + delta
return "\n"..indent..color_before_current_word..word
end
end)
end
Let us say I have the following three paragraphs of text (separated
from each other by empty lines—number 3 and 7, here):
This is my first paragraph line 1
This is my first paragraph line 2
This is my second paragraph line 4
This is my second paragraph line 5
This is my second paragraph line 6
This is my third paragraph line 8
This is my third paragraph line 9
Question 1: How can I number these paragraphs automatically,
to obtain this result:
1 This is my first paragraph line 1
This is my first paragraph line 2
2 This is my second paragraph line 4
This is my second paragraph line 5
This is my second paragraph line 6
3 This is my third paragraph line 8
This is my third paragraph line 9
(I succeeded to do this, but only via a clumsy macro.)
Question 2: Is it possible to refer to these paragraphs? For
instance, is it possible to index a text file as answered (by Prince
Goulash and Herbert Sitz) in the earlier question, but this time
with the paragraph numbers and not the line numbers?
Thanks in advance.
Here's one way to do the ref numbers, with a pair of functions:
function! MakeRefMarkers()
" Remove spaces from empty lines:
%s/^ \+$//
" Mark all spots for ref number:
%s/^\_$\_.\zs\(\s\|\S\)/_parref_/
" Initialize ref val:
let s:i = 0
" Replace with ref nums:
%s/^_parref_/\=GetRef()/
endfunction
function! GetRef()
let s:i += 1
return s:i . '. '
endfunction
Then just do it by calling MakeRefMarkers(). It doesn't remove existing ref numbers if they're there, that would require another step. And it doesn't catch first paragraph if it's first line in file (i.e, without preceding blank line). But it does handle situations where there's more than one blank line between paragraphs.
Question One
Here is a function to enumerate paragraphs. Simply do :call EnumeratePara() anywhere in your file. The variable indent can be adjusted as you wish. Let me know if anything needs correcting or explaining.
function! EnumeratePara()
let indent = 5
let lnum = 1
let para = 1
let next_is_new_para = 1
while lnum <= line("$")
let this = getline(lnum)
if this =~ "^ *$"
let next_is_new_para=1
elseif next_is_new_para == 1 && this !~ "^ *$"
call cursor(lnum, 1)
sil exe "normal i" . para . repeat(" ", indent-len(para))
let para+=1
let next_is_new_para = 0
else
call cursor(lnum, 1)
sil exe "normal i" . repeat(" ", indent)
endif
let lnum += 1
endwhile
endfunction
Question Two
This isn't a very elegant approach, but it seems to work. First of all, here's a function that maps each line in the file to a paragraph number:
function! MapLinesToParagraphs()
let lnum = 1
let para_lines = []
let next_is_new_para = 1
let current_para = 0
while lnum <= line("$")
let this = getline(lnum)
if this =~ "^ *$"
let next_is_new_para = 1
elseif next_is_new_para == 1
let current_para += 1
let next_is_new_para = 0
endif
call add(para_lines, current_para)
let lnum += 1
endwhile
return para_lines
endfunction
So that para_lines[i] will give the paragraph of line i.
Now we can use the existing IndexByWord() function, and use MapLinesToParagraph() to convert the line numbers into paragraph numbers before we return them:
function! IndexByParagraph(wordlist)
let temp_dict = {}
let para_lines = MapLinesToParagraphs()
for word in a:wordlist
redir => result
sil! exe ':g/' . word . '/#'
redir END
let tmp_list = split(strtrans(result), "\\^\# *")
let res_list = []
call map(tmp_list, 'add(res_list, str2nr(matchstr(v:val, "^[0-9]*")))')
call map(res_list, 'para_lines[v:val]')
let temp_dict[word] = res_list
endfor
let result_list = []
for key in sort(keys(temp_dict))
call add(result_list, key . ' : ' . string(temp_dict[key])[1:-2])
endfor
return join(result_list, "\n")
endfunction
I have not tested these functions very thoroughly, but they seem to work okay, at least in your example text. Let me know how you get on!
Both problems could be solved much easier than it is suggested
by the other two answers.
1. In order to solve the first problem of numbering paragraphs,
the following two steps are ample.
Indent the paragraphs (using tabs, here):
:v/^\s*$/s/^/\t/
Insert paragraph numbering (see also my answer to
the question on substitution with counter):
:let n=[0] | %s/^\s*\n\zs\ze\s*\S\|\%1l/\=map(n,'v:val+1')
2. The second problem of creating index requires some scripting in
order to be solved by Vim means only. Below is the listing of a small
function, WordParIndex() that is supposed to be run after paragraphs
are numbered according to the first problem's description.
function! WordParIndex()
let [p, fq] = [0, {}]
let [i, n] = [1, line('$')]
while i <= n
let l = getline(i)
if l !~ '^\s*$'
let [p; ws] = ([p] + split(l, '\s\+'))[l=~'^\S':]
for w in ws
let t = get(fq, w, [p])
let fq[w] = t[-1] != p ? t + [p] : t
endfor
endif
let i += 1
endwhile
return fq
endfunction
The return value of the WordParIndex() function is the target index
dictionary. To append its text representation at the end of a buffer,
run
:call map(WordParIndex(), 'append(line("$"),v:key.": ".join(v:val,","))')
My approach would be macro based:
Yank the number "0" somehow and move to the start of the first paragraph.
Record a macro to
Indent the paragraph with >}
Paste the stored number at the correct position p
Increment the number by one with <ctrl>-a
Yank the pasted number with yiw
Move to the next paragraph with }l or /^\S
Execute the macro as many times as needed to reach the end of the document.
The method of pasting a number, incrementing it, and then reyanking it inside a macro is quite a useful technique that comes in handy whenever you need to number things. And it's simple enough to just do it in a throw-away fashion. I mainly use it for carpet logging, but it has other uses as your question demonstrates.
I am trying to get the number of lines to show up on the right side of the screen, instead of near the left with the other text. Is this possible? My current .vimrc foldtext function concatenates the first two lines and keeps the current indent, followed by some dashes and then the number of lines:
function! MyFoldText()
let line = getline(v:foldstart)
let line2 = getline(v:foldstart + 1)
let sub = substitute(line . "|" . line2, '/\*\|\*/\|{{{\d\=', '', 'g')
let ind = indent(v:foldstart)
let lines = v:foldend-v:foldstart + 1
let i = 0
let spaces = ''
while i < (ind - ind/4)
let spaces .= ' '
let i = i+1
endwhile
return spaces . sub . ' --------(' . lines . ' lines)'
endfunction
So, using '|' as a screen edge, instead of
| line1 | line2 --------(5 lines)-----------------|
the foldtext would be like this
| line1 | line2 -------------------------(5 lines)|
p.s.
It would also be nice to get a few extra fixes, such as pulling current tabstop setting instead of hardcoding it as 4, and getting it to show the next actual code (skipping comments, whitespace, brackets, etc), instead of just concatenating the first two lines.
Something like the line below is what I use, sort of tailored to your code. You will need to set offset to some value that fits your situation; I think you might want offset of around 8 or 9:
let offset = 8
return spaces . sub . repeat('-', winwidth(0)-strlen(spaces . sub) - offset) . '('. lines .')'
Here is an example from the help of the EightHeader plugin which does exactly what you want:
If you don't like the default 'foldtext' you can customize it by setting to
EightHeaderFolds().
For example the closed folds looks like this by default:
```+-- 45 lines: Fold level one
+--- 67 lines: Fold level two
If you would like to change it to this kind:
Fold level one................45 lines
Fold level two..............67 lines
... then you can use this function:
let &foldtext = "EightHeaderFolds( '\\=s:fullwidth-2', 'left', [ repeat( ' ', v:foldlevel - 1 ), '.', '' ], '\\= s:foldlines . \" lines\"', '' )"