How to read all strings from a file - C++ - string

I try to read all the strings from the file "Text.txt" and add the strings to a vector by using this code:
std::ifstream in;
in.open("Text.txt");
std::vector<std::string> vec;
while (!in.eof()) {
in >> str;
vec.push_back(str);
}
The problem is that I read the last string twice.
Any idea why this is happening?
Thank you!

It's explained elsewhere on this site already.
In this order:
in.eof() checks eofbit which is false. No read operation has read the end of file yet. The loop continues.
in >> str encounters the end of file and sets eofbit. This also leaves str unchanged from the last iteration,
you push the old (unchanged) str which is now in your vector twice,
you exit the loop when in.eof() checks eofbit.
Your misunderstanding is that in.eofis doing something to detect the
end of file condition -- but, it's not. It just checks eofbit.
eofbit isn't set until the >> operation is performed.
Carefully read documentation for ios::eof, which I shall excerpt here:
std::ios::eof
Returns true if the eofbit error state flag is set for the stream.
This flag is set by all standard input operations when the End-of-File
is reached in the sequence associated with the stream.
Note that the value returned by this function depends on the last
operation performed on the stream (and not on the next).
To fix the problem
in >> str will return whether or not a string was read. Just base your loop condition on that.
while(in >> str)
vec.push_back(str);

Related

strange character in Fortran write output

I want to time some subroutines. Here is the template I use to write the name and duration of execution:
SUBROUTINE get_sigma_vrelp
...declarations...
real(8) :: starttime, endtime
CHARACTER (LEN = 200) timebuf
starttime = MPI_Wtime()
...do stuff...
endtime = MPI_Wtime()
write (timebuf, '(30a,e20.10e3)') 'get_sigma_vrelp',endtime-starttime
call pout(timebuf)
END SUBROUTINE get_sigma_vrelp
And here is a sample output:
(thread 4):get_sigma_vrelp �>
Why is a strange character printed instead of a numerical value for endtime-starttime? Incidentally, pout() simply writes the buffer to a process-specific file in a threadsafe manner. It shouldn't have anything to do with the problem, but if there is nothing else here that would cause the erroneous output then I can post its body.
You have it the wrong way round! The line should read
write (timebuf, '(a30,e20.10e3)') 'get_sigma_vrelp',endtime-starttime
This way, you expect one string that is 30 characters long (a30) instead of 30 strings of arbitrary length (30a). The write statement does not receive characters after the first string, but the corresponding bytes of the float. Hence the garbage.
Your character literal is only 15 chars long, so you could write the line as
write (timebuf, '(a15,e20.10e3)') 'get_sigma_vrelp',endtime-starttime
or let the compiler decide the length on its own:
write (timebuf, '(a,e20.10e3)') 'get_sigma_vrelp',endtime-starttime

(F)Lex checking symbol without "consuming" it

The purpose of this is to concatenate strings (with (f)lex if possible) if they're written consecutively separated only by whitespace.
Strings start and end with "s.
The thing is I used states and while it can concatenate the strings it also consumes the next character/symbol that comes right after the strings.
For example -- "this " "is only " "1 string"id -- this will concatenate the strings ("this is only 1 string") but it will also "consume" the i in id thus destroying one token.
Is there a way to check the next char/symbol without actually "consuming/disposing" (can't really think of a term) it.
\" yy_push_state(X_STRING); yylval.s = new std::string("");
<X_STRING>\" yy_push_state(X_CONC);
<X_STRING>. yylval.s += yytext;
<X_STRING>\n yyerror("newline in string");
<X_CONC>[ ^\n] ;
<X_CONC>\" yy_pop_state();
<X_CONC>. yy_pop_state(); yy_pop_state(); return STRING
Any way to do it?
You can use yyless(0) to cause the current token to be rescanned. Make sure you change start condition, or you'll end up with an endless loop.
By the way, I think your code would be more readable if you switched start conditions with BEGIN rather than using the state stack. In fact, you could easily avoid start conditions, but that would make interpreting escape sequences more complicated. Possibly better would be to just avoid X_CONC by using a rule for \"[[:space:]]*\"

Handling piped io in Lua

I have searched a lot for this but can't find answers anywhere. I am trying to do something like the following:
cat somefile.txt | grep somepattern | ./script.lua
I haven't found a single resource on handling piped io in Lua, and can't figure out how to do it. Is there a good way, non hackish way to do tackle it? Preferably buffered for lower memory usage, but I'll settle for reading the whole file at once if thats the only alternative.
It would be really disappointing to have to write it into a temp file and then load it into the program.
Thanks in advance.
The standard lirary has an io.stdin and an io.stdout that you can use for input and output without havig to resort to temporary files. You can also use io.read isntead of someFile:read and it will read from stdin by default.
http://www.lua.org/pil/21.1.html
The buffering is responsibility of the operating system that is providing the pipes. You don't need to worry too much about it when writing your programs.
edit: Apparently when you mentioned buffering you were thinking about reading part of the file as opposed to loading the whole file into a string. io.read can take a numeric parameter to read up to a certain number of bytes from input, returning nil if no characters could be read.
local size = 2^13 -- good buffer size (8K)
while true do
local block = io.read(size)
if not block then break end
io.write(block)
end
Another (simpler) alternative is the io.lines() iterator but without a filename inside the parentheses. Example:
for line in io.lines() do
print(line)
end
UPDATE: To get a number of characters you can write a wrapper around this. Example:
function io.chars(n,filename)
n = n or 1 --default number of characters to read at a time
local chars = ''
local wrap, yield = coroutine.wrap, coroutine.yield
return wrap(function()
for line in io.lines(filename) do
line = chars .. line .. '\n'
while #line >= n do
yield(line:sub(1,n))
line = line:sub(n+1)
end
chars = line
end
if chars ~= '' then yield(chars) end
end)
end
for text in io.chars(30) do
io.write(text)
end

How do I concatenate a string stored in variable and a number in MATLAB

I am trying to read a tag from XML and then want to concatenate a number to it.
Firstly, I am saving the value of the string to a variable and trying to concatenate it
with the variable in the for loop. But it throws an error.
for i = 0:tag.getLength-1
node = tag.item(i);
disp([node.getTextContent]);
str=node.getTextContent;
str= strcat(str, num2str(i))
new_loads = cat(2,loads,[node.getTextContent]);
end
Error thrown is
Operands to the || and && operators must be
convertible to logical scalar values.
Error in strcat (line 83)
if ~isempty(str) && (str(end) == 0 ||
isspace(str(end)))
Error in SMERCGUI>pushbutton1_Callback (line 182)
str= strcat(str,' morning')
Error in gui_mainfcn (line 96)
feval(varargin{:});
Error in SMERCGUI (line 44)
gui_mainfcn(gui_State, varargin{:});
Error in
#(hObject,eventdata)SMERCGUI('pushbutton1_Callback',hObject,eventdata,guidata(hObject))
Error while evaluating uicontrol Callback
The error suggests that your string is not a string. It's not clear to me whether it's throwing an error at the strcat line, or at the later cat line.
At any rate, it should be clear that you cannot concatenate elements of different types into an array - cell array yes, regular array no. So the line
new_loads = cat(2,loads,[node.getTextContent]);
is bound to give a problem. 2 is numerical, and node.getTextContent is a string - or maybe a cell array or something else. I can't see what loads is, so I can't tell if that is involved in the problem.
Usually a good way to combine numbers and strings into a single string is
newString = sprintf('%s %d', oldString, number);
You can then use all the formatting tricks of printf to produce output exactly as you want. But before you do anything, make sure you understand the type of all the elements you are trying to string together. The easiest way to do this for all the elements in memory is
whos
Or if you just want it for one variable,
whos str
Or all variables starting with s:
whos s*
The output is self-explanatory. If you still can't figure it out after this, leave a comment and I'll try to help you out.
EDIT based on what I read at http://blogs.mathworks.com/community/2010/11/01/xml-and-matlab-navigating-a-tree/ , it is possible that you just need to cast your str variable to a Matlab string (apparently it's a java.lang.string). So try to add
str = char(str);
before using str. It may be what you need.

Parsing strings in Fortran

I am reading from a file in Fortran which has an undetermined number of floating point values on each line (for now, there are about 17 values on a line). I would like to read the 'n'th value on each line to a given floating point variable. How should i go about doing this?
In C the way I wrote it was to read the entire line onto the string and then do something like the following:
for(int il = 0; il < l; il++)
{
for(int im = -il; im <= il; im++)
pch = strtok(NULL, "\t ");
}
for(int im = -l; im <= m; im++)
pch = strtok(NULL, "\t ");
dval = atof(pch);
Here I am continually reading a value and throwing it away (thus shortening the string) until I am ready to accept the value I am trying to read.
Is there any way I can do this in Fortran? Is there a better way to do this in Fortran? The problem with my Fortran code seems to be that read(tline, '(f10.15)') tline1 does not shorten tline (tline is my string holding the entire line and tline1 what i am trying to parse it into), thus I cannot use the same method as I did in my C routine.
Any help?
The issue is that Fortran is a record-based I/O system while C is stream-based.
If you have access to a Fortran 2003 compliant compiler (modern versions of gfortran should work), you can use the stream ACCESS specifier to do what you want.
An example can be found here.
Of course, if you were really inclined, you could just use your C function directly from Fortran. Interfacing the two languages is generally simple, typically only requiring a wrapper with a lowercase name and an appended underscore (depending on compiler and platform of course). Passing arrays or strings back and forth is not so trivial typically; but for this example that wouldn't be needed.
Once the data is in a character array, you can read it into another variable as you are doing with the ADVANCE=no signature, ie.
do i = 1, numberIWant
read(tline, '(F10.15)', ADVANCE="no") tline1
end do
where tline should contain your number at the end of the loop.
Because of the record-based I/O, a READ statement will typically throw out what is after the end of the record. But the ADVANCE=no tells it not to.
If you know exactly at what position the value you want starts, you can use the T edit descriptor to initiate the next read from that position.
Let's say, for instance, that the width of each field is 10 characters and you want to read the fifth value. The read statement will then look something like the following.
read(file_unit, '(t41, f10.5)') value1
P.s.: You can dynamically create a format string at runtime, with the correct number after the t, by using a character variable as format and use an internal file write to put in this number.
Let's say you want the value that starts at position n. It will then look something like this (I alternated between single and double quotes to try to make it more clear where each string starts and stops):
write(my_format, '(a, i0, a)') "(t", n, ', f10.5)'
read(file_unit, my_format) value1

Resources