Why is my string nil? - nim-lang

I made this simple program that reads characters until the enter key is pressed
var data: string
while true:
var c = readChar(stdin) # read char
case c
of '\r': # if enter, stop
break
else: discard
data.add(c) # add the read character to the string
echo data
But when it tries to echo data, it crashes
> ./program
hello
Traceback (most recent call last)
program.nim(11) program
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
This means data is nil. But every time I press enter a character, it adds the character to data. Something goes wrong, but where?

data is initially nil when you define it as var data: string. Instead you can use var data = "" to make it an initialized string.

The stream stdin buffers all the characters until the newline key is pressed, then it will submit the character(s). I expected the behavior to be reading direct characters.
That means that \r will never be the case, it will try to add a character to data but data is nil, so that fails. I thought it failed at the echo statement.
To demonstrate, this code works:
var data = ""
while true:
var c = readChar(stdin) # read char
case c
of '\e': # if escape, stop
break
else:
data.add(c) # add the read character to the string
echo data

Related

python3 file.readline EOF?

I am having trouble determining when I have reached the end of a file in python with file.readline
fi = open('myfile.txt', 'r')
line = fi.readline()
if line == EOF: //or something similar
dosomething()
c = fp.read()
if c is None:
will not work because then I will loose data on the next line, and if a line only has a carriage return I will miss an empty line.
I have looked a dozens or related posts, and they all just use the inherent loops that just break when they are done. I am not looping so this doesn't work for me. Also I have file sizes in the GB with 100's of thousands of lines. A script could spend days processing a file. So I need to know how to tell when I am at the end of the file in python3. Any help is appreciated. Thank you!
I ran in to this same exact problem. My specific issue was iteration over two files, where the shorter one was only supposed to read a line on specific reads of the longer file.
As some mentioned here the natural pythonic way to iterate line by line is to, well, just iterate. My solution to stick with this 'naturalness' was to just utilize the iterator property of a file manually. Something like this:
with open('myfile') as lines:
try:
while True: #Just to fake a lot of readlines and hit the end
current = next(lines)
except StopIteration:
print('EOF!')
You can of course embellish this with your own IOWrapper class, but this was enough for me. Just replace all calls to readline to calls of next, and don't forget to catch the StopIteration.
The simplest way to check whether you've reached EOF with fi.readline() is to check the truthiness of the return value;
line = fi.readline()
if not line:
dosomething() # EOF reached
Reasoning
According to the official documentation
f.readline() reads a single line from the file; a newline character (\n) is left at the end of the string, and is only omitted on the last line of the file if the file doesn’t end in a newline. This makes the return value unambiguous; if f.readline() returns an empty string, the end of the file has been reached, while a blank line is represented by '\n', a string containing only a single newline.
and the only falsy string in python is the empty string ('').
You can use the output of the tell() function to determine if the last readline changed the current position of the stream.
fi = open('myfile.txt', 'r')
pos = fi.tell()
while (True):
li = fi.readline()
newpos = fi.tell()
if newpos == pos: # stream position hasn't changed -> EOF
break
else:
pos = newpos
According to the Python Tutorial:
f.tell() returns an integer giving the file object’s current position in the file represented as number of bytes from the beginning of the file when in binary mode and an opaque number when in text mode.
...
In text files (those opened without a b in the mode string), only seeks relative to the beginning of the file are allowed (the exception being seeking to the very file end with seek(0, 2)) and the only valid offset values are those returned from the f.tell(), or zero.
Since the value returned from tell() can be used to seek(), they would have to be unique (even if we can't guarantee what they correspond to). Therefore, if the value of tell() before and after a readline() is unchanged, the stream position is unchanged, and the EOF has been reached (or some other I/O exception of course). Reading an empty line will read at least the newline and advance the stream position.
This is a demonstrative example using f.tell() and f.read() with a chunk of data:
Assuming my input.txt file contain:
hello
hi
hoo
foo
bar
Test:
with open('input.txt', 'r') as f:
# Read chunk of data
chunk = 4
while True:
line = f.read(chunk)
if not line:
line = "i've read Nothing"
print("EOF reached. What i read when i reach EOF:", line)
break
else:
print('Read: {} at position: {}'.format(line.replace('\n', ''), f.tell()))
Will output:
Read: hell at position: 4
Read: ohi at position: 9
Read: hoo at position: 14
Read: foo at position: 19
Read: bar at position: 24
EOF reached. What i read when i reach EOF: i've read Nothing
with open(FILE_PATH, 'r') as fi:
for line in iter(fi.readline, ''):
parse(line)

How to compare upper and lowercase letters in a conditional in Swift

Apologies if this is a duplicate. I have a helper function called inputString() that takes user input and returns a String. I want to proceed based on whether an upper or lowercase character was entered. Here is my code:
print("What do you want to do today? Enter 'D' for Deposit or 'W' for Withdrawl.")
operation = inputString()
if operation == "D" || operation == "d" {
print("Enter the amount to deposit.")
My program quits after the first print function, but gives no compiler errors. I don't know what I'm doing wrong.
It's important to keep in mind that there is a whole slew of purely whitespace characters that show up in strings, and sometimes, those whitespace characters can lead to problems just like this.
So, whenever you are certain that two strings should be equal, it can be useful to print them with some sort of non-whitespace character on either end of them.
For example:
print("Your input was <\(operation)>")
That should print the user input with angle brackets on either side of the input.
And if you stick that line into your program, you'll see it prints something like this:
Your input was <D
>
So it turns out that your inputString() method is capturing the newline character (\n) that the user presses to submit their input. You should improve your inputString() method to go ahead and trim that newline character before returning its value.
I feel it's really important to mention here that your inputString method is really clunky and requires importing modules. But there's a way simpler pure Swift approach: readLine().
Swift's readLine() method does exactly what your inputString() method is supposed to be doing, and by default, it strips the newline character off the end for you (there's an optional parameter you can pass to prevent the method from stripping the newline).
My version of your code looks like this:
func fetchInput(prompt: String? = nil) -> String? {
if let prompt = prompt {
print(prompt, terminator: "")
}
return readLine()
}
if let input = fetchInput("Enter some input: ") {
if input == "X" {
print("it matches X")
}
}
the cause of the error that you experienced is explained at Swift how to compare string which come from NSString. Essentially, we need to remove any whitespace or non-printing characters such as newline etc.
I also used .uppercaseString to simplify the comparison
the amended code is as follows:
func inputString() -> String {
var keyboard = NSFileHandle.fileHandleWithStandardInput()
var inputData = keyboard.availableData
let str: String = (NSString(data: inputData, encoding: NSUTF8StringEncoding)?.stringByTrimmingCharactersInSet(
NSCharacterSet.whitespaceAndNewlineCharacterSet()))!
return str
}
print("What do you want to do today? Enter 'D' for Deposit or 'W' for Withdrawl.")
let operation = inputString()
if operation.uppercaseString == "D" {
print("Enter the amount to deposit.")
}

Lua: Capturing String Based on Number of Symbols Received

I currently have a string that can be any length in size based on a single digit in one or two specific locations (based on the first digit captured). For example:
Changed
First digit captured tells me IF a file name is to follow: "1" = Object Name Follows. "0" = Next input captured is Length Multiplier.
"1" is not always received. But "0" is always received.
With "1" Capture it looks like this:
START|(1)|NAMEOFGRAPHIC|(0)|(#)|INPUT|INPUT|INPUT|INPUT|... etc
With "0" (no "1" captured)
START|(0)|(#)|INPUT|INPUT|INPUT|INPUT|... etc
The Length Multiplier bit (always follows "0") is the number of INPUT groups to follow. A "group" is a set of 4xINPUT's. So, if it was a "4", the string I want to completely capture looks like this:
With a "1":
START|(1)|NAMEOFGRAPHIC|(0)|(4)|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|
With a "0":
START|(0)|(4)|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|INPUT|
As each INPUT is received, a pipe symbol is added after. I want to use the pipes to monitor the length of the input based on the digit. If the digit is 5, for example, it would capture the 3x INPUT, 5, then 5x INPUT after (with all pipes included). Once this is done, the function would send the fully captured string to other function(s) for use.
I am having problems working out the receiving function to capture this full string. I have tried to count the number of pipes in different loop functions and all are resulting in errors.
Attempts include (please understand I'm pretty new to all of this):
local buffer = ""
function pipe_count(input)
a = "|"
buffer = buffer..input.."|"
while #a < 5 do
buffer = buffer..input.."|"
return buffer
end
end
local buffer = ""
function pipe_count(input)
buffer = buffer..input.."|"
mult = tonumber(buffer:match("(.-|.-|.-|(%d)|.*)"))
while buffer do
for i = 1, mult do
buffer = buffer..input.."|"
end
return buffer
end
Those were two examples I tried. I deleted my other futile attempts to capture the exact string length. My current issue that it is taking the INPUT captures, as each one is received, and sending it to the next function prior to capturing the entire string. So, if I had received the string at the top, it would look like this:
`INPUT`
`INPUT|INPUT`
`INPUT|INPUT|INPUT`
`INPUT|INPUT|INPUT|5`
`INPUT|INPUT|INPUT|5|INPUT`
`INPUT|INPUT|INPUT|5|INPUT|INPUT` etc
until finally the string below is received:
`INPUT|INPUT|INPUT|5|INPUT|INPUT|INPUT|INPUT|INPUT|`
At this point, my file runs as it should. But up until this point, I'm getting errors since the parameters of the function(s) aren't fully met.
Ideally, I want that last string before moving on.
Any ideas would be very welcomed and appreciated.
Cheers
ETA: These INPUT's are filling a buffer. I want that check digit to be responsible for the string to only be used if the length value is met. Again, I really appreciate all input. Thank you.
ETA: Example code tried and more input details.
All strings in Lua are internalized, so it's usually a better idea to push strings onto an array than to repeatedly rebuild the same string. This example takes input line by line from stdin. 3 data inputs, followed by a number, followed by that number of data inputs. There are plenty of other ways to do it, but this is pretty easy to follow.
local buffer = {}
function process_input(input)
if #buffer == 3 then
input = tonumber(input)
end
table.insert(buffer,input)
if #buffer > 4 and #buffer == buffer[4] + 4 then
local pipe_delim = table.concat(buffer,'|')
buffer = {}
return pipe_delim
end
end
repeat
local input = io.read()
local pipe_delim = process_input( input )
if pipe_delim then
print('Got:', pipe_delim)
end
until false

Python 3- check if buffered out bytes form a valid char

I am porting some code from python 2.7 to 3.4.2, I am struck at the bytes vs string complication.
I read this 3rd point in the wolf's answer
Exactly n bytes may cause a break between logical multi-byte characters (such as \r\n in binary mode and, I think, a multi-byte character in Unicode) or some underlying data structure not known to you;
So, when I buffer read a file (say - 1 byte each time) & the very first characters happens to be a 6-byte unicode how do I figure out how many more bytes to be read? Because if I do not read till the complete char, it will be skipped from processing; as next time read(x) will read x bytes relative to it's last position (i.e. halfway between it char's byte equivalent)
I tried the following approach:
import sys, os
def getBlocks(inputFile, chunk_size=1024):
while True:
try:
data=inputFile.read(chunk_size)
if data:
yield data
else:
break
except IOError as strerror:
print(strerror)
break
def isValid(someletter):
try:
someletter.decode('utf-8', 'strict')
return True
except UnicodeDecodeError:
return False
def main(src):
aLetter = bytearray()
with open(src, 'rb') as f:
for aBlock in getBlocks(f, 1):
aLetter.extend(aBlock)
if isValid(aLetter):
# print("char is now a valid one") # just for acknowledgement
# do more
else:
aLetter.extend( getBlocks(f, 1) )
Questions:
Am I doomed if I try fileHandle.seek(-ve_value_here, 1)
Python must be having something in-built to deal with this, what is it?
how can I really test if the program meets its purpose of ensuring complete characters are read (right now I have only simple english files)
how can I determine best chunk_size to make program faster. I mean reading 1024 bytes where first 1023 bytes were 1-byte-representable-char & last was a 6-byter leaves me with the only option of reading 1 byte each time
Note: I can't prefer buffered reading as I do not know range of input file sizes in advance
The answer to #2 will solve most of your issues. Use an IncrementalDecoder via codecs.getincrementaldecoder. The decoder maintains state and only outputs fully decoded sequences:
#!python3
import codecs
import sys
byte_string = '\u5000\u5001\u5002'.encode('utf8')
# Get the UTF-8 incremental decoder.
decoder_factory = codecs.getincrementaldecoder('utf8')
decoder_instance = decoder_factory()
# Simple example, read two bytes at a time from the byte string.
result = ''
for i in range(0,len(byte_string),2):
chunk = byte_string[i:i+2]
result += decoder_instance.decode(chunk)
print('chunk={} state={} result={}'.format(chunk,decoder_instance.getstate(),ascii(result)))
result += decoder_instance.decode(b'',final=True)
print(ascii(result))
Output:
chunk=b'\xe5\x80' state=(b'\xe5\x80', 0) result=''
chunk=b'\x80\xe5' state=(b'\xe5', 0) result='\u5000'
chunk=b'\x80\x81' state=(b'', 0) result='\u5000\u5001'
chunk=b'\xe5\x80' state=(b'\xe5\x80', 0) result='\u5000\u5001'
chunk=b'\x82' state=(b'', 0) result='\u5000\u5001\u5002'
'\u5000\u5001\u5002'
Note after the first two bytes are processed the internal decoder state just buffers them and appends no characters to the result. The next two complete a character and leave one in the internal state. The last call with no additional data and final=True just flushes the buffer. It will raise an exception if there is an incomplete character pending.
Now you can read your file in whatever chunk size you want, pass them all through the decoder and be sure that you only have complete code points.
Note that with Python 3, you can just open the file and declare the encoding. The chunk you read will actually be processed Unicode code points using an IncrementalDecoder internally:
input.csv (saved in UTF-8 without BOM)
我是美国人。
Normal text.
code
with open('input.txt',encoding='utf8') as f:
while True:
data = f.read(2) # reads 2 Unicode codepoints, not bytes.
if not data: break
print(ascii(data))
Result:
'\u6211\u662f'
'\u7f8e\u56fd'
'\u4eba\u3002'
'\nN'
'or'
'ma'
'l '
'te'
'xt'
'.'

unexpected symbol near '\' when loadstring byte

When trying to create a tool that converts a Lua code into byte and then string.dump it I got an error.
Code used :
s = [[
print("hello lua user")
]]
local byte = ""
for i = 1, s:len() do
byte = byte.."\\"..tostring(s:byte(i))
end
-- Creating the function to use in string.dump
f, err = loadstring(byte)
print(err)
local output = string.dump(f)
The error in title comes from printing err
The weird is that if I print(byte) and then manually paste it inside loadstring quoted, it works.
Manually pasting it won't work since I need it to be automated.
You are confusing with escaped sequences in Lua. Let's check a simpler example:
In a system using ASCII, '\97' is equivalent to 'a', so
print('\97')
print('a')
Both lines print the character a, but what you are converting is like this:
print('\\97')
This prints \97 itself, not a.
To make your code work, add these lines after you get byte.
local f1, err1 = loadstring("return '" .. byte .. "'")
byte = f1()
This call to loadstring converts a string like '\\97' back to '\97'.

Resources