Linux - Named pipes - losing data

Linux - Named pipes - losing data - linux

I am using named pipes for IPC. At times the data sent between the process can be large and frequent. During these time I see lots of data loss. Are there any obvious problems in the code below that could cause this?
Thanks
#!/usr/bin/env groovy
import java.io.FileOutputStream;
def bytes = new File('/etc/passwd').bytes
def pipe = new File('/home/mohadib/pipe')
1000.times{
def fos = new FileOutputStream(pipe)
fos.write(bytes)
fos.flush()
fos.close()
}
#!/usr/bin/env groovy
import java.io.FileInputStream;
import java.io.ByteArrayOutputStream;
def pipe = new File('/home/mohadib/pipe')
def bos = new ByteArrayOutputStream()
def len = -1
byte[] buff = new byte[8192]
def i = 0
while(true)
{
def fis = new FileInputStream(pipe)
while((len = fis.read(buff)) != -1) bos.write(buff, 0, len)
fis.close()
bos.reset()
i++
println i
}

Named pipes lose their contents when the last process closes them. In your example, this can happen if the writer process does another iteration while the reader process is about to do fis.close(). No error is reported in this case.
A possible fix is to arrange that the reader process never closes the fifo. To get rid of the EOF condition when the last writer disconnects, open the fifo for writing, close the read end, reopen the read end and close the temporary write end.

This section gives me worries:
1000.times{
def fos = new FileOutputStream(pipe)
fos.write(bytes)
fos.flush()
fos.close()
}
I know that the underlying Unix write() system call does not always write the requested number of bytes. You have to check the return value to see what number was actually written.
I checked the docs for Java and it appears fos.write() has no return value, it just throws an IOException if anything goes wrong. What does Groovy do with exceptions? Are there any exceptions happening?
If you can, run this under strace and view the results of the read and write system calls. It's possible that the Java VM isn't doing the right thing with the write() system call. I know this can happen because I caught glibc's fwrite implementation doing that (ignoring the return value) two years ago.

Related

Problem with using CreatePseudoConsole to create a pty pipe in python

I am trying to make a pty pipe. For that I have to use the CreatePseudoConsole function from the Windows api. I am loosely copying this which is this but in python.
I don't know if it's relevant but I am using Python 3.7.9 and Windows 10.
This is my code:
from ctypes.wintypes import DWORD, HANDLE, SHORT
from ctypes import POINTER, POINTER, HRESULT
import ctypes
import msvcrt
import os
# The COORD data type used for the size of the console
class COORD(ctypes.Structure):
_fields_ = [("X", SHORT),
("Y", SHORT)]
# HPCON is the same as HANDLE
HPCON = HANDLE
CreatePseudoConsole = ctypes.windll.kernel32.CreatePseudoConsole
CreatePseudoConsole.argtypes = [COORD, HANDLE, HANDLE, DWORD, POINTER(HPCON)]
CreatePseudoConsole.restype = HRESULT
def create_console(width:int, height:int) -> HPCON:
read_pty_fd, write_fd = os.pipe()
read_pty_handle = msvcrt.get_osfhandle(read_pty_fd)
read_fd, write_pty_fd = os.pipe()
write_pty_handle = msvcrt.get_osfhandle(write_pty_fd)
# Create the console
size = COORD(width, height)
console = HPCON()
result = CreatePseudoConsole(size, read_pty_handle, write_pty_handle,
DWORD(0), ctypes.byref(console))
# Check if any errors occured
if result != 0:
raise ctypes.WinError(result)
# Add references for the fds to the console
console.read_fd = read_fd
console.write_fd = write_fd
# Return the console object
return console
if __name__ == "__main__":
consol = create_console(80, 80)
print("Writing...")
os.write(consol.write_fd, b"abc")
print("Reading...")
print(os.read(consol.read_fd, 1))
print("Done")
The problem is that it isn't able to read from the pipe. I expected it to print "a" but it just gets stuck on the os.read. Please note that this is the first time I use the WinAPI so the problem is likely to be there.

There is nothing wrong with the code: what you got wrong is your expectations.
What you are doing is writing to the pipe meant to feed ‘keyboard’ input to the program, and reading from another pipe that returns ‘screen’ output from the program. But there is no actual program at the other end of either pipe, and so there is nothing the read call can ever return.
The HPCON handle returned by the CreatePseudoConsole API is supposed to be passed in a thread attribute list to a newly-spawned process via CreateProcess. Pretty much nothing can be done with it other than that. (You cannot even connect yourself to the pseudoconsole, as you would be able on Unix.) After the handle is passed in this manner, you can communicate with the process using the read_fd and write_fd descriptors.
An article on MSDN provides a full sample in C that creates a pseudoconsole and passes it to a new process; the exact same thing is done by the very source you linked.

Python avoid partial writes with non-blocking write to named pipe

I am running python3.8 on linux.
In my script, I create a named pipe, and open it as follows:
import os
import posix
import time
file_name = 'fifo.txt'
os.mkfifo(file_name)
f = posix.open(file_name, os.O_RDWR | os.O_NONBLOCK)
os.set_blocking(f, False)
Without yet having opened the file for reading elsewhere ( for instance, with cat), I start to write to the file in a loop.
base_line = 'abcdefghijklmnopqrstuvwxyz'
s = base_line * 10000 + '\n'
while True:
try:
posix.write(f, s.encode())
except BlockingIOError as e:
print("Exception occurred: {}".format(e))
time.sleep(.5)
When I then go to read from the named pipe with cat, I find that there was a partial-write that took place.
I am confused how I can know how many bytes were written in this instance. Since the exception was thrown, I do not have access to the return value (num bytes written). The documentation suggests that BlockingIOError has a property called characters_written, however when I try to access this field an AttributeError is raised.
In summary: How can I either avoid this partial write in the first place, or at least know how much was partially written in this instance?

os.write performs an unbuffered write. The docs state that BlockingIOError only has a characters_written attribute when a buffered write operation would block.
If any bytes were successfully written before the pipe became full, that number of bytes will be returned from os.write. Otherwise, you'll get an exception. Of course, something like a drive failure will also cause an exception, even if some bytes were written. This is no different from how POSIX write works, except instead of returning -1 on error, an exception is raised.
If you don't like dealing with the exception, you can use a wrapper around the file descriptor, such as a io.FileIO object. I've modified your code since it tries to write the entire buffer every time you looped back to the os.write call (if it failed once, it will fail every time):
import io
import os
import time
base_line = 'abcdefghijklmnopqrstuvwxyz'
data = (base_line * 10000 + '\n').encode()
file_name = 'fifo.txt'
os.mkfifo(file_name)
fd = os.open(file_name, os.O_RDWR | os.O_NONBLOCK)
# os.O_NONBLOCK makes os.set_blocking(fd, False) unnecessary.
with io.FileIO(fd, 'wb') as f:
written = 0
while written < len(data):
n = f.write(data[written:])
if n is None:
time.sleep(.5)
else:
written += n
BTW, you might use the selectors module instead of time.sleep; I noticed a slight delay when trying to read from the pipe because of the sleep delay, which shouldn't happen if you use the selectors module:
with io.FileIO(fd, 'wb') as f:
written = 0
sel = selectors.DefaultSelector()
sel.register(f, selectors.EVENT_WRITE)
while written < len(data):
n = f.write(data[written:])
if n is None:
# Wait here until we can start writing again.
sel.select()
else:
written += n
sel.unregister(f)
Some useful information can also be found in the answer to POSIX named pipe (fifo) drops record in nonblocking mode.

Read on pipe being blocked when child(writer) is put on sleep on Python 3

In the following code I'm trying to make the child send some string and then make it sleep for a while. Everytime the child sleeps for some reason the read of the parent gets blocked. The read works fine if the sleep is removed but it is essential for the application I'm making.
import os
nameout, namein = os.pipe()
cpid = os.fork()
if cpid == 0: # Child Process
os.close(nameout)
namein = os.fdopen(namein,'w')
namein.write("Empire of the Clouds")
print("I've just written")
os.system('sleep 60') # Removing this makes everything work
else: # Parent process
print("I'm inside parent")
os.close(namein)
songname = os.fdopen(nameout)
print("I'm going to read")
songAndNum = songname.read()
print("Song name read")
print(songAndNum)
Could you please tell me where I'm going wrong or offer some other alternative. Thanks in advance

There are two issues here. Firstly the string probably has not really been written into the pipe when the child goes to sleep. The string is probably still sitting in an internal buffer. To fix, call namein.flush() after namein.write().
Secondly, songname.read() will block until a large amount of data is collected from the pipe or until the pipe gets closed as a side effect of the child process exiting. To fix, change songname.read() to songname.readline() and in the child write a newline \n after the title string before calling flush().

RxScala: How to keep the thread doing Observable.interval alive?

I am trying to write a simple RxScala program:
import rx.lang.scala.Observable
import scala.concurrent.duration.DurationInt
import scala.language.{implicitConversions, postfixOps}
object Main {
def main(args: Array[String]): Unit = {
val o = Observable.interval(1 second)
o.subscribe(println(_))
}
}
When I run this program, I do not see anything printed out. I suspect that this is because that thread producing the numbers in Observable.interval dies. I noticed a call to waitFor(o) in the RxScalaDemo, but I can't figure out where that is imported from.
How do I keep this program running for ever printing the number sequence?

Here is one way to block the main thread from exiting:
val o = Observable.interval(1 second)
val latch = new CountDownLatch(1)
o.subscribe(i => {
print(i)
if (i >= 5) latch.countDown()
})
latch.await()
This is a fairly common pattern, use CountDownLatch.await to block the main thread and then countDown the latch when you are done with what you are doing, thus releasing the main thread

You're not seeing anything because your main method exits immediately after you subscribe to the Observable. At that point, your program is done.
A common trick for test programs like this is to read a byte from stdin once you've subscribed.

Groovy: Appending to large files

How do I append to large files efficiently. I have a process that has to continually append to a file and as the file size grows the performance seems to slow down as well. Is there anyway to specify a large buffer size with the append

While Don's approach is valid, in general (it will throw an exception, though, because of a syntax error, and you need to flush() a BufferedOutputStream), I've been planning to elaborate further (which takes its time).
Groovy does not provide special objects for I/O operations. Thus, you would use Java's FileOutputStream (for writing bytes) or FileWriter (for writing strings). Both provide a constructor that takes a boolean append parameter.
For both, there exist decorators (BufferedOutputStream and BufferedWriter), which are buffered. "Buffering", in this scope, means, that contents are not necessarily written instantly to the underlying stream, and thus, there's a potential for I/O optimization.
Don already provided a sample for the BufferedOutputStream, and here's one for the BufferedWriter:
File file = new File("foo")
if (file.exists()) {
assert file.delete()
assert file.createNewFile()
}
boolean append = true
FileWriter fileWriter = new FileWriter(file, append)
BufferedWriter buffWriter = new BufferedWriter(fileWriter)
100.times { buffWriter.write "foo" }
buffWriter.flush()
buffWriter.close()
While Groovy does not provide its own I/O objects, the Groovy JDK (GDK) enhances several Java types by adding convenience methods. In the scope of I/O outputting, the OutputStream and the File types are relevant.
So, finally, you can work with those the "Groovy way":
new File("foo").newOutputStream().withWriter("UTF-8") { writer ->
100.times { writer.write "foo" + it }
}
EDIT: As per your further inquiry:
None of the GDK methods allows for setting a buffer size.
The above "Groovy" code will overwrite the file if called repeatedly. - In contrast, the following piece of code will append the string to an existing file and, thus, can be called repeatedly:
new File("foo").withWriterAppend("UTF-8") { it.write("bar") }

def file = new File('/path/to/file')
// Create an output stream that writes to the file in 'append' mode
def fileOutput = new FileOutputStream(file, true)
// Buffer the output - set bufferSize to whatever size buffer you want to use
def bufferSize = 512
def fileOutput = new BufferedOutputStream(fileOutput, bufferSize)
try {
byte[] contentToAppend = // Get the content to write to the file
fileOutput.write(contentToAppend)
} finally {
fileOutput.close()
}

In the JVM on Windows the append flag has been implemented inefficently with a seek operation.
This is neighter atomic nor very performant when opening the file multiple times. It is supposed to be fixed somewhere in the Java VM 7: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6631352

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Linux - Named pipes - losing data - linux

Related

Problem with using CreatePseudoConsole to create a pty pipe in python

Python avoid partial writes with non-blocking write to named pipe

Read on pipe being blocked when child(writer) is put on sleep on Python 3

RxScala: How to keep the thread doing Observable.interval alive?

Groovy: Appending to large files

Categories

Resources