Interprocess communication via Pipes - linux

It is known that during Interprocess Communication in Linux, the processes communicate with each other through a special file named as "Pipe".
It is also known that the the operations performed on that file is write by one process and read by one process in order to communicate with each other.
Now, the question is :
Do these write and read operations are performed in parallel during the communication (operations are executed parallely) ? and if not than,
What happens when one of the process enters the SLEEP state during the communication? Does it performs the write operation first for the second process to read or it goes directly to sleep without performing any of the write and read operation?

The sending process can write until the pipe buffer is full (64k on Linux since 2.6.11). After that, write(2) will block.
The receiving process will block until data is available to read(2).
For a more detailed look into pipe buffering, look at https://unix.stackexchange.com/a/11954.
For example, this program
#include <sys/types.h>
#include <sys/wait.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
int
main(int argc, char *argv[])
{
int pipefd[2];
pid_t cpid;
char wbuf[32768];
char buf[16384];
/* Initialize writer buffer with 012...89 sequence */
for (int i = 0; i < sizeof(wbuf); i++)
wbuf[i] = '0' + i % 10;
if (pipe(pipefd) == -1) {
perror("pipe");
exit(EXIT_FAILURE);
}
cpid = fork();
if (cpid == -1) {
perror("fork");
exit(EXIT_FAILURE);
}
if (cpid == 0) { /* Child reads from pipe */
close(pipefd[1]); /* Close unused write end */
while (read(pipefd[0], &buf, sizeof(buf)) > 0);
close(pipefd[0]);
_exit(EXIT_SUCCESS);
} else { /* Parent writes sequence to pipe */
close(pipefd[0]); /* Close unused read end */
for (int i = 0; i < 5; i++)
write(pipefd[1], wbuf, sizeof(wbuf));
close(pipefd[1]); /* Reader will see EOF */
wait(NULL); /* Wait for child */
exit(EXIT_SUCCESS);
}
}
will produce the following sequence when run with gcc pipes.c && strace -e trace=open,close,read,write,pipe,clone -f ./a.out:
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
close(3) = 0
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\3\2\0\0\0\0\0"..., 832) = 832
close(3) = 0
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f32117489d0) = 21114
close(3) = 0
write(4, "01234567890123456789012345678901"..., 32768) = 32768
write(4, "01234567890123456789012345678901"..., 32768) = 32768
write(4, "01234567890123456789012345678901"..., 32768strace: Process 21114 attached
<unfinished ...>
[pid 21114] close(4) = 0
[pid 21114] read(3, "01234567890123456789012345678901"..., 16384) = 16384
[pid 21114] read(3, <unfinished ...>
[pid 21113] <... write resumed> ) = 32768
[pid 21114] <... read resumed> "45678901234567890123456789012345"..., 16384) = 16384
[pid 21113] write(4, "01234567890123456789012345678901"..., 32768 <unfinished ...>
[pid 21114] read(3, "01234567890123456789012345678901"..., 16384) = 16384
[pid 21114] read(3, <unfinished ...>
[pid 21113] <... write resumed> ) = 32768
[pid 21114] <... read resumed> "45678901234567890123456789012345"..., 16384) = 16384
[pid 21113] write(4, "01234567890123456789012345678901"..., 32768 <unfinished ...>
[pid 21114] read(3, "01234567890123456789012345678901"..., 16384) = 16384
[pid 21114] read(3, <unfinished ...>
[pid 21113] <... write resumed> ) = 32768
[pid 21114] <... read resumed> "45678901234567890123456789012345"..., 16384) = 16384
[pid 21113] close(4) = 0
[pid 21114] read(3, "01234567890123456789012345678901"..., 16384) = 16384
[pid 21114] read(3, "45678901234567890123456789012345"..., 16384) = 16384
[pid 21114] read(3, "01234567890123456789012345678901"..., 16384) = 16384
[pid 21114] read(3, "45678901234567890123456789012345"..., 16384) = 16384
[pid 21114] read(3, "", 16384) = 0
[pid 21114] close(3) = 0
[pid 21114] +++ exited with 0 +++
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=21114, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
+++ exited with 0 +++
You'll notice that the reads and writes are interleaved and that the writing and reading processes will block a few times as either the pipe is full or not enough data is available for reading.

Related

How to properly pass buffer pointers to Linux system calls in x86_64 assembly?

Hello I am trying to learn how to make system calls with x86_64 assembly on Linux. I am running into an issue where I can't seem to figure out how to properly pass the argument for getpeername.
In this link using C it looks like they are using the address of operator to pass the arguments. I can't figure out how to replicate this out in assembly. Here is the strace when I use no brackets for my buffer.
First I defined my buffer in the section .data
ip_buff: times 14 db 0
.length: equ $-ip_buff
This is a macro
%define SYS_getpeername 52
r12 stores the return value from the socket accept call
syscall getpeername,r12,ip_buff,15
Here is the strace not using brackets
[pid 749] accept(3, NULL, NULL <unfinished ...>
[pid 761] read(4, "", 1024) = 0
[pid 761] write(1, "", 0) = 0
[pid 761] getpeername(4, 0x600733, 0xf) = -1 EFAULT (Bad address)
Here is the strace for when I do use brackets.
[pid 749] accept(3, NULL, NULL <unfinished ...>
[pid 745] read(4, "GET / HTTP/1.1\r\nHost: 127.0.0.1:"..., 1024) = 78
[pid 745] write(1, "GET / HTTP/1.1\r\nHost: 127.0.0.1:"..., 78) = 78
[pid 745] getpeername(4, NULL, 0xf) = -1 EFAULT (Bad address)
How can I properly make this system call?
The actual problem is not with the buffer but with its length. Notice in the prototype you have socklen_t *addrlen so that should be a pointer. The value 15 that you pass is not a pointer hence the -EFAULT.
You should change the .length: equ $-ip_buff to ip_length: dd $-ip_buff and then use syscall getpeername,r12,ip_buff,ip_length

az acs kubernetes get-credentials | Invalid EC key. | ssh known_hosts corrupted

I run into this issue, and it took me a while, some intuition, educated guess, and an strace -f command to find this bug.
I believe it is wrongly caught exception of the paramico? library, hidden by some Azure CLI exception catcher.
Anyway, there I leave it, so the futere-me, and future-you can find it.
az acs kubernetes get-credentials
Invalid EC key.
$ strace -f az acs kubernetes get-credentials
(interesting part of MissingHostKeyPolicy wrt hostkeys (my guess - known_hosts file)
[pid 9035] open("/home/kuba/lib/azure-cli/local/lib/python2.7/site-packages/paramiko/client.py", O_RDONLY) = 4
[pid 9035] fstat(4, {st_mode=S_IFREG|0664, st_size=30983, ...}) = 0
[pid 9035] fstat(4, {st_mode=S_IFREG|0664, st_size=30983, ...}) = 0
[pid 9035] read(4, "# Copyright (C) 2006-2007 Robey"..., 8192) = 8192
[pid 9035] read(4, " sock=None,\n gss_auth=Fal"..., 4096) = 4096
[pid 9035] read(4, "t be\n verified\n "..., 4096) = 4096
[pid 9035] read(4, " )\n else:\n "..., 4096) = 4096
[pid 9035] read(4, " chan = self._transport.open_"..., 4096) = 4096
[pid 9035] read(4, " allowed_types = "..., 4096) = 4096
[pid 9035] read(4, " MissingHostKeyPolicy (object):\n"..., 4096) = 2311
[pid 9035] read(4, "", 4096) = 0
[pid 9035] close(4) = 0
[pid 9035] stat("/home/kuba/lib/azure-cli/local/lib/python2.7/site-packages/paramiko/hostkeys.py", {st_mode=S_IFREG|0664, st_size=13135, ...}) = 0
(part just before crush)
[pid 9035] read(4, " bn_ptr = self._lib.BN_bin2bn"..., 4096) = 4096
[pid 9035] read(4, "lf._lib.BIO_new_mem_buf(\n "..., 4096) = 4096
[pid 9035] read(4, " hashes.SHA1,\n "..., 4096) = 4096
[pid 9035] read(4, "DSA_free)\n\n p = self._int"..., 4096) = 4096
[pid 9035] read(4, "ror(\n \"MD5 is not"..., 4096) = 4096
[pid 9035] read(4, " CRL version. We only support v2"..., 4096) = 4096
[pid 9035] read(4, ": {0}'.format(extension.oid)\n "..., 4096) = 4096
[pid 9035] read(4, " return self._evp_pkey_to_priv"..., 4096) = 4096
[pid 9035] read(4, "eturn _CertificateRevocationList"..., 4096) = 4096
[pid 9035] read(4, " _Reasons.UNSUPPORTED_CIPHER\n "..., 4096) = 4096
[pid 9035] read(4, "i.NULL)\n ec_cdata = self."..., 4096) = 4096
[pid 9035] read(4, "res != 1:\n self._cons"..., 4096) = 4096
[pid 9035] read(4, "ding must be an item from the En"..., 4096) = 4096
[pid 9035] read(4, " write_bio = self._li"..., 4096) = 4096
[pid 9035] read(4, " parameter_numbers = numb"..., 4096) = 4096
[pid 9035] read(4, " self._lib.NID_X25519, se"..., 4096) = 1791
[pid 9035] read(4, "", 4096) = 0
I was right - 2 lines of my known_hosts glued together. Strange that only az cli did fail
|1|YDdg1mMCRjdmiJt7MkMpelWDk2o=|i1EMCbgw/5my5flPsw2BiFa8mUM= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTAAABBBCpdyijGVsvUtMdlLoB5ekaQHQ2ZzQ0Z8UY5xdOAx9qqb3cYCYJgv8mc32yUzSu8D4iKfW2E5JXB8fG5otZsi3E=
|1|bssRIVCpG+vfNtdM4RAwH6zUCW8=|7AFIFRTmvoqO12bTZ0CyTgTqKdw= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBBaXNnBKKBlQ1WDqy90c1zNjklBL7zXqDIB|1|AOjIgeSGPSh32t33uEGOX3iycrc=|7LupvcIR6QL8USA193kRORnA1rQ= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyTYAAAAIbmlzdHAyNTYAAABBBBaXNnBKKBlQ1WDqy90c1zNjklBL7zXqDIBbAp0NBe9dYmuyTytpGxOWvmWoA1gjbNd/ekXW+m8gd6Yf8pDE/Cg=
|1|67+OBFoZyiXGx6mDl+lu/3SpBOc=|K6GLNh6ztZ9eb8cNGV64Rn3/yIM= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBD7n79Vhwqw8zmRFFQvjnE2UB24vl8JWAN0ZPPFDOtr9jBd90AKsbZEXmqZhP1GennphesTU1cdHayQrQGbjV8=
A side topic:
After splitting the line, the error message has changed to more readable output. In the end I did remove the corrupted line to make it work.
edited known_hosts file
|1|YDdg1mMCRjdmiJt7MkMpelWDk2o=|i1EMCbgw/5my5flPsw2BiFa8mUM= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTAAABBBCpdyijGVsvUtMdlLoB5ekaQHQ2ZzQ0Z8UY5xdOAx9qqb3cYCYJgv8mc32yUzSu8D4iKfW2E5JXB8fG5otZsi3E=
|1|bssRIVCpG+vfNtdM4RAwH6zUCW8=|7AFIFRTmvoqO12bTZ0CyTgTqKdw= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBBaXNnBKKBlQ1WDqy90c1zNjklBL7zXqDIB
|1|AOjIgeSGPSh32t33uEGOX3iycrc=|7LupvcIR6QL8USA193kRORnA1rQ= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyTYAAAAIbmlzdHAyNTYAAABBBBaXNnBKKBlQ1WDqy90c1zNjklBL7zXqDIBbAp0NBe9dYmuyTytpGxOWvmWoA1gjbNd/ekXW+m8gd6Yf8pDE/Cg=
|1|67+OBFoZyiXGx6mDl+lu/3SpBOc=|K6GLNh6ztZ9eb8cNGV64Rn3/yIM= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBD7n79Vhwqw8zmRFFQvjnE2UB24vl8JWAN0ZPPFDOtr9jBd90AKsbZEXmqZhP1GennphesTU1cdHayQrQGbjV8=
az acs kubernetes get-credentials --resource-group=myResourcGroup --name=myK8sCluster
('|1|bssRIVCpG+vfNtdM4RAwH6zUCW8=|7AFIFRTmvoqO12bTZ0CyTgTqKdw= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBBaXNnBKKBlQ1WDqy90c1zNjklBL7zXqDIB', Error('Incorrect padding',))
Traceback (most recent call last):
File "/home/kuba/lib/azure-cli/local/lib/python2.7/site-packages/azure/cli/main.py", line 36, in main
cmd_result = APPLICATION.execute(args)
File "/home/kuba/lib/azure-cli/local/lib/python2.7/site-packages/azure/cli/core/application.py", line 216, in execute
result = expanded_arg.func(params)
File "/home/kuba/lib/azure-cli/local/lib/python2.7/site-packages/azure/cli/core/commands/__init__.py", line 377, in __call__
return self.handler(*args, **kwargs)
File "/home/kuba/lib/azure-cli/local/lib/python2.7/site-packages/azure/cli/core/commands/__init__.py", line 620, in _execute_command
reraise(*sys.exc_info())
File "/home/kuba/lib/azure-cli/local/lib/python2.7/site-packages/azure/cli/core/commands/__init__.py", line 602, in _execute_command
result = op(client, **kwargs) if client else op(**kwargs)
File "/home/kuba/lib/azure-cli/local/lib/python2.7/site-packages/azure/cli/command_modules/acs/custom.py", line 776, in k8s_get_credentials
_k8s_get_credentials_internal(name, acs_info, path, ssh_key_file)
File "/home/kuba/lib/azure-cli/local/lib/python2.7/site-packages/azure/cli/command_modules/acs/custom.py", line 797, in _k8s_get_credentials_internal
'.kube/config', path_candidate, key_filename=ssh_key_file)
File "/home/kuba/lib/azure-cli/local/lib/python2.7/site-packages/azure/cli/command_modules/acs/acs_client.py", line 70, in secure_copy
ssh.load_system_host_keys()
File "/home/kuba/lib/azure-cli/local/lib/python2.7/site-packages/paramiko/client.py", line 102, in load_system_host_keys
self._system_host_keys.load(filename)
File "/home/kuba/lib/azure-cli/local/lib/python2.7/site-packages/paramiko/hostkeys.py", line 97, in load
e = HostKeyEntry.from_line(line, lineno)
File "/home/kuba/lib/azure-cli/local/lib/python2.7/site-packages/paramiko/hostkeys.py", line 366, in from_line
raise InvalidHostKey(line, e)
InvalidHostKey: ('|1|bssRIVCpG+vfNtdM4RAwH6zUCW8=|7AFIFRTmvoqO12bTZ0CyTgTqKdw= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBBaXNnBKKBlQ1WDqy90c1zNjklBL7zXqDIB', Error('Incorrect padding',))

why does shell save, dup, and restore redirected descriptor in the parent, rather than redirect in the child?

I wanted to understand how redirection is implemented in a (POSIX) shell, so I looked at dash (the simplest shell) code, and here is how it works.
A dash script like:
date > foobar.txt
date
is (as an SSCCE) handled like this:
int fd;
int saved;
fd = open64("foobar.txt", O_WRONLY|O_CREAT);
saved = fcntl(1, F_DUPFD, 10);
dup2(fd, 1);
if (!fork()) {
execl("/bin/date", "date", (char *)NULL);
}
dup2(saved, 1);
if (!fork()) {
execl("/bin/date", "date", (char *)NULL);
}
This is strange. Why save, dup and dup again to restore, descriptors in the parent, when it would be much simpler to just dup in the child, and not have to save and restore. This is simpler and I checked it works the same:
int fd;
if (!fork()) {
fd = open64("foobar.txt", O_WRONLY|O_CREAT);
dup2(fd, 1);
execl("/bin/date", "date", (char *)NULL);
}
if (!fork()) {
execl("/bin/date", "date", (char *)NULL);
}
I am sure there must be a good reason and I am not understanding something deeper. What is it?
No good reason as far as I can tell. Bash does things in the opposite order, and the externally observable behavior is the same.
I didn't bother reading the source code, it's easy enough to see what happens using strace. (The : is to prevent the shell from optimizing away the fork.)
$ strace -fetrace=dup2,file,process dash -c 'date > foobar.txt; :'
execve("/usr/bin/dash", ["dash", "-c", "date > foobar.txt; :"], [/* 71 vars */]) = 0
open("foobar.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
dup2(3, 1) = 1
stat("/usr/bin/date", {st_mode=S_IFREG|0755, st_size=105280, ...}) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f6650bf2750) = 1948
strace: Process 1948 attached
[pid 1947] wait4(-1, <unfinished ...>
[pid 1948] execve("/usr/bin/date", ["date"], [/* 71 vars */]) = 0
[pid 1948] exit_group(0) = ?
[pid 1948] +++ exited with 0 +++
<... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 1948
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1948, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
dup2(10, 1) = 1
exit_group(0) = ?
+++ exited with 0 +++
$ strace -fetrace=dup2,file,process bash -c 'date > foobar.txt; :'
execve("/usr/bin/bash", ["bash", "-c", "date > foobar.txt; :"], [/* 71 vars */]) = 0
stat("/usr/bin/date", {st_mode=S_IFREG|0755, st_size=105280, ...}) = 0
access("/usr/bin/date", R_OK) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fd5b78349d0) = 2026
strace: Process 2026 attached
[pid 2025] wait4(-1, <unfinished ...>
[pid 2026] open("foobar.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
[pid 2026] dup2(3, 1) = 1
[pid 2026] execve("/usr/bin/date", ["date"], [/* 71 vars */]) = 0
[pid 2026] exit_group(0) = ?
[pid 2026] +++ exited with 0 +++
<... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 2026
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=2026, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
wait4(-1, 0x7ffeffe03c50, WNOHANG, NULL) = -1 ECHILD (No child processes)
exit_group(0) = ?
+++ exited with 0 +++

Implicit system calls in UNIX commands

I've been studying UNIX and system calls and I came across a low-level and tricky questions. The question asks what system calls are called for this command:
grep word1 word2 > file.txt
I did some research and I was unable to find a huge number of resources on the underlying UNIX calls. However, it seems to me that the answer would be open (to open and the file descriptor for the file file.txt), then dup2 (to change the STDOUT of grep to the file descriptor of open), then write to write the STDOUT of grep (which is now the file descriptor of file.txt), and finally close(), to close the file descriptor of file.txt... However, I have no idea if I am right or on the correct path, can anyone with experience in UNIX enlighten me on this topic?
You are on correct direction in your research. This command is very helpful to trace system calls in any program:
strace
On my PC it shows output (without stream redirection):
$ strace grep abc ss.txt
execve("/bin/grep", ["grep", "abc", "ss.txt"], [/* 237 vars */]) = 0
brk(0) = 0x13de000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f1785694000
close(3) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
stat("ss.txt", {st_mode=S_IFREG|0644, st_size=13, ...}) = 0
open("ss.txt", O_RDONLY) = 3
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fffa0e4f370) = -1 ENOTTY (Inappropriate ioctl for device)
read(3, "abc\n123\n321\n\n", 32768) = 13
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f178568c000
write(1, "abc\n", 4abc
) = 4
read(3, "", 32768) = 0
close(3) = 0
close(1) = 0
munmap(0x7f178568c000, 4096) = 0
close(2) = 0
exit_group(0) = ?

Different pipe size between Ubuntu and Debian

I was testing file transfer with OpenBSD netcat and noticed that it takes a bit more time to transfer the same file on Ubuntu rather than Debian. Using strace, I found that data is transferred in 64k blocks on Ubuntu.
mgamal#ubuntu:~$ strace cat test | nc -vvvv 10.10.172.11 8888
...
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536
On Debian on the other hand:
mgamal#ubuntu:~$ strace cat test | nc -vvvv 10.10.172.11 8888
....
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
I wrote the following piece of code on Debian to check the pipe size:
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
int main()
{
int pipefd[2];
int size;
int i;
pipefd[0] = STDIN_FILENO;
pipefd[1] = STDOUT_FILENO;
pipe(pipefd);
size = fcntl(pipefd[0], F_GETPIPE_SZ);
printf("%d\n", size);
size = fcntl(pipefd[1], F_GETPIPE_SZ);
printf("%d\n", size);
return 0;
}
Running it, it still reports 64k
mgamal#debian:~$ ./test
65536
65536
I also tried using something other than netcat to check. And I still see the pipe size being 128k
root#debian:~# strace cat foo | less
...
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
I've tried to check the source packages for netcat, the kernel, glibc to see if the pipe size is set to 128k, or if there are any calls to fcntl() that change the pipe size, but could find no trace.
Why is the pipe size reported as 64k, while actual size is 128k?
GNU cat is in the coreutils package. GNU cat does a stat or fstat on its input and output and looks at st_blksize, the optimal blocksize for filesystem I/O. It then takes the max of that number and a hardwired number and uses that as the buffer size for input and output. This is done in io_blksize.
Ubuntu 14 comes with coreutils 8.21. The minimum blocksize in that version is 64KiB.
Debian 8 comes with coreutils 8.23. The minimum blocksize in that version is 128KiB.

Resources