syscall lseek() is slower in ubuntu 2.6.32 vs 2.6.24 - linux

I am running the following code on ubuntu 14 and 18. It is 6X slower on 18 on same hardware. Is there something I am doing wrong?
main(int argc, char *argv[])
{
int fd;
off_t m;
time_t start, ed;
int i, k;
if (argc<2) exit(0);
fd = open(argv[1],O_RDWR|O_CREAT);
if (fd<0) {
printf("cannot open file %s\n", argv[1]);
exit(0);
}
start = time(0L);
for(k=0; k<100; ++k) {
for(i=0;i<500000;++i) {
m = lseek(fd, 0, 0);
if (m== -1) {
printf("lseek failed\n");
exit(0);
}
}
}
ed = time(0L);
printf("Time: %ld\n",ed-start);
}
On ubuntu 14 this takes 4 seconds
On ubuntu 18 it takes 24 seconds
Hardware is same

zx485 was right. the Spectre prevention in kernel was slowing it down by a factor of 6.
Disabling the protection with the following on redhat 7 changed it to normal.
# echo 0 > /sys/kernel/debug/x86/pti_enabled
# echo 0 > /sys/kernel/debug/x86/retp_enabled
# echo 0 > /sys/kernel/debug/x86/ibrs_enabled

Related

Only one thread created with MPI & no speed-up with OpenMP

I actually have two questions but it seems they may be connected:
1) I've tried to run basic MPI example:
#include <mpi.h>
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char* argv[])
{
int rank, size;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
printf("I am %d from %d\n", rank, size);
MPI_Finalize();
return 0;
}
It has to output something like:
I am 0 from 2
I am 1 from 2
Although I'm getting the following:
$ mpicc mpi_hello.c -o hello
$ mpirun -np 4 ./hello
I am 0 from 1
I am 0 from 1
I am 0 from 1
I am 0 from 1
$ mpirun -np 2 ./hello
I am 0 from 1
I am 0 from 1
Is it somehow connected to thread definition in Linux? I'm running it on Ubuntu 16.04.
2) My OpenMP program:
#include <omp.h>
#include <math.h>
#include <time.h>
#include <iostream>
#include <stdio.h>
const int N = 10000;
int matrix[N][N];
int main()
{
#pragma omp parallel num_threads(2)
#pragma omp for
for (int i = 0; i < N; i++)
for (int j = 0; j < N; j++)
matrix[i][j] = 1+i;
clock_t t;
t = clock();
#pragma omp parallel num_threads(2)
#pragma omp for
for (int i = 0; i < N; i++)
{
matrix[i][i] = 0;
for (int j = 0; j< N; j++)
if (j != i)
matrix[i][i] += sin(cos(log(matrix[i][j] + matrix[j][i])));
}
t = clock() - t;
std::cout << "It took " << ((float)t)/CLOCKS_PER_SEC << " sec" << std::endl;
return 0;
}
It works correctly and uses 2 threads. However, it loads 2 processors (~100% CPU) and takes the same time (~34 seconds) as the similiar consequtive one (loads 1 processor ~50% CPU). I know that OpenMP may need some time to start, but how can it result in the same duration of programs?
Answering to the MPI part of the question.
Do you have MPICH installed? If is that so, try to compile and run like this:
$ mpicc.mpich mpi_hello.c -o hello
$ mpirun.mpich -np 4 ./hello
I am 0 from 4
I am 1 from 4
I am 2 from 4
I am 3 from 4
It should work.

Real parallelism in Linux shell

I am trying to have real parallelism on Linux shell, but I can't achieve it.
I have two programs. Allones, that only prints '1' character, and allzeros, that only prints 0 characters.
When I execute "./allones & ./allzeros &", I get big prints of '0's, and big prints of '1's, that mix in big chunks (e.g. 1111....111000...0000111...111000...000"). My processor has 8 cores.
However, when I executed my own program on a multi-core FPGA (with no OS), (If I distribute programs on different cores) I get something like "011000101000011010...".
How can I run it on Linux to get a result similar to what I get on a multi-core FPGA?
Sounds like you're experiencing libc's default line buffering:
Here's a test program spam.c:
#include <stdio.h>
int main(int argc, char** argv) {
while(1) {
printf("%s", argv[1]);
}
}
We can run it with:
$ ./spam 0 & ./spam 1 & sleep 1; killall spam
11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111(...)000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000(...)
On my systems, each block is exactly 1024 bytes long, strongly hinting at a buffering issue.
Here's the same code with a fflush to prevent buffering:
#include <stdio.h>
int main(int argc, char** argv) {
while(1) {
printf("%s", argv[1]);
fflush(stdout);
}
}
This is the new output:
100111001100110011001100110011001100110011100111001110011011001100110011001100110011001100110011001100110011001100110011001100011000110001100110001100100110011001100111001101100110011001100110011001100110000000000110010011000110011

My strace program gave wrong output under 64bit linux system

I am building an online judge system. The key point is to trace system calls.
I choose ptrace. The subprocess will stopped because of SIGTRAP when it's going to enter one system call, then the parent process goes to read the orig_rax (orig_eax)register of subprocess to get the system call number.
When the code is running under Opensuse13.1 32bit, no problem, its output is the same as the linux command strace.
But I just test the code under Opensuse13.1 64bit, Ubuntu12.04 64bit, its output is wrong.
The demo code can be downloaded here : https://gist.github.com/kainwen/41a7bd0198099d766bda
Under 64bit linux system, you save the code strace_ls.c, compile it gcc strace_ls.c, and run it./a.out 2>out. The output is here
So my code's output is strange.
My code and its output is below:
#include <sys/resource.h>
#include <unistd.h>
#include <sys/ptrace.h>
#include <sys/reg.h>
#include <sys/user.h>
#include <sys/syscall.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <signal.h>
#include <stdio.h>
void judge_exe()
{
pid_t pid ;
int insyscall = 0 ;
struct user context ;
pid = fork() ;
if (pid == 0) { //child
ptrace(PTRACE_TRACEME,0,NULL,NULL) ;
execl("/usr/bin/ls","ls",NULL) ;
}
else {//parent
int status ;
ptrace(PTRACE_SYSCALL, pid, NULL, NULL);
while (1) {
wait(&status) ;
if (WIFEXITED(status)) //normally terminated
break;
else if (WIFSTOPPED(status) && (WSTOPSIG(status)==SIGTRAP)) {
if (!insyscall) {
insyscall = 1;
ptrace(PTRACE_GETREGS,pid,NULL,&context.regs);
fprintf(stderr,"syscall num: %d \n",context.regs.orig_rax);
} else
insyscall = 0;
}
ptrace(PTRACE_SYSCALL,pid,NULL,NULL);
}
return;
}
}
int main(int argc, char *argv[]) {
judge_exe();
}
The output of the code above is (just heading lines):
syscall num: 59
syscall num: 12
syscall num: 9
syscall num: 21
syscall num: 2
syscall num: 4
The heading lines of output of the linux command strace ls is:
execve("/usr/bin/ls", ["ls"], [/* 101 vars */]) = 0
brk(0) = 0x1bab000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efff4326000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/usr/lib64/mpi/gcc/openmpi/lib64/tls/x86_64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/mpi/gcc/openmpi/lib64/tls/x86_64", 0x7fff00c49b80) = -1 ENOENT (No such file or directory)
The number of system call sys_execve is 11, not 59 (the output of my code)
Solved
My method is right. The syscall number is different under 64bit and 32bit.

Execute a command in another terminal via /dev/pts

I have a terminal that uses STDIN 3 (/proc/xxxx/fd/0 -> /dev/pts/3)
So if (in another terminal) I do:
echo 'do_something_command' > /dev/pts/3
The command is shown in my first (pts/3) terminal, but the command is not executed. And if (in this terminal pts/3) I'm in a program waiting for some data from stdin, the data is written on screen but the program does not capture it from stdin.
What I want to do is execute the command "do_something_command" and not only show it.
Can someone explain this behavior to me? How do I achieve my intention?
I completely get what you are asking. You can achieve this by writing and executing a small piece of code in C yourself. This should give you some idea.
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <string.h>
#include <unistd.h>
void print_help(char *prog_name) {
printf("Usage: %s [-n] DEVNAME COMMAND\n", prog_name);
printf("Usage: '-n' is an optional argument if you want to push a new line at the end of the text\n");
printf("Usage: Will require 'sudo' to run if the executable is not setuid root\n");
exit(1);
}
int main (int argc, char *argv[]) {
char *cmd, *nl = "\n";
int i, fd;
int devno, commandno, newline;
int mem_len;
devno = 1; commandno = 2; newline = 0;
if (argc < 3) {
print_help(argv[0]);
}
if (argc > 3 && argv[1][0] == '-' && argv[1][1] == 'n') {
devno = 2; commandno = 3; newline=1;
} else if (argc > 3 && argv[1][0] == '-' && argv[1][1] != 'n') {
printf("Invalid Option\n");
print_help(argv[0]);
}
fd = open(argv[devno],O_RDWR);
if(fd == -1) {
perror("open DEVICE");
exit(1);
}
mem_len = 0;
for (i = commandno; i < argc; i++) {
mem_len += strlen(argv[i]) + 2;
if (i > commandno) {
cmd = (char *)realloc((void *)cmd, mem_len);
} else { // i == commandno
cmd = (char *)malloc(mem_len);
}
strcat(cmd, argv[i]);
strcat(cmd, " ");
}
if (newline == 0)
usleep(225000);
for (i = 0; cmd[i]; i++)
ioctl (fd, TIOCSTI, cmd+i);
if (newline == 1)
ioctl (fd, TIOCSTI, nl);
close(fd);
free((void *)cmd);
exit (0);
}
Compile and execute it with sudo permissions. For example, if you want to execute a command on /dev/pts/3, then simply do a sudo ./a.out -n /dev/pts/3 whoami, runs a whoami on /dev/pts/3.
This code was completely taken from this page.
You seem to use the wrong quotes around the command.
Either remove the quotes and the echo command, or use echo and back-ticks (`).
Try:
echo `date` > /dev/pts/3
or just
date > /dev/pts/3
Note that whatever runs on /dev/pts/3 wouldn't be able to read what pops up "from behind".

Write log in stdout (MPI)

I using MPI on Windows with Cygwin. I try to use critical section for write log some one, but what I would not do I always get a mixed log.
setbuf(stdout, 0);
int totalProcess;
MPI_Comm_size(MPI_COMM_WORLD, &totalProcess);
int processRank;
MPI_Comm_rank(MPI_COMM_WORLD, &processRank);
int rank = 0;
while (rank < totalProcess) {
if (processRank == rank) {
printf("-----%d-----\n", rank);
printf("%s", logBuffer);
printf("-----%d-----\n", rank);
//fflush(stdout);
}
rank ++;
MPI_Barrier(MPI_COMM_WORLD);
}
I run mpi at single machine (emulation mode):
mpirun -v -np 2 ./bin/main.out
I want dedicated space log per process, what I do wrong?
(When I wrote it I think it would not work correctly...)
This is the same problem asked about here; there is enough buffering going on at various different layers that there's no guarantee that the final output will reflect the order that the individual processes wrote, although in practice it can work for "small enough" outputs.
But if the goal is something like a logfile, MPI-IO provides mechanisms for you to write to a file in exactly such a way - MPI_File_write_ordered, which writes output in order of processors to the file. As an example:
#include <string.h>
#include <stdio.h>
#include "mpi.h"
int main(int argc, char** argv)
{
int rank, size;
MPI_File logfile;
char mylogbuffer[1024];
char line[128];
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_File_open(MPI_COMM_WORLD, "logfile.txt", MPI_MODE_WRONLY | MPI_MODE_CREATE,
MPI_INFO_NULL, &logfile);
/* write initial message */
sprintf(mylogbuffer,"-----%d-----\n", rank);
sprintf(line,"Message from proc %d\n", rank);
for (int i=0; i<rank; i++)
strcat(mylogbuffer, line);
sprintf(line,"-----%d-----\n", rank);
strcat(mylogbuffer, line);
MPI_File_write_ordered(logfile, mylogbuffer, strlen(mylogbuffer), MPI_CHAR, MPI_STATUS_IGNORE);
/* write another message */
sprintf(mylogbuffer,"-----%d-----\nAll done\n-----%d-----\n", rank, rank);
MPI_File_write_ordered(logfile, mylogbuffer, strlen(mylogbuffer), MPI_CHAR, MPI_STATUS_IGNORE);
MPI_File_close(&logfile);
MPI_Finalize();
return 0;
}
Compiling and running gives:
$ mpicc -o log log.c -std=c99
$ mpirun -np 5 ./log
$ cat logfile.txt
-----0-----
-----0-----
-----1-----
Message from proc 1
-----1-----
-----2-----
Message from proc 2
Message from proc 2
-----2-----
-----3-----
Message from proc 3
Message from proc 3
Message from proc 3
-----3-----
-----4-----
Message from proc 4
Message from proc 4
Message from proc 4
Message from proc 4
-----4-----
-----0-----
All done
-----0-----
-----1-----
All done
-----1-----
-----2-----
All done
-----2-----
-----3-----
All done
-----3-----
-----4-----
All done
-----4-----

Resources