How can I get the Linux PID for a thread? - linux

I'm interested in obtaining the PID of a thread created inside a Rust program. As stated in the documentation, thread::id() does not work for this purpose. I found Get the current thread id and process id as integers? that seemed like the answer, but my experiments show it doesn't work.
This is the code:
extern crate rand;
extern crate libc;
use std::thread::{self, Builder};
use std::process::{self, Command};
use rand::thread_rng;
use rand::RngCore;
use std::time::Duration;
use std::os::unix::thread::JoinHandleExt;
use libc::pthread_t;
fn main() {
let main_pid = process::id();
println!("This PID {}", main_pid);
let b = Builder::new().name(String::from("LongRunningThread")).spawn(move || {
let mut rng = thread_rng();
let spawned_pid = process::id();
println!("Spawned PID {}", spawned_pid);
loop {
let u = rng.next_u64() % 1000;
println!("Processing request {}", u);
thread::sleep(Duration::from_millis(u));
}
}).expect("Could not spawn worker thread");
let p_threadid : pthread_t = b.as_pthread_t();
println!("Spawned p_threadid {}", p_threadid);
let thread_id = b.thread().id();
println!("Spawned thread_id {:?}", thread_id);
thread::sleep(Duration::from_millis(60_000));
}
The output from running the program inside a Linux machine is the following:
This PID 8597
Spawned p_threadid 139858455706368. <-- Clearly wrong
Spawned thread_id ThreadId(1) <-- Clearly wrong
Spawned PID 8597
Processing request 289
Processing request 476
Processing request 361
Processing request 567
The following is an excerpt from the output of htop in my system:
6164 1026 root 20 0 98M 7512 6512 S 0.0 0.0 0:00.03 │ ├─ sshd: dash [priv]
6195 6164 dash 20 0 98M 4176 3176 S 0.0 0.0 0:00.20 │ │ └─ sshd: dash#pts/11
6196 6195 dash 20 0 22964 5648 3408 S 0.0 0.0 0:00.09 │ │ └─ -bash
8597 6196 dash 20 0 2544 4 0 S 0.0 0.0 0:00.00 │ │ └─ ./process_priorities
8598 6196 dash 20 0 2544 4 0 S 0.0 0.0 0:00.00 │ │ └─ LongRunningThre
The PID I want from the spawned thread is 8598, but I can't figure out how to obtain it in a Rust program. Any ideas?

I found the answer using an existing crate called Palaver. It includes a gettid() that works across platforms. The only caveat is that the default configuration of the crate uses nightly features, so if you are on stable, make sure to disable them like this palaver = { version = "*", default-features = false }
Now, when I run the code that uses gettid(), this is the output:
This PID 9009
Priority changed for thread
Spawned PID 9010
Processing request 803
Processing request 279
Processing request 624
And the output from my htop:
6164 1026 root 20 0 98M 7512 6512 S 0.0 0.0 0:00.03 │ ├─ sshd: dash [priv]
6195 6164 dash 20 0 98M 4176 3176 S 0.0 0.0 0:00.21 │ │ └─ sshd: dash#pts/11
6196 6195 dash 20 0 22964 5648 3408 S 0.0 0.0 0:00.10 │ │ └─ -bash
9009 6196 dash 20 0 2544 4 0 S 0.0 0.0 0:00.00 │ │ └─ ./process_priorities
9010 6196 dash 20 0 2544 4 0 S 0.0 0.0 0:00.00 │ │ └─ LongRunningThre

Related

What are llvm_pipe threads?

I'm writing a Rust app that uses a lot of threads. I noticed the CPU usage was high so I did top and then hit H to see the threads:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
247759 root 20 0 3491496 104400 64676 R 32.2 1.0 0:02.98 my_app
247785 root 20 0 3491496 104400 64676 S 22.9 1.0 0:01.89 llvmpipe-0
247786 root 20 0 3491496 104400 64676 S 21.9 1.0 0:01.71 llvmpipe-1
247792 root 20 0 3491496 104400 64676 S 20.9 1.0 0:01.83 llvmpipe-7
247789 root 20 0 3491496 104400 64676 S 20.3 1.0 0:01.60 llvmpipe-4
247790 root 20 0 3491496 104400 64676 S 20.3 1.0 0:01.64 llvmpipe-5
247787 root 20 0 3491496 104400 64676 S 19.9 1.0 0:01.70 llvmpipe-2
247788 root 20 0 3491496 104400 64676 S 19.9 1.0 0:01.61 llvmpipe-3
What are these llvmpipe-n threads? Why my_app launches them? Are them even from my_app for sure?
As HHK links to, the llvmpipe threads are from your OpenGL driver, which is Mesa.
You said you are running this in a VM. VMs usually don't virtualize GPU hardware, so the Mesa OpenGL driver is doing sofware rendering. To achieve better performance, Mesa spawns threads to do parallel computations on the CPU.

Can't explain this Node clustering behavior

I'm learning about threads and how they interact with Node's native cluster module. I saw some behavior I can't explain that I'd like some help understanding.
My code:
process.env.UV_THREADPOOL_SIZE = 1;
const cluster = require('cluster');
if (cluster.isMaster) {
cluster.fork();
} else {
const crypto = require('crypto');
const express = require('express');
const app = express();
app.get('/', (req, res) => {
crypto.pbkdf2('a', 'b', 100000, 512, 'sha512', () => {
res.send('Hi there');
});
});
app.listen(3000);
}
I benchmarked this code with one request using apache benchmark.
ab -c 1 -n 1 localhost:3000/ yielded these connection times
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 605 605 0.0 605 605
Waiting: 605 605 0.0 605 605
Total: 605 605 0.0 605 605
So far so good. I then ran ab -c 2 -n 2 localhost:3000/ (doubling the number of calls from the benchmark). I expected the total time to double since I limited the libuv thread pool to one thread per child process and I only started one child process. But nothing really changed. Here's those results.
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 0
Processing: 608 610 3.2 612 612
Waiting: 607 610 3.2 612 612
Total: 608 610 3.3 612 612
For extra info, when I further increase the number of calls with ab -c 3 -n 3 localhost:3000/, I start to see a slow down.
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 599 814 352.5 922 1221
Waiting: 599 814 352.5 922 1221
Total: 599 815 352.5 922 1221
I'm running all this on a quadcore mac using Node v14.13.1.
tldr: how did my benchmark not use up all my threads? I forked one child process with one thread in its libuv pool - so the one call in my benchmark should have been all it could handle without taking longer. And yet the second test (the one that doubled the amount of calls) took the same amount of time as the benchmark.

Node.js - spawn is cutting off the results

I'm creating a node program to return the output of linux top command, is working fine the only issue is that the name of command is cutted, instead the full command name like /usr/local/libexec/netdata/plugins.d/apps.plugin 1 returns /usr/local+
My code
const topparser=require("topparser")
const spawn = require('child_process').spawn
let proc=null
let startTime=0
exports.start=function(pid_limit,callback){
startTime=new Date().getTime()
proc = spawn('top', ['-c','-b',"-d","3"])
console.log("started process, pid: "+proc.pid)
let top_data=""
proc.stdout.on('data', function (data) {
console.log('stdout: ' + data);
})
proc.on('close', function (code) {
console.log('child process exited with code ' + code);
});
}//start
exports.stop=function(){
console.log("stoped process...")
if(proc){proc.kill('SIGINT')}// SIGHUP -linux ,SIGINT -windows
}//stop
The results
14861 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kworker/1+
14864 root 20 0 0 0 0 S 0.0 0.0 0:00.02 [kworker/0+
15120 root 39 19 102488 3344 2656 S 0.0 0.1 0:00.09 /usr/bin/m+
16904 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kworker/0+
19031 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kworker/u+
21500 root 20 0 0 0 0 Z 0.0 0.0 0:00.00 [dsc] <def+
22571 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kworker/0+
Any way to fix it?
Best regards
From a top manpage:
In Batch mode, when used without an argument top will format output using the COLUMNS= and LINES=
environment variables, if set. Otherwise, width will be fixed at the maximum 512 columns. With an
argument, output width can be decreased or increased (up to 512) but the number of rows is consid‐
ered unlimited.
Add '-w', '512' to the arguments.
Since you work with node, you can query netdata running on localhost for this.
Example:
http://london.my-netdata.io/api/v1/data?chart=apps.cpu&after=-1&options=ms
For localhost netdata:
http://localhost:19999/api/v1/data?chart=apps.cpu&after=-1&options=ms
You can also get systemd services:
http://london.my-netdata.io/api/v1/data?chart=services.cpu&after=-1&options=ms
If you are not planning to update the screen per second, you can instruct netdata to return the average of a longer duration:
http://london.my-netdata.io/api/v1/data?chart=apps.cpu&after=-5&points=1&group=average&options=ms
The above returns the average of the last 5 seconds.
Finally, you get the latest values all the metrics netdata monitors, with this:
http://london.my-netdata.io/api/v1/allmetrics?format=json
For completeness, netdata can export all the metrics in BASH format for shell scripts. Check this: https://github.com/firehol/netdata/wiki/receiving-netdata-metrics-from-shell-scripts

Why do hGetBuf, hPutBuf, etc. allocate memory?

In the process of doing some simple benchmarking, I came across something that surprised me. Take this snippet from Network.Socket.Splice:
hSplice :: Int -> Handle -> Handle -> IO ()
hSplice len s t = do
a <- mallocBytes len :: IO (Ptr Word8)
finally
(forever $! do
bytes <- hGetBufSome s a len
if bytes > 0
then hPutBuf t a bytes
else throwRecv0)
(free a)
One would expect that hGetBufSome and hPutBuf here would not need to allocate memory, as they write into and read from a pre-allocated buffer. The docs seem to back this intuition up... But alas:
individual inherited
COST CENTRE %time %alloc %time %alloc bytes
hSplice 0.5 0.0 38.1 61.1 3792
hPutBuf 0.4 1.0 19.8 29.9 12800000
hPutBuf' 0.4 0.4 19.4 28.9 4800000
wantWritableHandle 0.1 0.1 19.0 28.5 1600000
wantWritableHandle' 0.0 0.0 18.9 28.4 0
withHandle_' 0.0 0.1 18.9 28.4 1600000
withHandle' 1.0 3.8 18.8 28.3 48800000
do_operation 1.1 3.4 17.8 24.5 44000000
withHandle_'.\ 0.3 1.1 16.7 21.0 14400000
checkWritableHandle 0.1 0.2 16.4 19.9 3200000
hPutBuf'.\ 1.1 3.3 16.3 19.7 42400000
flushWriteBuffer 0.7 1.4 12.1 6.2 17600000
flushByteWriteBuffer 11.3 4.8 11.3 4.8 61600000
bufWrite 1.7 6.9 3.0 9.9 88000000
copyToRawBuffer 0.1 0.2 1.2 2.8 3200000
withRawBuffer 0.3 0.8 1.2 2.6 10400000
copyToRawBuffer.\ 0.9 1.7 0.9 1.7 22400000
debugIO 0.1 0.2 0.1 0.2 3200000
debugIO 0.1 0.2 0.1 0.2 3200016
hGetBufSome 0.0 0.0 17.7 31.2 80
wantReadableHandle_ 0.0 0.0 17.7 31.2 32
wantReadableHandle' 0.0 0.0 17.7 31.2 0
withHandle_' 0.0 0.0 17.7 31.2 32
withHandle' 1.6 2.4 17.7 31.2 30400976
do_operation 0.4 2.4 16.1 28.8 30400880
withHandle_'.\ 0.5 1.1 15.8 26.4 14400288
checkReadableHandle 0.1 0.4 15.3 25.3 4800096
hGetBufSome.\ 8.7 14.8 15.2 24.9 190153648
bufReadNBNonEmpty 2.6 4.4 6.1 8.0 56800000
bufReadNBNonEmpty.buf' 0.0 0.4 0.0 0.4 5600000
bufReadNBNonEmpty.so_far' 0.2 0.1 0.2 0.1 1600000
bufReadNBNonEmpty.remaining 0.2 0.1 0.2 0.1 1600000
copyFromRawBuffer 0.1 0.2 2.9 2.8 3200000
withRawBuffer 1.0 0.8 2.8 2.6 10400000
copyFromRawBuffer.\ 1.8 1.7 1.8 1.7 22400000
bufReadNBNonEmpty.avail 0.2 0.1 0.2 0.1 1600000
flushCharReadBuffer 0.3 2.1 0.3 2.1 26400528
I have to assume this is on purpose... but I have no idea what that purpose might be. Even worse: I'm just barely clever enough to get this profile, but not quite clever enough to figure out exactly what's being allocated.
Any help along those lines would be appreciated.
UPDATE: I've done some more profiling with two drastically simplified testcases. The first testcase directly uses the read/write ops from System.Posix.Internals:
echo :: Ptr Word8 -> IO ()
echo buf = forever $ do
threadWaitRead $ Fd 0
len <- c_read 0 buf 1
c_write 1 buf (fromIntegral len)
yield
As you'd hope, this allocates no memory on the heap each time through the loop. The second testcase uses the read/write ops from GHC.IO.FD:
echo :: Ptr Word8 -> IO ()
echo buf = forever $ do
len <- readRawBufferPtr "read" stdin buf 0 1
writeRawBufferPtr "write" stdout buf 0 (fromIntegral len)
UPDATE #2: I was advised to file this as a bug in GHC Trac... I'm still not sure it actually is a bug (as opposed to intentional behavior, a known limitation, or whatever) but here it is: https://ghc.haskell.org/trac/ghc/ticket/9696
I'll try to guess based on the code
Runtime tries to optimize small reads and writes, so it maintains internal buffer. If your buffer is 1 byte long, it will be inefficient to use it dirrectly. So internal buffer is used to read bigger chunk of data. It is probably ~32Kb long. Plus something similar for writing. Plus your own buffer.
The code has an optimization -- if you provide buffer bigger then the internal one, and the later is empty, it will use your buffer dirrectly. But the internal buffer is already allocated, so it will not less memory usage. I don't know how to dissable internal buffer, but you can open feature request if it is important for you.
(I realize that my guess can be totally wrong.)
ADD:
This one does seem to allocate, but I still don't know why.
What is your concern, max memory usage or number of allocated bytes?
c_read is a C function, it doesn't allocate on haskell's heap (but may allocate on C heap.)
readRawBufferPtr is Haskell function, and it is usual for haskell functions to allocate a lot of memory, that quickly becomes a garbage. Simply because of immutability. It is common for haskell program to allocate e.g 100Gb while memory usage is under 1Mb.
It seems like the conclusion is: it's a bug.

How do I change the process name?

Does anyone know how to change the process name in top?
top - 05:02:47 up 182 days, 10:38, 1 user, load average: 14.53, 13.11, 11.95
Tasks: 4 total, 2 running, 2 sleeping, 0 stopped, 0 zombie
Cpu(s): 57.2%us, 14.8%sy, 0.0%ni, 26.2%id, 1.3%wa, 0.0%hi, 0.5%si, 0.0%st
Mem: 24736852k total, 22519688k used, 2217164k free, 132268k buffers
Swap: 8386552k total, 741900k used, 7644652k free, 12416224k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6230 user 20 0 47540 6856 1164 R 41.5 0.0 0:03.10 perl
6430 user 20 0 14900 1156 936 R 0.3 0.0 0:00.02 top
6227 user 20 0 47276 7552 2088 S 0.0 0.0 0:00.07 perl
14577 user 20 0 11588 1808 1340 S 0.0 0.0 0:00.46 bash
I have figured out how to change the top -c name
$0 = 'new name.';
However this doesn't accomplish my goal.
I found a non standard module, and it looks very promising,
However I can't use any non standard modules.
http://metacpan.org/pod/Sys::Prctl
# instead of "perl helloworld.pl"
$0 = "helloworld"
prctl_name("helloworld");
I was hoping someone had some input, or knowledge on changing the title/name of a process.
I feel I have gone through perlvar pretty thoroughly however I may have missed a simple $^0. Hoping it's that simple.
Edit
#user2783897, not sure why I didn't think of that, here is the basic example I made.
sub prctl_name {
my $TASK_COMM_LEN = 16;
my $SYS_prctl = 157;
my $SYS_PR_SET_NAME = 15;
my $SYS_PR_GET_NAME = 16;
my ($str) = #_;
if(defined $str) {
my $rv = prctl($SYS_PR_SET_NAME, $str);
if($rv == 0) {
return 1;
} else {
return;
}
} else {
$str = "\x00" x ($TASK_COMM_LEN + 1); # allocate $str
my $ptr = unpack( 'L', pack( 'P', $str ) );
my $rv = prctl($SYS_PR_GET_NAME, $ptr);
if($rv == 0) {
return substr($str, 0, index($str, "\x00"));
} else {
return;
}
}
}
sub prctl {
my $SYS_prctl = 157;
my ($option, $arg2, $arg3, $arg4, $arg5) = #_;
syscall($SYS_prctl, $option,
($arg2 or 0), ($arg3 or 0), ($arg4 or 0), ($arg5 or 0));
}
Why don't you copy Sys/Prctl.pm code inside your own ? It's only a few dozens of lines.
Furthermore, most of the code is dedicated to finding on which kind of kernel the process is running, to select the appropriate SYS_prctl parameter. If you know on what kind of kernel you're running, you can cut the code to its bare bone.

Resources