How can I track a thread's Cpu usage in Delphi - multithreading

I have a program running several threads, but some threads sometimes overload the CPU. so I need to limit these threads CPU usage to %50 something, is it possible in Delphi?
edit: sorry guys my question was not clear.
I actually want to know how could I track threads ( at least make a thread list with their thread IDs) and see how much CPU uses each thread. But I want to do this so I could see which thread is responsible for CPU overload.
sorry for the inconvenience again.

I think the answer to your question can be found in the following Stack Overflow question: How to get the cpu usage per thread on windows (win32).
However, I would advise you to endeavour to understand why your program is behaving as it does and attack the root of the problem rather than killing any threads that you take a dislike to. Of course, if the program in question is purely for your own private use then your approach may be perfectly expedient and pragmatic. But if you are writing professional software then I can't see a situation where killing busy threads sounds like a reasonable approach.

You cannot "limit CPU usage", not in Delphi nor in Windows itself, as far as I know.
You likely want something else: not to interfere with user actions or with other threads. But if there's nothing going on and user aren't doing anything, why run slower than you could? Just use the 100% of the CPU, nobody needs it!
So, if you need those threads not to interfere with user actions, just set them to lower priority with Windows function SetThreadPriority. They'll only run when user doesn't need processor power.
Another trick to give more chance for other threads to run, call Sleep(0) from time to time in your thread body. Every time you call Sleep(), you ask OS to switch to another thread, simply speaking.

I track a rolling CPU usage per thread for every thread in all my applications using some code in my framework (http://www.csinnovations.com/framework/framework.htm). A log output looks like:
15/01/2011 11:17:59.631,Misha,MISHA-DCDEL,Scores Client,V0.2.0.1,Main Thread,Memory Check,Verbose,Globals,"System allocated memory = 8282615808 bytes (change since last check = 4872478720 bytes)"
15/01/2011 11:17:59.632,Misha,MISHA-DCDEL,Scores Client,V0.2.0.1,Main Thread,Memory Check,Verbose,Globals,"Process allocated memory = 152580096 bytes (change since last check = -4579328 bytes)"
15/01/2011 11:17:59.633,Misha,MISHA-DCDEL,Scores Client,V0.2.0.1,Main Thread,CPU Check,Verbose,Globals,"System CPU usage = 15.6 % (average over lifetime = 3.0 %)"
15/01/2011 11:17:59.634,Misha,MISHA-DCDEL,Scores Client,V0.2.0.1,Main Thread,CPU Check,Verbose,Globals,"Process CPU usage = 0.5 % (average over lifetime = 0.7 %)"
15/01/2011 11:17:59.634,Misha,MISHA-DCDEL,Scores Client,V0.2.0.1,Main Thread,CPU Check,Verbose,Globals,"Thread CPU usage = 0.0 % (average over lifetime = 0.0 %)"
15/01/2011 11:17:59.634,Misha,MISHA-DCDEL,Scores Client,V0.2.0.1,Main Thread,CPU Check,Verbose,Globals,"Thread CPU usage = 0.0 % (average over lifetime = 0.0 %)"
15/01/2011 11:17:59.634,Misha,MISHA-DCDEL,Scores Client,V0.2.0.1,Main Thread,CPU Check,Verbose,Globals,"Thread CPU usage = 0.0 % (average over lifetime = 0.0 %)"
15/01/2011 11:17:59.635,Misha,MISHA-DCDEL,Scores Client,V0.2.0.1,Main Thread,CPU Check,Verbose,Globals,"Thread CPU usage = 0.1 % (average over lifetime = 0.1 %)"
15/01/2011 11:17:59.635,Misha,MISHA-DCDEL,Scores Client,V0.2.0.1,Main Thread,CPU Check,Verbose,Globals,"Thread CPU usage = 0.0 % (average over lifetime = 0.0 %)"
15/01/2011 11:17:59.635,Misha,MISHA-DCDEL,Scores Client,V0.2.0.1,Main Thread,CPU Check,Verbose,Globals,"Thread CPU usage = 0.3 % (average over lifetime = 0.5 %)"
15/01/2011 11:17:59.635,Misha,MISHA-DCDEL,Scores Client,V0.2.0.1,Main Thread,CPU Check,Verbose,Globals,"Thread CPU usage = 0.0 % (average over lifetime = 0.0 %)"
15/01/2011 11:17:59.635,Misha,MISHA-DCDEL,Scores Client,V0.2.0.1,Main Thread,CPU Check,Verbose,Globals,"Thread CPU usage = 0.0 % (average over lifetime = 0.0 %)"
15/01/2011 11:17:59.636,Misha,MISHA-DCDEL,Scores Client,V0.2.0.1,Main Thread,CPU Check,Verbose,Globals,"Thread CPU usage = 0.0 % (average over lifetime = 0.0 %)"
15/01/2011 11:17:59.636,Misha,MISHA-DCDEL,Scores Client,V0.2.0.1,Main Thread,CPU Check,Verbose,Globals,"Thread CPU usage = 0.1 % (average over lifetime = 0.1 %)"
The time period is configurable, and I tend to use either 10 seconds, a minute, or 10 minutes. Have a look in the CsiSystemUnt.pas and AppGlobalsUnt.pas files to see how it is done.
Cheers, Misha
PS I also check memory usage as well.

Related

node.js global queue memory leak no matter what is tried

I am using Node.js 14 LTS and Express 4.17.1. I am trying to build a global queue that can take JSON payload, store it in the queue and dispatch as a group when the queue is a given size.
The issue that I'm facing is that the garbage collection time vastly increases, which will mean that the Node.js app will eventually become non responsive and maybe crash. No matter if I use the code below, simple arrays, or use contained classes, it is the same thing.
Essentially this is a memory leak even though I'm clearing the arrays, classes or singleton array each time it is dispatched.
I believe this is due to the reference in the root of the file so Node's garbage collection can't release it from memory and therefore grows.
I wonder if you have an alternative that can accomplish similarly but not increase the HEAP?
Here is a simplified example of what I'm trying to get working without leaks:
"use strict";
import Singleton from '../services/Singleton.js'
import Queue from '../services/Queue.js'
import dotenv from 'dotenv'
dotenv.config()
const bulkESNumber = parseInt(process.env.BULK_ES_NUMER)
let ts_queue = null
let ts_singleton = null
router.post('/:es_indice/:id?', async (req, res) => {
if (!ts_queue) {
ts_queue = new Queue();
ts_singleton = new Singleton(ts_queue);
ts_queue = null
}
let ts_q = ts_singleton.getInstance()
if ( bulkESNumber > await ts_q.size()) {
await ts_q.add(req)
console.log(await ts_q.size())
} else {
console.log('dispatche t2_s queue')
ts_q.clear()
}
})
The Queue Class:
"use strict";
class Queue {
queue = null
constructor() {
this.queue = []
}
async add(item) {
this.queue.push(item)
}
async clear() {
this.queue = []
}
async get() {
return this.queue
}
async size() {
return this.queue.length
}
}
export default Queue
The Singleton Class:
class Singleton {
constructor(obj_instance) {
if (!Singleton.instance) {
Singleton.instance = obj_instance;
}
}
getInstance() {
return Singleton.instance;
}
}
export default Singleton
If you are brave enough to scroll through this next section below you'll see that the time between garbage collection and HEAP size both keep increasing throughout the process. This means that the app becomes less responsive to the POSTs and eventually will crash.
I trigger this by posting to an endpoint that POSTs a json payload that should be stored in the queue
This is how I start the server with a heap trace:
>$ node --trace_gc --use_strict index.js
server running on port: http://127.0.0.1:3232
[14071:0x110008000] 376 ms: Mark-sweep 16.6 (28.7) -> 11.1 (28.8) MB, 2.3 / 0.1 ms (+ 0.1 ms in 3 steps since start of marking, biggest step 0.1 ms, walltime since start of marking 16 ms) (average mu = 1.000, current mu = 1.000) finalize incremental marking via task GC in old space requested
1
2021-06-14T22:53:20.233Z info: Event elasped time 3ms
2
2021-06-14T22:53:21.717Z info: Event elasped time 1ms
3
2021-06-14T22:53:22.659Z info: Event elasped time 1ms
4
2021-06-14T22:53:23.649Z info: Event elasped time 1ms
...
2021-06-14T22:53:25.307Z info: Event elasped time 1ms
[14101:0x110008000] 8531 ms: Mark-sweep 12.4 (29.0) -> 11.4 (14.3) MB, 7.7 / 0.1 ms (+ 0.8 ms in 4 steps since start of marking, biggest step 0.5 ms, walltime since start of marking 35 ms) (average mu = 0.999, current mu = 0.999) finalize incremental marking via task GC in old space requested
7
2021-06-14T22:53:26.035Z info: Event elasped time 2ms
[14101:0x110008000] 9168 ms: Mark-sweep 11.5 (14.3) -> 11.4 (14.3) MB, 5.1 / 0.0 ms (+ 0.5 ms in 4 steps since start of marking, biggest step 0.3 ms, walltime since start of marking 32 ms) (average mu = 0.998, current mu = 0.991) finalize incremental marking via task GC in old space requested
8
2021-06-14T22:53:28.032Z info: Event elasped time 5ms
...
9
2021-06-14T22:53:38.027Z info: Event elasped time 0ms
10
2021-06-14T22:53:38.643Z info: Event elasped time 1ms
dispatche t2_s queue
2021-06-14T22:53:39.316Z info: Event elasped time 1ms
1
2021-06-14T22:53:40.029Z info: Event elasped time 1ms
2
[14101:0x110008000] 23392 ms: Scavenge 12.5 (14.3) -> 11.7 (14.3) MB, 0.7 / 0.0 ms (average mu = 0.998, current mu = 0.991) allocation failure
2021-06-14T22:53:40.753Z info: Event elasped time 2ms
...
5
2021-06-14T22:53:49.131Z info: Event elasped time 1ms
[14101:0x110008000] 32322 ms: Scavenge 12.5 (14.3) -> 11.8 (14.3) MB, 0.9 / 0.0 ms (average mu = 0.998, current mu = 0.991) allocation failure

Java eden space is not 8 times larger than s0 space

according to oracle's doc default parameter values for SurvivorRatio is 8, that means each survivor space will be one-eighth the size of eden space.
but in my application it don't work
$ jmap -heap 48865
Attaching to process ID 48865, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.45-b02
using thread-local object allocation.
Parallel GC with 8 thread(s)
Heap Configuration:
MinHeapFreeRatio = 0
MaxHeapFreeRatio = 100
MaxHeapSize = 4294967296 (4096.0MB)
NewSize = 89128960 (85.0MB)
MaxNewSize = 1431306240 (1365.0MB)
OldSize = 179306496 (171.0MB)
NewRatio = 2
SurvivorRatio = 8
MetaspaceSize = 21807104 (20.796875MB)
CompressedClassSpaceSize = 1073741824 (1024.0MB)
MaxMetaspaceSize = 17592186044415 MB
G1HeapRegionSize = 0 (0.0MB)
Heap Usage:
PS Young Generation
Eden Space:
capacity = 67108864 (64.0MB)
used = 64519920 (61.53099060058594MB)
free = 2588944 (2.4690093994140625MB)
96.14217281341553% used
From Space:
capacity = 11010048 (10.5MB)
used = 0 (0.0MB)
free = 11010048 (10.5MB)
0.0% used
To Space:
capacity = 11010048 (10.5MB)
used = 0 (0.0MB)
free = 11010048 (10.5MB)
0.0% used
PS Old Generation
capacity = 179306496 (171.0MB)
used = 0 (0.0MB)
free = 179306496 (171.0MB)
0.0% used
7552 interned Strings occupying 605288 bytes.
but in VisualVM eden space is 1.332G and S0 is 455M, eden is only 3 times larger than S0 not the 8
You have neither disabled -XX:-UseAdaptiveSizePolicy, nor set -Xms equal to -Xmx, so JVM is free to resize heap generations (and survivor spaces) in runtime. In this case the estimated maximum Survior size is
MaxSurvivor = NewGen / MinSurvivorRatio
where -XX:MinSurvivorRatio=3 by default. Note: this is an estimated maximum, not the actual size.
See also this answer.

I am sufferring JAVA G1 issue

does any one encounter this kind of issue in java G1 gc
the first highlight user time is about 4 ms
but the second one user time is 0 ms and system time is about 4ms.
in G1 gc system time shouldn't be high, is it a bug in G1 gc?
below is my gc argunments
Xms200g -Xmx200g -Xmn30g -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSCompactAtFullCollection -XX:CMSMaxAbortablePrecleanTime=5000 -XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark -verbose:gc -XX:+PrintPromotionFailure -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC
2018-01-07T04:54:39.995+0800: 906650.864: [GC (Allocation Failure) 2018-01-07T04:54:39.996+0800: 906650.865: [ParNew
Desired survivor size 1610612736 bytes, new threshold 6 (max 6)
- age 1: 69747632 bytes, 69747632 total
- age 2: 9641544 bytes, 79389176 total
- age 3: 10522192 bytes, 89911368 total
- age 4: 11732392 bytes, 101643760 total
- age 5: 9158960 bytes, 110802720 total
- age 6: 10917528 bytes, 121720248 total
: 25341731K->170431K(28311552K), 0.2088528 secs] 153045380K->127882325K(206569472K), 0.2094236 secs] [Times: **user=4.53 sys=0.00, real=0.21 secs]**
Heap after GC invocations=32432 (full 10):
par new generation total 28311552K, used 170431K [0x00007f6058000000, 0x00007f67d8000000, 0x00007f67d8000000)
eden space 25165824K, 0% used [0x00007f6058000000, 0x00007f6058000000, 0x00007f6658000000)
from space 3145728K, 5% used [0x00007f6658000000, 0x00007f666266ffe0, 0x00007f6718000000)
to space 3145728K, 0% used [0x00007f6718000000, 0x00007f6718000000, 0x00007f67d8000000)
concurrent mark-sweep generation total 178257920K, used 127711893K [0x00007f67d8000000, 0x00007f9258000000, 0x00007f9258000000)
Metaspace used 54995K, capacity 55688K, committed 56028K, reserved 57344K
}
2018-01-07T04:54:40.205+0800: 906651.074: Total time for which application threads were stopped: 0.2269738 seconds, Stopping threads took: 0.0001692 seconds
{Heap before GC invocations=32432 (full 10):
par new generation total 28311552K, used 25336255K [0x00007f6058000000, 0x00007f67d8000000, 0x00007f67d8000000)
eden space 25165824K, 100% used [0x00007f6058000000, 0x00007f6658000000, 0x00007f6658000000)
from space 3145728K, 5% used [0x00007f6658000000, 0x00007f666266ffe0, 0x00007f6718000000)
to space 3145728K, 0% used [0x00007f6718000000, 0x00007f6718000000, 0x00007f67d8000000)
concurrent mark-sweep generation total 178257920K, used 127711893K [0x00007f67d8000000, 0x00007f9258000000, 0x00007f9258000000)
Metaspace used 54995K, capacity 55688K, committed 56028K, reserved 57344K
2018-01-07T04:55:02.541+0800: 906673.411: [GC (Allocation Failure) 2018-01-07T04:55:02.542+0800: 906673.411: [ParNew
Desired survivor size 1610612736 bytes, new threshold 6 (max 6)
- age 1: 93841912 bytes, 93841912 total
- age 2: 11310104 bytes, 105152016 total
- age 3: 8967160 bytes, 114119176 total
- age 4: 10278920 bytes, 124398096 total
- age 5: 11626160 bytes, 136024256 total
- age 6: 9077432 bytes, 145101688 total
: 25336255K->195827K(28311552K), 0.1926783 secs] 153048149K->127918291K(206569472K), 0.1932366 secs] [Times: **user=0.00 sys=4.07, real=0.20 secs]**
Heap after GC invocations=32433 (full 10):
par new generation total 28311552K, used 195827K [0x00007f6058000000, 0x00007f67d8000000, 0x00007f67d8000000)
eden space 25165824K, 0% used [0x00007f6058000000, 0x00007f6058000000, 0x00007f6658000000)
from space 3145728K, 6% used [0x00007f6718000000, 0x00007f6723f3cf38, 0x00007f67d8000000)
to space 3145728K, 0% used [0x00007f6658000000, 0x00007f6658000000, 0x00007f6718000000)
concurrent mark-sweep generation total 178257920K, used 127722463K [0x00007f67d8000000, 0x00007f9258000000, 0x00007f9258000000)
Metaspace used 54995K, capacity 55688K, committed 56028K, reserved 57344K
}
2018-01-07T04:55:02.735+0800: 906673.604: Total time for which application threads were stopped: 0.2149603 seconds, Stopping threads took: 0.0002262 seconds
2018-01-07T04:55:14.673+0800: 906685.542: Total time for which application threads were stopped: 0.0183883 seconds, Stopping threads took: 0.0002046 seconds
2018-01-07T04:55:14.797+0800: 906685.666: Total time for which application threads were stopped: 0.0135349 seconds, Stopping threads took: 0.0002472 seconds
2018-01-07T04:55:14.810+0800: 906685.679: Total time for which application threads were stopped: 0.0129019 seconds, Stopping threads took: 0.0001014 seconds
2018-01-07T04:55:14.823+0800: 906685.692: Total time for which application threads were stopped: 0.0125939 seconds, Stopping threads took: 0.0002915 seconds
2018-01-07T04:55:21.597+0800: 906692.466: Total time for which application threads were stopped: 0.0137018 seconds, Stopping threads took: 0.0001683 seconds
{Heap before GC invocations=32433 (full 10):
your command-line specifies -XX:+UseConcMarkSweepGC - this isn't a G1 issue.

How to Get Free Swap Memory for Matrix Computation in Linux Matlab?

Situation: estimate if you can compute big matrix with your Ram and Swap in Linux Matlab
I need the sum of Mem and Swap, corresponding values by free -m under Heading total in Linux
total used free shared buff/cache available
Mem: 7925 3114 3646 308 1164 4220
Swap: 28610 32 28578
Free Ram memory in Matlab by
% http://stackoverflow.com/a/12350678/54964
[r,w] = unix('free | grep Mem');
stats = str2double(regexp(w, '[0-9]*', 'match'));
memsize = stats(1)/1e6;
freeRamMem = (stats(3)+stats(end))/1e6;
Free Swap memory in Matlab: ...
Relation between Memory requirement and Matrix size of Matlab: ...
Testing Suever's 2nd iteration
Suever's command gives me 29.2 GB that is corresponding to free's output so correct
$ free
total used free shared buff/cache available
Mem: 8115460 4445520 1956672 350692 1713268 3024604
Swap: 29297656 33028 29264628
System: Linux Ubuntu 16.04 64 bit
Linux kernel: 4.6
Linux kernel options: wl, zswap
Matlab: 2016a
Hardware: Macbook Air 2013-mid
Ram: 8 GB
Swap: 28 Gb on SSD (set up like in the thread How to Allocate More Space to Swap and Increase its Size Greater than Ram?)
SSD: 128 GB
You can just make a slight modification to the code that you've posted to get the swap amount.
function freeMem = freeMemory(type)
[r, w] = unix(['free | grep ', type]);
stats = str2double(regexp(w, '[0-9]*', 'match'));
memsize = stats(1)/1e6;
if numel(stats) > 3
freeMem = (stats(3)+stats(end))/1e6;
else
freeMem = stats(3)/1e6;
end
end
totalFree = freeMemory('Mem') + freeMemory('Swap')
To figure out how much memory a matrix takes up, use the size of the datatype and multiply by the number of elements as a first approximation.

Why do hGetBuf, hPutBuf, etc. allocate memory?

In the process of doing some simple benchmarking, I came across something that surprised me. Take this snippet from Network.Socket.Splice:
hSplice :: Int -> Handle -> Handle -> IO ()
hSplice len s t = do
a <- mallocBytes len :: IO (Ptr Word8)
finally
(forever $! do
bytes <- hGetBufSome s a len
if bytes > 0
then hPutBuf t a bytes
else throwRecv0)
(free a)
One would expect that hGetBufSome and hPutBuf here would not need to allocate memory, as they write into and read from a pre-allocated buffer. The docs seem to back this intuition up... But alas:
individual inherited
COST CENTRE %time %alloc %time %alloc bytes
hSplice 0.5 0.0 38.1 61.1 3792
hPutBuf 0.4 1.0 19.8 29.9 12800000
hPutBuf' 0.4 0.4 19.4 28.9 4800000
wantWritableHandle 0.1 0.1 19.0 28.5 1600000
wantWritableHandle' 0.0 0.0 18.9 28.4 0
withHandle_' 0.0 0.1 18.9 28.4 1600000
withHandle' 1.0 3.8 18.8 28.3 48800000
do_operation 1.1 3.4 17.8 24.5 44000000
withHandle_'.\ 0.3 1.1 16.7 21.0 14400000
checkWritableHandle 0.1 0.2 16.4 19.9 3200000
hPutBuf'.\ 1.1 3.3 16.3 19.7 42400000
flushWriteBuffer 0.7 1.4 12.1 6.2 17600000
flushByteWriteBuffer 11.3 4.8 11.3 4.8 61600000
bufWrite 1.7 6.9 3.0 9.9 88000000
copyToRawBuffer 0.1 0.2 1.2 2.8 3200000
withRawBuffer 0.3 0.8 1.2 2.6 10400000
copyToRawBuffer.\ 0.9 1.7 0.9 1.7 22400000
debugIO 0.1 0.2 0.1 0.2 3200000
debugIO 0.1 0.2 0.1 0.2 3200016
hGetBufSome 0.0 0.0 17.7 31.2 80
wantReadableHandle_ 0.0 0.0 17.7 31.2 32
wantReadableHandle' 0.0 0.0 17.7 31.2 0
withHandle_' 0.0 0.0 17.7 31.2 32
withHandle' 1.6 2.4 17.7 31.2 30400976
do_operation 0.4 2.4 16.1 28.8 30400880
withHandle_'.\ 0.5 1.1 15.8 26.4 14400288
checkReadableHandle 0.1 0.4 15.3 25.3 4800096
hGetBufSome.\ 8.7 14.8 15.2 24.9 190153648
bufReadNBNonEmpty 2.6 4.4 6.1 8.0 56800000
bufReadNBNonEmpty.buf' 0.0 0.4 0.0 0.4 5600000
bufReadNBNonEmpty.so_far' 0.2 0.1 0.2 0.1 1600000
bufReadNBNonEmpty.remaining 0.2 0.1 0.2 0.1 1600000
copyFromRawBuffer 0.1 0.2 2.9 2.8 3200000
withRawBuffer 1.0 0.8 2.8 2.6 10400000
copyFromRawBuffer.\ 1.8 1.7 1.8 1.7 22400000
bufReadNBNonEmpty.avail 0.2 0.1 0.2 0.1 1600000
flushCharReadBuffer 0.3 2.1 0.3 2.1 26400528
I have to assume this is on purpose... but I have no idea what that purpose might be. Even worse: I'm just barely clever enough to get this profile, but not quite clever enough to figure out exactly what's being allocated.
Any help along those lines would be appreciated.
UPDATE: I've done some more profiling with two drastically simplified testcases. The first testcase directly uses the read/write ops from System.Posix.Internals:
echo :: Ptr Word8 -> IO ()
echo buf = forever $ do
threadWaitRead $ Fd 0
len <- c_read 0 buf 1
c_write 1 buf (fromIntegral len)
yield
As you'd hope, this allocates no memory on the heap each time through the loop. The second testcase uses the read/write ops from GHC.IO.FD:
echo :: Ptr Word8 -> IO ()
echo buf = forever $ do
len <- readRawBufferPtr "read" stdin buf 0 1
writeRawBufferPtr "write" stdout buf 0 (fromIntegral len)
UPDATE #2: I was advised to file this as a bug in GHC Trac... I'm still not sure it actually is a bug (as opposed to intentional behavior, a known limitation, or whatever) but here it is: https://ghc.haskell.org/trac/ghc/ticket/9696
I'll try to guess based on the code
Runtime tries to optimize small reads and writes, so it maintains internal buffer. If your buffer is 1 byte long, it will be inefficient to use it dirrectly. So internal buffer is used to read bigger chunk of data. It is probably ~32Kb long. Plus something similar for writing. Plus your own buffer.
The code has an optimization -- if you provide buffer bigger then the internal one, and the later is empty, it will use your buffer dirrectly. But the internal buffer is already allocated, so it will not less memory usage. I don't know how to dissable internal buffer, but you can open feature request if it is important for you.
(I realize that my guess can be totally wrong.)
ADD:
This one does seem to allocate, but I still don't know why.
What is your concern, max memory usage or number of allocated bytes?
c_read is a C function, it doesn't allocate on haskell's heap (but may allocate on C heap.)
readRawBufferPtr is Haskell function, and it is usual for haskell functions to allocate a lot of memory, that quickly becomes a garbage. Simply because of immutability. It is common for haskell program to allocate e.g 100Gb while memory usage is under 1Mb.
It seems like the conclusion is: it's a bug.

Resources