Reading /dev/urandom as early as possible - linux

I am performing research in the field of random number generation and I need to demonstrate the "boot-time entropy hole" from the well-known "P's and Q's" paper (here). We will be spooling up two copies of the same minimal Linux virtual machine at the same time and we are expecting their /dev/urandom values to be the same at some early point in the boot process.
However, I have been unable to read /dev/urandom early enough in the boot process to spot the issue. We need to the earlier in the boot process.
How can I get the earliest possible values of /dev/urandom? We likely will need to modify the kernel, but we have very little experience there, and need some pointers. Or, if there's a kernel-instrumenting tool available that could do it without re-compiling a kernel, that would be great, too.
Thanks in advance!

urandom is provided via device driver and the first thing kernel does with the driver is to call init call.
If you take a look here: http://lxr.free-electrons.com/source/drivers/char/random.c#L1401
* Note that setup_arch() may call add_device_randomness()
* long before we get here. This allows seeding of the pools
* with some platform dependent data very early in the boot
* process. But it limits our options here. We must use
* statically allocated structures that already have all
* initializations complete at compile time. We should also
* take care not to overwrite the precious per platform data
* we were given.
*/
static int rand_initialize(void)
{
init_std_data(&input_pool);
init_std_data(&blocking_pool);
init_std_data(&nonblocking_pool);
return 0;
}
early_initcall(rand_initialize);
So, init function for this driver is rand_initialize. However note that comment says that setup_arch may call add_device randomness() before this device is even initialized. However, calling that function does not add any actual entropy (it feeds the pool with stuff like MAC addresses, so if you have two exactly the same VMs, you're good there). From the comment:
* add_device_randomness() is for adding data to the random pool that
* is likely to differ between two devices (or possibly even per boot).
* This would be things like MAC addresses or serial numbers, or the
* read-out of the RTC. This does *not* add any actual entropy to the
* pool, but it initializes the pool to different values for devices
* that might otherwise be identical and have very little entropy
* available to them (particularly common in the embedded world).
Also, note that entropy pools are stored on shutdown and restored on boot time via init script (on my Ubuntu 14.04, it's in /etc/init.d/urandom), so you might want to call your script from that script before
53 (
54 date +%s.%N
55
56 # Load and then save $POOLBYTES bytes,
57 # which is the size of the entropy pool
58 if [ -f "$SAVEDFILE" ]
59 then
60 cat "$SAVEDFILE"
61 fi
62 # Redirect output of subshell (not individual commands)
63 # to cope with a misfeature in the FreeBSD (not Linux)
64 # /dev/random, where every superuser write/close causes
65 # an explicit reseed of the yarrow.
66 ) >/dev/urandom
or similar call is made.

Related

Linux. Can packets pass libpcap by?

I am writing a linux program that controls internet traffic. In other words, how much bytes I have used while some amount of time. I use a Pcap4J for java (implementation of libpcap) and I have question about it. What happens if my program hasn't proceeded a package while a new one has arrived.
1. It slows down the download(upload) rate for the whole OS?
2. It skips a new one, and my program will never know that it passed by?
In other words, I've downloaded the 1G of data on my computer. How many bytes my program get: 100% or it may be passed my program by but still got the destination place!
And give me know if it is a bad idea to write a control traffic app using this lib!
Your application loses packets. In your words, they pass by.
However, if your idea is to have a metric of how many packets went in and out of your system in a given time, there are definitely better ways to achieve it.
On Linux you can just do a script that does something like this:
DEVICE=eth0
RX0=$(cat /sys/net/$DEVICE/statistics/rx_bytes)
TX0=$(cat /sys/net/$DEVICE/statistics/tx_bytes)
while : ; do
sleep 5
RX1=$(cat /sys/net/$DEVICE/statistics/rx_bytes)
TX1=$(cat /sys/net/$DEVICE/statistics/tx_bytes)
echo "RX bytes: $(($RX1-$RX0))"
echo "TX bytes: $(($TX1-$TX0))"
RX0=RX1
TX0=TX1
done
You can adjust times or whether is a parameter, I think you'll get the idea.

PEBS records much less memory-access samples than actually present

I have been trying to log memory accesses that are made by a program using Perf and PEBS counters. My intention was to log all of the memory accesses made by a program (I chose programs from SpecCPU2006). By tweaking certain parameters, I seem to record much more samples than there actually is for the program. I know, as has been said previously, that it is tough to record all of the memory access samples but leaving that aside, I want to know how can PEBS record more samples than there actually is?
I followed the below steps :-
First of all, I modified the /proc/sys/kernel/perf_cpu_time_max_percent value. Initially it was 25%, I changed it to 95%. This was because I wanted to see if I can record the maximum number of memory access samples. This would also allow me to probably use a much higher perf_event_max_sample_rate, which is usually 100,000 at a maximum but now I can set it to a higher value without it being lowered down.
I used a much higher value for perf_event_max_sample_rate which is 244,500, instead of the maximum allowable value of 100,000.
Now what I did was I used perf-stat to record the total count of the memory-stores information in a program. I got the below data :-
./perf stat -e cpu/mem-stores/u ../../.././libquantum_base.arnab 100
N = 100, 37 qubits required
Random seed: 33
Measured 3277 (0.200012), fractional approximation is 1/5.
Odd denominator, trying to expand by 2.
Possible period is 10.
100 = 4 * 25
Performance counter stats for '../../.././libquantum_base.arnab 100':
158,115,509 cpu/mem-stores/u
0.591718162 seconds time elapsed
There are roughly ~158 million events as indicated by perf-stat, which should be a correct indicator, since this is directly coming from the hardware counter values.
But now, as I run the perf record -e command and use PEBS counters to calculate all of the memory store events that are possible :-
./perf record -e cpu/mem-stores/upp -c 1 ../../.././libquantum_base.arnab 100
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.
Couldn't record kernel reference relocation symbol
Symbol resolution may be skewed if relocation was used (e.g. kexec).
Check /proc/kallsyms permission or run as root.
N = 100, 37 qubits required
Random seed: 33
Measured 3277 (0.200012), fractional approximation is 1/5.
Odd denominator, trying to expand by 2.
Possible period is 10.
100 = 4 * 25
[ perf record: Woken up 32 times to write data ]
[ perf record: Captured and wrote 7.827 MB perf.data (254125 samples) ]
I can see 254125 samples being recorded. This is much much less than what was returned by perf stat. I am recording all of these accesses in the userspace only (I am using -u in both cases).
Why does this happen ? Am I recording the memory-store events in any wrong way ? Or is there a problem with the CPU behavior ?

Questions about Python3.6 os.urandom/os.getrandom/secrets

Referring to documentation for os and secrets:
os.getrandom(size, flags=0)
Get up to size random bytes. The function can return less bytes than requested.
getrandom() relies on entropy gathered from device drivers and other sources of environmental noise.
So does this mean it's from /dev/random?
On Linux, if the getrandom() syscall is available, it is used in blocking mode: block until the system urandom entropy pool is initialized (128 bits of entropy are collected by the kernel).
So to ensure kernel CSPRNG with bad internal state is never used I should use os.getrandom()? Since the function can return less bytes than requested, should I run the application level CSPRNG as something like
def rng():
r = bytearray()
while len(r) < 32:
r += os.getrandom(1)
return bytes(r)
to ensure maximum security? I explicitly want all systems that do not support blocking until urandom entropy pool is initialized to be unable to run the program and system that support it, to wait. This is because the software must be secure even if it's run from a live CDs that has zero entropy at start.
Or does the blocking mean if I do os.getrandom(32), the program waits if necessary, forever until the 32 bytes are collected?
The flags argument is a bit mask that can contain zero or more of the following values ORed together: os.GRND_RANDOM and GRND_NONBLOCK.
Can someone please ELI5 how this works?
os.urandom(size)
On Linux, if the getrandom() syscall is available, it is used in blocking mode: block until the system urandom entropy pool is initialized (128 bits of entropy are collected by the kernel).
So the urandom quietly falls back to non-blocking CSPRNG that doesn't know it's internal seeding state in older Linux kernel versions?
Changed in version 3.6.0: On Linux, getrandom() is now used in blocking mode to increase the security.
Does this have to do with os.getrandom()? Is it a lower level call? Are the two the same?
os.GRND_NONBLOCK
By default, when reading from /dev/random, getrandom() blocks if no random bytes are available, and when reading from /dev/urandom, it blocks if the entropy pool has not yet been initialized.
So it's the 0-flag in os.getrandom(size, flag=0)?
os.GRND_RANDOM
If this bit is set, then random bytes are drawn from the /dev/random pool instead of the /dev/urandom pool.
What does ORing the os.getrandom() flags mean? How does the os.getrandom(flags=1) tell if I meant to enable os.GRND_NONBLOCK or os.GRND_RANDOM. Or do I need to set it before like this:
os.GRND_RANDOM = 1
os.getrandom(32) # or use the rng() defined above
secrets module
The secrets module is used for generating cryptographically strong random numbers suitable for managing data such as passwords, account authentication, security tokens, and related secrets.
The only clear way to generate random bytes is
secrets.token_bytes(32)
The secrets module provides access to the most secure source of randomness that your operating system provides.
So that should mean it's os.getrandom with fallback to os.urandom? So it's not a good choice if you desire 'graceful exit if internal state can not be evaluated'?
To be secure against brute-force attacks, tokens need to have sufficient randomness. Unfortunately, what is considered sufficient will necessarily increase as computers get more powerful and able to make more guesses in a shorter period. As of 2015, it is believed that 32 bytes (256 bits) of randomness is sufficient for the typical use-case expected for the secrets module.
Yet the blocking stops at 128 bits of internal state, not 256. Most symmetric ciphers have 256-bit versions for a reason.
So I should probably make sure the /dev/random is used in blocking mode to ensure internal state has reached 256 bits by the time the key is generated?
So tl;dr
What's the most secure way in Python3.6 to generate a 256-bit key on a Linux (3.17 or newer) live distro that has zero entropy in kernel CSPRNG internal state at the start of my program's execution?
After doing some research, I can answer my own question.
os.getrandom is a wrapper for getrandom() syscall offered in Linux Kernel 3.17 and newer. The flag is a number (0, 1, 2 or 3) that corresponds to bitmask in following way:
GETRANDOM with ChaCha20 DRNG
os.getrandom(32, flags=0)
GRND_NONBLOCK = 0 (=Block until the ChaCha20 DRNG seed level reaches 256 bits)
GRND_RANDOM = 0 (=Use ChaCha20 DRNG)
= 00 (=flag 0)
This is a good default to use with all Python 3.6 programs on all platforms (including live distros) when no backwards compatibility with Python 3.5 and pre-3.17 kernels is needed.
The PEP 524 is incorrect when it claims
On Linux, getrandom(0) blocks until the kernel initialized urandom with 128 bits of entropy.
According to page 84 of the BSI report, the 128-bit limit is used during boot time for callers of kernel module's get_random_bytes() function, if the code was made to properly wait for the triggering of the add_random_ready_callback() function. (Not waiting means get_random_bytes() might return insecure random numbers.) According to page 112
When reaching the state of being fully seeded and thus having the ChaCha20 DRNG seeded with 256 bits of entropy -- the getrandom system call unblocks and generates random numbers.
So, GETRANDOM() never returns random numbers until the ChaCha20 DRNG is fully seeded.
os.getrandom(32, flags=1)
GRND_NONBLOCK = 1 (=If the ChaCha20 DRNG is not fully seeded, raise BlockingIOError instead of blocking)
GRND_RANDOM = 0 (=Use ChaCha20 DRNG)
= 01 (=flag 1)
Useful if the application needs to do other tasks while it waits for the ChaCha20 DRNG to be fully seeded. The ChaCha20 DRNG is almost always fully seeded during boot time, so flags=0 is most likely a better choice. Needs the try-except logic around it.
GETRANDOM with blocking_pool
The blocking_pool is also accessible via the /dev/random device file. The pool was designed with the idea in mind that entropy runs out. This idea applies only when trying to create one-time pads (that strive for information theoretic security). The quality of entropy in blocking_pool for that purpose is not clear, and the performance is really bad. For every other use, properly seeded DRNG is enough.
The only situation where blocking_pool might be more secure is with pre-4.17 kernels that have the CONFIG_RANDOM_TRUST_CPU flag set during compile time, and if the CPU HWRNG happened to have a backdoor. Since in that case the ChaCha20 DRNG is initially seeded with RDSEED/RDRAND instruction, bad CPU HWRNG would be a problem. However, according to page page 134 of the BSI report:
[As of kernel version 4.17] The Linux-RNG now considers the ChaCha20 DRNG fully seeded after it received 128 bit of entropy from the noise sources. Previously it was sufficient that it received at least 256 interrupts.
Thus the ChaCha20 DRNG wouldn't be considered fully seeded until entropy is also mixed from input_pool, that pools and mixes random events from all LRNG noise sources together.
By using os.getrandom() with flags 2 or 3, the entropy comes from blocking_pool, that receives entropy from input_pool, that in turn receives entropy from several additional noise sources. The ChaCha20 DRNG is reseeded also from the input_pool, thus the CPU RNG does not have permanent control over the DRNG state. Once this happens, ChaCha20 DRNG is as secure as blocking_pool.
os.getrandom(32, flags=2)
GRND_NONBLOCK = 0 (=Return 32 bytes or less if entropy counter of blocking_pool is low. Block if no entropy is available.)
GRND_RANDOM = 1 (=Use blocking_pool)
= 10 (=flag 2)
This needs an external loop that runs the function and stores returned bytes into a buffer until the buffer size is 32 bytes. The major problem here is due to the blocking behavior of the blocking_pool, obtaining the bytes needed might take a very long time, especially if other programs are also requesting random numbers from the same syscall or /dev/random. Another issue is loop that uses os.getrandom(32, flags=2) spends more time idle waiting for random bytes than it would with flag 3 (see below).
os.getrandom(32, flags=3)
GRND_NONBLOCK = 1 (=return 32 bytes or less if entropy counter of blocking_pool is low. If no entropy is available, raise BlockingIOError instead of blocking).
GRND_RANDOM = 1 (=use blocking_pool)
= 11 (=flag 3)
Useful if the application needs to do other tasks while it waits for blocking_pool to have some amount of entropy. Needs the try-except logic around it plus an external loop that runs the function and stores returned bytes into a buffer until the buffer size is 32 bytes.
Other
open('/dev/urandom', 'rb').read(32)
To ensure backwards compatibility, unlike GETRANDOM() with ChaCha20 DRNG, reading from /dev/urandom device file never blocks. There is no guarantee for the quality of random numbers, which is bad. This is the least recommended option.
os.urandom(32)
os.urandom(n) provides best effort security:
Python3.6
On Linux 3.17 and newer, os.urandom(32) is the equivalent of os.getrandom(32, flags=0). On older kernels it quietly falls back to the equivalent of open('/dev/urandom', 'rb').read(32) which is not good.
os.getrandom(32, flags=0) should be preferred as it can not fall back to insecure mode.
Python3.5 and earlier
Always the equivalent of open('/dev/urandom', 'rb').read(32) which is not good. As os.getrandom() is not available, Python3.5 should not be used.
secrets.token_bytes(32) (Python 3.6 only)
Wrapper for os.urandom(). Default length of keys is 32 bytes (256 bits). On Linux 3.17 and newer, secrets.token_bytes(32) is the equivalent of os.getrandom(32, flags=0). On older kernels it quietly falls back to the equivalent of open('/dev/urandom', 'rb').read(32) which is not good.
Again, os.getrandom(32, flags=0) should be preferred as it can not fall back to insecure mode.
tl;dr
Use os.getrandom(32, flags=0).
What about other RNG sources, random, SystemRandom() etc?
import random
random.<anything>()
is never safe for creating passwords, cryptographic keys etc.
import random
sys_rand = random.SystemRandom()
is safe for cryptographic use WITH EXCEPTIONS!
sys_rand.sample()
Generating a random password with sys_rand.sample(list_of_password_chars, counts=password_length) is not safe because to quote the documentation, the sample() method is used for "random sampling without replacement". This means that each consequtive character in the password is guaranteed not to contain any of the previous characters. This will lead to passwords that are not uniformly random.
sys_rand.choices()
The sample() method was used for random sampling without replacement. The choices() method is used for random sampling with replacement. However, the to quote the documentation on choices,
The algorithm used by choices() uses floating point arithmetic for internal consistency and speed. The algorithm used by choice() defaults to integer arithmetic with repeated selections to avoid small biases from round-off error.
The floating point arithmetic choices() method uses thus introduces cryptographically non-negligible biases to the sampled passwords. Thus, random.choices() must not be used for password/key generation!
sys_random.choice()
As per the previously quoted piece of documentation, the sys_random.choice() method uses integer arithmetic as opposed to floating point arithmetic, thus generating passwords/keys with repeated calls to sys_random.choice() is therefore safe.
secrets.choice()
The secrets.choice() is a wrapper for sys_random.choice(), and can be used interchangeably with random.SystemRandom().choice(): they are the same thing.
The recipe for best practice to generate a passphrase with secrets.choice() is
import secrets
# On standard Linux systems, use a convenient dictionary file.
# Other platforms may need to provide their own word-list.
with open('/usr/share/dict/words') as f:
words = [word.strip() for word in f]
passphrase = ' '.join(secrets.choice(words) for i in range(4))
How can I ensure the generated passphrase meets some security level, e.g. 128 bits?
Here's a recipe for that
import math
import secrets
def generate_passphrase() -> str:
PASSWORD_MIN_BIT_STRENGTH = 128 # Set desired minimum bit strength here
with open('/usr/share/dict/words') as f:
wordlist = [word.strip() for word in f]
word_space = len(wordlist)
word_count = math.ceil(math.log(2 ** PASSWORD_MIN_BIT_STRENGTH, word_space))
passphrase = ' '.join(secrets.choice(wordlist) for _ in range(word_count))
# pwd_bit_strength = math.floor(math.log2(word_space ** word_count))
# print(f"Generated {pwd_bit_strength}-bit passphrase.")
return passphrase
As #maqp suggested...
Using os.getrandom(32, flags=0) is logical choice unless you're using the new secrets AND the Linux kernel (3.17 and newer) does NOT fall back to open('dev/urandom', 'rb').read(32).
Workaround, Secrets on Python 3.5.x
I installed Secrets for Python 2 even though running Python 3 and at a glance Secrets is working in Python 3.5.2 environment. Perhaps if I get time, or someone else does they can learn whether this one falls back, I suppose if Linux kernel is below certain version it may occur.
pip install python2-secrets
Once completed you can import secrets just like you would have with the Python 3 flavor.
Or just make sure to use Linux Kernel 3.17 & newer. Knowing one's kernel is always good practice, but in reality we count on smart people like maqp to find & share these things. Great job.
Were we complacent... having false sense of security?*
1st, 2nd, 4th... where is the outrage? It's not about 'your' security, that would be selfish to assume. It's the ability to spy on those who represent you in government, those who have skeletons & weaknesses (humans). Be sure to correct those selfish ones that say, "I got nothing to hide".
How Bad Was It?
The strength of encryption increases exponentially as length of key increases, therefore is it reasonable to assume that a reduction by of at least half, say 256 down to 128 would equate to decrease in strength by factors of tens, hundreds, thousands or more? Did it make big-bro job pretty easy, or just a tiny bit easier, am leaning towards saying the former.
The Glass Half Full?
Oh well, at least Linux is open source & we can see the insides for the most part. We still need chip hackers to find secret stuff, and the chips & hardware drivers are where you'll probably find stuff that will keep you from sleeping at night.

Is there any alterntive to wait3 to get rusage structure in shell scripting?

I was trying to monitor the peak memory usage of a child process.time -v is an option,but it is not working in solaris.So is there any way to get details that are in rusage structure from shell scripting?
You can use /usr/bin/timex
From the /usr/bin/timex man page:
The given command is executed; the elapsed time, user time and system
time spent in execution are reported in seconds. Optionally, process
accounting data for the command and all its children can be listed or
summarized, and total system activity during the execution interval
can be reported.
...
-p List process accounting records for command and all its children. This option works only if the process accounting software is installed. Suboptions f, h, k, m, r, and t modify the data items
reported. The options are as follows:
...
Start with the man page for acctadm to get process accounting enabled.
Note that on Solaris, getrusage() and wait3() do not return memory usage statistics. See the (somewhat dated) getrusage() source code at http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/syscall/rusagesys.c and the wait3() source code at http://src.illumos.org/source/xref/illumos-gate/usr/src/lib/libbc/libc/sys/common/wait.c#158 (That's actually OpenSolaris source, which Oracle dropped support for, and it may not represent the current Solaris implementation, although a few tests on Solaris 11.2 show that the RSS data is in fact still zero.)
Also, from the Solaris getrusage() man page:
The ru_maxrss, ru_ixrss, ru_idrss, and ru_isrss members of the
rusage structure are set to 0 in this implementation.
There are almost certainly other ways to get the data, such as dtrace.
Edit:
dtrace doesn't look to be much help, unfortunately. Attempting to run this dtrace script with dtrace -s memuse.d -c bash
#!/usr/sbin/dtrace -s
#pragma D option quiet
profile:::profile-1001hz
/ pid == $target /
{
#pct[ pid ] = max( curpsinfo->pr_pctmem );
}
dtrace:::END
{
printa( "pct: %#u %a\n", #pct );
}
resulted in the following error message:
dtrace: failed to compile script memuse.d: line 8: translator does not define conversion for member: pr_pctmem
dtrace on Solaris doesn't appear to provide access to process memory usage. In fact, the Solaris 11.2 /usr/lib/dtrace/procfs.d translator for procfs data has this comment in it:
/*
* Translate from the kernel's proc_t structure to a proc(4) psinfo_t struct.
* We do not provide support for pr_size, pr_rssize, pr_pctcpu, and pr_pctmem.
* We also do not fill in pr_lwp (the lwpsinfo_t for the representative LWP)
* because we do not have the ability to select and stop any representative.
* Also, for the moment, pr_wstat, pr_time, and pr_ctime are not supported,
* but these could be supported by DTrace in the future using subroutines.
* Note that any member added to this translator should also be added to the
* kthread_t-to-psinfo_t translator, below.
*/
Browsing the Illumos.org source code, searching for ps_rssize, indicates that the procfs data is computed only when needed, and not updated continually as the process runs. (See http://src.illumos.org/source/search?q=pr_rssize&defs=&refs=&path=&hist=&project=illumos-gate)

Linux display average CPU load for last week

On a Linux box, I need to display the average CPU utilisation per hour for the last week. Is that information logged somewhere? Or do I need to write a script that wakes up every 15 minutes to copy /proc/loadavg to a logfile?
EDIT: I'm not allowed to use any tools other than those that come with Linux.
You might want to check out sar (man page), it fits your use case nicely.
System Activity Reporter (SAR) - capture important system performance metrics at
periodic intervals.
Example from IBM Developer Works Article:
Add an entry to your root crontab
# Collect measurements at 10-minute intervals
0,10,20,30,40,50 * * * * /usr/lib/sa/sa1
# Create daily reports and purge old files
0 0 * * * /usr/lib/sa/sa2 -A
Then you can simply query this information using a sar command (display all of today's info):
root ~ # sar -A
Or just for a certain days log file:
root ~ # sar -f /var/log/sa/sa16
You can usually find it in the sysstat package for your linux distro
As far as I know it's not stored anywhere... It's a trivial thing to write, anyway. Just add something like
cat /proc/loadavg >> /var/log/loads
to your crontab.
Note that there are monitoring tools (like Munin) which can do this kind of thing for you, and generate pretty graphs of it to boot... they might be overkill for your situation though.
I would recommend looking at Multi Router Traffic Grapher (MRTG).
Using snmpd to read the load average, it will automatically calculate averages at any time interval and length, along with nice charts for analysis.
Someone has already posted a CPU usage example.

Resources