Testing whether memory is accessible in Linux - linux

Given an untrusted memory address, is there a way in Linux to test whether it points to valid, accessible memory?
For example, in mach you can use vm_read_overwrite() to attempt to copy data from the specified location. If the address is invalid or inaccessible, it will return an error code rather than crashing the process.

write from that memory (into /dev/null, for example (EDIT: with /dev/null it might not work as expected, use a pipe)), and you'll receive EFAULT error if the address is unaccessible.
I have no idea how to test for writable memory without destroying its content if it is writable.

This a typical case of TOCTOU - you check at some point that the memory is writeable, then later on you try to write to it, and somehow (e.g. because the application deallocated it), the memory is no longer accessible.
There is only one valid way to actually do this, and that is, trap the fault you get from writing to it when you actually need to use it.
Of course, you can use tricks to try to figure out if the memory "may be writeable", but there is no way you can actually ensure it is writeable.
You may want to explain slightly more what you are actually trying to do, and maybe we can have some better ideas if you are more specific.

You can try msync:
int page_size = getpagesize();
void *aligned = (void *)((uintptr_t)p & ~(page_size - 1));
if (msync(aligned, page_size, MS_ASYNC) == -1 && errno == ENOMEM) {
// Non-accessibe
}
But this function may be slow and should not be used in performance critical circumstance.

Related

"shmop_open(): unable to attach or create shared memory segment 'No error':"?

I get this every time I try to create an account to ask this on Stack Overflow:
Oops! Something Bad Happened!
We apologize for any inconvenience, but an unexpected error occurred while you were browsing our site.
It’s not you, it’s us. This is our fault.
That's the reason I post it here. I literally cannot ask it on Overflow, even after spending hours of my day (on and off) repeating my attempts and solving a million reCAPTCHA puzzles. Can you maybe fix this error soon?
With no meaningful/complete examples, and basically no documentation whatsoever, I've been trying to use the "shmop" part of PHP for many years. Now I must find a way to send data between two different CLI PHP scripts running on the same machine, without abusing the database for this. It must work without database support, which means I'm trying to use shmop, but it doesn't work at all:
$shmopid = shmop_open(1, 'w', 0644, 99999); // I have no idea what the "key" is supposed to be. It says: "System's id for the shared memory block. Can be passed as a decimal or hex.", so I've given it a 1 and also tried with 123. It gave an error when I set the size to 64, so I increased it to 99999. That's when the error changed to the one I now face above.
shmop_write($shmopid, 'meow 123', 0); // Write "meow 123" to the shared variable.
while (1)
{
$shared_string = shmop_read($shmopid, 0, 8); // Read the "meow 123", even though it's the same script right now (since this is an example and minimal test).
var_dump($shared_string);
sleep(1);
}
I get the error for the first line:
shmop_open(): unable to attach or create shared memory segment 'No error':
What does that mean? What am I doing wrong? Why is the manual so insanely cryptic for this? Why isn't this just a built-in "superarray" that can be accessed across the scripts?
About CLI:
It cannot work in standalone CLI processes, as an answer here says:
https://stackoverflow.com/a/34533749
The master process is the one to hold the shared memory block, so you will have to use php-fpm or mod_php or some other web/service-running version, and maybe even start/request/stop it all from a CLI php script.
About shmop usage itself:
Use "c" mode in shmop_open() for creating the segment before it can be used with "a" or "w".
I stumbled on this error in a different scenario where shared memory is completely optional to speed up some repeated operations. So I wanted to try reading first without knowing memory size and then allocate from actual data when needed. In my case I had to call it as #shmop_open() to hide this error output.
About shmop on Windows:
PHP 7 crashed Apache worker process causing its restart with status 3221225477 when trying to reallocate a segment with the same predefined (arbitrary number) key but different size, even after shmop_delete(). As a workaround for this, I took filemtime() of the source file which contains data to be stored in memory, and used that timestamp as the key in shmop_open(). It still was not flawless IIRC, and I don't know if it would cause memory leaks, but it was enough to test my code which would mainly run on Linux anyway.
Finally, as of PHP version 8.0.7, shmop seems to work fine with Apache 2.4.46 and mod_php in Windows 10.

linux write(): does it try to write as many bytes as possible?

If I use write in this way: write (fd, buf, 10000000 /* 10MB */) where fd is a socket and uses blocking I/O, will the kernel tries to flush as many bytes as possible so that only one call is enough? Or I have to call write several times according to its return value? If that happens, does it mean something is wrong with fd?
============================== EDITED ================================
Thanks for all the answers. Furthermore, if I put fd into poll and it returns successfully with POLLOUT, so call to write cannot be blocked and writes all the data unless something is wrong with fd?
In blocking mode, write(2) will only return if specified number of bytes are written. If it can not write it'll wait.
In non-blocking (O_NONBLOCK) mode it'll not wait. It'll return right then. If it can write all of them it'll be a success other wise it'll set errno accordingly. Then you have check the errno if its EWOULDBLOCK or EAGAIN you have to invoke same write agian.
From manual of write(2)
The number of bytes written may be less than count if, for example, there is insufficient space on the underlying physical medium, or the RLIMIT_FSIZE resource
limit is encountered (see setrlimit(2)), or the call was interrupted by a signal handler after having written less than count bytes. (See also pipe(7).)
So yes, there can be something wrong with fd.
Also note this
A successful return from write() does not make any guarantee that data has been committed to disk. In fact, on some buggy implementations, it does not even guar‐
antee that space has successfully been reserved for the data. The only way to be sure is to call fsync(2) after you are done writing all your data.
/etc/sysctl.conf is used in Linux to set parameters for the TCP protocol, which is what I assume you mean by a socket. There may be a lot of parameters there, but when you dig through it, basically there is a limit to the amount of data the TCP buffers can hold at one time.
So if you tried to write 10 MB of data at one go, write would return a ssize_t value equal to that value. Always check the return value of the write() system call. If the system allowed 10MB then write would return that value.
The value is
net.core.wmem_max = [some number]
If you change some number to a value large enough to allow 10MB you can write that much. DON'T do that! You could cause other problems. Research settings before you do anything. Changing settings can decrease performance. Be careful.
http://linux.die.net/man/7/tcp
has basic C information for TCP settings. Also check out /proc/sys/net on your box.
One other point - TCP is a two way door, so just because you can send a zillion bytes at one time does not mean the other side can read it or even handle it. You socket may just block for a while. And possibly your write() return value may be less than you hoped for.

"cat" command killed when reading from a Linux device driver

I have an assignment in my Operating Systems class to make a simple pseudo-stack Linux device driver. So for an example, if I was to write "Hello" to the device driver, it would return "olleH" when I read from it. We have to construct a tester program in C to just call upon the read/write functions of the device driver to just demonstrate that it functions in a FILO manner. I have done all of this, and my tester program, in my opinion, demonstrates the purpose of the assignment; however, out of curiosity, inside BASH I execute the following commands:
echo "Test" > /dev/driver
cat /dev/driver
where /dev/driver is the special file I created using "mknod". However, when I do this, I get a black screen full of errors. After I swap back to the GUI view using CNTRL+ALT+F7, I see that BASH has returned "Killed".
Does anyone know what could be causing this to happen? I am confused since my tester program calls open(), read(), and write() with everything functioning as it should.
If I need to show some code, just ask.
The function in your device driver that writes to the buffer you are providing it is most likely causing this issue.
To debug, you can do the following:
First, make sure the read part is fine. You can printk your internal buffer after you read from input to ensure this.
Second, in your write function, printk some information instead of actually writing anything and make sure everything is fine.
Also, make sure the writer makes it clear that the write has ended. I'm not particularly sure about device drivers, but you either need to return 0 as the number of bytes written when called a second time, or set an eof variable (if that is one of the arguments to your function)

Opening device file and error code in linux-kernel

I am new to kernel programming and I have two questions:
My device is getting registered (by dynamic registration) but my
application is not able to open the device file. What could be the
possible reasons?
What would be the appropriate error code to return when my device
driver detects an divide by zero?
My code implements simple arithmetic operations in the kernel. I use an ioctl() based interface to communicate between user space and the kernel.
if(out.b==0) /*checking for divide by zero*/
out.res=-EINVAL;
else
out.res=out.a/out.b;
copy_to_user((values*)ioctl_param,&out,sizeof(values));
break;
We can't possibly answer the first question if you don't show us your code.
As for the second, EINVAL or perhaps ERANGE.
In your case you need to make the distinction between the information you return in the ioctl_param structure (that's a really bad variable name by the way) and the return status of the ioctl() call itself.
Remember that ioctl() returns 0 if it completes successfully, and sets errno if it fails.
The kernel and C library take care of most of that for you. Usually all you have to do is return -EINVAL or similar from your ioctl() function.
Something like this:
if(out.b == 0) /*checking for divide by zero*/
return -EINVAL;
out.res=out.a / out.b;
copy_to_user((values*)ioctl_param,&out,sizeof(values));
break;

Using assertion in the Linux kernel

I have a question about assert() in Linux: can I use it in the kernel?
If no, what techniques do you usually use if, for example I don't want to enter NULL pointer?
The corresponding kernel macros are BUG_ON and WARN_ON. The former is for when you want to make the kernel panic and bring the system down (i.e., unrecoverable error). The latter is for when you want to log something to the kernel log (viewable via dmesg).
As #Michael says, in the kernel, you need to validate anything that comes from userspace and just handle it, whatever it is. BUG_ON and WARN_ON are to catch bugs in your own code or problems with the hardware.
One option would be to use the macro BUG_ON(). It will printk a message, and then panic() (i.e. crash) the kernel.
http://kernelnewbies.org/KernelHacking-HOWTO/Debugging_Kernel
Of course, this should only be used as an error handling strategy of last resort (just like assert)...
No. Unless you're working on the kernel core and rather on a module, you should do your best to never crash (technically, abort()) the kernel. If you don't want to use a NULL pointer, just don't do it. Check it before using it, and produce an error log if it is.
The closest thing you might want to do if you're actually handling a fatal case is the panic() function or the BUG_ON and WARN_ON macros, which will abort execution and produce diagnostic messages, a stack trace and a list of modules.
Well, dereferencing null pointer will produce an oops, which you can use to find the offending code. Now, if you want to assert() a given condition, you can use
BUG_ON(condition)
A less lethal mechanism is WARN_ON, which will produce a backtrace without crashing the kernel.
I use this macro, it uses BUG() but adds some more info I normally use for debugging, and of course you can edit it to include more info if you wish:
#define ASSERT(x) \
do { if (x) break; \
printk(KERN_EMERG "### ASSERTION FAILED %s: %s: %d: %s\n", \
__FILE__, __func__, __LINE__, #x); dump_stack(); BUG(); \
} while (0)
BUG_ON() is the appropriate approach to do it. It checks for the condition to be true and calls the macro BUG().
How BUG() handles the rest is explained very well in the following article:
http://kernelnewbies.org/FAQ/BUG

Resources