Can exec*'s argv contain a value of 0 in multiple places? - linux

I was reading that after exec creates a new process,
argv is an array of argument strings, with argv[argc] == 0
What happens if one of the other values within the array argv happens to be 0? Will the number of arguments (argc) be incorrectly calculated when the child process runs?
I read this on page 34 of the ABI of AMD64 (https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf).

The execve system call (which is used by all exec* functions) has an argument of the form char *const argv[]. The kernel calculates argc by iterating over the supplied argv as follows:
static int count(struct user_arg_ptr argv, int max)
{
int i = 0;
if (argv.ptr.native != NULL) {
for (;;) {
const char __user *p = get_user_arg_ptr(argv, i);
if (!p)
break;
if (IS_ERR(p))
return -EFAULT;
if (i >= max)
return -E2BIG;
++i;
if (fatal_signal_pending(current))
return -ERESTARTNOHAND;
cond_resched();
}
}
return i;
}
The function get_user_arg_ptr essentially calculates an index into the argv array and returns the pointer stored at that index. The loop breaks under four conditions, two of them are pertinent to your question:
On the first NULL seen in the argv array. If there are other pointers following the first NULL in argv, they are ignored. Having more than one NULL smells like a bug in the program that constructed argv.
When the number of pointers is larger than or equal to MAX_ARG_STRINGS, which is defined as 0x7FFFFFFF. In this case, the system call fails.
The value of i returned is assigned to argc when get_user_arg_ptr returns.
Another case where the terminating NULL in argv matters is when the application itself uses argv as follows:
for(char **p = argv; *p != NULL; ++p)
{
// ...
}
It's part of the Linux ABI that argv terminates with NULL, so such code is legal and portable across all Linux implementations. By the way, this code is legal too on Windows. Therefore, argc is provided for convenience only.
In addition, both the C and C++ standards state in 5.1.2.2.1 and 3.6.1, respectively, that if argc is larger than zero, then all values in argv[0] through argv[argc-1] shall be non-null pointers to null-terminated strings. Also argv[argc] must be null and that argc is non-negative. See also this answer.

Related

Reading array of regs using Verilator and VPI

So I have the following register defined in my verilog
reg [31:0] register_mem [0:15]/* verilator public */;
My goal is from my verilator c++ code to read each of the 16 values stored in it.
I have found that the documentation for this VPI stuff is rather difficult to find. I still cannot figure out what a t_vpi_vecval is and what its parameters are or if it is even the right approach.
Here is my approach at reading the 5th value in the register
unsigned int read_regs() {
const std::string path = "TOP.TOP.cpu.reg_file.register_mem";
vpiHandle vh1 = vpi_handle_by_name((PLI_BYTE8*)path.c_str(), NULL);
if (!vh1) {
printf("Name %s", path.c_str());
vl_fatal(__FILE__, __LINE__, "sim_main", "No handle found: ");
}
const char* name = vpi_get_str(vpiName, vh1);
s_vpi_value v;
v.format = vpiVectorVal;
vpi_get_value(vh1, &v);
return v.value.vector[4].aval;
}
No-matter what I do here the method returns 0 suggesting that I am not looking at the register_mem array.
What am I doing wrong?
In order to get values of the array, you need to get values for every element of an array separately. VPI does not return values for the full array.
Actually the handle which you get for an array is a handle of the vpiRegArray type. It can be iterated to access every separate element.
Here is a simple code which does the iteration and prints values of every element in the array:
#include "vpi_user.h"
PLI_INT32 preg_calltf( char *txt ) {
vpiHandle hreg = vpi_handle_by_name("rarr.register_mem", 0);
vpi_printf("reg type: %s\n", vpi_get_str(vpiType, hreg)); // vpiRegArray
s_vpi_value val = {vpiDecStrVal}; // struct t_vpi_value
vpiHandle arrayIterator = vpi_iterate( vpiReg, hreg);
if( arrayIterator != NULL ) {
vpiHandle item = NULL;
while( NULL != ( item = vpi_scan( arrayIterator ) ) ) {
vpi_get_value(item, &val);
vpi_printf("item type: %s = %s\n", vpi_get_str(vpiType, item), val.value.str); // vpiReg
vpi_free_object( item );
}
}
return 0;
}
In this case I initialized the val with vpiDecStrVal. It instructs the compiler to prepare value results as a decimal string. The value is now accessible as val.value.str. You have multiple choices to get string or binary data in 2-state or 4-state representation.
For 2-state values up to 32 bit you can use an integer formatting. However, for longer values or 4-state, you need vpiVectorVal. It actually requests verilog to create 2 arrays of 32-bit integers, aval and bval. Both have sizes big enough to keep all bits of the value. Combination of bits in aval and bval represent the 4-state value of all bits in the vector.
All vpi information is available in LRM including relation diagrams and data structures. There are also some books, for example "the verilog pli handbook" by Sutherland.

Visual Studio '13 (Access Violation)

When I compile and run this program via gcc(g++)/Cygwin it compiles and acts as expected.
#include <iostream>
using namespace std;
int main(int argc, char* argv[]) {
for (int arg = 1; arg <= argc; arg++)
{
cout << argv[arg] << endl;
}
return 0;
}
However, when compiling with Visual Studio 13, the program compiles but I am given an access violation upon execution. What gives?
Unhandled exception at 0x000B5781 in demo.exe: 0xC0000005: Access violation reading location 0x00000000.
argv is a pointer to the first element of an array containing argc+1 elements. The first argc elements of this array contain pointers to first elements of null terminated strings representing the arguments given to the program by the environment (commonly the first of these strings is the name of the program, followed by the command line arguments).
The last element of this array (the argc+1th element, which argv[argc] refers to) is a null pointer. Your code dereferences this null pointer, leading to undefined behaviour.
The important thing to note here is that array indexing in C++ is zero based, rather than one based. This means that the first element of an array arr of length n is arr[0], and the last element is arr[n-1]. Your code appears to assume that the first element of such an array is arr[1] and that the last element is arr[n].

How can I get argv from "struct linux_binprm"?

I want to extract all argv from an existing struct linux_binprm. On kernel 3.4, I tried this piece of code: http://www.mail-archive.com/kernelnewbies#nl.linux.org/msg00278.html in do_excve_common, but it doesn't work. It returns (null). What is the problem and how can I get ALL the arguments in a char * string?
. If you want to get the full command line before the binary loader executing in do_execve_common(), you can try following:
there is one argument *argv in the function do_execve_common() parameter table, why bother to get the argv from "struct linux_binprm"? You can use the *argv directly with following codes. In the do_execve_common(), insert some codes as following:
argc = count(argv, MAX_ARG_STRINGS);
i = 0;
while (i < argc)
{
const char __user *str;
int len;
ret = -EFAULT;
str = get_user_arg_ptr(argv, i);
if (IS_ERR(str))
goto out;
len = strnlen_user(str, MAX_ARG_STRLEN);
if (!len)
goto out;
//copy the str to kernel temporary storage
//NOTE: tmp[] is a string array,
// the memory should have been allocated already for strings storage,
// each string is ended with \0
memcpy(tmp[i], str, len)
}
After executing these codes, I think the argv strings will be all saved in tmp[] array.
. While if you want to get the full command line after binary loader executing, I think at this time the argument page has been setup correctly, then you can try following approach to get the full command line:
There is a function proc_pid_cmdline() in ./fs/proc/base.c file, you can re-use most codes in proc_pid_cmdline() function to get the full command line from the argument page.

VC++ read variable length char*

I'm trying to read a variable length char* from the user input. I want to be able to specify the length of the string to read when the function is called;
char *get_char(char *message, unsigned int size) {
bool correct = false;
char *value = (char*)calloc(size+1, sizeof(char));
cout << message;
while(!correct) {
int control = scanf_s("%s", value);
if (control == 1)
correct = true;
else
cout << "Enter a correct value!" <<endl
<< message;
while(cin.get() != '\n');
}
return value;
}
So, upon running the program and trying to enter a string, I get a memory access violation, so I figured something has gone wrong when accessing the allocated space. My first idea was it went wrong because the size of the scanned char * is not specified within scanf(), but it doesn't work with correct length strings either. Even if I give the calloc a size of 1000 and try to enter one character, the program crashes.
What did I do wrong?
You have to specify the size of value to scanf_s:
int control = scanf_s("%s", value, size);
does the trick.
See the documentation of scanf_s for an example of how to use the function:
Unlike scanf and wscanf, scanf_s and wscanf_s require the buffer size to be specified for all input parameters of type c, C, s, S, or [. The buffer size is passed as an additional parameter immediately following the pointer to the buffer or variable.
I omit the rest of the MSDN description here because in the example they're providing, they use scanf instead of scanf_s what is quite irritating...

String manipulation in Linux kernel module

I am having a hard time in manipulating strings while writing module for linux. My problem is that I have a int Array[10] with different values in it. I need to produce a string to be able send to the buffer in my_read procedure. If my array is {0,1,112,20,4,0,0,0,0,0}
then my output should be:
0:(0)
1:-(1)
2:-------------------------------------------------------------------------------------------------------(112)
3:--------------------(20)
4:----(4)
5:(0)
6:(0)
7:(0)
8:(0)
9:(0)
when I try to place the above strings in char[] arrays some how weird characters end up there
here is the code
int my_read (char *page, char **start, off_t off, int count, int *eof, void *data)
{
int len;
if (off > 0){
*eof =1;
return 0;
}
/* get process tree */
int task_dep=0; /* depth of a task from INIT*/
get_task_tree(&init_task,task_dep);
char tmp[1024];
char A[ProcPerDepth[0]],B[ProcPerDepth[1]],C[ProcPerDepth[2]],D[ProcPerDepth[3]],E[ProcPerDepth[4]],F[ProcPerDepth[5]],G[ProcPerDepth[6]],H[ProcPerDepth[7]],I[ProcPerDepth[8]],J[ProcPerDepth[9]];
int i=0;
for (i=0;i<1024;i++){ tmp[i]='\0';}
memset(A, '\0', sizeof(A));memset(B, '\0', sizeof(B));memset(C, '\0', sizeof(C));
memset(D, '\0', sizeof(D));memset(E, '\0', sizeof(E));memset(F, '\0', sizeof(F));
memset(G, '\0', sizeof(G));memset(H, '\0', sizeof(H));memset(I, '\0', sizeof(I));memset(J, '\0', sizeof(J));
printk("A:%s\nB:%s\nC:%s\nD:%s\nE:%s\nF:%s\nG:%s\nH:%s\nI:%s\nJ:%s\n",A,B,C,D,E,F,G,H,I,J);
memset(A,'-',sizeof(A));
memset(B,'-',sizeof(B));
memset(C,'-',sizeof(C));
memset(D,'-',sizeof(D));
memset(E,'-',sizeof(E));
memset(F,'-',sizeof(F));
memset(G,'-',sizeof(G));
memset(H,'-',sizeof(H));
memset(I,'-',sizeof(I));
memset(J,'-',sizeof(J));
printk("A:%s\nB:%s\nC:%s\nD:%s\nE:%s\nF:%s\nG:%s\nH:%s\nI:%s\nJ:%\n",A,B,C,D,E,F,G,H,I,J);
len = sprintf(page,"0:%s(%d)\n1:%s(%d)\n2:%s(%d)\n3:%s(%d)\n4:%s(%d)\n5:%s(%d)\n6:%s(%d)\n7:%s(%d)\n8:%s(%d)\n9:%s(%d)\n",A,ProcPerDepth[0],B,ProcPerDepth[1],C,ProcPerDepth[2],D,ProcPerDepth[3],E,ProcPerDepth[4],F,ProcPerDepth[5],G,ProcPerDepth[6],H,ProcPerDepth[7],I,ProcPerDepth[8],J,ProcPerDepth[9]);
return len;
}
it worked out with this:
char s[500];
memset(s,'-',498);
for (i=len=0;i<10;++i){
len+=sprintf(page+len,"%d:%.*s(%d)\n",i,ProcPerDepth[i],s,ProcPerDepth[i]);
}
I wonder if there is an easy flag to multiply string char in sprintf. thanx –
Here are a some issues:
You have entirely filled the A, B, C ... arrays with characters. Then, you pass them to an I/O routine that is expecting null-terminated strings. Because your strings are not null-terminated, printk() will keep printing whatever is in stack memory after your object until it finds a null by luck.
Multi-threaded kernels like Linux have strict and relatively small constraints regarding stack allocations. All instances in the kernel call chain must fit into a specific size or something will be overwritten. You may not get any detection of this error, just some kind of downstream crash as memory corruption leads to a panic or a wedge. Allocating large and variable arrays on a kernel stack is just not a good idea.
If you are going to write the tmp[] array and properly nul-terminate it, there is no reason to also initialize it. But if you were going to initialize it, you could do so with compiler-generated code by just saying: char tmp[1024] = { 0 }; (A partial initialization of an aggregate requires by C99 initialization of the entire aggregate.) A similar observation applies to the other arrays.
How about getting rid of most of those arrays and most of that code and just doing something along the lines of:
for(i = j = 0; i < n; ++i)
j += sprintf(page + j, "...", ...)

Resources