What is the valid address space for a user process? (OS X and Linux) - linux

The mmap system call documentation says that the function will fail if:
MAP_FIXED was specified and the addr
argument was not page aligned, or part
of the desired address space resides
out of the valid address space for a
user process.
I can't find documentation anywhere saying what would be a valid address to map. (I'm interested in doing this on OS X and linux, ideally the same address would be valid for both...).

Linux kernel reserves part of virtual address space for itself to where userspace have (almost) no access and can't map anything. You're looking for what's called "userspace/kernelspace split".
On i386 arch default is 3G/1G one -- userspace gets lower 3 GB of virtual address space, kernel gets upper 1 GB, additionally there are 2G/2G and 1G/3G splits:
config PAGE_OFFSET
hex
default 0xB0000000 if VMSPLIT_3G_OPT
default 0x80000000 if VMSPLIT_2G
default 0x78000000 if VMSPLIT_2G_OPT
default 0x40000000 if VMSPLIT_1G
default 0xC0000000
depends on X86_32
On x86_64, userspace lives in lower half of (currently) 48-bit of virtual address space:
/*
* User space process size. 47bits minus one guard page.
*/
#define TASK_SIZE_MAX ((1UL << 47) - PAGE_SIZE)

This varies based on a number of factors, many of which aren't under your control. As adobriyan mentioned, depending on the OS, you have various fixed upper limits, beyond which kernel code and data lies. Usually this upper limit is at least 2GB on 32-bit OSes; some OSes provide additional address space. 64-bit OSes generally provide an upper limit controlled by the number of virtual address bits supported by your CPU (usually at least 40 bits worth of address space). However there are yet other factors beyond your control:
On recent version of linux, mmap mappings below the address configured in /proc/sys/vm/mmap_min_addr will be denied.
You cannot make mappings which overlap with any existing mappings. Since the dynamic linker is free to map anywhere that does not overlap with your executable's fixed sections, this means potentially any address may be denied.
The kernel may inject other additional mappings, such as the system call gate.
malloc may perform mmaps on its own, which are placed in a somewhat arbitrary location
As such, there is no way to absolutely guarentee that MAP_FIXED will succeed, and so it should normally be avoided.
The only place I've seen where MAP_FIXED is necessary is in the wine startup code, which reserves (using MAP_FIXED) all addresses above 2G, in order to avoid confusing windows code which assumes no mappings will ever show up with a negative address. This is, of course, a highly specialized use of the flag.
If you're trying to do this in order to avoid having to deal with offsets in shared memory, one option would be to wrap pointers in a class to automatically handle offsets:
template<typename T>
class offset_pointer {
private:
ptrdiff_t offset;
public:
typedef T value_type, *ptr_type;
typedef const T const_type, *const_ptr_type;
offset_ptr(T *p) { set(p); }
offset_ptr() { set(NULL); }
void set(T *p) {
if (p == NULL)
offset = 1;
else
offset = (char *)p - (char *)this;
}
T *get() {
if (offset == 1) return NULL;
return (T*)( (char *)this + offset );
}
const T *get() const { return const_cast<offset_pointer>(this)->get(); }
T &operator*() { return *get(); }
const T &operator*() const { return *get(); }
T *operator->() { return get(); }
const T *operator->() const { return get(); }
operator T*() { return get(); }
operator const T*() const { return get(); }
offset_pointer operator=(T *p) { set(p); return *this; }
offset_pointer operator=(const offset_pointer &other) {
offset = other.offset;
return *this;
}
};
Note: This is untested code, but should give you the basic idea.

Related

Kernel API to get Physical RAM Offset

I'm writing a device driver (for Linux kernel 2.6.x) that interacts directly with physical RAM using physical addresses. For my device's memory layout (according to the output of cat /proc/iomem), System RAM begins at physical address 0x80000000; however, this code may run on other devices with different memory layouts so I don't want to hard-code that offset.
Is there a function, macro, or constant which I can use from within my device driver that gives me the physical address of the first byte of System RAM?
Is there a function, macro, or constant which I can use from within my device driver that gives me the physical address of the first byte of System RAM?
It doesn't matter, because you're asking an XY question.
You should not be looking for or trying to use the "first byte of System RAM" in a device driver.
The driver only needs knowledge of the address (and length) of its register block (that is what this "memory" is for, isn't it?).
In 2.6 kernels (i.e. before Device Tree), this information was typically passed to drivers through struct resource and struct platform_device definitions in a board_devices.c file.
The IORESOURCE_MEM property in the struct resource is the mechanism to pass the device's memory block start and end addresses to the device driver.
The start address is typically hardcoded, and taken straight from the SoC datasheet or the board's memory map.
If you change the SoC, then you need new board file(s).
As an example, here's code from arch/arm/mach-at91/at91rm9200_devices.c to configure and setup the MMC devices for a eval board (AT91RM9200_BASE_MCI is the physical memory address of this device's register block):
#if defined(CONFIG_MMC_AT91) || defined(CONFIG_MMC_AT91_MODULE)
static u64 mmc_dmamask = DMA_BIT_MASK(32);
static struct at91_mmc_data mmc_data;
static struct resource mmc_resources[] = {
[0] = {
.start = AT91RM9200_BASE_MCI,
.end = AT91RM9200_BASE_MCI + SZ_16K - 1,
.flags = IORESOURCE_MEM,
},
[1] = {
.start = AT91RM9200_ID_MCI,
.end = AT91RM9200_ID_MCI,
.flags = IORESOURCE_IRQ,
},
};
static struct platform_device at91rm9200_mmc_device = {
.name = "at91_mci",
.id = -1,
.dev = {
.dma_mask = &mmc_dmamask,
.coherent_dma_mask = DMA_BIT_MASK(32),
.platform_data = &mmc_data,
},
.resource = mmc_resources,
.num_resources = ARRAY_SIZE(mmc_resources),
};
void __init at91_add_device_mmc(short mmc_id, struct at91_mmc_data *data)
{
if (!data)
return;
/* input/irq */
if (data->det_pin) {
at91_set_gpio_input(data->det_pin, 1);
at91_set_deglitch(data->det_pin, 1);
}
if (data->wp_pin)
at91_set_gpio_input(data->wp_pin, 1);
if (data->vcc_pin)
at91_set_gpio_output(data->vcc_pin, 0);
/* CLK */
at91_set_A_periph(AT91_PIN_PA27, 0);
if (data->slot_b) {
/* CMD */
at91_set_B_periph(AT91_PIN_PA8, 1);
/* DAT0, maybe DAT1..DAT3 */
at91_set_B_periph(AT91_PIN_PA9, 1);
if (data->wire4) {
at91_set_B_periph(AT91_PIN_PA10, 1);
at91_set_B_periph(AT91_PIN_PA11, 1);
at91_set_B_periph(AT91_PIN_PA12, 1);
}
} else {
/* CMD */
at91_set_A_periph(AT91_PIN_PA28, 1);
/* DAT0, maybe DAT1..DAT3 */
at91_set_A_periph(AT91_PIN_PA29, 1);
if (data->wire4) {
at91_set_B_periph(AT91_PIN_PB3, 1);
at91_set_B_periph(AT91_PIN_PB4, 1);
at91_set_B_periph(AT91_PIN_PB5, 1);
}
}
mmc_data = *data;
platform_device_register(&at91rm9200_mmc_device);
}
#else
void __init at91_add_device_mmc(short mmc_id, struct at91_mmc_data *data) {}
#endif
ADDENDUM
i'm still not seeing how this is an xy question.
I consider it an XY question because:
You conflate "System RAM" with physical memory address space.
"RAM" would be actual (readable/writable) memory that exists in the address space.
"System memory" is the RAM that the Linux kernel manages (refer to your previous question).
Peripherals can have registers and/or device memory in (physical) memory address space, but this should not be called "System RAM".
You have not provided any background on how or why your driver "interacts directly with physical RAM using physical addresses." in a manner that is different from other Linux drivers.
You presume that a certain function is the solution for your driver, but you don't know the name of that function. That's a prototype for an XY question.
can't i call a function like get_platform_device (which i just made up) to get the struct platform_device and then find the struct resource that represents System RAM?
The device driver would call platform_get_resource() (in its probe function) to retrieve its struct resource that was defined in the board file.
To continue the example started above, the driver's probe routune has:
static int __init at91_mci_probe(struct platform_device *pdev)
{
struct mmc_host *mmc;
struct at91mci_host *host;
struct resource *res;
int ret;
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (!res)
return -ENXIO;
if (!request_mem_region(res->start, resource_size(res), DRIVER_NAME))
return -EBUSY;
...
/*
* Map I/O region
*/
host->baseaddr = ioremap(res->start, resource_size(res));
if (!host->baseaddr) {
ret = -ENOMEM;
goto fail1;
}
that would allow me to write code that can always access the nth byte of RAM, without assumptions of how RAM is arranged in relation to other parts of memory.
That reads like a security hole or a potential bug.
I challenge you to to find a driver in the mainline Linux kernel that uses the "physical address of the first byte of System RAM".
Your title is "Kernel API to get Physical RAM Offset".
The API you are looking would seem to be the struct resource.
What you want to do seems to fly in the face of Linux kernel conventions. For the integrity and security of the system, drivers do not try to access any/every part of memory.
The driver will request and can be given exclusive access to the address space of its registers and/or device memory.
All RAM under kernel management is only accessed through well-defined conventions, such as buffer addresses for copy_to_user() or the DMA API.
A device driver simply does not have free reign to access any part of memory it chooses.
Once a driver is started by the kernel, there is absolutely no way it can disregard "assumptions of how RAM is arranged".
RAM memory mapping is based identical to each SOC or processor. Vendor usually provide their memory mapping related documents to the user.
Probably you need to refer your processor datasheet related document.
As you said memory is hard-coded. In most cases memory mapping is hard-coded over device-tree or in SDRAM driver itself.
Maybe you could look into memblock struct and memblock_start_of_DRAM().
/sys/kernel/debug/memblock/memory represent memory banks.
0: 0x0000000940000000..0x0000000957bb5fff
RAM bank 0: 0x0000000940000000 0x0000000017bb6000
1: 0x0000000980000000..0x00000009ffffffff
RAM bank 1: 0x0000000980000000 0x0000000080000000

How to get the page directory of a pointer in xv6

Here is my 'translate()' in proc.c
I want to get the physical address given the virtual address of a pointer, but I don't know how to get the pointers pgdir(page directory)...
int translate(void* vaddr)
{
cprintf("vaddr = %p\n",vaddr);
int paddr;
pde_t *pgdir;
pte_t *pgtab;
pde_t *pde;
pte_t *pte;
pgdir = (pde_t*)cpu->ts.cr3;
cprintf("page directory base is: %p\n",cpu->ts.cr3);
pde = &pgdir[PDX(vaddr)];
if(*pde & PTE_P){
pgtab = (pte_t*)P2V(PTE_ADDR(*pde));
}else{
cprintf("pde = %d\n",*pde);
cprintf("PTE_P = %d\n",PTE_P);
cprintf("pte not present\n");
return -1;
}
pte = &pgtab[PTX(vaddr)];
paddr = PTE_ADDR(*pte);
cprintf("the virtual address is %p\n",vaddr);
cprintf("the physical address is %d\n",paddr);
return 0;
}
You need to use argint() or argptr() to read the argument.
There's a global variable proc that lives in proc.h.
proc.h, lines 34 to 43
// Per-CPU variables, holding pointers to the
// current cpu and to the current process.
// The asm suffix tells gcc to use "%gs:0" to refer to cpu
// and "%gs:4" to refer to proc. seginit sets up the
// %gs segment register so that %gs refers to the memory
// holding those two variables in the local cpu's struct cpu.
// This is similar to how thread-local variables are implemented
// in thread libraries such as Linux pthreads.
extern struct cpu *cpu asm("%gs:0"); // &cpus[cpunum()]
extern struct proc *proc asm("%gs:4"); // cpus[cpunum()].proc
You can reference proc->pgdir in proc.c or anywhere else that proc.h is included.
For what it's worth your translate function looks good to me.

Ripping out the hidden kernel module by reading kernel memory directly?

Is it possible to find hidden kernel modules by reading kernel memory directly?
By hiding I mean a LKM that removes itself from the kernel module list.
If so, what structure should I expect, or what document should I read?
following #Eugene, I find a way to read kernel memory directly to find the so called not-so-clever hidden module: just compare the module from both procfs perspective and sysfs perspective:
static int detect_hidden_mod_init(void)
{
char *procfs_modules[MAX_MODULE_SIZE];
char *sysfs_modules[MAX_MODULE_SIZE];
int proc_module_index = 0, sys_module_index = 0;
struct module *mod;
struct list_head *p;
// get modules from procfs perspective
list_for_each(p, &__this_module.list){
mod = list_entry(p, struct module, list);
procfs_modules[proc_module_index++] = mod->name;
}
// get modules from sysfs perspective
struct kobject *kobj;
struct kset *kset = __this_module.mkobj.kobj.kset;
list_for_each(p, &kset->list) {
kobj = container_of(p, struct kobject, entry);
sysfs_modules[sys_module_index++] = kobj->k_name;
}
//compare the procfs_modules and sysfs_modules
...
}
Actually it can detect most of current module-hidden rootkit, however as Eugene said, "A clever rootkit could try to hide that data as well". So it is not a perfect way.

Why the bad_alloc(const char*) was made private in Visual C++ 2012?

I am just trying to compile a bit bigger project using the Visual Studio 2012 Release Candidate, C++. The project was/is compiled using the VS2010 now. (I am just greedy to get the C++11 things, so I tried. :)
Apart of things that I can explain by myself, the project uses the code like this:
ostringstream ostr;
ostr << "The " __FUNCTION__ "() failed to malloc(" << i << ").";
throw bad_alloc(ostr.str().c_str());
The compiler now complains
error C2248: 'std::bad_alloc::bad_alloc' : cannot access private member declared
in class 'std::bad_alloc'
... which is true. That version of constructor is now private.
What was the reason to make that version of constructor private? Is it recommended by C++11 standard not to use that constructor with the argument?
(I can imagine that if allocation failed, it may cause more problems to try to construct anything new. However, it is only my guess.)
Thanks,
Petr
The C++11 Standard defines bad_alloc as such (18.6.2.1):
class bad_alloc : public exception {
public:
bad_alloc() noexcept;
bad_alloc(const bad_alloc&) noexcept;
bad_alloc& operator=(const bad_alloc&) noexcept;
virtual const char* what() const noexcept;
};
With no constructor that takes a string. A vendor providing such a constructor would make the code using it not portable, as other vendors are not obliged to provide it.
The C++03 standard defines a similar set of constructors, so VS didn't follow this part of the standard even before C++11. MS does try to make VS as standard compliant as possible, so they've probably just used the occasion (new VS, new standard) to fix an incompatibility.
Edit: Now that I've seen VS2012's code, it is also clear why the mentioned constructor is left private, instead of being completely removed: there seems to be only one use of that constructor, in the bad_array_new_length class. So bad_array_new_length is declared a friend in bad_alloc, and can therefore use that private constructor. This dependency could have been avoided if bad_array_new_length just stored the message in the pointer used by what(), but it's not a lot of code anyway.
If you are accustomed to passing a message when you throw a std::bad_alloc, a suitable technique is to define an internal class that derives from std::bad_alloc, and override ‘what’ to supply the appropriate message.
You can make the class public and call the assignment constructor directly, or make a helper function, such as throw_bad_alloc, which takes the parameters (and additional scalar information) and stores them in the internal class.
The message is not formatted until ‘what’ is called. In this way, stack unwinding may have freed some memory so the message can be formatted with the actual reason (memory exhaustion, bad request size, heap corruption, etc.) at the catch site. If formatting fails, simply assign and return a static message.
Trimmed example:
(Tip: The copy constructor can just assign _Message to nullptr, rather than copy the message since the message is formatted on demand. The move constructor, of course can just confiscate it :-).
class internal_bad_alloc: public std::bad_alloc
{
public:
// Default, copy and move constructors....
// Assignment constructor...
explicit internal_bad_alloc(int errno, size_t size, etc...) noexcept:
std::bad_alloc()
{
// Assign data members...
}
virtual ~internal_bad_alloc(void) noexcept
{
// Free _Message data member (if allocated).
}
// Override to format and return the reason:
virtual const char* what(void) const noexcept
{
if (_Message == nullptr)
{
// Format and assign _Message. Assign the default if the
// format fails...
}
return _Message;
}
private:
// Additional scalar data (error code, size, etc.) pass into the
// constructor and used when the message is formatted by 'what'...
mutable char* _Message;
static char _Default[];
}
};
//
// Throw helper(s)...
//
extern void throw_bad_alloc(int errno, size_t size, etc...)
{
throw internal_bad_alloc(errno, size, etc...);
}

Maximum number of nested conditions allowed

Does anyone knows the limit of nested conditions (I mean conditions under another, several times)? In, let's say, Java and Visual Basic.
I remembered when I was beginning with my developing trace, I make, I think 3 nested conditions in VB 6, and the compiler, just didn't enter the third one, now that I remember I never, knew the maximun nested coditions a language can take.
No limit should exist for a REAL programming language. For VB.NET and Java I would be shocked if there is any limit. The limit would NOT be memory, because we are talking about COMPILE TIME constraints, not executing environment constraints.
This works just find in C#: It should be noted that the compiler might optimize this to not even use the IFs.
static void Main(string[] args)
{
if (true)
{
if (true)
{
if (true)
{
if (true)
{
if (true)
{
if (true)
{
if (true)
{
if (true)
{
if (true)
{
if (true)
{
if (true)
{
if (true)
{
if (true)
{
if (true)
{ Console.WriteLine("It works"); }
}
}
}
}
}
}
}
}
}
}
}
}
}
}
This should not be optimized too much:
static void Main(string[] args)
{
if (DateTime.Now.Month == 1)
{
if (DateTime.Now.Year == 2011)
{
if (DateTime.Now.Month == 1)
{
if (DateTime.Now.Year == 2011)
{
if (DateTime.Now.Month == 1)
{
if (DateTime.Now.Year == 2011)
{
if (DateTime.Now.Month == 1)
{
if (DateTime.Now.Year == 2011)
{
if (DateTime.Now.Month == 1)
{
if (DateTime.Now.Year == 2011)
{
if (DateTime.Now.Month == 1)
{
if (DateTime.Now.Year == 2011)
{
if (DateTime.Now.Month == 1)
{
if (DateTime.Now.Year == 2011)
{ Console.WriteLine("It works"); }
}
}
}
}
}
}
}
}
}
}
}
}
}
Console.ReadKey();
}
I agree with most of the people here that there is no limit on writing the if blocks. But there is a max limit on the java method size. I believe its 64K.
If you mean nested if blocks then there is no theoretical limit. The only bound is the available disk space to store the source code and/or compiled code. There may also be a runtime limit if each block generates a new stack frame, but again that is just a memory limit.
The only explanation for your empirical result of 3 is either an error in programming or an error in interpreting the results.
I must agree that the limit is purely based on memory limit. If you reached it, I would expect that you would reach some kind of stack overflow limit, however I doubt there is a possibility that you would reach this limit.
I could not find a source of reference to back this up, but a quick test of 40+ nested if statements compiled and ran fine.
The limit to the number of nested conditionals will almost certainly be based upon the size of the compiler's stack and data structures, and not anything to do with the run-time environment except possibly in cases where the code space of the target environment is severely constrained relative to the memory available to the compiler (e.g. using a modern PC to compile a program for a small microcontroller with 512 bytes of flash). Note that no RAM (beyond any used to store the code itself) is required at run-time to evaluate a deeply-nested combination of logical operators, other than whatever would be required by the most complex term thereof (i.e. memory required to compute '(foo() || bar()) && boz()' is the largest of the memory required to compute foo(), bar(), or boz()).
In practical terms, there is no way one would reach a limit using a modern compiler on a modern machine, unless one were writing a "program" for the specific purpose of exceeding it (I'd expect the limit would probably be between 1,000 and 1,000,000 levels of nesting, but even if it's "only" 1,000 there's no reason to nest things that deep).
Logically I would think that the limit would be based on the memory available to the application in Java.

Resources