How to understand the macro of "page_align" in kernel? - linux

the codes is:
#define PAGE_SHIFT 12
#define PAGE_SIZE (1UL << PAGE_SHIFT)
#define PAGE_MASK (~(PAGE_SIZE-1))
#define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK)
I know this marco aligns
any address on the page boundary. How to understand this implement
?

It rounds up an addr to the next multiple of 4096 (i.e. 212), by adding 4095 (i.e. 212-1) to it and clearing the lowest 12 bits

Related

How to evaluate the flags field of mtd_info_user?

The MTD driver placed inside the Linux kernel source has a definition as below.
struct mtd_info_user {
__u8 type;
__u32 flags;
__u32 size; /* Total size of the MTD */
__u32 erasesize;
__u32 writesize;
__u32 oobsize; /* Amount of OOB data per block (e.g. 16) */
__u64 padding; /* Old obsolete field; do not use */
};
I'm trying to understand what the flags field stands for. My ultimate purpose is to find a way to check if the external MTD device is healthy. I thought that the flags field can represent the actual status of the device.
Inside the mtdchar_open(..) function from mtdchar.c source code, there is a comparison as below.
/* You can't open it RW if it's not a writeable device */
if ((file->f_mode & FMODE_WRITE) && !(mtd->flags & MTD_WRITEABLE)) {
ret = -EACCES;
goto out1;
}
So, I guess the flags field can be evaluated by using the macros from the mtd-abi.h header.
#define MTD_ABSENT 0
#define MTD_RAM 1
#define MTD_ROM 2
#define MTD_NORFLASH 3
#define MTD_NANDFLASH 4 /* SLC NAND */
#define MTD_DATAFLASH 6
#define MTD_UBIVOLUME 7
#define MTD_MLCNANDFLASH 8 /* MLC NAND (including TLC) */
#define MTD_WRITEABLE 0x400 /* Device is writeable */
#define MTD_BIT_WRITEABLE 0x800 /* Single bits can be flipped */
#define MTD_NO_ERASE 0x1000 /* No erase necessary */
#define MTD_POWERUP_LOCK 0x2000 /* Always locked after reset */
The problem is about why some of those definitions are declared like normal integers, e.g. MTD_ABSENT.

Ext2/3: Block Type Clarification: IND vs DIND vs TIND

I'm seeing references to "IND" vs "DIND" vs "TIND" block-types in a few places, whereas the definition in the code is very terse:
(https://github.com/torvalds/linux/blob/master/fs/ext4/ext4.h#L362)
#define EXT4_NDIR_BLOCKS 12
#define EXT4_IND_BLOCK EXT4_NDIR_BLOCKS
#define EXT4_DIND_BLOCK (EXT4_IND_BLOCK + 1)
#define EXT4_TIND_BLOCK (EXT4_DIND_BLOCK + 1)
#define EXT4_N_BLOCKS (EXT4_TIND_BLOCK + 1)
Can someone clarify what they are, as well as, potentially, why the definitions imply that a TIND block includes a DIND, and a DIND block includes a IND block.
I've looked, feverishly, but there aren't any obvious discussions or comments on the subject and it's going to take me a bit more time to figure it out from the code.
#define EXT4_NDIR_BLOCKS /* number of direct blocks */
#define EXT4_IND_BLOCK /* single indirect block */
#define EXT4_DIND_BLOCK /* double indirect block */
#define EXT4_TIND_BLOCK /* trible indirect block */
#define EXT4_N_BLOCKS /* total number of blocks */
NDIR is the number of direct blocks.
IND is the single indirect block.
DIND is the double indirect block.
TIND is the trible indirect block
N is the total number of blocks.

Kmalloc Alignment

Lets say I allocate with kmalloc an array of uint64_t (and lets assume the size of the array is 32kB). I have the following questions :
1) Is the array guaranteed to be page aligned ?
2) Is the array guaranteed to be cache / block aligned ?
3) Is there no guarantee at all ?
When I allocate the array , and i use virt_to_phys to get the physical address of the array i am gettign physical addresses like 00000040142d5c00 and virtual addresses like fffffe07df400000
Is there any chance that i will end up with alignment smaller than uint64_t , lets say 4 byte alignment or not ?
Thank you in advance
The alignment defined by preprocessor constant ARCH_KMALLOC_MINALIGN,
it was calculeted like this:
#if defined(ARCH_DMA_MINALIGN) && ARCH_DMA_MINALIGN > 8
#define ARCH_KMALLOC_MINALIGN ARCH_DMA_MINALIGN
#define KMALLOC_MIN_SIZE ARCH_DMA_MINALIGN
#define KMALLOC_SHIFT_LOW ilog2(ARCH_DMA_MINALIGN)
#else
#define ARCH_KMALLOC_MINALIGN __alignof__(unsigned long long)
#endif
So in theory __alignof__(unsigned long long) may return some smaller then 8
on some exotic case,
but in practice __alignof__(unsigned long long) >= 8, and so ARCH_KMALLOC_MINALIGN would be >= 8.

User can define the virtual address ignoring API?

It was strange to me when I saw this:
#define HI3518_IOCH1_PHYS 0x10000000 /* 0x1000_0000 ~ 0x1020_0000 */
#define HI3518_IOCH2_PHYS 0x20000000 /* 0x2000_0000 ~ 0x2020_0000 */
#define HI3518_IOCH1_SIZE 0x200000
#define HI3518_IOCH2_SIZE 0x700000
#define HI3518_IOCH1_VIRT 0xFE000000
#define HI3518_IOCH2_VIRT (HI3518_IOCH1_VIRT + HI3518_IOCH1_SIZE)
This is located at arch/arm/mach-hi3518/include/mach/io.h
The last 2 line puzzle me, is that mean user can determine the virtual address themselves, don't need to use APIs?
Thanks in advance!

arch/x86/include/asm/unistd.h vs. include/asm-generic/unistd.h

What's the difference between these two files? I can't really get it. I should mention that the first file should be arch/x86/include/asm/unistd_32.h (or and _64.h). Here is a quick preview of what they contain:
arch/x86/include/asm/unistd.h:
#ifndef _ASM_X86_UNISTD_32_H
#define _ASM_X86_UNISTD_32_H
/*
* This file contains the system call numbers.
*/
#define __NR_restart_syscall 0
#define __NR_exit 1
#define __NR_fork 2
#define __NR_read 3
#define __NR_write 4
#define __NR_open 5
#define __NR_close 6
#define __NR_waitpid 7
#define __NR_creat 8
#define __NR_link 9
#define __NR_unlink 10
#define __NR_execve 11
#define __NR_chdir 12
#define __NR_time 13
#define __NR_mknod 14
#define __NR_chmod 15
#define __NR_lchown 16
#define __NR_break 17
#define __NR_oldstat 18
#define __NR_lseek 19
#define __NR_getpid 20
#define __NR_mount 21
#define __NR_umount 22
include/asm-generic/unistd.h
#if !defined(_ASM_GENERIC_UNISTD_H) || defined(__SYSCALL)
#define _ASM_GENERIC_UNISTD_H
#include <asm/bitsperlong.h>
/*
* This file contains the system call numbers, based on the
* layout of the x86-64 architecture, which embeds the
* pointer to the syscall in the table.
*
* As a basic principle, no duplication of functionality
* should be added, e.g. we don't use lseek when llseek
* is present. New architectures should use this file
* and implement the less feature-full calls in user space.
*/
#ifndef __SYSCALL
#define __SYSCALL(x, y)
#endif
#if __BITS_PER_LONG == 32
#define __SC_3264(_nr, _32, _64) __SYSCALL(_nr, _32)
#else
#define __SC_3264(_nr, _32, _64) __SYSCALL(_nr, _64)
#endif
#define __NR_io_setup 0
__SYSCALL(__NR_io_setup, sys_io_setup)
#define __NR_io_destroy 1
__SYSCALL(__NR_io_destroy, sys_io_destroy)
#define __NR_io_submit 2
__SYSCALL(__NR_io_submit, sys_io_submit)
#define __NR_io_cancel 3
__SYSCALL(__NR_io_cancel, sys_io_cancel)
#define __NR_io_getevents 4
__SYSCALL(__NR_io_getevents, sys_io_getevents)
/* fs/xattr.c */
#define __NR_setxattr 5
__SYSCALL(__NR_setxattr, sys_setxattr)
#define __NR_lsetxattr 6
__SYSCALL(__NR_lsetxattr, sys_lsetxattr)
#define __NR_fsetxattr 7
__SYSCALL(__NR_fsetxattr, sys_fsetxattr)
#define __NR_getxattr 8
__SYSCALL(__NR_getxattr, sys_getxattr)
#define __NR_lgetxattr 9
__SYSCALL(__NR_lgetxattr, sys_lgetxattr)
#define __NR_fgetxattr 10
__SYSCALL(__NR_fgetxattr, sys_fgetxattr)
#define __NR_listxattr 11
__SYSCALL(__NR_listxattr, sys_listxattr)
#define __NR_llistxattr 12
I don't have a definite answer, but it's not uncommon for redundant files to exist when devs try to shift from old mechanisms to newer ones. Your case here looks quite similar.
If you checkout the 3.4 kernel, you will find that both arch/x86/include/asm/unistd_32.h and arch/x86/include/asm/unistd_64.h are gone. Instead, they are generated using arch/x86/syscalls.
Checkout the latest kernel (3.4.2 stable works for me), and do a "git log --stat arch/x86/include/asm", search for unistd_64.h or unistd_32.h or unistd.h.
I found the following commit might be interesting to you.
commit 303395ac3bf3e2cb488435537d416bc840438fcb
I've never touched syscalls before, so I'd rather not say too much. git log is usually how I sort out confusing files. You can also get into makefiles if you are good at it. (I am not, so I rely on git log. )

Resources