Geting numbers of files in specific directory in Linux - linux

Is there anyway to get the total number of files in a specific directory not iterating readdir(3)?
I mean only direct members of a specific directory.
It seems that the only way to get the number of files is calling readdir(3) repeatedly until it returns zero.
Is there any other way to get the number in O(1)? I need a solution that works in Linux.
Thanks.

scandir() example, requires dirent.h:
struct dirent **namelist;
int n=scandir(".", &namelist, 0, alphasort); // "." == current directory.
if (n < 0)
{
perror("scandir");
exit(1);
}
printf("files found = %d\n", n);
free(namelist);

I don't think it is possible in O(1).
Just think about inode structure. There's no clue for this.
But if it's OK to get number of files in a filesystem,
you can use statvfs(2).
#include <sys/vfs.h> /* or <sys/statfs.h> */
int statfs(const char *path, struct statfs *buf);
struct statfs {
__SWORD_TYPE f_type; /* type of file system (see below) */
__SWORD_TYPE f_bsize; /* optimal transfer block size */
fsblkcnt_t f_blocks; /* total data blocks in file system */
fsblkcnt_t f_bfree; /* free blocks in fs */
fsblkcnt_t f_bavail; /* free blocks available to
unprivileged user */
fsfilcnt_t f_files; /* total file nodes in file system */
fsfilcnt_t f_ffree; /* free file nodes in fs */
fsid_t f_fsid; /* file system id */
__SWORD_TYPE f_namelen; /* maximum length of filenames */
__SWORD_TYPE f_frsize; /* fragment size (since Linux 2.6) */
__SWORD_TYPE f_spare[5];
};
You can easily get number of files via f_files - f_ffree.
BTW, This is very interesting question. so I voted it up.

In shell script it's damn simple
ll directory_path | wc -l

Related

Where do we define the type of structure returned by the kernel when using the "perf_event_open" system call using mmap?

I'm trying to use the syscall perf_event_open to get some performance data from the system.
I am currently working on periodic data retrieval using shared memory with a ring buffer.
But I can't find what structure is returned in each section of the ring buffer. The manual page enumerate all possibilities, but that's all.
I can't figure out which member of the perf_event_attr structure to fill in to control what type of structure will be returned to the ring buffer.
If you have some informations about that, I'll be happy to read it !
The https://github.com/torvalds/linux/blob/master/tools/perf/design.txt documentation has description of mmaped ring, and perf script / perf script -D can decode ring data when it is saved as perf.data file. Some parts of the doc are outdated, but it is still useful for perf_event_open syscall description.
First mmap page is metadata page, rest 2^n pages are filled with events where every event has header struct perf_event_header of 8 bytes.
Like stated, asynchronous events, like counter overflow or PROT_EXEC
mmap tracking are logged into a ring-buffer. This ring-buffer is
created and accessed through mmap().
The mmap size should be 1+2^n pages, where the first page is a
meta-data page (struct perf_event_mmap_page) that contains various
bits of information such as where the ring-buffer head is.
/*
* Structure of the page that can be mapped via mmap
*/
struct perf_event_mmap_page {
__u32 version; /* version number of this structure */
__u32 compat_version; /* lowest version this is compat with */
...
}
The following 2^n pages are the ring-buffer which contains events of the form:
#define PERF_RECORD_MISC_KERNEL (1 << 0)
#define PERF_RECORD_MISC_USER (1 << 1)
#define PERF_RECORD_MISC_OVERFLOW (1 << 2)
struct perf_event_header {
__u32 type;
__u16 misc;
__u16 size;
};
enum perf_event_type
The design.txt doc has incorrect values for enum perf_event_type, check actual perf_events kernel subsystem source codes - https://github.com/torvalds/linux/blob/master/include/uapi/linux/perf_event.h#L707. That uapi/linux/perf_event.h file also has some struct hints in comments, like
* #
* # The RAW record below is opaque data wrt the ABI
* #
* # That is, the ABI doesn't make any promises wrt to
* # the stability of its content, it may vary depending
* # on event, hardware, kernel version and phase of
* # the moon.
* #
* # In other words, PERF_SAMPLE_RAW contents are not an ABI.
* #
*
* { u32 size;
* char data[size];}&& PERF_SAMPLE_RAW
*...
* { u64 size;
* char data[size];
* u64 dyn_size; } && PERF_SAMPLE_STACK_USER
*...
PERF_RECORD_SAMPLE = 9,

How to read file time stamps in a Directory

I want to read the stat of all the files within a directory in C. (linux: Fedora)
i have declared this structure:
struct stat st = {0};
Then I check for the existence of the directory.
if(stat("/home/gadre/Source",&st) == -1)
{
status = mkdir("/home/gadre/Source", 0777);
}
syslog(LOG_INFO, "Source Directory stage completed\n");
where stat is:
struct stat {
dev_t st_dev; /* ID of device containing file */
ino_t st_ino; /* inode number */
mode_t st_mode; /* protection */
nlink_t st_nlink; /* number of hard links */
uid_t st_uid; /* user ID of owner */
gid_t st_gid; /* group ID of owner */
dev_t st_rdev; /* device ID (if special file) */
off_t st_size; /* total size, in bytes */
blksize_t st_blksize; /* blocksize for file system I/O */
blkcnt_t st_blocks; /* number of 512B blocks allocated */
time_t st_atime; /* time of last access */
time_t st_mtime; /* time of last modification */
time_t st_ctime; /* time of last status change */
};
now once i enter the directory I would want to check the last modification time st_mtime
of each file.
Any ideas what data structure I should be using...should store the fd in a list first and then iterate over it checking... what is the efficient approach.
Thanks.
The generic approach is looping without any list like containers,
Open fullpath of your dir dp = opendir(fullpath)) and get the directory pointer
Loop by reading dp like this while ( (dirp = readdir(dp)) != NULL )
Get the file names from the dirent structure dirp->d_name
Construct a new full path for the filepahts i.e. smth like this filepath = fullpath + "/" + dirp->d_name
and finally perform lstat to get the time-stamp info
P.S. I would prefer to use lstat because one of the files in your directory might be a symbolic link, in this case lstat will return the timestamp of the symbolic link itself and not the timestamp of the file to which it points to

Configure kern.log to give more info about a segfault

Currently I can find in kern.log entries like this:
[6516247.445846] ex3.x[30901]: segfault at 0 ip 0000000000400564 sp 00007fff96ecb170 error 6 in ex3.x[400000+1000]
[6516254.095173] ex3.x[30907]: segfault at 0 ip 0000000000400564 sp 00007fff0001dcf0 error 6 in ex3.x[400000+1000]
[6516662.523395] ex3.x[31524]: segfault at 7fff80000000 ip 00007f2e11e4aa79 sp 00007fff807061a0 error 4 in libc-2.13.so[7f2e11dcf000+180000]
(You see, apps causing segfault are named ex3.x, means exercise 3 executable).
Is there a way to ask kern.log to log the complete path? Something like:
[6...] /home/user/cclass/ex3.x[3...]: segfault at 0 ip 0564 sp 07f70 error 6 in ex3.x[4...]
So I can easily figure out from who (user/student) this ex3.x is?
Thanks!
Beco
That log message comes from the kernel with a fixed format that only includes the first 16 letters of the executable excluding the path as per show_signal_msg, see other relevant lines for segmentation fault on non x86 architectures.
As mentioned by Makyen, without significant changes to the kernel and a recompile, the message given to klogd which is passed to syslog won't have the information you are requesting.
I am not aware of any log transformation or injection functionality in syslog or klogd which would allow you to take the name of the file and run either locate or file on the filesystem in order to find the full path.
The best way to get the information you are looking for is to use crash interception software like apport or abrt or corekeeper. These tools store the process metadata from the /proc filesystem including the process's commandline which would include the directory run from, assuming the binary was run with a full path, and wasn't already in path.
The other more generic way would be to enable core dumps, and then to set /proc/sys/kernel/core_pattern to include %E, in order to have the core file name including the path of the binary.
The short answer is: No, it is not possible without making code changes and recompiling the kernel. The normal solution to this problem is to instruct your students to name their executable <student user name>_ex3.x so that you can easily have this information.
However, it is possible to get the information you desire from other methods. Appleman1234 has provided some alternatives in his answer to this question.
How do we know the answer is "Not possible to the the full path in the kern.log segfault messages without recompiling the kernel":
We look in the kernel source code to find out how the message is produced and if there are any configuration options.
The files in question are part of the kernel source. You can download the entire kernel source as an rpm package (or other type of package) for whatever version of linux/debian you are running from a variety of places.
Specifically, the output that you are seeing is produced from whichever of the following files is for your architecture:
linux/arch/sparc/mm/fault_32.c
linux/arch/sparc/mm/fault_64.c
linux/arch/um/kernel/trap.c
linux/arch/x86/mm/fault.c
An example of the relevant function from one of the files(linux/arch/x86/mm/fault.c):
/*
* Print out info about fatal segfaults, if the show_unhandled_signals
* sysctl is set:
*/
static inline void
show_signal_msg(struct pt_regs *regs, unsigned long error_code,
unsigned long address, struct task_struct *tsk)
{
if (!unhandled_signal(tsk, SIGSEGV))
return;
if (!printk_ratelimit())
return;
printk("%s%s[%d]: segfault at %lx ip %p sp %p error %lx",
task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
tsk->comm, task_pid_nr(tsk), address,
(void *)regs->ip, (void *)regs->sp, error_code);
print_vma_addr(KERN_CONT " in ", regs->ip);
printk(KERN_CONT "\n");
}
From that we see that the variable passed to printout the process identifier is tsk->comm where struct task_struct *tsk and regs->ip where struct pt_regs *regs
Then from linux/include/linux/sched.h
struct task_struct {
...
char comm[TASK_COMM_LEN]; /* executable name excluding path
- access with [gs]et_task_comm (which lock
it with task_lock())
- initialized normally by setup_new_exec */
The comment makes it clear that the path for the executable is not stored in the structure.
For regs->ip where struct pt_regs *regs, it is defined in whichever of the following are appropriate for your architecture:
arch/arc/include/asm/ptrace.h
arch/arm/include/asm/ptrace.h
arch/arm64/include/asm/ptrace.h
arch/cris/include/arch-v10/arch/ptrace.h
arch/cris/include/arch-v32/arch/ptrace.h
arch/metag/include/asm/ptrace.h
arch/mips/include/asm/ptrace.h
arch/openrisc/include/asm/ptrace.h
arch/um/include/asm/ptrace-generic.h
arch/x86/include/asm/ptrace.h
arch/xtensa/include/asm/ptrace.h
From there we see that struct pt_regs is defining registers for the architecture. ip is just: unsigned long ip;
Thus, we have to look at what print_vma_addr() does. It is defined in mm/memory.c
/*
* Print the name of a VMA.
*/
void print_vma_addr(char *prefix, unsigned long ip)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
/*
* Do not print if we are in atomic
* contexts (in exception stacks, etc.):
*/
if (preempt_count())
return;
down_read(&mm->mmap_sem);
vma = find_vma(mm, ip);
if (vma && vma->vm_file) {
struct file *f = vma->vm_file;
char *buf = (char *)__get_free_page(GFP_KERNEL);
if (buf) {
char *p;
p = d_path(&f->f_path, buf, PAGE_SIZE);
if (IS_ERR(p))
p = "?";
printk("%s%s[%lx+%lx]", prefix, kbasename(p),
vma->vm_start,
vma->vm_end - vma->vm_start);
free_page((unsigned long)buf);
}
}
up_read(&mm->mmap_sem);
}
Which shows us that a path was available. We would need to check that it was the path, but looking a bit further in the code gives a hint that it might not matter. We need to see what kbasename() did with the path that is passed to it. kbasename() is defined in include/linux/string.h as:
/**
* kbasename - return the last part of a pathname.
*
* #path: path to extract the filename from.
*/
static inline const char *kbasename(const char *path)
{
const char *tail = strrchr(path, '/');
return tail ? tail + 1 : path;
}
Which, even if the full path is available prior to it, chops off everything except for the last part of a pathname, leaving the filename.
Thus, no amount of runtime configuration options will permit printing out the full pathname of the file in the segment fault messages you are seeing.
NOTE: I've changed all of the links to kernel source to be to archives, rather than the original locations. Those links will get close to the code as it was at the time I wrote this, 2104-09. As should be no surprise, the code does evolve over time, so the code which is current when you're reading this may or may not be similar or perform in the way which is described here.

Program based on shared memory

I am executing the code as given below for the shared memory, but now if i have to give the number of strings and string pattern from the command line, what should i do?? and sebsequently also i have to read the strings and string patterns from shared memory region.
Also if i have to reverse the strings and stored at the same location for that what should i do??
Please help me on this problem.
#define SHMSIZE 500 /*Shared Memory Size given by us */
int main(int argc, char *argv[])
{
int shmid;
key_t key;
char *shm;
key = 5876;
shmid = shmget(key,SHMSIZE,IPC_CREAT| 0666); /*Creating Shared Memory */
if(shmid < 0)
{
perror("shmget");
exit(1);
}
shm = shmat(shmid,NULL,0); /* Shared Memory Attachment */
if(shm == (char *) -1)
{
perror("shmat");
exit(1);
}
printf("Memory attached at %X\n",(int) shm); /* Printing the address where Memory is attached */
sprintf(shm,"God is Great"); /* Write a string to the shared memory */
shmdt(shm); /* Deattach the shared memory segment */
shm = shmat(shmid,(void *) 0x50000000,0); /*Reattach the shared memory segment */
printf("Memory Reattached at %X\n",(int) shm);
printf("%s\n",shm); /* Print the desired string */
return 0;
}
In according to take input from user, you need to parse what passed through argv. Then copy the values into your code and write it over the shared memory region. From your code you can do the following:
sprintf(shm, argv[1]);
to parse the first parameter passed to your shared memory region. And to reverse the string, copy the string from shared memory into a variable, then reverse it and finally, write it into that shared memory region from your client code. Since, you've created shm with 666 permission this should allow client to write on that portion.
Take a look at here in case you need to understand the concept properly ( http://www.cs.cf.ac.uk/Dave/C/node27.html)

How to find definition of structure when reading c program on linux?

I am reading source code of xl2tpd, and face lots of problems when reading this code. For example I cannot find where the structure lac is defined. How do I find the definition of this structure?
I have used ctags and vim to read this code, but failed to find the structure. I googled and could not find the structure. Is there any method that can make the code reading process more comfortable? That is, I can jump to definition of most variables, functions and structures?
try cscope with vim. follow steps below -
1) run cscope -R in xl2tpd directory . it will create file cscope.out
2) open file with vim where structure lac is used
3) use :cs f g <lac> . now it will show the files where lac is defined .
4) choose file.h. it contain the definition .
if you are perticulerly interested in definition of struct lac it is below -
struct lac
{
struct lac *next;
struct host *lns; /* LNS's we can connect to */
struct schedule_entry *rsched;
int tun_rws; /* Receive window size (tunnel) */
int call_rws; /* Call rws */
int rxspeed; /* Tunnel rx speed */
int txspeed; /* Tunnel tx speed */
int active; /* Is this connection in active use? */
int hbit; /* Permit hidden AVP's? */
int lbit; /* Use the length field? */
int challenge; /* Challenge authenticate the peer? */
unsigned int localaddr; /* Local IP address */
unsigned int remoteaddr; /* Force remote address to this */
char authname[STRLEN]; /* Who we authenticate as */
char password[STRLEN]; /* Password to authenticate with */
char peername[STRLEN]; /* Force peer name to this */
char hostname[STRLEN]; /* Hostname to report */
char entname[STRLEN]; /* Name of this entry */
int authpeer; /* Authenticate our peer? */
int authself; /* Authenticate ourselves? */
int pap_require; /* Require PAP auth for PPP */
int chap_require; /* Require CHAP auth for PPP */
int pap_refuse; /* Refuse PAP authentication for us */
int chap_refuse; /* Refuse CHAP authentication for us */
int idle; /* Idle timeout in seconds */
int autodial; /* Try to dial immediately? */
int defaultroute; /* Use as default route? */
int redial; /* Redial if disconnected */
int rmax; /* Maximum # of consecutive redials */
int rtries; /* # of tries so far */
int rtimeout; /* Redial every this many # of seconds */
char pppoptfile[STRLEN]; /* File containing PPP options */
int debug;
struct tunnel *t; /* Our tunnel */
struct call *c; /* Our call */
};
When going through third-party code, there are a few tools that I have found invaluable:
Source Navigator
lxr
ctags
and, of course, the oldest and greatest of all: grep
I believe that the Eclipse CDT also allows you to quickly find the definition of any variable you are looking at, but I have not actually used it - I prefer using console programs for my actual C coding.
None of those are vim-based, although at least ctags can be used via vim or emacs. Nevertheless, they can be very useful when exploring a new codebase that you know nothing about...
Are you talking about this?
The source code already comes with a tags file.
Loading any file (common.h in my case) in Vim you can use :tag lac to jump to the first definition of lac or :tselect lac to choose between the 3 occurrences in this project and :tag gconfig to jump to the unique definition of gconfig.
See :help tags.
I'm using vim + cscope and have the same issue with you. I find a way to workaround this issue.
in vim, search the text instead of the definition. for example, in the linux kernel source code, if you're trying to find "struct file",
commands this:
cs find t struct file {
you will have a accurate definition timely in most cases, take care, no quotation mark for the text "struct file {".
hope it will help you.

Resources