I am implementing a virtual filesystem using the fuse, and need some understanding regarding the offset parameter in readdir.
Earlier we were ignoring the offset and passing 0 in the filler function, in which case the kernel should take care.
Our filesystem database, is storing: directory name, filelength, inode number and parent inode number.
How do i calculate get the offset?
Then is the offset of each components, equal to their size sorted in incremental form of their inode number? What happens is there is a directory inside a directory, is the offset in that case equal to the sum of the files inside?
Example: in case the dir listing is - a.txt b.txt c.txt
And inode number of a.txt=3, b.txt=5, c.txt=7
Offset of a.txt= directory offset
Offset of b.txt=dir offset + size of a.txt
Offset of c.txt=dir offset + size of b.txt
Is the above assumption correct?
P.S: Here are the callbacks of fuse
The selected answer is not correct
Despite the lack of upvotes on this answer, this is the correct answer. Cracking into the format of the void buffer should be discouraged, and that's the intent behind declaring such things void in C code - you shouldn't write code that assumes knowledge of the format of the data behind void pointers, use whatever API is provided properly instead.
The code below is very simple and straightforward, as it should be. No knowledge of the format of the Fuse buffer is required.
Fictitious API
This is a contrived example of what some device's API could look
like. This is not part of Fuse.
// get_some_file_names() -
// returns a struct with buffers holding the names of files.
// PARAMETERS
// * path - A path of some sort that the fictitious device groks.
// * offset - Where in the list of file names to start.
// RETURNS
// * A name_list, it has some char buffers holding the file names
// and a couple other auxiliary vars.
//
name_list *get_some_file_names(char *path, size_t offset);
Listing the files in parts
Here's a Fuse callback that can be registered with the Fuse system to
list the filenames provided by get_some_file_names(). It's arbitrarily named readdir_callback() so its purpose is obvious.
int readdir_callback( char *path,
void *buf, // This is meant to be "opaque".
fuse_fill_dir_t *filler, // filler takes care of buf.
off_t off, // Last value given to filler.
struct fuse_file_info *fi )
{
// Call the fictitious API to get a list of file names.
name_list *list = get_some_file_names(path, off);
for (int i = 0; i < list->length; i++)
{
// Feed the file names to filler() one at a time.
if (filler(buf, list->names[i], NULL, off + i + 1))
{
break; // filler() returned 1, requesting a break.
}
incr_num_files_listed(list);
}
if (all_files_listed(list))
{
return 1; // Tell Fuse we're done.
}
return 0;
}
The off (offset) value is not used by the filler function to fill its opaque buffer, buf. The off value is, however, meaningful to the callback as an offset base as it provides file names to filler(). Whatever value was last passed to filler() is what gets passed back to readdir_callback() on its next invocation. filler()
itself only cares whether the off value is 0 or not-0.
Indicating "I'm done listing!" to Fuse
To signal to the Fuse system that your readdir_callback() is done listing file names in parts (when the last of the list of names has been given to filler()), simply return 1 from it.
How off Is Used
The off, offset, parameter should be non-0 to perform the partial listings. That's its only requirement as far as filler() is concerned. If off is 0, that indicates to Fuse that you're going to do a full listing in one shot (see below).
Although filler() doesn't care what the off value is beyond it being non-0, the value can still be meaningfully used. The code above is using the index of the next item in its own file list as its value. Fuse will keep passing the last off value it received back to the read dir callback on each invocation until the listing is complete (when readdir_callback() returns 1).
Listing the files all at once
int readdir_callback( char *path,
void *buf,
fuse_fill_dir_t *filler,
off_t off,
struct fuse_file_info *fi )
{
name_list *list = get_all_file_names(path);
for (int i = 0; i < list->length; i++)
{
filler(buf, list->names[i], NULL, 0);
}
return 0;
}
Listing all the files in one shot, as above, is simpler - but not by much. Note that off is 0 for the full listing. One may wonder, 'why even bother with the first approach of reading the folder contents in parts?'
The in-parts strategy is useful where a set number of buffers for file names is allocated, and the number of files within folders may exceed this number. For instance, the implementation of name_list above may only have 8 allocated buffers (char names[8][256]). Also, buf may fill up and filler() start returning 1 if too many names are given at once. The first approach avoids this.
The offset passed to the filler function is the offset of the next item in the directory. You can have the entries in the directory in any order you want. If you don't want to return an entire directory at once, you need to use the offset to determine what gets asked for and stored. The order of items in the directory is up to you, and doesn't matter what order the names or inodes or anything else is.
Specifically, in the readdir call, you are passed an offset. You want to start calling the filler function with entries that will be at this callback or later. In the simplest case, the length of each entry is 24 bytes + strlen(name of entry), rounded up to the nearest multiple of 8 bytes. However, see the fuse source code at http://sourceforge.net/projects/fuse/ for when this might not be the case.
I have a simple example, where I have a loop (pseudo c-code) in my readdir function:
int my_readdir(const char *path, void *buf, fuse_fill_dir_t filler, off_t offset, struct fuse_file_info *fi)
{
(a bunch of prep work has been omitted)
struct stat st;
int off, nextoff=0, lenentry, i;
char namebuf[(long enough for any one name)];
for (i=0; i<NumDirectoryEntries; i++)
{
(fill st with the stat information, including inode, etc.)
(fill namebuf with the name of the directory entry)
lenentry = ((24+strlen(namebuf)+7)&~7);
off = nextoff; /* offset of this entry */
nextoff += lenentry;
/* Skip this entry if we weren't asked for it */
if (off<offset)
continue;
/* Add this to our response until we are asked to stop */
if (filler(buf, namebuf, &st, nextoff))
break;
}
/* All done because we were asked to stop or because we finished */
return 0;
}
I tested this within my own code (I had never used the offset before), and it works fine.
I would like to know if these 2 headers have the same meaning nor why?
From wikipedia :
offset 06h : Set to 1 for the original version of ELF.
offset 14h : Set to 1 for the original version of ELF.
reference : http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
You may want to read a more detailed document which is likely to include the information you're looking for:
http://www.skyfree.org/linux/references/ELF_Format.pdf
The header structure
#define EINIDENT 16
typedefstruct{
unsigned char e_ident[EINIDENT];
Elf32_Half e_type;
Elf32_Half e_machine;
Elf32_Word e_version;
Elf32_Addr e_entry;
Elf32_Off e_phoff;
Elf32_Off e_shoff;
Elf32_Word e_flags;
Elf32_Half e_ehsize;
Elf32_Half e_phentsize;
Elf32_Half e_phnum;
Elf32_Half e_shentsize;
Elf32_Half e_shnum;
Elf32_Half e_shstrndx;
} Elf32Ehdr;
The 2nd e_version which defines the version as 1 (i.e. "current")
e_version This member identifies the object file version.
Name Value Meaning
EV_NONE 0 Invalid version
EV_CURRENT 1 Current version
The value 1 signifies the original file format; extensions will
create new versions with higher numbers. The value of EV_CURRENT,
though given as 1 above, will change as necessary to reflect the
current version number.
The version in the e_ident part is also EV_CURRENT, so exactly the same version:
EI_VERSION Byte e_ident[EI_VERSION] specifies the ELF header version
number. Currently, this value must be EV_CURRENT, as
explained above for e_version.
From what I understand, I would say that the version has not changed yet so it is still 1 in both places, but that could change in the future...
As I know, a ELF object consists of a number of segments, each of which has a corresponding program header describing the segment. In libelf, a program header is defined as a Elf64_Phdr (or Elf32_Phdr) structure, and a Elf64_Phdr structure is defined like this:
typedef struct {
Elf32_Word p_type; /* Segment type */
Elf32_Off p_offset; /* Segment file offset */
Elf32_Addr p_vaddr; /* Segment virtual address */
Elf32_Addr p_paddr; /* Segment physical address */
Elf32_Word p_filesz; /* Segment size in file */
Elf32_Word p_memsz; /* Segment size in memory */
Elf32_Word p_flags; /* Segment flags */
Elf32_Word p_align; /* Segment alignment */
} Elf32_Phdr;
However, segments have names (don't they?) and Elf64_Phdr structures don't have a field which points to their corresponding names. So, how to get a name of a segment of an ELF file from its corresponding program header? Or is the p_type field enough to identify a segment, so that segments don't have names?
However, segments have names (don't they?)
No, they don't.
Or is the p_type field enough to identify a segment, so that segments don't have names?
Correct.
ok...
so im suppose to write a program that prints all of the sections name in an elf file using only mmap (thats not important...)
so what i did so far is this -
maped the file into the stat structure =
map_start = mmap(0, fd_stat.st_size, PROT_READ | PROT_WRITE , MAP_SHARED, fd, 0)) <0 )
casted it into the write format from the starting point i got =
header = (Elf32_Ehdr *) map_start;
gotten the section header offset from the file =
secoff = header->e_shoff;
now - i know i need to go to the map_start+secoff location - that will give me the section table, and the sh_name will give me an index for the string table...
how to i go to the sting table?
how is it represented?
how do i use it? and is the value in sh_name the index in the string table (if it is represented as an array) , or an offset..
anyway - lets say i want to print the first two section's name - how do i do it givven the code i wrote above
help please?
header = (Elf32_Ehdr *) map_start;
secoff = header->e_shoff;
This is probably wrong. Unless the Elf32_Ehdr structure is explicitly declared __attribute__((packed)), the compiler will eventually insert padding between the members of the structure, so sizeof(Elf32_Ehdr) != (the actual size of an ELF header section). Why not simply use the libelf accessor functions instead?
Update: if you're not allowed to use accessor functions, you'll have to do something like this:
Elf32_Ehdr hdr;
memcpy(&hdr.e_ident, map_start + 0, EI_NIDENT);
memcpy(&hdr.e.type, map_start + 0 + sizeof(Elf32_half), sizeof(Elf32_Half));
et cetera.
I used readelf on several binaries on my linux box and saw something that surprised me in the program headers. This eample is from the 'ld' utility, but it also occurs with anything I compile with gcc.
PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
This segment spans the entirety of the program headers. Why is is marked as executable? It doesn't contain machine code. But also, why is even this present in the headers? I don't really want it in my program image.
The PHDR pointing to the PHDRs tells the loader that the PHDRs themselves should be mapped to the process address space, in order to make them accessible to the program itself.
This is useful mainly for dynamic linking.
The reason the memory is marked as executable is because the PHDRs are smaller than one page, and live right next to the start of the executable code. If the permissions for the PHDRs were different from those of the program text, the linker would have to insert padding between them.
The main File ELF headers are there to easily find the offset in the file where other sections are stored. Then each subheader describes the data in it's section.
Main ELF header looks like this:
/* ELF File Header */
typedef struct
{
unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */
Elf32_Half e_type; /* Object file type */
Elf32_Half e_machine; /* Architecture */
Elf32_Word e_version; /* Object file version */
Elf32_Addr e_entry; /* Entry point virtual address */
Elf32_Off e_phoff; /* Program header table file offset */
Elf32_Off e_shoff; /* Section header table file offset */
Elf32_Word e_flags; /* Processor-specific flags */
Elf32_Half e_ehsize; /* ELF header size in bytes */
Elf32_Half e_phentsize; /* Program header table entry size */
Elf32_Half e_phnum; /* Program header table entry count */
Elf32_Half e_shentsize; /* Section header table entry size */
Elf32_Half e_shnum; /* Section header table entry count */
Elf32_Half e_shstrndx; /* Section header string table index */
} Elf32_Ehdr;
The program header(s) are there because they describe the executable parts of the ELF executable.
The next portion of the program are
the ELF program headers. These
describe the sections of the program
that contain executable program code
to get mapped into the program address
space as it loads.
/* Program segment header. */
typedef struct
{
Elf32_Word p_type; /* Segment type */
Elf32_Off p_offset; /* Segment file offset */
Elf32_Addr p_vaddr; /* Segment virtual address */
Elf32_Addr p_paddr; /* Segment physical address */
Elf32_Word p_filesz; /* Segment size in file */
Elf32_Word p_memsz; /* Segment size in memory */
Elf32_Word p_flags; /* Segment flags */
Elf32_Word p_align; /* Segment alignment */
} Elf32_Phdr;
This is taken from here