Logical block number to address (Linux filesystem) - linux

int block_count;
struct stat statBuf;
int block;
int fd = open("file.txt", O_RDONLY);
fstat(fd, &statBuf);
block_count = (statBuf.st_size + statBuf.st_blksize - 1) / statBuf.st_blksize;
int i,add;
for(i = 0; i < block_count; i++) {
block = i;
if (ioctl(fd, FIBMAP, &block)) {
perror("FIBMAP ioctl failed");
}
printf("%3d %10d\n", i, block);
add = block;
}
char buffer[255];
int fd2 = open("/dev/sda1", O_RDONLY);
lseek(fd2, add, SEEK_SET);
read(fd2, buffer, 20);
printf("%s%s\n","ss ",buffer);
Output:
0 5038060
1 5038061
2 5038062
3 5038063
4 5038064
5 5038065
ss
I am using the above code to get the logical block number of a file. Lets suppose I want to read the contents of the last block number, How would I do that?
Is there a way to get the address of a block from logical block number?
PS: I am using linux and filesystem is ext4

The above code actually is getting you the physical block number. The FIBMAP ioctl takes as input the logical block number, and returns the physical block number. You can then multiply that by the blocksize (which is 4k for most ext4 file systems, but you can get that by using the BLKBSZGET ioctl) to get the byte offset if you want to read that disk block using lseek and read.
Note that the more modern interface that people tend to use today is the FIEMAP interface. This doesn't require root, and returns the physical byte offset, plus a lot more data. For more information, please see:
https://www.kernel.org/doc/Documentation/filesystems/fiemap.txt
Or you can look at the source code for the filefrag command, which is part of e2fsprogs:
https://git.kernel.org/cgit/fs/ext2/e2fsprogs.git/tree/misc/filefrag.c

Related

How to pass efficiently the hugepages-backed buffer to the BM DMA device in Linux?

I need to provide a huge circular buffer (a few GB) for the bus-mastering DMA PCIe device implemented in FPGA.
The buffers should not be reserved at the boot time. Therefore, the buffer may be not contiguous.
The device supports scatter-gather (SG) operation, but for performance reasons, the addresses and lengths of consecutive contiguous segments of the buffer are stored inside the FPGA.
Therefore, usage of standard 4KB pages is not acceptable (there would be up to 262144 segments for each 1GB of the buffer).
The right solution should allocate the buffer consisting of 2MB hugepages in the user space (reducing the maximum number of segments by factor of 512).
The virtual address of the buffer should be transferred to the kernel driver via ioctl. Then the addresses and the length of the segments should be calculated and written to the FPGA.
In theory, I could use get_user_pages to create the list of the pages, and then call sg_alloc_table_from_pages to obtain the SG list suitable to program the DMA engine in FPGA.
Unfortunately, in this approach I must prepare the intermediate list of page structures with length of 262144 pages per 1GB of the buffer. This list is stored in RAM, not in the FPGA, so it is less problematic, but anyway it would be good to avoid it.
In fact I don't need to keep the pages maped for the kernel, as the hugepages are protected against swapping out, and they are mapped for the user space application that will process the received data.
So what I'm looking for is a function sg_alloc_table_from_user_hugepages, that could take such a user-space address of the hugepages-based memory buffer, and transfer it directly into the right scatterlist, without performing unnecessary and memory-consuming mapping for the kernel.
Of course such a function should verify that the buffer indeed consists of hugepages.
I have found and read these posts: (A), (B), but couldn't find a good answer.
Is there any official method to do it in the current Linux kernel?
At the moment I have a very inefficient solution based on get_user_pages_fast:
int sgt_prepare(const char __user *buf, size_t count,
struct sg_table * sgt, struct page *** a_pages,
int * a_n_pages)
{
int res = 0;
int n_pages;
struct page ** pages = NULL;
const unsigned long offset = ((unsigned long)buf) & (PAGE_SIZE-1);
//Calculate number of pages
n_pages = (offset + count + PAGE_SIZE - 1) >> PAGE_SHIFT;
printk(KERN_ALERT "n_pages: %d",n_pages);
//Allocate the table for pages
pages = vzalloc(sizeof(* pages) * n_pages);
printk(KERN_ALERT "pages: %p",pages);
if(pages == NULL) {
res = -ENOMEM;
goto sglm_err1;
}
//Now pin the pages
res = get_user_pages_fast(((unsigned long)buf & PAGE_MASK), n_pages, 0, pages);
printk(KERN_ALERT "gupf: %d",res);
if(res < n_pages) {
int i;
for(i=0; i<res; i++)
put_page(pages[i]);
res = -ENOMEM;
goto sglm_err1;
}
//Now create the sg-list
res = sg_alloc_table_from_pages(sgt, pages, n_pages, offset, count, GFP_KERNEL);
printk(KERN_ALERT "satf: %d",res);
if(res < 0)
goto sglm_err2;
*a_pages = pages;
*a_n_pages = n_pages;
return res;
sglm_err2:
//Here we jump if we know that the pages are pinned
{
int i;
for(i=0; i<n_pages; i++)
put_page(pages[i]);
}
sglm_err1:
if(sgt) sg_free_table(sgt);
if(pages) kfree(pages);
* a_pages = NULL;
* a_n_pages = 0;
return res;
}
void sgt_destroy(struct sg_table * sgt, struct page ** pages, int n_pages)
{
int i;
//Free the sg list
if(sgt->sgl)
sg_free_table(sgt);
//Unpin pages
for(i=0; i < n_pages; i++) {
set_page_dirty(pages[i]);
put_page(pages[i]);
}
}
The sgt_prepare function builds the sg_table sgt structure that i can use to create the DMA mapping. I have verified that it contains the number of entries equal to the number of hugepages used.
Unfortunately, it requires that the list of the pages is created (allocated and returned via the a_pages pointer argument), and kept as long as the buffer is used.
Therefore, I really dislike that solution. Now I have 256 2MB hugepages used as a DMA buffer. It means that I have to create and keeep unnecessary 128*1024 page structures. I also waste 512 MB of kernel address space for unnecessary kernel mapping.
The interesting question is if the a_pages may be kept only temporarily (until the sg-list is created)? In theory it should be possible, as the pages are still locked...

Write data to an SD card through a buffer without a race condition

I am writing firmware for a data logging device. It reads data from sensors at 20 Hz and writes data to an SD card. However, the time to write data to SD card is not consistent (about 200-300 ms). Thus one solution is writing data to a buffer at a consistent rate (using a timer interrupt), and have a second thread that writes data to the SD card when the buffer is full.
Here is my naive implementation:
#define N 64
char buffer[N];
int count;
ISR() {
if (count < N) {
char a = analogRead(A0);
buffer[count] = a;
count = count + 1;
}
}
void loop() {
if (count == N) {
myFile.open("data.csv", FILE_WRITE);
int i = 0;
for (i = 0; i < N; i++) {
myFile.print(buffer[i]);
}
myFile.close();
count = 0;
}
}
The code has the following problems:
Writing data to the SD card is blocking reading when the buffer is full
It might have a race conditions.
What is the best way to solve this problem? Using a circular buffer, or double buffering? How do I ensure that a race condition does not happen?
You have rather answered your own question; you should use either double buffering or a circular buffer. Double-buffering is probably simpler to implement and appropriate for devices such as an SD card for which block-writes are generally more efficient.
Buffer length selection may need some consideration; generally you would make the buffer the same as the SD sector buffer size (typically 512 bytes), but that may not be practical, and with a sample rate as low as 20 sps, optimising SD write performance is perhaps not an issue.
Another consideration is that you need to match your sample rate to the file-system latency by selecting an appropriate buffer size. In this case the 64 sample buffer buffer will fill in a little more than three seconds, but the block write takes only up-to 300 ms - so you could use a much smaller buffer if required - 8 samples would be sufficient - although be careful, you may have observed latency of 300 ms, but it may be larger when specific boundaries are crossed in the physical flash memory - I have seen significant latency on some cards at 1 Mbyte boundaries - moreover card performance varies significantly between sizes and manufacturers.
An adaptation of your implementation with double-buffering is below. I have reduced the buffer length to 32 samples, but with double-buffering the total is unchanged at 64, but the write lag is reduced to 1.6 seconds.
// Double buffer and its management data/constants
static volatile char buffer[2][32];
static const int BUFFLEN = sizeof(buffer[0]);
static const unsigned char EMPTY = 0xff;
static volatile unsigned char inbuffer = 0;
static volatile unsigned char outbuffer = EMPTY;
ISR()
{
static int count = 0;
// Write to current buffer
char a = analogRead(A0);
buffer[inbuffer][count] = a;
count++ ;
// If buffer full...
if( count >= BUFFLEN )
{
// Signal to loop() that data available (not EMPTY)
outbuffer = inbuffer;
// Toggle input buffer
inbuffer = inbuffer == 0 ? 1 : 0;
count = 0;
}
}
void loop()
{
// If buffer available...
if( outbuffer != EMPTY )
{
// Write buffer
myFile.open("data.csv", FILE_WRITE);
for( int i = 0; i < BUFFLEN; i++)
{
myFile.print(buffer[outbuffer][i]);
}
myFile.close();
// Set the buffer to empty
outbuffer = EMPTY;
}
}
Note the use of volatile and unsigned char for the shared data. It is important that data shared between concurrent execution contexts is accessed explicitly and atomically; access to an int on 8-bit AVR based Arduino requires multiple machine instructions and the interrupt may occur part way through a read/write in loop() and cause an incorrect value to be read.

ioctl() call resets file descriptor to 0

Consider the following code:
file_fd = open(device, O_RDWR);
if (file_fd < 0) {
perror("open");
return -1;
}
printf("File descriptor: %d\n", file_fd);
uint32_t DskSize;
if (ioctl(file_fd, BLKGETSIZE, &DskSize) < 0) {
perror("ioctl");
return -1;
}
printf("File descriptor after: %d\n", file_fd);
This snippet yields this:
File descriptor: 3
File descriptor after: 0
Why does my file descriptor get reset to 0? The program writes the stuff out to stdout instead of my block device.
This should not happen. I expect my file_fd to be non-zero and retain its value.
Looks like you smash your stack.
Since there are only two stack variables file_fd and DskSize and changing DskSize changes file_fd suggests that DiskSize must be unsigned long or size_t (a 64-bit value), not uint32_t.
Looking at BLKGETSIZE implementation confirms that the value type is unsigned long.
You may like to run your applications under valgrind, it reports this kind of errors.

Accessing another process virtual memory in Linux (debugging)

How does gdb access another process virtual memory on Linux? Is it all done via /proc?
How does gdb access another process virtual memory on Linux? Is it all done via /proc?
On Linux for reading memory:
1) If the number of bytes to read is fewer than 3 * sizeof (long) or the filesystem /proc is unavailable or reading from /proc/PID/mem is unsuccessful then ptrace is used with PTRACE_PEEKTEXT to read data.
These are these conditions in the function linux_proc_xfer_partial():
/* Don't bother for one word. */
if (len < 3 * sizeof (long))
return 0;
/* We could keep this file open and cache it - possibly one per
thread. That requires some juggling, but is even faster. */
xsnprintf (filename, sizeof filename, "/proc/%d/mem",
ptid_get_pid (inferior_ptid));
fd = gdb_open_cloexec (filename, O_RDONLY | O_LARGEFILE, 0);
if (fd == -1)
return 0;
2) If the number of bytes to read is greater or equal to 3 * sizeof (long) and /proc is available then pread64 or (lseek() and read() are used:
static LONGEST
linux_proc_xfer_partial (struct target_ops *ops, enum target_object object,
const char *annex, gdb_byte *readbuf,
const gdb_byte *writebuf,
ULONGEST offset, LONGEST len)
{
.....
/* If pread64 is available, use it. It's faster if the kernel
supports it (only one syscall), and it's 64-bit safe even on
32-bit platforms (for instance, SPARC debugging a SPARC64
application). */
#ifdef HAVE_PREAD64
if (pread64 (fd, readbuf, len, offset) != len)
#else
if (lseek (fd, offset, SEEK_SET) == -1 || read (fd, readbuf, len) != len)
#endif
ret = 0;
else
ret = len;
close (fd);
return ret;
}
On Linux for writing memory:
1) ptrace with PTRACE_POKETEXT or PTRACE_POKEDATA is used.
As for your second question:
where can I find information about ... setting hardware watchpoints
gdb, Internals Watchpoint:s http://sourceware.org/gdb/wiki/Internals%20Watchpoints
Reference:
http://linux.die.net/man/2/ptrace
http://www.alexonlinux.com/how-debugger-works

Reading boot sector of a FAT32 file system

I am writing a program and I need to access some information in the boot sector about the FAT32 file system that I have mounted.
Here is what I did, fully commented.
int main (void) {
unsigned short *i; //2 byte unsigned integer pointer
char tmp[1024]; //buffer
int fd; //file descriptor from opening device
fd = open("/dev/sdf", O_RDONLY); //open the device file
lseek(fd, 14, SEEK_SET); //set offset to 14 (0x0E), i.e. storing "no. of reserved sectors" of the FAT32 system
read(fd, tmp, 2); //read off 2 bytes, "no. of reserved sectors" is represented by 2 bytes
i = &tmp; //point j at those 2 bytes
printf("*j: %d\n", *j); //print *j out
close(fd); //close device
return 0;
}
The output of *i is 38, which is nonsense. I formatted the file system with mkfs.vfat. I set the "no. of reserved sectors" to 32.
What I have tried:
i = (unsigned short *) &tmp, do a casting, this removes the warning when I compile, but doesn't help
read(fd, tmp, 512), load the whole boot sector (512 bytes) into tmp, then read from the buffer, but doesn't help, result still 38 though.
fiddle with the offset, i.e. change 14 to 13 or 15, in case I get the index wrong. It prints out 9744 for 13 and 512 for 15 respectively, so doesn't work.
I'm not sure whether I'm doing it the right way. Can someone please point me in the right direction?
Thanks in advance,
Felastine.
Try running:
$ dd if=/dev/sdf of=/tmp/x.bin bs=512 count=1
And then:
$ hd /tmp/x.bin
Or
$ od -tx2 /tmp/x.bin
And post the first lines.
Chances are that your fattools are adding 6 extra reserved sectors of their own. And then they substract them before showing the data.
unsigned short *i; //2 byte unsigned integer pointer
char tmp[1024];
[...]
i = &tmp; //point j at those 2 bytes
tmp is char[], &tmp something of order char**.
Think again, you don't want the & here.

Resources