Get the serial number of the volume udf cd / dvd disk? - linux

I'm writing a program in linux which counts the serial number (xxxx-xxxx) of the volume of the CD in Windows 7. My program correctly determines the serial number of the volume on disks with the filesystems iso9660 and joilet. But how to define a disk volume sniffer with a file system udf? Can someone tell me ....
ps if anyone does not understand I'm talking about the serial number of this kind https://extra-torrent.jimdo.com/2016/01/23/hard-disk-volume-serial-number-change/
#include <QCoreApplication>
#include <stdio.h>
#include <sys/ioctl.h>
#include <linux/cdrom.h>
#include <string.h>
#include <szi/szimac.h>
#include <qfile.h>
#include <iostream>
#include <QDir>
#include <unistd.h>
#define SEC_SIZE 2048
#define VD_N 16
#define VD_TYPE_SUPP 2
#define VD_TYPE_END 255
#define ESC_IDX 88
#define ESC_LEN 3
#define ESC_UCS2L1 "%/#"
#define ESC_UCS2L2 "%/C"
#define ESC_UCS2L3 "%/E"
using namespace std;
int cdid(unsigned char pvd[SEC_SIZE])
{
unsigned char part[4] = {0};
int i;
for(i = 0; i < SEC_SIZE; i += 4)
{
part[3] += pvd[i + 0];
part[2] += pvd[i + 1];
part[1] += pvd[i + 2];
part[0] += pvd[i + 3];
}
return (part[3] << 24) + (part[2] << 16) + (part[1] << 8) + part[0];
}
int main(int argc, char *argv[])
{
FILE *in;
unsigned char buf[SEC_SIZE];
struct cdrom_multisession msinfo;
long session_start;
int id;
QString home=QString(getenv("HOME"))+QString("/chteniestorm");
QFile file(home);
ustr="/dev/sr0";
in = fopen(ustr.toLocal8Bit().data(), "rb");
if(in == NULL)
{
if (file.open(QIODevice::WriteOnly))
{
file.write("sernom=1");
file.close();
}
cout<<"netdiska"<<endl;
return 0;
}
/* Get session info */
msinfo.addr_format = CDROM_LBA;
if(ioctl(fileno(in), CDROMMULTISESSION, &msinfo) != 0)
{
fprintf(stderr, "WARNING: Can't get multisession info\n");
perror(NULL);
session_start = 0;
}
else
{
session_start = msinfo.addr.lba;
}
fseek(in, 0, SEEK_SET); //to the begining
/* Seek to primary volume descriptor */
if(fseek(in, (session_start + VD_N) * SEC_SIZE, SEEK_SET) != 0)
{
if (file.open(QIODevice::WriteOnly))
{
file.write("sernom=2");
file.close();
}
fclose(in);
return 0;
}
/* Read descriptor */
if(fread(buf, 1, SEC_SIZE, in) != SEC_SIZE)
{
if (file.open(QIODevice::WriteOnly))
{
file.write("sernom=3");
file.close();
}
fclose(in);
return 0;
}
/* Caclculate disc id */
id = cdid(buf);
/* Search for Joliet extension */
while(buf[0] != VD_TYPE_END)
{
/* Read descriptor */
if(fread(buf, 1, SEC_SIZE, in) != SEC_SIZE)
{
perror(NULL);
return 0;
}
if(buf[0] == VD_TYPE_SUPP
&& (memcmp(buf + ESC_IDX, ESC_UCS2L1, ESC_LEN) == 0
|| memcmp(buf + ESC_IDX, ESC_UCS2L2, ESC_LEN) == 0
|| memcmp(buf + ESC_IDX, ESC_UCS2L3, ESC_LEN) == 0)
)
{
/* Joliet found */
id = cdid(buf);
}
}
fclose(in);
}

It looks like this question was asked on more places [1], [2], [3], [4] but nowhere was answered yet. So I will do it here.
In some of those posts people decoded serial number generation algorithm. It is just checksum which you already have found and put into your cdid() function. Same checksum algorithm is used for both ISO9660 and UDF filesystems on Windows. You have already figured out from which ISO9660 structures is that checksum calculated.
So your question remain just for UDF filesystem. For UDF filesystem on Windows that checksum is calculated from the 512 bytes long File Set Descriptor (FSD) structure. I would suggest you to read OSTA UDF specification how to locale that FSD on UDF disc.
Basically for plain UDF which do not use Virtual Allocation Table (VAT), Sparing Table or Metadata Partition, location of the FSD is stored in Logical Volume Descriptor (LVD) structure, in field LogicalVolumeContentsUse (it is of type long_ad). LVD is stored in the Volume Descriptor Sequence (VDS). VDS's location is stored in Anchor Volume Descriptor Pointer (AVDP), in field MainVolumeDescriptorSequenceExtent. AVDP itself is located at sector 256 of medium. Optical media have sector size 2048 bytes and common hard disk 512 bytes.
For UDF with VAT (e.g. on CD-R/DVD-R/BD-R), Sparing Table (e.g. on CD-RW/DVD-RW) or Metadata Partition (e.g. on Blu-ray), it is much more complicated. You need to look into Virtual, Sparable or Metadata Partition to figure out how to translate logical location of the FSD to physical location of media.
In udftools project starting with version 2.0, there is a new tool udfinfo which provides various information about UDF filesystem. It shows also that Windows specific Volume Serial Number from your question under winserialnum key. Note that udfinfo cannot read FSD from UDF filesystem with VAT or Metadata yet.

Related

Why does making concurrent random writes to a single file on an NVMe SSD not lead to throughput increases?

I've been experimenting with a random write workload where I use multiple threads to write to disjoint offsets within one or more files on an NVMe SSD. I'm using a Linux machine and the writes are synchronous and are made using direct I/O (i.e., the files are opened with O_DSYNC and O_DIRECT).
I noticed that if the threads write concurrently to a single file, the achieved write throughput does not increase when the number of threads increases (i.e., the writes appear to be applied serially and not in parallel). However if each thread writes to its own file, I do get throughput increases (up to the SSD's manufacturer-advertised random write throughput). See the graph below for my throughput measurements.
I was wondering if anyone knows why I'm not able to get throughput increases if I have multiple threads concurrently writing to non-overlapping regions in the same file?
Here are some additional details about my experimental setup.
I'm writing 2 GiB of data (random write) and varying the number of threads used to do the write (from 1 to 16). Each thread writes 4 KiB of data at a time. I'm considering two setups: (1) all threads write to a single file, and (2) each thread writes to its own file. Before starting the benchmark, the file(s) used are opened and are initialized to their final size using fallocate(). The file(s) are opened with O_DIRECT and O_DSYNC. Each thread is assigned a random disjoint subset of the offsets within the file (i.e., the regions the threads write to are non-overlapping). Then, the threads concurrently write to these offsets using pwrite().
Here are the machine's specifications:
Linux 5.9.1-arch1-1
1 TB Intel NVMe SSD (model SSDPE2KX010T8)
ext4 file system
128 GiB of memory
2.10 GHz 20-core Xeon Gold 6230 CPU
The SSD is supposed to be capable of delivering up to 70000 IOPS of random writes.
I've included a standalone C++ program that I've used to reproduce this behavior on my machine. I've been compiling using g++ -O3 -lpthread <file> (I'm using g++ version 10.2.0).
#include <algorithm>
#include <cassert>
#include <chrono>
#include <cstdlib>
#include <cstring>
#include <iostream>
#include <random>
#include <thread>
#include <vector>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
constexpr size_t kBlockSize = 4 * 1024;
constexpr size_t kDataSizeMiB = 2048;
constexpr size_t kDataSize = kDataSizeMiB * 1024 * 1024;
constexpr size_t kBlocksTotal = kDataSize / kBlockSize;
constexpr size_t kRngSeed = 42;
void AllocFiles(unsigned num_files, size_t blocks_per_file,
std::vector<int> &fds,
std::vector<std::vector<size_t>> &write_pos) {
std::mt19937 rng(kRngSeed);
for (unsigned i = 0; i < num_files; ++i) {
const std::string path = "f" + std::to_string(i);
fds.push_back(open(path.c_str(), O_CREAT | O_WRONLY | O_DIRECT | O_DSYNC,
S_IRUSR | S_IWUSR));
write_pos.emplace_back();
auto &file_offsets = write_pos.back();
int fd = fds.back();
for (size_t blk = 0; blk < blocks_per_file; ++blk) {
file_offsets.push_back(blk * kBlockSize);
}
fallocate(fd, /*mode=*/0, /*offset=*/0, blocks_per_file * kBlockSize);
std::shuffle(file_offsets.begin(), file_offsets.end(), rng);
}
}
void ThreadMain(int fd, const void *data, const std::vector<size_t> &write_pos,
size_t offset, size_t num_writes) {
for (size_t i = 0; i < num_writes; ++i) {
pwrite(fd, data, kBlockSize, write_pos[i + offset]);
}
}
int main(int argc, char *argv[]) {
assert(argc == 3);
unsigned num_threads = strtoul(argv[1], nullptr, 10);
unsigned files = strtoul(argv[2], nullptr, 10);
assert(num_threads % files == 0);
assert(num_threads >= files);
assert(kBlocksTotal % num_threads == 0);
void *data_buf;
posix_memalign(&data_buf, 512, kBlockSize);
*reinterpret_cast<uint64_t *>(data_buf) = 0xFFFFFFFFFFFFFFFF;
std::vector<int> fds;
std::vector<std::vector<size_t>> write_pos;
std::vector<std::thread> threads;
const size_t blocks_per_file = kBlocksTotal / files;
const unsigned threads_per_file = num_threads / files;
const unsigned writes_per_thread_per_file =
blocks_per_file / threads_per_file;
AllocFiles(files, blocks_per_file, fds, write_pos);
const auto begin = std::chrono::steady_clock::now();
for (unsigned thread_id = 0; thread_id < num_threads; ++thread_id) {
unsigned thread_file_offset = thread_id / files;
threads.emplace_back(
&ThreadMain, fds[thread_id % files], data_buf,
write_pos[thread_id % files],
/*offset=*/(thread_file_offset * writes_per_thread_per_file),
/*num_writes=*/writes_per_thread_per_file);
}
for (auto &thread : threads) {
thread.join();
}
const auto end = std::chrono::steady_clock::now();
for (const auto &fd : fds) {
close(fd);
}
std::cout << kDataSizeMiB /
std::chrono::duration_cast<std::chrono::duration<double>>(
end - begin)
.count()
<< std::endl;
free(data_buf);
return 0;
}
In this scenario, the underlying reason was that ext4 was taking an exclusive lock when writing to the file. To get the multithreaded throughput scaling that we would expect when writing to the same file, I needed to make two changes:
The file needed to be "preallocated." This means that we need to make at least one actual write to every block in the file that we plan on writing to (e.g., writing zeros to the whole file).
The buffer used for making the write needs to be aligned to the file system's block size. In my case the buffer should have been aligned to 4096.
// What I had
posix_memalign(&data_buf, 512, kBlockSize);
// What I actually needed
posix_memalign(&data_buf, 4096, kBlockSize);
With these changes, using multiple threads to make non-overlapping random writes to a single file leads to the same throughput gains as if the threads each wrote to their own file.

How does limits on the shared memory work on Linux

I was looking into the Linux kernel limits on the shared memory
/proc/sys/kernel/shmall
specifies the maximum amount of pages that can be allocated. Considering this number as x and the page size as p. I assume that "x * p" bytes is the limit on the system wide shared memory.
Now I wrote a small program to create a shared memory segment and i attached to that shared memory segment twice as below
shm_id = shmget(IPC_PRIVATE, 4*sizeof(int), IPC_CREAT | 0666);
if (shm_id < 0) {
printf("shmget error\n");
exit(1);
}
printf("\n The shared memory created is %d",shm_id);
ptr = shmat(shm_id,NULL,0);
ptr_info = shmat(shm_id,NULL,0);
In the above program ptr and ptr_info were different. So the shared memory is mapped to 2 virtual addresses in my process address space.
When I do an ipcs it looks like this
...
0x00000000 1638416 sun 666 16000000 2
...
Now coming to the shmall limit x * p noted above in my question. Is this limit applicable on the sum of all the virtual memory allocated for every shared memory segment? or does this limit apply on the physical memory?
Physical memory is only one here (shared memory) and from the program above when I do 2 shmat's there is twice the amount of memory allocated in my process address space. So this limit will hit soon if do continuous shmat's on a single shared memory segment?
The limit only applies to physical memory, that is the real shared memory allocated for all segments, because shmat() just maps that allocated segment into process address space.
You can trace it in the kernel, there is only one place where this limit is checked — in the newseg() function that allocates new segments (ns->shm_ctlall comparison). shmat() implementation is busy with a lot of stuff, but doesn't care at all about shmall limit, so you can map one segment as many times as you want to (well, address space is also limited, but in practice you rarely care about this limit).
You can also try some test from userspace with a simple program like this one:
#define _GNU_SOURCE
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <unistd.h>
unsigned long int get_shmall() {
FILE *f = NULL;
char buf[512];
unsigned long int value = 0;
if ((f = fopen("/proc/sys/kernel/shmall", "r")) != NULL) {
if (fgets(buf, sizeof(buf), f) != NULL)
value = strtoul(buf, NULL, 10); // no proper checks
fclose(f); // no return value check
}
return value;
}
int set_shmall(unsigned long int value) {
FILE *f = NULL;
char buf[512];
int retval = 0;
if ((f = fopen("/proc/sys/kernel/shmall", "w")) != NULL) {
if (snprintf(buf, sizeof(buf), "%lu\n", value) >= sizeof(buf) ||
fwrite(buf, 1, strlen(buf), f) != strlen(buf))
retval = -1;
fclose(f); // fingers crossed
} else
retval = -1;
return retval;
}
int main()
{
int shm_id1 = -1, shm_id2 = -1;
unsigned long int shmall = 0, shmused, newshmall;
void *ptr1, *ptr2;
struct shm_info shminf;
if ((shmall = get_shmall()) == 0) {
printf("can't get shmall\n");
goto out;
}
printf("original shmall: %lu pages\n", shmall);
if (shmctl(0, SHM_INFO, (struct shmid_ds *)&shminf) < 0) {
printf("can't get SHM_INFO\n");
goto out;
}
shmused = shminf.shm_tot * getpagesize();
printf("shmused: %lu pages (%lu bytes)\n", shminf.shm_tot, shmused);
newshmall = shminf.shm_tot + 1;
if (set_shmall(newshmall) != 0) {
printf("can't set shmall\n");
goto out;
}
if (get_shmall() != newshmall) {
printf("something went wrong with shmall setting\n");
goto out;
}
printf("new shmall: %lu pages (%lu bytes)\n", newshmall, newshmall * getpagesize());
printf("shmget() for %u bytes: ", (unsigned int) getpagesize());
shm_id1 = shmget(IPC_PRIVATE, (size_t)getpagesize(), IPC_CREAT | 0666);
if (shm_id1 < 0) {
printf("failed: %s\n", strerror(errno));
goto out;
}
printf("ok\nshmat 1: ");
ptr1 = shmat(shm_id1, NULL, 0);
if (ptr1 == 0) {
printf("failed\n");
goto out;
}
printf("ok\nshmat 2: ");
ptr2 = shmat(shm_id1, NULL, 0);
if (ptr2 == 0) {
printf("failed\n");
goto out;
}
printf("ok\n");
if (ptr1 == ptr2) {
printf("ptr1 and ptr2 are the same with shm_id1\n");
goto out;
}
printf("shmget() for %u bytes: ", (unsigned int) getpagesize());
shm_id2 = shmget(IPC_PRIVATE, (size_t)getpagesize(), IPC_CREAT | 0666);
if (shm_id2 < 0)
printf("failed: %s\n", strerror(errno));
else
printf("ok, although it's wrong\n");
out:
if (shmall != 0 && set_shmall(shmall) != 0)
printf("failed to restrore shmall\n");
if (shm_id1 >= 0 && shmctl(shm_id1, IPC_RMID, NULL) < 0)
printf("failed to remove shm_id1\n");
if (shm_id2 >= 0 && shmctl(shm_id2, IPC_RMID, NULL) < 0)
printf("failed to remove shm_id2\n");
return 0;
}
What is does is it sets the shmall limit just one page above what is currently used by the system, then tries to get page-sized new segment and map it twice (all successfully), then tries to get one more page-sized segment and fails to do that (execute the program as superuser because it writes to /proc/sys/kernel/shmall):
$ sudo ./a.out
original shmall: 18446744073708503040 pages
shmused: 21053 pages (86233088 bytes)
new shmall: 21054 pages (86237184 bytes)
shmget() for 4096 bytes: ok
shmat 1: ok
shmat 2: ok
shmget() for 4096 bytes: failed: No space left on device
I did not find any Physical memory allocation at do_shmat function (linux/ipc/shm.c)
https://github.com/torvalds/linux/blob/5469dc270cd44c451590d40c031e6a71c1f637e8/ipc/shm.c
so shmat consumes only vm (your process address space),
the main function of shmat is mmap

why file with hole has smaller disk block than file without hole?

#include <fcntl.h>
#include <unistd.h>
char buf1[] = "abcdefghij";
char buf2[] = "ABCDEFGHIJ";
char buf3[10];
int
main(void)
{
int fd;
if ((fd = creat("file.hole", FILE_MODE)) < 0) {
err_sys("creat error");
}
if (write(fd, buf1, 10) != 10) { // offset is now = 10
err_sys("buf1 write error");
}
if (lseek(fd, 16380, SEEK_SET) == -1) { // offset now = 16380
err_sys("lseek error");
}
if (write(fd, buf2, 10) != 10) { // offset now = 16390
err_sys("buf2 write error");
}
close(fd);
if ((fd = open("file.hole", O_RDWR)) == -1) {
err_sys("failed to re-open file");
}
ssize_t n;
ssize_t m;
while ((n = read(fd, buf3, 10)) > 0) {
if ((m = write(STDOUT_FILENO, buf3, 10)) != 10) {
err_sys("stdout write error");
}
}
if (n == -1) {
err_sys("buf3 read error");
}
exit(0);
}
I'm newbie in unix system programming
There is code making file with hole.
Output result is:
$ls -ls file.hole file.nohole
8 -rw-r--r-- 1 sar 16394 time file.hole
20 -rw-r--r-- 1 sar 16394 time file.nohole
Why file with hole has fewer disk block than file without hole?
In my thinking, file without hole takes smaller disk blocks
Because file with hole is more spreaded than without hole..
From "Advanced Programming in the UNIX Environment 3rd-Stevens Rago, example 3.2"
Why do you think that a file without hole takes smaller space ? This exactly the contrary.
If the file has holes, then it is not necessary to reserve disk blocks for that space.
The number of disk blocks is not related to the spreading of the file, but directly related to the size of the data you wrote in the file.
The distribution of the data blocks on the hard disk doesn't count against the number of blocks which the file system needs to store the data. It really doesn't matter if the blocks are close together or far away since the file system can use the blocks between for different files.
So the output shows you that file.hole only occupies 8 blocks in the hard disk, not where they are.

Program to see the bytes from a file internally

Do you know if exist one program or method to see (secuences of)bytes from a text,html file?
Not to see characters, rather see the complete sequence of bytes.
recommendations?
yes, it is called hex editor... Hundreds of those exist out there.
Here are some: http://en.wikipedia.org/wiki/Comparison_of_hex_editors
A common hex editor allows you to view any file's byte sequence.
If you just want to see the existing bytes (without changing them) you can use a hex-dump program, which is much smaller and simpler than a hex editor. For example, here's one I wrote several years ago:
/* public domain by Jerry Coffin
*/
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
unsigned long offset = 0;
FILE *input;
int bytes, i, j;
unsigned char buffer[16];
char outbuffer[60];
if ( argc < 2 ) {
fprintf(stderr, "\nUsage: dump filename [filename...]");
return EXIT_FAILURE;
}
for (j=1;j<argc; ++j) {
if ( NULL ==(input=fopen(argv[j], "rb")))
continue;
printf("\n%s:\n", argv[j]);
while (0 < (bytes=fread(buffer, 1, 16, input))) {
sprintf(outbuffer, "%8.8lx: ", offset+=16);
for (i=0;i<bytes;i++) {
sprintf(outbuffer+10+3*i, "%2.2X ",buffer[i]);
if (!isprint(buffer[i]))
buffer[i] = '.';
}
printf("%-60s %*.*s\n", outbuffer, bytes, bytes, buffer);
}
fclose(input);
}
return 0;
}

How can I find the size of a ELF file/image with Header information?

I need to find the size of an elf image for some computation. I have tried with the readelf utility on linux which gives the informations about the headers and section. I need to have the exact file size of the elf(on the whole).
How do I find the size of the ELF from the header information or Is there any other means to find the size of an elf without reading the full image.
The answer to the specific question is a little tricky for ELF files.
The following will compute the size of the "descriptive" information in an ELF file using the header: e_ehsize + (e_phnum * e_phentsize) + (e_shnum * e_shentsize)
The above is based on the ELF documentation.
The next piece to add to the above sum is the size in the file of the section entries. Intuitively we would like to compute this using sh_size for each of the sections in the file -- e_shnum of them. HOWEVER, this doesn't yield the correct answer due to alignment issues. If you use an ordered list of sh_offset values you can compute the exact number of bytes that the section entry occupies (I found some strange alignments where using sh_addralign isn't as useful as you would like); for the last section entry use the file header's e_shoff since the section header table is last. This worked for the couple I checked.
update.c in libelf has the details it uses when updating an elf file.
Example:
ls -l gives 126584
Calculation using the values also reported by readelf -h:
Start of section headers e_shoff 124728
Size of section headers e_shentsize 64
Number of section headers e_shnum 29
e_shoff + ( e_shentsize * e_shnum ) = 126584
This assumes that the section header table (SHT) is the last part of the ELF. This is usually the case but it could also be that the last section is the last part of the ELF. This should be checked for, but is not in this example.
Here is a working implementation in C, compile with gcc elfsize.c -o elfsize:
#include <elf.h>
#include <byteswap.h>
#include <stdio.h>
#include <stdint.h>
#include <errno.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
typedef Elf32_Nhdr Elf_Nhdr;
static char *fname;
static Elf64_Ehdr ehdr;
static Elf64_Phdr *phdr;
#if __BYTE_ORDER == __LITTLE_ENDIAN
#define ELFDATANATIVE ELFDATA2LSB
#elif __BYTE_ORDER == __BIG_ENDIAN
#define ELFDATANATIVE ELFDATA2MSB
#else
#error "Unknown machine endian"
#endif
static uint16_t file16_to_cpu(uint16_t val)
{
if (ehdr.e_ident[EI_DATA] != ELFDATANATIVE)
val = bswap_16(val);
return val;
}
static uint32_t file32_to_cpu(uint32_t val)
{
if (ehdr.e_ident[EI_DATA] != ELFDATANATIVE)
val = bswap_32(val);
return val;
}
static uint64_t file64_to_cpu(uint64_t val)
{
if (ehdr.e_ident[EI_DATA] != ELFDATANATIVE)
val = bswap_64(val);
return val;
}
static long unsigned int read_elf32(int fd)
{
Elf32_Ehdr ehdr32;
ssize_t ret, i;
ret = pread(fd, &ehdr32, sizeof(ehdr32), 0);
if (ret < 0 || (size_t)ret != sizeof(ehdr)) {
fprintf(stderr, "Read of ELF header from %s failed: %s\n",
fname, strerror(errno));
exit(10);
}
ehdr.e_shoff = file32_to_cpu(ehdr32.e_shoff);
ehdr.e_shentsize = file16_to_cpu(ehdr32.e_shentsize);
ehdr.e_shnum = file16_to_cpu(ehdr32.e_shnum);
return(ehdr.e_shoff + (ehdr.e_shentsize * ehdr.e_shnum));
}
static long unsigned int read_elf64(int fd)
{
Elf64_Ehdr ehdr64;
ssize_t ret, i;
ret = pread(fd, &ehdr64, sizeof(ehdr64), 0);
if (ret < 0 || (size_t)ret != sizeof(ehdr)) {
fprintf(stderr, "Read of ELF header from %s failed: %s\n",
fname, strerror(errno));
exit(10);
}
ehdr.e_shoff = file64_to_cpu(ehdr64.e_shoff);
ehdr.e_shentsize = file16_to_cpu(ehdr64.e_shentsize);
ehdr.e_shnum = file16_to_cpu(ehdr64.e_shnum);
return(ehdr.e_shoff + (ehdr.e_shentsize * ehdr.e_shnum));
}
long unsigned int get_elf_size(char *fname)
/* TODO, FIXME: This assumes that the section header table (SHT) is
the last part of the ELF. This is usually the case but
it could also be that the last section is the last part
of the ELF. This should be checked for.
*/
{
ssize_t ret;
int fd;
long unsigned int size = 0;
fd = open(fname, O_RDONLY);
if (fd < 0) {
fprintf(stderr, "Cannot open %s: %s\n",
fname, strerror(errno));
return(1);
}
ret = pread(fd, ehdr.e_ident, EI_NIDENT, 0);
if (ret != EI_NIDENT) {
fprintf(stderr, "Read of e_ident from %s failed: %s\n",
fname, strerror(errno));
return(1);
}
if ((ehdr.e_ident[EI_DATA] != ELFDATA2LSB) &&
(ehdr.e_ident[EI_DATA] != ELFDATA2MSB))
{
fprintf(stderr, "Unkown ELF data order %u\n",
ehdr.e_ident[EI_DATA]);
return(1);
}
if(ehdr.e_ident[EI_CLASS] == ELFCLASS32) {
size = read_elf32(fd);
} else if(ehdr.e_ident[EI_CLASS] == ELFCLASS64) {
size = read_elf64(fd);
} else {
fprintf(stderr, "Unknown ELF class %u\n", ehdr.e_ident[EI_CLASS]);
return(1);
}
close(fd);
return size;
}
int main(int argc, char **argv)
{
ssize_t ret;
int fd;
if (argc != 2) {
fprintf(stderr, "Usage: %s <ELF>\n", argv[0]);
return 1;
}
fname = argv[1];
long unsigned int size = get_elf_size(fname);
fprintf(stderr, "Estimated ELF size on disk: %lu bytes \n", size);
return 0;
}
Perhaps gelf could be useful.
GElf is a generic, ELF class-independent API for manipulat- ing ELF object files. GElf provides a single, common inter- face for handling 32-bit and 64-bit ELF format object files.
specifically these functions:
elf32_fsize, elf64_fsize - return the size of an object file type
Have you tried using the gnu "readelf" utility?
http://sourceware.org/binutils/docs/binutils/readelf.html
All you have to do is to sum the last section's file offset and its size.
fseek(fileHandle, elfHeader.e_shoff + (elfHeader.e_shnum-1) * elfHeader.e_shentsize, SEEK_SET);
Elf64_Shdr sectionHeader; // or Elf32_Shdr
fread(&sectionHeader, 1, elfHeader.e_shentsize, fileHandle);
int fileSize = sectionHeader.sh_offset + sectionHeader.sh_size;
elfHeader used values:
e_shoff = Section header table file offset
e_shnum = Section header table entry count
e_shentsize = Section header table entry size
sectionHeader used values:
sh_offset = Section file offset
sh_size = Section size in bytes
You can use the stat functions family (stat(), lstat(), fstat()) to get the size of any file (using the st_size member of the stat member).
Do you need something more specific?
If you really want to use the ELF structure, use the elf.h header which contains that structure:
typedef struct {
unsigned char e_ident[EI_NIDENT];
uint16_t e_type;
uint16_t e_machine;
uint32_t e_version;
ElfN_Addr e_entry;
ElfN_Off e_phoff;
ElfN_Off e_shoff;
uint32_t e_flags;
uint16_t e_ehsize;
uint16_t e_phentsize;
uint16_t e_phnum;
uint16_t e_shentsize;
uint16_t e_shnum;
uint16_t e_shstrndx;
} Elf32_Ehdr;
It's the header of an ELF32 file (replace 32 with 64 for a 64-bit file).
e_ehsize is the size of the file in bytes.
I'll copy verbatim the comment that was posted as an edit suggestion:
This answer is incorrect. e_ehsize is merely the size of the elf header, not the elf file.

Resources