Linux Lazarus: Wrong FileSize Reported by TFileStream - linux

I am trying to read my file through Filestream to send it over the network and I have noticed something odd. I am not sure why.
My actual filesize is 44.7KB, but when filestream reads the same file it tells me that the filesize is 45228 Bytes or 45.2 KB. Why is that? Is there is a way to fix that?
fs:TFileStream;
fs := TFileStream.Create('myfile.dat', fmOpenRead or fmshareDenyWrite);
showmessage(inttostr(fs.Size));

One possibility is that whatever you are using to report the file size is measuring the size using kibibytes (1024 bytes) rather than kilobytes (1000 bytes).
Divide 45228 by 1024 to get 44.2KiB. This still doesn't match exactly but I would not be surprised if there was a transcription error in your question. There is at least one, where you wrote FileSize rather than Size (now corrected by a question edit) so my guess is that some of the other specific details are incorrect.
Other than that I think it very likely that the problem is in your other method of obtaining the file size. TFileStream.Size can be trusted to give an accurate value. If that doesn't tally with some other measure then that other measure is probably wrong.
On Linux you can use the stat command to get a definitive report of the file size. I would expect that to yield the same value as TFileStream.Size.

Related

Node.js Streams readable.readableLength not returning file size

I have been trying to upload a file by creating a readStream with fs.createReadStream(filePath) and piping to a writeStream. I need the file size to implement a progress bar.
In docs I see there is the readable.readableLength field, which to my understanding returns the file size but it returns zero in my case. I already got the file size with fs.statSync() but was curious.
The explanation in the documents was not really clear for me so I wanted to ask what readable.readableLength field represents? If it is the file size why it is valued zero?
readableLenght represents the amount of bytes in the stream that can still be read from it. If it is zero, the stream has probably already been read and thus there are zero bytes left for reading.

What is a quick way to check if file contents are null?

I have a rather large file (32 GB) which is an image of an SD card, created using dd.
I suspected that the file is empty (i.e. filled with the null byte \x00) starting from a certain point.
I checked this using python in the following way (where f is an open file handle with the cursor at the last position I could find data at):
for i in xrange(512):
if set(f.read(64*1048576))!=set(['\x00']):
print i
break
This worked well (in fact it revealed some data at the very end of the image), but took >9 minutes.
Has anyone got a better way to do this? There must be a much faster way, I'm sure, but cannot think of one.
Looking at a guide about memory buffers in python here I suspected that the comparator itself was the issue. In most non-typed languages memory copies are not very obvious despite being a killer for performance.
In this case, as Oded R. established, creating a buffer from read and comparing the result with a previously prepared nul filled one is much more efficient.
size = 512
data = bytearray(size)
cmp = bytearray(size)
And when reading:
f = open(FILENAME, 'rb')
f.readinto(data)
Two things that need to be taken into account is:
The size of the compared buffers should be equal, but comparing bigger buffers should be faster until some point (I would expect memory fragmentation to be the main limit)
The last buffer may not be the same size, reading the file into the prepared buffer will keep the tailing zeroes where we want them.
Here the comparison of the two buffers will be quick and there will be no attempts of casting the bytes to string (which we don't need) and since we reuse the same memory all the time, the garbage collector won't have much work either... :)

xentop VBD_RD & VBD_WR output

I'm writing a perl script that track the output of xentop tool, I'm not sure what is the meaning for VBD_RD & VBD_WR
Following http://support.citrix.com/article/CTX127896
VBD_RD number displays read requests
VBD_WR number displays write requests
Dose anyone know how read & write requests are measured in bytes, kilobytes, megabytes??
Any ideas?
Thank you
As far as I understood, xentop shows you two different measuers (two for read and write).
VBD_RD and VBD_WR's measures are unit. The number of times you tried to access to the block device. This does not say anything about the the number of bytes you have read or written.
The second measure read (VBD_RSECT) and write (VBD_WSECT) are measured in "sectors". You can find the size of the sectors by using xenstore-ls (https://serverfault.com/questions/153196/xen-find-vbd-id-for-physical-disks) (in my case it was 512).
The unit of a sector is in bytes (http://xen.1045712.n5.nabble.com/xen-3-3-testing-blkif-Clarify-units-for-sector-sized-blkif-request-params-td2620172.html).
So if the VBD_WR value is 2, VBD_WSECT value is 10, sector size is 512. We have written 10 * 512 bytes in two different requests (you tried to access to the block device two times but we know nothing about how much bytes were written in each request but we only know the total). To find disk I/O you can periodically check these values and take the derivative between those values.
I suppose the sector size might change for each block device somehow, so it might be worthy to check the xenstore-ls output for each domain but I'm not sure. You can probably define it in the cfg file too.
This is what I found out and understood so far. I hope this helps.

VimL: Get extra KB from function that outputs file size

Right now I'm creating a plugin of sorts for Vim, it's meant to simply have all kinds of utility functions to put in your statusline, here's the link: https://github.com/Greduan/vim-usefulstatusline
Right now I have this function: https://github.com/Greduan/vim-usefulstatusline/blob/master/autoload/usefulstatusline_filesize.vim
It simply outputs the file size from bytes to megabytes. Now, currently if the file size reaches 1MB for example it outputs 1MB, this is fine, but I would also like for it to output the amount of bytes or KB extra that it has.
From example, instead of outputting 1MB it would output 1MB-367KB, see what I mean? It would output the biggest size, and then the remainder of the size that follows it. It's hard to explain.
So how would I modify the current function(s) to output the size this way?
Thanks for your help! Any of it is appreciated. :)
Who needs this? I doubt it would be convenient to anyone (especially when having small remainders like 1MB + 3KB), using 1.367MB is much better. I see in your code that you don’t have either MB (1000*1000 B) or MiB (1024*1024 B), 1000*1024 bytes is very strange. Also, don’t use getfsize, it is wrong for any non-file buffer you constantly see in plugins. Use line2byte(line('$')+1)-1.
For 1.367MB you can just rewrite humanize_bytes function in VimL if you are fine with depending on +float feature.
Using integer arithmetic you can get the remainder with
let kbytes_remainder = kbytes % 1000
And do change to either MiB/KiB (M/K is a common shortcut used in ls. Without B) or MB/KB.

Doing file operations with 64-bit addresses in C + MinGW32

I'm trying to read in a 24 GB XML file in C, but it won't work. I'm printing out the current position using ftell() as I read it in, but once it gets to a big enough number, it goes back to a small number and starts over, never even getting 20% through the file. I assume this is a problem with the range of the variable that's used to store the position (long), which can go up to about 4,000,000,000 according to http://msdn.microsoft.com/en-us/library/s3f49ktz(VS.80).aspx, while my file is 25,000,000,000 bytes in size. A long long should work, but how would I change what my compiler(Cygwin/mingw32) uses or get it to have fopen64?
The ftell() function typically returns an unsigned long, which only goes up to 232 bytes (4 GB) on 32-bit systems. So you can't get the file offset for a 24 GB file to fit into a 32-bit long.
You may have the ftell64() function available, or the standard fgetpos() function may return a larger offset to you.
You might try using the OS provided file functions CreateFile and ReadFile. According to the File Pointers topic, the position is stored as a 64bit value.
Unless you can use a 64-bit method as suggested by Loadmaster, I think you will have to break the file up.
This resource seems to suggest it is possible using _telli64(). I can't test this though, as I don't use mingw.
I don't know of any way to do this in one file, a bit of a hack but if splitting the file up properly isn't a real option, you could write a few functions that temp split the file, one that uses ftell() to move through the file and swaps ftell() to a new file when its reaching the split point, then another that stitches the files back together before exiting. An absolutely botched up approach, but if no better solution comes to light it could be a way to get the job done.
I found the answer. Instead of using fopen, fseek, fread, fwrite... I'm using _open, lseeki64, read, write. And I am able to write and seek in > 4GB files.
Edit: It seems the latter functions are about 6x slower than the former ones. I'll give the bounty anyone who can explain that.
Edit: Oh, I learned here that read() and friends are unbuffered. What is the difference between read() and fread()?
Even if the ftell() in the Microsoft C library returns a 32-bit value and thus obviously will return bogus values once you reach 2 GB, just reading the file should still work fine. Or do you need to seek around in the file, too? For that you need _ftelli64() and _fseeki64().
Note that unlike some Unix systems, you don't need any special flag when opening the file to indicate that it is in some "64-bit mode". The underlying Win32 API handles large files just fine.

Resources