Need advice for disk access program - programming-languages

Need advice for disk access program - programming-languages

I'm envisioning a program I will need to write and need some advice on the language. I will need to be doing raw disk access so I can display hex data, scroll or jump around on the disk, and do calculations from the data. I have been using Java the most and it's portability between OSes for my other projects is certainly a benefit, but raw disk access either isn't possible, would require JNI, or may be possible on *nix when you can access disks as "files". I keep reading different things. By the way I can handle this type of work using Files in Java, but in this project I need to be able to access the disk so disk imaging to files beforehand isn't needed.
It would be nice to make it as portable as I could since there is a real benefit to using different OSes, but it may not be worth it and I should just stick with Windows and a native compiling language. Is there any existing JNI code that could help? I have experience in other languages but I haven't used C++ in a long time. Should I forget about Java and tryout C#? Someone told me that Python has libraries available for this type of thing despite it being an interpreted language so what about Python? What would be best for the project? What would be good for me to learn?
Searching around for raw disk access, Java, Python, does not seem to give any useful results. Thanks for any help!
EDIT
It seems like this will be quite involved, learning what I need to know, and then learning that. It's too bad I couldn't use disk images instead because then I'd be able to start working on it immediately in Java, which I'm comfortable with and I know I could make a good product. I've gotten great throughput in other raw data processing projects with Java so that doesn't worry me. Plus it would be truly portable. Hmm might have to consider it more. I'd probably need a big azz storage system to hold all the images though :)
UPDATE
Just a note for anyone that finds this question... I have figured out this works just by specifying the disk for the File using the PhysicalDrive notation (in Windows) like the answer below by hunsricker. However there are some issues. First if you do a "exists" check File.exists(), it says the file does not exist. Also, the file size is zero, and when I get a "java.io.IOException: The drive cannot find the sector requested" is the way I know I'm at the end of the file. And the worst part- I was getting some odd runtime errors doing this when I was reading some bytes and skipping some (64) bytes in a loop. I altered my program a bit to read different amounts and that changed where the error occurred. I was using BufferedInputStream instead of RandomAccessFile like hunsricker below by the way, not sure if it makes a difference. My only answer for this issue is that since I'm doing physical disk access, it doesn't like that I am not reading in even 512 byte sectors or 1K blocks or such. Indeed when I read even 1K, 2K, 512bytes, etc., and don't skip anything, it works fine and runs to the end. The errors I saw were java.io.ioexception "incorrect function" and java.io.ioexception "the parameter is incorrect". There was no rhyme or reason to them. Then I made image files of the same data and ran my program on those and it would do any combination of reading and skipping bytes with no problem. Physical disk access was more picky I guess.

I was looking by myself for a possibility to access raw data of a physical drive. And now as I got it to work, I just want to tell you how. You can access raw disk data directly from within java ... just run the following code with administrator priviliges:
File diskRoot = new File ("\\\\.\\PhysicalDrive0");
RandomAccessFile diskAccess = new RandomAccessFile (diskRoot, "r");
byte[] content = new byte[1024];
diskAccess.readFully (content);
So you will get the first kB of your first physical drive on the system. To access logical drives - as mentioned above - just replace 'PhysicalDrive0' with the drive letter e.g. 'D:'
oh yes ... I tried with Java 1.7 on a Win 7 system ...
RageDs link brougth me to the solution ... thank you :-)

Disk access will depend on the disk's particular drivers. And since this is such a low-level task, I doubt Java/Python would have such support (these languages are generally used for fast, high-level software package development). Since you will probably not be aware of the disks' particular hardware implementations, you will probably have to end up using an operating system API (which is OS-dependent of course). I would recommend looking into C and/or the particular assembly language for the architecture you plan to do this work on. Then, I would recommend continuing your search to find the appropriate API for your target OS.
EDIT
For Windows, a good place to start is here. More specifically, MSDN's CreateFile() is probably a function you would be interested in.

Related

Persistant storage values handling in Linux

I have a QSPI flash on my embedded board.
I have a driver + process "Q" to handle reading and writing into.
I want to store variables like SW revisions, IP, operation time, etc.
I would like to ask for suggestions how to handle the different access rights to write and read values from user space and other processes.
I was thinking to have file for each variable. Than I can assign access rights for those files and process Q can change the value in file if value has been changed. So process Q will only write into and other processes or users can only read.
But I don't know about writing. I was thinking about using message queue or zeroMQ and build the software around it but i am not sure if it is not overkill. But I am not sure how to manage access rights anyway.
What would be the best approach? I would really appreciate if you could propose even totally different approach.
Thanks!

This question will probably be downvoted / flagged due to the "Please suggest an X" nature.
That said, if a file per variable is what you're after, you might want to look at implementing a FUSE file system that wraps your SPI driver/utility "Q" (or build it into "Q" if you get to compile/control source to "Q"). I'm doing this to store settings in an EEPROM on a current work project and its turned out nicely. So I have, for example, a file, that when read, retrieves 6 bytes from EEPROM (or a cached copy) provides a MAC address in std hex/colon-separated notation.
The biggest advantage here, is that it becomes trivial to access all your configuration / settings data from shell scripts (e.g. your init process) or other scripting languages.
Another neat feature of doing it this way is that you can use inotify (which comes "free", no extra code in the fusefs) to create applications that efficiently detect when settings are changed.
A disadvantage of this approach is that it's non-trivial to do atomic transactions on multiple settings and still maintain normal file semantics.

Can regular file reading benefited from nonblocking-IO?

It seems not to me and I found a link that supports my opinion. What do you think?

The content of the link you posted is correct. A regular file socket, opened in non-blocking mode, will always be "ready" for reading; when you actually try to read it, blocking (or more accurately as your source points out, sleeping) will occur until the operation can succeed.
In any case, I think your source needs some sedatives. One angry person, that is.

I've been digging into this quite heavily for the past few hours and can attest that the author of the link you cited is correct. However, the appears to be "better" (using that term very loosely) support for non-blocking IO against regular files in native Linux Kernel for v2.6+. The "libaio" package contains a library that exposes the functionality offered by the kernel, but it has some caveats about the different types of file systems which are supported and it's not portable to anything outside of Linux 2.6+.
And here's another good article on the subject.

You're correct that nonblocking mode has no benefit for regular files, and is not allowed to. It would be nice if there were a secondary flag that could be set, along with O_NONBLOCK, to change this, but due to the way cache and virtual memory work, it's actually not an easy task to define what correct "non-blocking" behavior for ordinary files would mean. Certainly there would be race conditions unless you allowed programs to lock memory associated with the file. (In fact, one way to implement a sort of non-sleeping IO for ordinary files would be to mmap the file and mlock the map. After that, on any reasonable implementation, read and write would never sleep as long as the file offset and buffer size remained within the bounds of the mapped region.)

how to copy nk.bin to partition on wince 6.0

i want to copy nk.bin to partition on wince 6.0.
i want that when i restart device then using redboot cammand it should be able to load nk.bin from partion. how to do this?

This is a broad and pretty platform-specific question. Forst, you've not told us much about your platform, so we have to make assumptions. I'll assume, based on you using redboot and talking about "partitions" that you'r running on ARM and that your OS image is stored in persistent storage (i.e. Flash).
The next question is "How and where is the OS stored?" This is platform specific, so only you (or your OEM) can say. It might be inside a FAT 32 volume or it might be written raw to a specific location in flash outside of any file system. If it's the former (it's probably not, or you likely wouldn't be asking the question), you could copy it. If it's just at some location raw, you're going to need APIs to directly access the flash. See if the OEM provided them (apps can't map direct to hardware in 6.0, so if there's no OEM-provided API, you'll have to write a driver).
You also need to know if you're XIP. If so, I don't think you're going to be able to copy the OS while it's running - at least I'd consider it a high-risk operation. In that case you likely need to set some sort of bit somewhere outside the existing file system (an EEPROM, scratch-pad regiter, raw flash, etc) and reboot, then modify the bootloader to make the copy.
THis all assumes you mean you want to copy it from on the device itself. You could mean you want to copy it using a JTAG tool as well, in which case everything I've said is irrelevant (except the location of the OS - and even that's not relevant if you're thinking you want to copy it from an outside source).

Recently, I need a memory-file swapping algorithm

My boss asked me to find some algorithms or existed libraries.
coz our application runs on linux, and it need lots of files, maybe over 5G-20G....but we dont need to load the files at one time, but at anytime when the file is needed. btw, we have maybe over 100-1000 files stored in our drive.
However, this application is kinda realtime, at least. Simple and ordinary reading or loading can not meet our needs.
I know in linux and windows, there is mechanism virture memory..in linux we use mmap to realize our swapping demands...
But boss is boss, who said we dont take that into consideration presently..
So, i am here prey for helps..
thanx

Your operating system can handle caching and virtual memory (’n stuff) a lot better than you (or any library) can. Apart from simply keeping all files in memory (I heard RAM was cheap :) there’s not much you can do.

Can running 'cat' speed up subsequent file random access on a linux box?

on a linux box with plenty of memory (a few Gigs), I need to access randomly to a big file as fast as possible.
I was thinking about doing a cat myfile > /dev/null before accessing it so my file pages go in memory sequentially, hence faster than with a dry random access.
Does this approach make sense to you?

While doing that may force the contents of the file into the system's cache, you are better off using posix_fadvise() (with the POSIX_FADV_WILLNEED advice) or the (blocking)readahead() call to make the kernel precache the data you will need.
EDIT:
You might also want to try using the POSIX_FADV_RANDOM advice to disable readahead altogether.
There's an article with a decent explanation of usage here: Advising the Linux Kernel on File I/O

As the others said, you'll need to benchmark it in your particular case.
It is quite possible it will result in a significant performance increase though.
On traditional rotating media (i.e. a hard disk) sequential access (cat file > /dev/null/fadvise) is much faster than random access.

Only one way to be sure that any (possibly premature?) optimization is worthwhile: benchmark it.

It could theoretically speed up the access (especially if you access almost everything from the file), but I wouldn't bet on a big difference.
The only really useful approach is to benchmark it for your specific case.

If you really want the speed I'd recommend trying memory-mapped IO instead of trying to hack something up with cat. Of course, it depends on the size of file you're trying to access and the type of access you want.. this may not be possible...
readahead is a good call too...

Doing "cat" on a big file might bring the data in and blow more valuable data out of the cache; this is not what you want.
If performance is at all important to you, you'll be doing regular performance testing anyway (and soak tests etc), so continue to do that and watch your graphs, figures etc.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string