MapDB file types - mapdb

I have a problem with mapDB version 1.0.6. When i create a database i end up with two files with the same name but with different file types.
One is for example IRTree with file type FILE and the other is IRTree with file type .p
Having said that, whenever i try to read my database providing a filename IRTree i end up with an exception:
NullPointerException with the command DBMaker.newFileDB(new File(filename)).readOnly().make(); or an IOException: storage header is invalid.
Can anyone explain to me what's going on?

MapDB uses two files. .P file is used to store data. Always open file without extension, otherwise it will try to open incorrect file.

Related

caret ^ is converting to some special symbol

I'm transferring the file which has the content like below from mainframe system to a Unix instance. I've a delimiter in the file as ^&*. I'm sending the same in mainframe but when we receive the file in the unix we're receiving as Ø&*.
I'm using connect direct to transfer the file from one system to another.
File Type: Flat File, File transfer: CD (Connect Direct)
file content
H^&*20220407^&*160009^&*2006
T^&*1
But when I receive the file in the unix server I can the file content is changed. Mainly ^ is converted to Ø.
HØ&*20220407Ø&*160009Ø&*2006
TØ&*1
This is most surely a code page problem.
The data in the file on the mainframe is (most probably) in some EBCDIC code page. ConnectDirect is doing a code page tranformation when sending the file to that UNIX system. This is what the XLATE(YES) means.
However, there is some default code page "from"-"to" pair configured, which is being used with XLATE(YES). But this probably is not the correct pair. You need to
find out what EBCDIC code page the data on the mainframe is encoded in. Is it IBM-037, IBM-1047, IBM-500, IBM-273, etc. There are many.
find out what code page the data shall be in on the UNIX side: UTF-8, ISO8859-1, 437, etc. There are many.
make sure ConnectDirect will transform using the correct source and target code pages.
Ask your ConnectDirect support people to help you with this.

How do I get the filename of an open std::fs::File in Rust?

I have an open std::fs::File, and I want to get it's filename, e.g. as a PathBuf. How do I do that?
The simple solution would be to just save the path used in the call to File::open. Unfortunately, this does not work for me. I am trying to write a program that reads log files, and the program that writes the logs keep changing the filenames as part of it's log rotation. So the file may very well have been renamed since it was opened. This is on Linux, so renaming open files is possible.
How do I get around this issue, and get the current filename of an open file?
On a typical Unix filesystem, a file may have multiple filenames at once, or even none at all. The file metadata is stored in an inode, which has a unique inode number, and this inode number can be linked from any number of directory entries. However, there are no reverse links from the inode back to the directory entries.
Given an open File object in Rust, you can get the inode number using the ino() method. If you know the directory the log file is in, you can use std::fs::read_dir() to iterate over all entries in that directory, and each entry will also have an ino() method, so you can find the one(s) matching your open file object. Of course this approach is subject to race conditions – the directory entry may already be gone again once you try to do anything with it.
On linux, files handles held by the current process can be found under /proc/self/fd. These look and act like symlinks to the original files (though I think they may technically be something else - perhaps someone who knows more can chip in).
You can therefore recover the (possibly changed) file name by constructing the correct path in /proc/self/fd using your file descriptor, and then following the symlink back to the filesystem.
This snippet shows the steps:
use std::fs::read_link;
use std::os::unix::io::AsRawFd;
use std::path::PathBuf;
// if f is your std::fs::File
// first construct the path to the symlink under /proc
let path_in_proc = PathBuf::from(format!("/proc/self/fd/{}", f.as_raw_fd()));
// ...and follow it back to the original file
let new_file_name = read_link(path_in_proc).unwrap();

Thermocycle library and OpenModelica

I want to load Thermocycle library by OpenModelica connect edition. But I get a message "The file was not encoded in UTF-8"
To fix this problem I should: "add a file package.encoding at the top-level." But I don't understand what must I do? What is the file which called "package.encoding", what should this file consist from? Where should I insert it?
The error message says it all. "add a file package.encoding at the top-level."
Put the file where your library's package.mo is located.
The file must contain the name of encoding used by the library.
Note that you can also use OMEdit's encoding conversion feature. File->Open/Convert Modelica File(s) With Encoding

Proper way to differentiate pst and dbx files in bash shell

I want to identify the file-format of the input file given to my shell script - whether a .pst or a .dbx file. I checked How to check the extension of a filename in a bash script?. That one deals with txt files and two methods are given there -
check if the extension is txt
check if the mime type is application/text etc.
I tried file -ib <filename> on a .pst and a .dbx file and it showed application/octet-stream for both. However, if I just do file <filename>, then I get
this for the dbx file -
file1.dbx: Microsoft Outlook Express DBX File Message database
and this for the pst file -
file2.pst: Microsoft Outlook binary email folder (Outlook >=2003)
So, my questions are -
is it better to use mime type detection everytime when the output can be anything and we need a proper check?
How to apply mime type check in this case - both returning "application/octet-stream"?
Update
I didn't want to do an extension based detection because it seems we just can't be sure on a Unix system, that a .dbx file truly is a dbx file. Since file <filename> returns a line which contains the correct information of the file (e.g. "Microsoft Outlook Express DBX File Message database"). That means the file command is able to identify the file type properly. Then why does it not get the correct information in file -ib <filename> command?
Will parsing the string output of file <filename> be fine? Is it advisable assuming I only need to identify a narrow set of data storage files of outlook family (MS Outlook Express, MS Office Outlook 2003,2007,2010 etc.). A small text identifier like application/dbx which could be compared would be all I need.
The file command relies on having a file type detection database which includes rules for the file types that you expect to encounter. It may not be possible to recognize these file types if the file content doesn't have a unique code near the beginning of the file.
Note that the -i option to emit mime types actually uses a separate "magic" numbers file to recognize file types rather than translating long descriptions to file types. It is quite possible for these two databases to be out of sync. If your application really needs to recognize these two file types I suggest that you look at the Linux source code for "file" to see how they recognize them and then code this recognition algorithm right into your app.
If you want to do the equivalent of DOS file type detection, then strip the extension off the filename (everything after the last period) and look up that string in your own table where you define the types that you need.

What is the standard way to handle users opening incorrect file types?

I hope my Q was clear ... I am curious about the typical way to code for someone clicking File|Open, and selecting a file that is inappropriate for the program--like someone using a word processing program and trying to open a binary file.
In my case, my files have multiple streams streamed together. I'm unsure how to have the code validate whether an improper file was selected before the app throws a stream read exception. (Or is the way to handle the situation to just write code to catch a stream read exception?)
Thanks, as always.
I think it's quite usual that you have code that just tries to open the file, and if it fails, an error is shown to the user. Most file formats has some kind of header with a "magic number", so that the reader can tell if it's not the right file very quickly after reading the first few bytes of the file.
Magic number at the start of the file generally helps -- if you have control of the file format.
Otherwise, yeah -- catch the exception and put up a dialog.

Resources