how to store multiple files in one file in python? - python-3.x

How can I store multiple files in one file using python?
I mean my own file format not a zip or a rar.
For e.g I want to create an archive from a folder but with my own file format. ( like 'Files.HR' )
Or just storing files in one file without any dictionary or file format. ( 'Files' No file format )

You may want to use "tar" files. In python, you can use the tarfile module to write files in the file and then later extract them back into real files.
You do not have to name the file *.tar. You can name it something else related to your specific application, such as naming it Files.HR.
Please see this nice tutorial or read the official docs to see how to use tarfile.

Related

How to open .nlm files? (UMLS data)

I am working with UMLS data (Unified Medical Language System) and they have zip files which contain files with .nlm extention and .md5 file (probably .md5 files for storing the keys to open .nlm files but I am not sure), and I don't know how to open them. In the notepad it shows symbols which are not readable. Can I convert .nlm files to text format or any other readable format or there are any software or libraries to see their encoded data?
I want to see the tables or data like this after I open .nlm file:
3|ENG||||||8717795||58488005||SNOMEDCT_US|PT|58488005|1,4-alpha-Glucan branching enzyme||N||
3|ENG||||||8717796||58488005||SNOMEDCT_US|FN|58488005|1,4-alpha-Glucan branching enzyme (substance)||N||
3|ENG||||||8717808||58488005||SNOMEDCT_US|SY|58488005|Amylo-(1,4,6)-transglycosylase||N||
3|ENG||||||8718164||58488005||SNOMEDCT_US|SY|58488005|Branching enzyme||N||
19|ENG||||||10794494||112116001||SNOMEDCT_US|SY|112116001|17-hydrocorticosteroid||N||
19|ENG||||||10794495||112116001||SNOMEDCT_US|SY|112116001|17-hydroxycorticoid||N||
19|ENG||||||10794496||112116001||SNOMEDCT_US|PT|112116001|17-hydroxycorticosteroid||N||
19|ENG||||||10794497||112116001||SNOMEDCT_US|FN|112116001|17-hydroxycorticosteroid (substance)||N||

Read only specific csv files in azure dataflow source

I have a data flow source, a delimited text dataset that points to a folder containing many csv files.
So the source reads all the csv files inside the folder2. The files inside folder2 are
abc.csv
someFile.csv
otherFile_2021.csv
predicted_file_1.csv
predicted_file_2.csv
predicted_file_99.csv
The aim is to read data from only the files like predicted_file_*.csv i.e to only read the last three files. Is it possible to add dynamic content in dataset so that it reads specific pattern files?
In source transformation, under source options, you can provide the wildcard path with filename prefix to read the required files.
Example:
(For debug purpose, I have added column to store the filename to verify the files)
Source:
Source preview:
Refer this document for more information.

Importing a whole folder of python files

In the current python program I'm working on, I need to access a lot of stored data. I store it in the form of a bunch of dictionaries, each in their own file. Each file has a single command: giveArchive(). So to access one of the files, I use:
import fileName
return fileName.giveArchive()
And this has worked well so far, but as the number of files I need grows, I want to streamline this a little bit. I'd like to store all of these files in the same folder, and that folder in the same directory as my main file. Is there some way I can import every file in a folder? And if I do, how can I use 'giveArchive()' from specific files in it?
You can do something like:
from folder.subfolder.deepersubfolder import filename
return filename.giveArchive()
this assumes folder can be accessed from the directory your script is running in

About hydra, crunch and 7-zip

I want to create a password list using crunch. As you know the file will become more than 1 petabyte. I know that in 7-zip you can "archive" a file in the format of "7z" and compress it by a lot. If I create a text file then compress it to a "7z" format, then is it possible to get crunch to access the text file while it is compressed in the "7z" format and to add to the list.
Also is it possible to get hydra to read an archive in this format "7z", when you want to use a password list?

Find data file in Zip file using microcontroller

I need a microcontroller to read a data file that's been stored in a Zip file (actually a custom OPC-based file format, but that's irrelevant). The data file has a known name, path, and size, and is guaranteed to be stored uncompressed. What is the simplest & fastest way to find the start of the file data within the Zip file without writing a complete Zip parser?
NOTE: I'm aware of the Zip comment section, but the data file in question is too big to fit in it.
I ended up writing a simple parser that finds the file data in question using the central directory.

Resources