How to read text file within library package - rust

I’m trying to read some text files located within my lib crate. File structure looks like this:
workspace
|
|-- MyBin
| |-- src
| |-- main.rs
|
`-- MyLib
|-- src
|-- lib.rs
`-- text.txt
Alright so MyBin has MyLib as one of its dependencies. Within lib.rs I am using the std::fs::read_to_string function to get access to text.txt’s contents. But when I run cargo run on MyBin the relative path now starts within MyBin’s src folder rather than MyLib.
Any way to read text.txt even when calling read_to_string from another crate?

When using libraries you need to understand that your code is never actually run before it's linked to an executable that itself is being run.
read_to_string does not embed your files content at compile time in the lib / executable it only tells the program that it needs to go get the file at runtime.
The read_to_string takes either a relative or absolute path to the file. I assume in your case you used a relative path. Relative path are always interpreted at runtime relative to where the executable is being run from.
As the comment on your post suggested if you need the text.txt file to you when you run the program you either have to ship it with the executable or embed its content in the library (maybe as an array of string).

Related

How to create an npm package that supports both local and global (-g) install with webpack and typescript?

I am trying to create an npm package that will have a local and a global (-g) installation option, but I am confused at the difference between the four directories (src vs lib and the purpose of bin).
Before, I would have used the src directory, and have webpack transpile and bundle (ts-loader + babel-loader) to the dist directory (with dist being hidden in gitignore). However, from what I looked at online, I should have instead bundle to lib directory, and manually create a bin directory that points to lib for executable cli (or global) packages?
Can someone tell me if my thought is correct? Should lib be added to gitignore, or should it be committed? What should I do about the bin directory? Is there a resource that you can point to for learning more about this?
I tried searching online for help with creating npm packages and searched through npmjs.com too, but I could not figure out what to do.
I also tried taking a quick look on github at what other projects did, but all I am able to derive so far is that the bin directory should include 1 index.js file that imports the main js file from the lib directory.
Thanks!
It doesn't really matter much when it comes to directory structure. Further, I would highly recommend that you use just the compiler Babel or TypeScript and generate required bundle (CJS or ESM). This way, you easily gets to preserve your source code directory structure.
If you cannot just use the compiler, then prefer Rollup over Webpack as it is excellent when it comes to bundling for libraries. The ESM output is still experiment with Webpack (as of version 5.x.x).
If you still decide to go ahead with Webpack, then this is what I would do. For a given directory structure:
<ROOT>
|-- src
| |-- app (actual library)
| | |-- index.js
| |-- cli (cli related code)
| | |-- index.js
And my content is:
// Library barrel file: src/app/index.js
export function main() {
console.log('Hello from library');
}
// CLI barrel file: src/cli/index.js
import { main } from '../app/index';
function cli() {
main();
}
cli();
// webpack.config.js
module.exports = {
mode: 'production',
entry: {
'cli/index': './src/bin/cli.js',
'library/index': './src/app/index.js'
},
output: {
filename: '[name].js',
library: {
type: 'commonjs'
}
},
externals: {
'../app/index': {
commonjs: '../app/index',
},
},
// USE ONLY IF LIBRARY TARGET IS MODULE
// experiments: {
// outputModule: true,
// }
// Rest of the configuration...
};
By default, it would create a dist folder with directory structure like this:
<ROOT>
|-- dist
| |-- cli
| | |-- index.js
| |-- library
| | |-- index.js
The, I can make use of package.json files bin and main field to respective compiles files:
{
"name": "my-lib",
"version": "1.0.0",
"main": "dist/library/index.js",
"types": "dist/library/index.d.js",
"bin": "dist/cli/index.js"
}
In above webpack configuration, note the use of externals which matches all the imports I have used. The default behavior of webpack is to always bundle and thus without this externals configuration, you would end up with library getting bundled twice for two entry points - one for cli and another as a library. (This is where Rollup helps greatly.) Further, you will have to do this for literally every third-party module that you imports or use webpack-node-externals
Few things to note:
Generally avoid having both (CLI & library) in the same package. Use something like my-library and my-libary-cli. The my-libary-cli package can specify my-library as a peerDependency.
As said earlier, what matters is the bin and main field. The folder structure doesn't matter.
Also, prefer just compiler over bundler. If you need to import custom assets like PNG, CSS, etc. then only bundler would be required. In that case, prefer rollup over webpack due to the clean output that Rollup generates. Use webpack for application level bundling.
To generate TS definition files, use tsc. Webpack won't do that for you.
The setup can get complicated easily if you intend to use new exports field instead of main field. So, I would prefer having a barrel pattern instead of allowing multiple sub imports. May things get wrong. TypeScript, Jest setup, etc. on consumer side.

node.js: resolve path relative from a test file

I working on a function implementation that is going to be distributed as a npm package.
Once this package has been installed by a consumer as a devDependency, they should be able to load a configuration file by calling a function (let's call it setConfig) from their test files.
I am trying to understand how to resolve a relative path (relative to the test file) when the consumer invokes my function (setConfig('./../some/relative/path/to/config')).
For instance, in the following project structure:
|- node_modules
|- my-published-package
|- dist/file.js (<- this hold setConfig)
|- src
|- configToLoad.json
|- __tests__
|- someTest.spec.js
The config could be specified as ../configToLoad.json and the setConfig should be able to resolve the path relatively to the test file path. In other words, the test author would write `importedClassFromPackage.setConfig('../configToLoad.json').
The setConfig function has the following signature:
setConfig(configPath: string): void {}
Where configPath can either be:
a relative path (from the test file that invokes that function)
an absolute path (absolute path of the configuration file)
a module specifier (i.e my-module/path/to/config.json)
First off, when your setConfig() function runs, you have no idea at all where it was called from. You do not and cannot know the directory of the file that you were called from.
As such, it is the responsibility of the caller to give you a path that you you can just use directly. It needs to either be a fully qualified path or be a relative path that is relative to the current working directory. It cannot be a path that is relative to the directory that the caller's module was in (unless that happens to be the same as the current working directory) because you will have no idea what the caller's module directory is.
So, in your desired spec where you want configPath to be any of these:
a relative path (from the test file that invokes that function)
an absolute path (absolute path of the configuration file)
a module specifier (i.e my-module/path/to/config.json)
Only the absolute path option will work or can be made to work.
The relative path won't work because your function does not know the directory of the test file you were called from.
The module specifier won't work because you don't know how to get the directory of that module name and it's ambiguous because it is possible to have the same named module loaded from more than one place (this happens in the real world when dependencies use a common module, but are using different versions of it).
Note, the caller can use path.join() to join their own module directory with the name of the file and pass you that fully qualified path. In a CommonJS module, the caller gets their own directory from __dirname. In an ESM module, the caller has to extract the directory from import.meta.url.

Multi Level import in Python not working with Airflow

My file structure look like this
Module
|
|--Common
| |
| utils.py
| credentials.ini
|-- Folder2
|--Folder3
|--Folder4
|
Folder5
|
comp.py
I need to import utils.py functions in the comp.py file, but the problem is that utils itself needs the credentials.ini file for it to work.
I solved the problem in utils.py by giving it a absolute path like this path=join(dirname(os.path.realpath(__file__)), 'credentials.ini')
and in comp.py file I added this path to env using
import sys
sys.path.append("../../")
While this worked when I ran comp.py but I need to schedule it on airflow for it to run. Whenever airflow schedules comp.py to run it can't find the utils.py (Airflow and the module package are in different paths). Any idea how I can resolve it? I don't want to manually add the utils.py path to the env.
P.S The whole directly is initialized as a package. I have added __init__.py to the main module folder as well as all the subdirectories in it.
EDIT: Fixed Formatting
Airflow loads DAGs in a sandboxed environment and it does not handle all the various ways importing works when you run Python file as script. This is due to security and the way how different components of the distributed system work.
See https://airflow.apache.org/docs/apache-airflow/stable/modules_management.html but more detailed information especially the "development" versoin of the documentation that will be released in 2.2 (especially the "best practices"):
http://apache-airflow-docs.s3-website.eu-central-1.amazonaws.com/docs/apache-airflow/latest/modules_management.html#best-practices-for-module-loading
There are some best practices to follow:
Place all your python files in one of the modules that are already on pythonpath
Always use absolute imports, do not use "relative" references
Don't rely on your current working directory setting (likely this is what your problem was really - your current working directory was different than you expected).
In your case what will likely work is:
write a method in your 'utils.py" - for example "get_credentials_folder()".
in this method use __file__ to derive the path of the "utils.py" and find the absolute path of the expected folder containing it (use pardir and abspath)
add that absolute path you get to the sys.path

Output the binaries in the project's root bin subfolder using cmake

I'm currently on a BF interpreter project. I decided to use CMake, and it works properly.
I settled for an out-of-source build, following the following tree :
+ Project root
|
+-- src/
+-- bin/
+-- build/ <- holds the "rubbish" generated by CMake when generating the Makefile
+-- CMakeLists.txt
When I want to build the project, I run, from the project's root folder :
cd build
cmake ..
make
In the CMakeLists.txt, I added the following line :
SET(EXECUTABLE_OUTPUT_PATH ${PROJECT_BINARY_DIR}/bin)
But, I've found it outputs the binaries in the build/bin folder, so I edited the line to :
SET(EXECUTABLE_OUTPUT_PATH "../bin/"
It works perfectly fine, but it is, IMHO, kind of ugly. Is there any "clean" way of doing this, that is without making assumptions about the project's structure, instead using something like set(EXECUTABLE_OUTPUT_PATH ${PROJECT_ROOT}/bin") ?
Thanks in advance for your replies and sorry for any English errors i may have made, as English isn't my first language.
You can set the variable CMAKE_RUNTIME_OUTPUT_DIRECTORY to achieve this - something like:
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${PROJECT_SOURCE_DIR}/bin)
Basically, the ${PROJECT_ROOT} variable you are looking for is PROJECT_SOURCE_DIR, CMAKE_SOURCE_DIR, or CMAKE_CURRENT_SOURCE_DIR. Each has a slightly different meaning, but for a basic project, these could well all point to the same directory.
Note that for the CMAKE_RUNTIME_OUTPUT_DIRECTORY variable to take effect, it must be set before the target is created, i.e. before the add_executable call in the CMakeLists.txt.
Also note that multi-configuration generators like MSVC will still append a per-configuration directory to this project root/bin folder.

Organizing School Workspace/Writing Makefile

I am doing some C programming for school and I have found myself reusing libraries that I have created, over and over again (stacks, user input, error handling, etc).
Right now, my personal SVN directory structure looks like this:
trunk/
|-- 2520
| `-- assignments
| |-- A2
| |-- Makefile
| |-- README
| |-- calculator.c
| |-- calculatorlib.c
| `-- calculatorlib.h
`-- libs
|-- misc
| |-- errorlib.c
| |-- errorlib.h
| |-- userinputlib.c
| `-- userinputlib.h
`-- stacks
|-- stacklib.c
`-- stacklib.h
Some of these files (userinputlib and errorlib) get used in almost every project I work on for obvious reasons. I want to be able to include these files in a project's workspace (2520/assignments/A2) without having to copy the files because I don't want to manage to copies of a file and I don't want to check in two copies of the same file in SVN.
I would like to have the library files in the project workspace so my Makefile works without having to do to much manual configuration (or hard-coding paths).
At first, I thought about symbolic links (which SVN and tar support) but I can't seem to compile my assignment given my headers in another directory.
I can manually compile each header to an object file and do the final linking, but I'm not sure how to do this in a Makefile automatically.
Any help or any alternative to how I have my environment setup is appreciated.
Thanks!
EDIT: I forgot to mention that I have searched Google and have found a few pages describing automatic dependency generation (perhaps I want this?) using gcc -MM and I have read over the GNU Make manual but nothing jumped out at me.
Use could use subversion's externals feature to link a dynamic copy of the libs tree as a subdirectory of your project.
But in the end it may be better to just use a copy (a subversion copy, so it would effectively be a branch of the library code), so that you don't need to worry about changes to the library affecting existing projects, and can merge changes around as needed.
Why wouldn't you want to (svn) copy the necessary libs into your project? You'll essentially be creating a branch of your lib for use in your project. If you end up discovering a bug in your library code, you can fix it in place, and commit it back (to the copy). Once you're wrapping up, you can merge your fixes back to the library's trunk location.

Resources