Tesseract not using path variable - linux

Why does my Tesseract instance require me to explicitly set my datapath, but doesn't want to read the environment variable?
Let me clarify: running the code
ITesseract tesseract = new Tesseract();
String result = tesseract.doOCR(myImage);
Throws an error:
Error opening data file ./tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the
parent directory of your "tessdata" directory.
I already have set my environment variable, ie doing
echo $TESSDATA_PREFIX returns /usr/share/tessdata/
Now, setting the path variable explicitly in my code, ie:
Itesseract tesseract = new Tesseract();
tesseract.setDatapath("/usr/share/tessdata/");
String result = tesseract.doOCR(myImage);
WORKS PERFECTLY. Why?
I'm using Manjaro 17.0.5

The library was initially designed to use the data files bundled in its tessdata folder. In your case, if you want to read from the standard tessdata directory, you would want to set datapath as follows:
tesseract.setDatapath(System.getenv("TESSDATA_PREFIX"));

Related

env file format weirdly

I'm working on node project, I'm using .env file to hide some of data. The .env file worked normally, and after I added more informations it got some color letters for variable names and it won't detect my variables when i'm using:
mongodb+srv://MONGODB_USERNAME:MONGODB_PASSWORD#cluster0.ekxmb.mongodb.net/MONGODB_DATABASE_NAME?retryWrites=true&w=majority
I'm having following code in my .env file:
MONGODB_PASSWORD = testpassword;
MONGODB_USERNAME = testuser;
MONGODB_DATABASE_NAME = testdbname;
It works when I manually type in app.js file code like this:
"mongodb+srv://testuser:testpassword#cluster0.ekxmb.mongodb.net/testdbname?retryWrites=true&w=majority"
After I added some additional variables code got formed weirdly with colors on variable names.
Note: It worked with code where I'm importing variables from .env file before, but after weird .env format it won't work.
Putting a variable name in a string will just make it say that variable name. In addition, you don't have the environment variables loaded anywhere.
Install the dotenv npm package for processing the .env file. Then, add this to your code to load the config file:
require('dotenv').config();
Create variables for each of your environment variables from process.env.YOUR_VARIABLE_NAME. An easy way to do this is with destructuring:
let {MONGODB_USERNAME, MONGODB_PASSWORD, MONGODB_DATABASE_NAME} = process.env;
Properly insert these variables into the string. You can use template literals for this:
`mongodb+srv://${MONGODB_USERNAME}:${MONGODB_PASSWORD}#cluster0.ekxmb.mongodb.net/${MONGODB_DATABASE_NAME}?retryWrites=true&w=majority`

Access to ENV variables defined in ${workspaceFolder}/.env files

For a project I need to define the ENVIRO variable (and some others) with the values prod/stage/dev.
This variable is used in .devcontainer/docker-compose.yml, .devcontainer/Dockerfile, some shell scripts and the Python source to set paths and the like.
Therefore I defined the file ${workspaceFolder}/.env which is imported by the Python extension like described here:
ENVIRO=dev
...
To avoid to execute / debug my code in the wrong environment, I wanted to create a little VSC extension, which does nothing else than to show the value of the ENVIRO variable in the Status Bar at the bottom.
Now the problem. In the extensions activate function I don't get access to the ENV variables defined in .env file:
const envValue = process.env["ENVIRO"];
// gives: undefined
In an terminal in the same VSC instance:
echo $ENVIRO
# gives: dev
When I access ENV variables defined by the system (not the .env file), there is no problem to access them in the extension's activate function:
export function activate( context: vscode.ExtensionContext) {
const envValue = process.env["NVM_BIN"];
// gives: '/Users/andi/.nvm/versions/node/v14.15.1/bin'
Is there no way to access this variable?
My suspicion is following:
The Python extension extends the Environment with the variables using the EnvironmentVariableCollection
This adds them to the terminal environment, but prevents access to the variables in other extensions.
Or do I (hopefully) miss something?

environment variable in a config.properties file

I'm trying to compile a Maven project that has a config.properties file. In the file, I have a set of environment variables that I have to set before compile.
In the config.properties file the variables are called like this:
${sys:rdfstore.host}:${sys:rdfstore.port}/openrdf-sesame/repositories/iserve/rdf-graphs/service
How do I have to set the variable rdfstore.host, and to what value should I set it to?
I have tried to solve this with:
export rdfstore.host="localhost"
However, with this I obtain a msj that is a invalid identifier, because
it has a point "." How can I solve this problem?
You should be confusing environment variables and the set of sytem properties:
The properties exported from your system as you did with the export command are called environment variables and should not contain dots in the name.
Those properties are then refered to using ${env.XXX}, meaning in your case you should change the variable name to:
export RDFSTORE_HOST="localhost"
It can then be referred to as below:
`${env.RDFSTORE_HOST}`
System variables are those introcued in command line when invoking a maven phase, those ones can host dots in their names:
mvn -Drdfstore.host="localhost"
They can be referred to as follows:
${rdfstore.host}
You can find more informations in the maven properties manual.

Require file somewhere in the directory node.js

I have a file that is required in many other files, that are on different folders, inside the main directory.
Is there a way to just require the filename without having to write the relative path, or the absolute path? Like require('the_file'). And without having to go to npm and install it?
Create a folder inside your main directory , put the_file.js inside and set the NODE_PATH variable to this folder.
Example :
Let's say you create a ./libs folder within your main directory, you can just use :
export NODE_PATH = /.../main/lib
after that, you can require any module inside this directory using just :
var thefile = require('the_file')
To not have to do that every time, you'd have to add the variable to your .bashrc (assuming you're running a Unix system).
Or you can set a global variable inside your app.js file and store the path of your 'the_file' in it like so :
global.rootPath = __dirname;
Then you can require from any of your files using :
var thefile = require(rootPath+'/the_file')
These are the most convenient methods for me, short of creating a private npm, but there are a few other alternatives that I discovered when looking up an answer to your question, have a look here : https://gist.github.com/branneman/8048520

How do you get the path of the running script in groovy?

I'm writing a groovy script that I want to be controlled via a properties file stored in the same folder. However, I want to be able to call this script from anywhere. When I run the script it always looks for the properties file based on where it is run from, not where the script is.
How can I access the path of the script file from within the script?
You are correct that new File(".").getCanonicalPath() does not work. That returns the working directory.
To get the script directory
scriptDir = new File(getClass().protectionDomain.codeSource.location.path).parent
To get the script file path
scriptFile = getClass().protectionDomain.codeSource.location.path
As of Groovy 2.3.0 the #SourceURI annotation can be used to populate a variable with the URI of the script's location. This URI can then be used to get the path to the script:
import groovy.transform.SourceURI
import java.nio.file.Path
import java.nio.file.Paths
#SourceURI
URI sourceUri
Path scriptLocation = Paths.get(sourceUri)
Note that this will only work if the URI is a file: URI (or another URI scheme type with an installed FileSystemProvider), otherwise a FileSystemNotFoundException will be thrown by the Paths.get(URI) call. In particular, certain Groovy runtimes such as groovyshell and nextflow return a data: URI, which will not typically match an installed FileSystemProvider.
This makes sense if you are running the Groovy code as a script, otherwise the whole idea gets a little confusing, IMO. The workaround is here: https://issues.apache.org/jira/browse/GROOVY-1642
Basically this involves changing startGroovy.sh to pass in the location of the Groovy script as an environment variable.
As long as this information is not provided directly by Groovy, it's possible to modify the groovy.(sh|bat) starter script to make this property available as system property:
For unix boxes just change $GROOVY_HOME/bin/groovy (the sh script) to do
export JAVA_OPTS="$JAVA_OPTS -Dscript.name=$0"
before calling startGroovy
For Windows:
In startGroovy.bat add the following 2 lines right after the line with
the :init label (just before the parameter slurping starts):
#rem get name of script to launch with full path
set GROOVY_SCRIPT_NAME=%~f1
A bit further down in the batch file after the line that says "set
JAVA_OPTS=%JAVA_OPTS% -Dgroovy.starter.conf="%STARTER_CONF%" add the
line
set JAVA_OPTS=%JAVA_OPTS% -Dscript.name="%GROOVY_SCRIPT_NAME%"
For gradle user
I have same issue when I'm starting to work with gradle. I want to compile my thrift by remote thrift compiler (custom by my company).
Below is how I solved my issue:
task compileThrift {
doLast {
def projectLocation = projectDir.getAbsolutePath(); // HERE is what you've been looking for.
ssh.run {
session(remotes.compilerServer) {
// Delete existing thrift file.
cleanGeneratedFiles()
new File("$projectLocation/thrift/").eachFile() { f ->
def fileName=f.getName()
if(f.absolutePath.endsWith(".thrift")){
put from: f, into: "$compilerLocation/$fileName"
}
}
execute "mkdir -p $compilerLocation/gen-java"
def compileResult = execute "bash $compilerLocation/genjar $serviceName", logging: 'stdout', pty: true
assert compileResult.contains('SUCCESSFUL')
get from: "$compilerLocation/$serviceName" + '.jar', into: "$projectLocation/libs/"
}
}
}
}
One more solution. It works perfect even you run the script using GrovyConsole
File getScriptFile(){
new File(this.class.classLoader.getResourceLoader().loadGroovySource(this.class.name).toURI())
}
println getScriptFile()
workaround: for us it was running in an ANT environment and storing some location parent (knowing the subpath) in the Java environment properties (System.setProperty( "dirAncestor", "/foo" )) we could access the dir ancestor via Groovy's properties.get('dirAncestor').
maybe this will help for some scenarios mentioned here.

Resources