How to config the heron-core file in Heron 0.17.5 version? - heron

When I used heron 0.17.1 version, I can config the heron-core file as following:
# location of the core package
heron.package.core.uri: "/heron/dist/heron-core.tar.gz"
# Whether role/env is required to submit a topology. Default value is False.
heron.config.is.role.required: True
heron.config.is.env.required: True
In this config, I deployed Heron 0.17.1 version with HDFS as its Uploader. So I copied the local file /heron/dist/heron-core.tar.gz file to hdfs://heron/disk in HDFS, and it worked.
However, when I updated the Heron's version from 0.17.1 to 0.17.5, I found there is no heron-core.tar.gz file in local /heron/dist directory. But it still needs to config the heron-core.tar.gz url in Client.yaml:
# location of the core package
# heron.package.core.uri: "file:///vagrant/.herondata/dist/heron-core-release.tar.gz"
# Whether role/env is required to submit a topology. Default value is False.
heron.config.is.role.required: True
heron.config.is.env.required: True
So what should I do to config the heron-core url in Client.yaml when I using Heron 0.17.5 verion? In detail, I tested the Heron cluster that didn't config the heron-core url, and it didn't work. You can see the change of Heron version in 0.17.5 is #2684.
Thanks for your answer.

"I found there is no core-core.tar.gz file in local /heron/dist directory.". You mean "heron-core.tar.gz"? I checked the centos build (https://github.com/apache/incubator-heron/releases/download/0.17.5/heron-0.17.5-centos.tar.gz) and I believe heron-core.tar.gz is in the dist directory.

Related

unable to read configfile using Configparser in Databricks

I want to read some values as a parameter using configparser in Databricks
i can import configparser module in databricks but unable to read the parameters from configfile its coming up error as KEY ERROR
please check the below screenshot
config file is
The problem is that your file is located on DBFS (the /FileStore/...) and this is file system isn't understood by configparser that works with "local" file system. To get this working, you need to append the /dbfs prefix to file path: /dbfs/FileStore/....
P.S. it may not work on community edition with DBR 7.x. In this case, just copy this config file before reading using the dbutils.fs.cp, like this :
dbutils.fs.cp("/FileStore/...", "file:///tmp/config.ini")
config.read("/tmp/config.ini")

Logstash 5 configure log4j logging for itself (not as plugin)

This is just for future reference since I solved it myself.
When I switched from logstash 2.x to 5.x, I was dealing with this warning (when I was runnig my logstash on this path D:\somepath\logstash-5.0.1):
Could not find log4j2 configuration at path /somepath/logstash-5.0.1/config/log4j2.properties. Using default config which logs to console
After some searching on internet and digging in ruby code (in the extracted logstash) I have found out this:
necessary to use path.settings (as mentioned many times) correctly
use correctly file or directory as URL path.
Finally I run my logstash as:
logstash.bat --path.settings=file://D:/somepath/logstash-5.0.1/config

Which directory contains third party libraries for Spark

When we use
spark-submit
which directory contains third party libraries that will be loaded on each of the slaves? I would like to scp one or more libraries to each of the slaves instead of shipping the contents in the application uber-jar.
Note: I did try adding to
$SPARK_HOME/lib_managed/jars
But the spark-submit still results in a ClassNotFoundException for classes included in the added library.
Hope these points will help you.
$SPARK_HOME/lib/ [contains the jar files ]
$SPARK_HOME/bin/ [contains the launch scripts - Spark-Submit,Spark-Class,pySpark,compute-classpath.sh etc]
Spark-Submit ---will call ---> Spark-Class.
Spark-class internally calls compute-Classpath.sh before executing / launching the job.
compute-Classpath.sh will pick the jars availble in $SPARK_HOME/lib to CLASSPATH.
(execute ./compute-classpath.sh //returns jars in lib dir)
So try these options.
option-1 - Placing user-specific-jars in $SPARK_HOME/lib/ will works
option-2 - Tweak compute-classpath.sh so that it will be able to pic
your jars specified in a user specific jars dir

how to make tornado auto restart when certain files changes

I'm using Tornado working with Python 3 and Linux server, when I edit and save some text or XML files I want Tornado to restart itself. I checked the document and found the autoreload module and the watch function here.
It seems it only worked for pyo files. What can I do if I want it to reload when a certain URI is modified?
Setting the debug flag to True in settings forces Tornado to reload whenever a file is modified or whenever a URI is changed in app.py (or where ever you have defined your handlers). Tornado also automatically reloads template files so any changes in there will be seen instantly.
settings = {
'debug':True,
# other stuff
}
tornado.web.Application.__init__(self, handlers, **settings)
the file added must be an absolute path.
def addwatchfiles(*paths):
for p in paths:
autoreload.watch(os.path.abspath(p))
addwatchfiles('config.xml')
config.xml is at the same directory in where the server's python file start at.
You need to turn autoreload on:
tornado.autoreload.start()
tornado.autoreload.watch('myfile')
Complete example at https://gist.github.com/renaud/10356841

Configuration does not have any version information - assuming the configuration is for version 1.5

Where in ccnet.config set version? I search and read docs but there is no version mentioned.
You can add this line in your config file:
<cruisecontrol xmlns:cb="urn:ccnet.config.builder" xmlns="http://thoughtworks.org/ccnet/1/5">
It is mentionned in this tracker CCNET-1870 :
http://jira.public.thoughtworks.org/browse/CCNET-1870
#Andy I got the same error in one of my configurations.
If you use cb:include in your configuration file, you should also give xmlns=version-url cb:config-template in configuration files included by cb:include.

Resources