Spark Custom Logging - apache-spark

Spark Custom Logging - apache-spark

I have multiple spark projects in my IDE. By default spark is picking log4j.properties file in spark/conf folder.
As I have multiple spark projects, I want have multiple log4j.properties files(per project one). Probably as part of the project code(resources folder)
Is there a way we can pickup specified log4j.properries instead of default log4j.properties.
Note:
I tried this
--driver-java-options "-Dlog4j.configuration=file:///usr/local/Cellar/apache-spark/2.4.1/libexec/conf/driver_log4j.properties"
and it worked without any issues, however I'm looking for something like below.
however I want to load log4j.properties file which is in resource folder while creating the spark logger.
class SparkLogger():
def __init__(self, app_name, sparksession = None):
self._spark = sparksession
self.log4jLogger = None
if self._spark is not None:
sparkContext =self._spark.sparkContext
self.log4jLogger = sparkContext._jvm.org.apache.log4j
self.log4jLogger = self.log4jLogger.LogManager.getLogger(app_name)
def info(self, info):
if self.log4jLogger:
self.log4jLogger.info(str(info))
def error(self, info):
if self.log4jLogger:
self.log4jLogger.error(str(info))
def warn(self, info):
if self.log4jLogger:
self.log4jLogger.warn(str(info))
def debug(self, info):
if self.log4jLogger:
self.log4jLogger.debug(str(info))

You have to define application_name log logger properties in log4j file. When you call get logger method using applicaiton_name, you will able to access customized application basis logs generation.

I have attempted to build my custom Logging just like what u described in your question but failed at last. I have to say it was totally a waste.
Finally I chose java.util.logging instead of log4j. Actually it is an original Logging util within JDK. The purpose I use it is that I wanna log information only for myself into a specified file.
So the class is like below.
package org.apache.spark.internal
import java.io.File
import java.text.SimpleDateFormat
import java.util.Date
import java.util.logging._
import scala.collection.mutable
protected [spark] object YLogger extends Serializable with Logging {
private var ylogs_ = new mutable.HashMap[String, Logger]()
private def initializeYLogging(className: String): Unit = {
// Here we set log file onto user's home.
val rootPath = System.getProperty("user.home")
val logPath = rootPath + File.separator + className + ".log"
logInfo(s"Create ylogger for class [${className}] with log file named [${logPath}]")
val log_ = Logger.getLogger(className)
val fileHandler = new FileHandler(logPath, false)
val formatter = new Formatter {
override def format(record: LogRecord): String = {
val time = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(new Date)
new StringBuilder()
.append("[")
.append(className)
.append("]")
.append("[")
.append(time)
.append("]")
.append(":: ")
.append(record.getMessage)
.append("\r\n")
.toString
}
}
fileHandler.setFormatter(formatter)
log_.addHandler(fileHandler)
ylogs_.put(className, log_)
}
private def ylog(logName: String): Logger = {
if (!ylogs_.contains(logName)) {
initializeYLogging(logName)
}
ylogs_.get(logName).get
}
def ylogInfo(logName: String)(info: String): Unit = {
if (ylog(logName).isLoggable(Level.INFO)) ylog(logName).info(info)
}
def ylogWarning(logName: String)(warning: String): Unit = {
if (ylog(logName).isLoggable(Level.WARNING)) ylog(logName).warning(warning)
}
}
And you can use it like below.
YLogger.ylogInfo("logFileName") ("This is a log.")
It's quite simple to use, I hope my answer could help u.

Related

Groovy call another script to set variables

I'm trying to define variables in another groovy script that I want to use in my current script. I have two scripts like this:
script1.groovy
thing = evaluate(new File("script2.groovy"))
thing.setLocalEnv()
println(state)
script2.groovy
static def setLocalEnv(){
def state = "hi"
def item = "hey"
}
When I println(state), I get a missing property exception. Basically I want script2 to have config variables that I can load in the context of script1. How can I do this?

I'm not sure what/how you want to do exactly, but I guess you can achieve your goal using one of the class available in groovy dynamique scripting capabilities: groovy.lang.Binding or GroovyClassLoader or GroovyScriptEngine, here is an example using GroovyShell class:
abstract class MyScript extends Script {
String name
String greet() {
"Hello, $name!"
}
}
import org.codehaus.groovy.control.CompilerConfiguration
def config = new CompilerConfiguration()
config.scriptBaseClass = 'MyScript'
def shell = new GroovyShell(this.class.classLoader, new Binding(), config)
def script = shell.parse('greet()')
assert script instanceof MyScript
script.setName('covfefe')
assert script.run() == 'Hello, covfefe!'
This is one way to bind a variable to an external script file, more examples from the doc:
http://docs.groovy-lang.org/latest/html/documentation/guide-integrating.html
P.S. Loading external file can be done with GroovyClassLoader:
def gcl = new GroovyClassLoader()
def clazz2 = gcl.parseClass(new File(file.absolutePath))
Hope this helps.

Getting variables from a different file in Jenkins Pipeline

I have a contants.groovy file as below
def testFilesList = 'test-package.xml'
def testdataFilesList = 'TestData.xml'
def gitId = '9ddfsfc4-fdfdf-sdsd-bd18-fgdgdgdf'
I have another groovy file that will be called in Jenkins pipeline job
def constants
node ('auto111') {
stage("First Stage") {
container('alpine') {
script {
constants = evaluate readTrusted('jenkins_pipeline/constants.groovy')
def gitCredentialsId = constants."${gitId}"
}
}
}
}
But constants."${gitId}" is says "cannot get gitID from null object". How do I get it?

It's because they are local variables and cannot be referenced from outside. Use #Field to turn them into fields.
import groovy.transform.Field
#Field
def testFilesList = 'test-package.xml'
#Field
def testdataFilesList = 'TestData.xml'
#Field
def gitId = '9ddfsfc4-fdfdf-sdsd-bd18-fgdgdgdf'
return this;
Then in the main script you should load it using load step.
script {
//make sure that file exists on this node
checkout scm
def constants = load 'jenkins_pipeline/constants.groovy'
def gitCredentialsId = constants.gitId
}
You can find more details about variable scope in this answer

Parsing xml file at build time and modify its values/content

I want to parse a xml file during build time in build.gradle file and want to modify some values of xml, i follow this SO question and answer Load, modify, and write an XML document in Groovy but not able to get any change in my xml file. can anyone help me out. Thanks
code in build.gradle :
def overrideLocalyticsValues(String token) {
def xmlFile = "/path_to_file_saved_in_values/file.xml"
def locXml = new XmlParser().parse(xmlFile)
locXml.resources[0].each {
it.#ll_app_key = "ll_app_key"
it.value = "123456"
}
XmlNodePrinter nodePrinter = new XmlNodePrinter(new PrintWriter(new FileWriter(xmlFile)))
nodePrinter.preserveWhitespace = true
nodePrinter.print(locXml)
}
xml file :
<resources>
<string name="ll_app_key" translatable="false">MY_APP_KEY</string>
<string name="ll_gcm_sender_id" translatable="false">MY_GCM_SENDER_ID</string>
</resources>

In your code : Is it right ...? Where is node name and attribute ..?
locXml.resources[0].each { // Wrongly entered without node name
it.#ll_app_key = "ll_app_key" // Attribute name #name
it.value = "123456" // you can't change or load values here
}
You tried to read and write a same file. Try this code which replaces the exact node of the xml file. XmlSlurper provides this facility.
Updated :
import groovy.util.XmlSlurper
import groovy.xml.XmlUtil
def xmlFile = "test.xml"
def locXml = new XmlSlurper().parse(xmlFile)
locXml.'**'.findAll{ if (it.name() == 'string' && it.#name == "ll_app_key") it.replaceBody 12345 }
new File (xmlFile).text = XmlUtil.serialize(locXml)

Groovy has a better method for this than basic replacement like you're trying to do - the SimpleTemplateEngine
static void main(String[] args) {
def templateEngine = new SimpleTemplateEngine()
def template = '<someXml>${someValue}</someXml>'
def templateArgs = [someValue: "Hello world!"]
println templateEngine.createTemplate(template).make(templateArgs).toString()
}
Output:
<someXml>Hello world!</someXml>

How to run a Data Import Script in Seprate Thread Grails/Grrovy?

I use to import data from excel ,but i use the bootstrap.groovy to write the code and my import script method is called when the application starts.
Here the scenarios is i m having 8000 related data once to import if they are not on my database.And,also when i deploy it to tomcat6 it is blocking other apps from deployment ,until it finish the import.So,i want to use separate thread for to run the script in anyway without affecting performance AND BLOCKING OTHER FROM DEPLOYMENT.
code excerpt ...
class BootStrap {
def grailsApplication
def sessionFactory
def excelService
def importStateLgaArea(){
String fileName = grailsApplication.mainContext.servletContext.getRealPath('filename.xlsx')
ExcelImporter importer = new ExcelImporter(fileName)
def listState = importer.getStateLgaTerritoryList() //get the map,form excel
log.info "List form excel:${listState}"
def checkPreviousImport = Area.findByName('Osusu')
if(!checkPreviousImport) {
int i = 0
int j = 0 // up
date cases
def beforeTime = System.currentTimeMillis()
for(row in listState){
def state = State.findByName(row['state'])
if(!state) {
// log.info "Saving State:${row['state']}"
row['state'] = row['state'].toString().toLowerCase().capitalize()
// log.info "after capitalized" + row['state']
state = new State(name:row['state'])
if(!state.save(flash:true)){
log.info "${state.errors}"
break;
}
}
}
}

For import of large data I suggest to take in consideration the use of Spring Batch. Is easy to integrate it in grails. You can try with this plugin or integrate it manually.

groovy script classpath

I'm writing a script in Groovy and I would like someone to be able to execute it simply by running ./myscript.groovy. However, this script requires a 3rd party library (MySQL JDBC), and I don't know of any way to provide this to the script other than via a -classpath or -cp argument, e.g.
`./monitor-vouchers.groovy -cp /path/to/mysql-lib.jar`
For reasons I won't go into here, it's not actually possible to provide the JAR location to the script using the -classpath/-cp argument. Is there some way that I can load the JAR from within the script itself? I tried using #Grab
import groovy.sql.Sql
#Grab(group='mysql', module='mysql-connector-java', version='5.1.19')
def getConnection() {
def dbUrl = 'jdbc:mysql://database1.c5vveqm7rqgx.eu-west-1.rds.amazonaws.com:3306/vouchers_prod'
def dbUser = 'pucaroot'
def dbPassword = 'password'
def driverClass = "com.mysql.jdbc.Driver"
return Sql.newInstance(dbUrl, dbUser, dbPassword, driverClass)
}
getConnection().class
But this causes the following error:
Caught: java.sql.SQLException: No suitable driver
java.sql.SQLException: No suitable driver
at monitor-vouchers.getConnection(monitor-vouchers.groovy:13)
at monitor-vouchers.run(monitor-vouchers.groovy:17)
Is there a way I can execute this script using just ./monitor-vouchers.groovy

You should be able to do:
import groovy.sql.Sql
#GrabConfig(systemClassLoader=true)
#Grab('mysql:mysql-connector-java:5.1.19')
def getConnection() {
def dbUrl = 'jdbc:mysql://database1.c5vveqm7rqgx.eu-west-1.rds.amazonaws.com:3306/vouchers_prod'
def dbUser = 'pucaroot'
def dbPassword = 'bigsecret'
def driverClass = "com.mysql.jdbc.Driver"
return Sql.newInstance(dbUrl, dbUser, dbPassword, driverClass)
}
getConnection().class

Two more options:
Put the jar in ${user.home}/.groovy/lib
If the jar is in a known location, use this code to load it into the current class loader:
this.class.classLoader.rootLoader.addURL( new URL() )

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Spark Custom Logging - apache-spark

You have to define application_name log logger properties in log4j file. When you call get logger method using applicaiton_name, you will able to access customized application basis logs generation.

Related

Groovy call another script to set variables

Getting variables from a different file in Jenkins Pipeline

Parsing xml file at build time and modify its values/content

How to run a Data Import Script in Seprate Thread Grails/Grrovy?

groovy script classpath

Categories

Resources