How to pass parameters into your ETL job? - kiba-etl

I am building an ETL which will be run on different sources, by a variable.
How can I execute my job (rake task)
Kiba.run(Kiba.parse(IO.read(etl_file),etl_file))
and pass in parameters for my etl_file to then use for its sources?
source MySourceClass(variable_from_rake_task)

Author of Kiba here.
EDIT: the solution below still applies, but if you need more flexibility, you can use Kiba.parse with a block to get more flexibility. See https://github.com/thbar/kiba/wiki/Considerations-for-running-Kiba-jobs-programmatically-(from-Sidekiq,-Faktory,-Rake,-...) for a detailed explanation.
Since you are using a Rake task (and not calling Kiba in a parallel environment, like Resque or Sidekiq), what you can do right now is leverage ENV variables, like this:
CUSTOMER_IDS=10,11,12 bundle exec kiba etl/upsert-customers.etl
Or, if you are using a rake task you wrote, you can do:
task :upsert_customers => :environment do
ENV['CUSTOMER_IDS'] = [10, 11, 12].join(',)
etl_file = 'etl/upsert-customers.etl'
Kiba.run(Kiba.parse(IO.read(etl_file),etl_file))
end
Then in upsert-customers.etl:
# quick parsing
ids = ENV['CUSTOMER_ID'].split(',').map { |c| Integer(c) }
source Customers, ids: ids
As I stated before, this will only work for command line mode, where ENV can be leveraged safely.
For parallel executions, please indeed track https://github.com/thbar/kiba/issues/18 since I'm going to work on it.
Let me know if this properly answers your need!

Looks like this is tracked here https://github.com/thbar/kiba/issues/18 and already asked here Pass Parameters to Kiba run Method

Related

Declarative Pipeline using env var as choice parameter value

Disclaimer: I can achieve the behavior I’m looking for with Active Choices plugin, BUT I really want this to work in a Jenkinsfile and controlled with scm because it’s tedious to configure the Active Choices on each job we may need them on. And with it being separate from the Jenkinsfile creation, it’s then one job defined in multiple places. :(
I am looking to verify if this is possible, because I can’t get the syntax right, if it is possible. And I haven’t been able to find any examples online:
pipeline {
environment {
ARTIFACTS = lib.myfunc() // this works well
}
parameters {
choice(name: "Artifacts", choices: ARTIFACTS) // I can’t get this to work
}
}
I cannot use the function inline in the declaration of the parameter. The errors were clear about that, but it seems as though I should be able to do what I’ve written out above.
I am not home, so I do not have the exceptions handy, but I will add them soon. They did not seem very helpful while I was working on this yesterday.
What have I tried?
I’ve tried having the the function return a List Because it requires a list according to the docs, and I’ve also tried (illogically) returning a String in the precise syntax of a list of strings. (It was hacky, like return "['" + artifacts.join("', '") + "']" to look like ['artifact1.zip', 'artifact2.zip']
I also tried things like "$ARTIFACTS" and ${ARTIFACTS} in desperation.
the list of choices has to be supplied as String containing new line characters (\n): choices: 'TESTING\nSTAGING\nPRODUCTION'
I was tipped off by this article:
https://st-g.de/2016/12/parametrized-jenkins-pipelines
Related to a bug:
https://issues.jenkins.io/plugins/servlet/mobile#issue/JENKINS-40358
:shrug:
First, we need to understand that Jenkins starts running your pipeline code by presenting you with Parameters page. Once you've set up the parameters, and pressed Build, then a node is allocated, variables are set, and your code starts to run.
But in your pipeline, as presented above, you want to run some code to prepare the parameters.
This is not how Jenkins usually works. It's definitely not doing the following: allocating a node, setting the variables, running some of your code until parameters clause is reached, stopping all that, presenting you with GUI, and then continuing where it left off. Again, it's not how Jenkins works.
This is why, when writing a new pipeline, your first option to build it is Build and not Build with Parameters. Jenkins hasn't run your code yet; it doesn't have any idea if there are any parameters. When running for the first time, it will remember the parameters (and any choices, if were) as were configured for this (first) run, so in the second run you will see the parameters as configured in the first run. (Generally, in run number n you will see the result of configuration in run number n-1.)
There are a number of ways to overcome this.
If having a "somewhat recent" (and not "current and absolutely up-to-date") situation fits you, your code may need minor changes to work — second time. (I don't know what exactly lib.myfunc() returns but if it's a choice of Development/Staging/Production this might be good enough.)
If having a "somewhat recent" situation is an absolute no-no (e.g. your lib.myfunc() returns the list of git branches, and "list of branches as of yesterday" is unacceptable), then your only solution is ActiveChoice. ActiveChoice allows you to run some code before showing you the Build with Parameters GUI (with script approval etc.).

abas-ERP: Excecute a FO-Service-Program from a cronjob

I need to run a service program, written in FO for abas-ERP continuous.
I heard about some already existing scripts for calling service programs from the shell. If that is possible I could simply use a cronjob for starting this script.
But I don't know exactly where to find a template for these shell scripts, which conditions have to be complied and if there are any restrictions.
For Example: Is it possible to call several FO-programs successively (this might be important relating to blocking licences)?
You can use edpinfosys.sh and execute infosystem TEXTZEIGEN per cronjob.
You could also use batchlg.sh
batchlg.sh 'FOP-Name' [ -PASSARGS ] [Parameter ...]

Cucumber feature outlines

Is it possible to parameterise a feature file in the same way it is a scenario? So each scenario in the feature could refer to some variables which are later defined by a single table for the entire feature file?
All of the answers I've found so far (Feature and scenario outline name in cucumber before hook for example) use Ruby meta-programming, which doesn't inspire much hope for the jvm setup I'm using.
No its not, and for good reason. Feature files are meant to be simple and readable, they are not for programming. Even using scenario outlines and tables is generally not a good thing, so taking this further and having a feature that cannot be understood without reading some other thing that defines variables is counter productive.
You can however put all your variables and stuff in step definitions and write your feature at a higher level of abstraction. You'll find implementing this much easier, as you can use a programming language (which is good at this stuff).
One way of parameterising a feature file is to generate it from a template at compile-time. Then at runtime your cucumber runner executes the generated feature file.
This is fairly easy to do if you are using gradle. Here is an example:
In build.gradle, add groovy code like this:
import groovy.text.GStringTemplateEngine
task generateFeatureFiles {
doFirst {
File featuresDir = new File(sourceSets.main.output.resourcesDir, "features")
File templateFile = new File(featuresDir, "myFeature.template")
def(String bestDay, String currentDay) = ["Friday", "Sunday"]
File featureFile = new File(featuresDir, "${bestDay}-${currentDay}.feature")
Map bindings = [bestDay: bestDay, currentDay: currentDay]
String featureText = new GStringTemplateEngine().createTemplate(templateFile).make(bindings)
featureFile.text = featureText
}
}
processResources.finalizedBy(generateFeatureFiles)
myFeature.template is in the src/main/resources/features directory and might look like this:
Feature: Is it $bestDay yet?
Everybody wants to know when it's $bestDay
Scenario: $currentDay isn't $bestDay
Given today is $currentDay
When I ask whether it's $bestDay yet
Then I should be told "Nope"
Running the build task will create a Friday-Sunday.feature file in build/src/main/resources with the bestDay and currentDay parameters filled in.
The generateFeatureFiles custom task runs immediately after the processResources task. The generated feature file can then be executed by the cucumber runner.
You could generate any number of feature files from the feature template file. The code could read in parameters from a config file in your resources directory for example.

Can I alter Python source code while executing?

What I mean by this is:
I have a program. The end user is currently using it. I submit a new piece of source code and expect it to run as if it were always there?
I can't find an answer that specifically answers the point.
I'd like to be able to say, "extend" or add new features (rather than fix something that's already there on the fly) to the program without requiring a termination of the program (eg. Restart or exit).
Yes, you can definitely do that in python.
Although, it opens a security hole, so be very careful.
You can easily do this by setting up a "loader" class that can collect the source code you want it to use and then call the exec builtin function, just pass some python source code in and it will be evaluated.
Check the package
http://opensourcehacker.com/2011/11/08/sauna-reload-the-most-awesomely-named-python-package-ever/ . It allows to overcome certain raw edges of plain exec. Also it may be worth to check Dynamically reload a class definition in Python

Passing build parameters downstream using groovy. Jenkins build pipeline

I've seen a number of examples that execute a pre build system groovy script to the effect of
import hudson.model.*
def thr = Thread.currentThread()
def build = thr?.executable
printf "Setting SVN_UPSTREAM as "+ build.getEnvVars()['SVN_REVISION'] +"\n" ;
build.addAction(new ParametersAction(new StringParameterValue('SVN_UPSTREAM', build.getEnvVars()['SVN_REVISION'])))
Which is intended to make SVN_UPSTREAM available to all downstream jobs.
With this in mind I attempt to use $SVN_UPSTREAM in a manually executed downstream job like
https://code.mikeyp.com/svn/mikeyp/client/trunk#$SVN_UPSTREAM
Which is not resolved causing an error.
Can anyone spot the problem here?
The bleeding edge jenkins build pipeline plugin now supports parameter passing. Eliminated the need for the groovy workaround for me.
Make sure that the parameter you are passing downstream is not set as a parameter in the downstream job where you wish to use it. That is, in the downstream job, if you have "This build is parametrized" checked, do not add SVN_UPSTREAM to the list of parameters. If you do, it overrides the preset value.

Resources