Yahoo - caffe on Spark library dependency? - apache-spark

Yahoo just released a version of caffe that uses the latest version of Apache-Spark yesterday, the git repo is not well documented yet: git link
There is a scala test file which is suppose to run an example: Scala Example
but it requires the dependency com.yahoo.ml.caffe.{Config, CaffeOnSpark, DataSource} which I assume contains basically the data, the config and the API. Has this been made into a library yet? How could I build this using sbt?

Our CaffeOnSpark release contains all the code that you need to run. com.yahoo.ml.caffe.* are collection of Scala classes in caffe-grid folder. Please follow our guides at CaffeOnSpark wiki page, and ask questions at your mailing list.

Related

How to write Google DataFlow templates in NodeJS?

I'm looking for information on writing DataFlow jobs in NodeJS. The tutorials are all referring to Java or Python.
Any ideas if it's possible?
Until now(2.13.0), Beam allows writing pipeline by Python, Java and Go.
Beam community is working on cross language pipeline feature that allows call PTransforms written in another language, but NodeJS is not on the roadmap.
Check[1] for more information on it.
[1] https://beam.apache.org/roadmap/portability
There is support for TypeScript now.
As of February 2023. it's still experimental but main functionality is covered, from Apache Beam documentation:
The Typescript SDK supports Node v16+ and is still experimental.
You can install it via npm:
npm install apache-beam
More instructions and documentation can be found in the official sites at npm and github.
Link npm: https://www.npmjs.com/package/apache-beam
Link github (example): https://github.com/apache/beam-starter-typescript

what is the difference between io.cucumber and info.cukes

I am trying to integrate BDD using Cucumber. But I am really confused what is the difference between io.cucumber and info.cukes libraries. And which one to use and when.
I tried to read and understand the github README.md file still can't make heads or tails.
Still further I am not sure what is cucumber-jvm. Why do we need cucumber-junit (can't the standalone junit library suffice).
Thanks in advance. Any help is much appreciated.
Refer to the release notes for more details. - https://github.com/cucumber/cucumber-jvm/blob/master/CHANGELOG.md.
There has been substantial changes in cucumber 2. Refer to this for more - https://cucumber.io/blog/2017/08/29/announcing-cucumber-jvm-2-0-0
io.cucumber and info.cukes are Maven group ids. info.cukes was for Cucumber version till 1.2.5. The latest version are in io.cucumber starting from 2.0.0. There is also a new version 3 with more goodies in github with the master as mentioned in the release notes.
The reason the groupid was changed because gherkin has changed the groupid similarly.
cucumber-jvm is the java implementation of Cucumber framework. there are many other implementations in other languages - https://github.com/cucumber.
When you use the #RunWith(Cucumber.class) on top of the test class, it means that a specialized runner is being used which will execute the feature files. The default runner of junit will not get you anywhere, though might cough up some exceptions.

Why does HDInsight cluster does not come with pre-installed Scala?

on HDInsight's masternode, $scala -verion returns an error. It is easily installed via
$apt-get install scala
but shouldn't scala be installed there by default?
Thank you for suggestion. What's the scenario where you need scala to be directly installed on the node? For example, in spark there are couple of other common scenarios that already work:
Running Spark commands in command line. This is accomplished through spark-shell which has built-in scala interpreter.
Building spark project. This is ussually done through maven or sbt project definition file. Those tools would automatically download correct scala version and compiler based on the project dependencies.
As you said it's not hard to preinstall scala, but we would like to understand the need to do that. In the discussions with customers this didn't come up before.

Looking for source code for Crafter Deployer 2.5.3

I have an instance of crafter running with crafter-studio-publishing-receiver-2.5.3-aio.jar, I need to locate the source code for the jar file.
Is this the right repository
What is the significance of the word "legacy" in the name of the project?
You can find the source code here:
https://github.com/craftercms/legacy-deployer
The specific version can be found by checking the manifest of the jar.
- unzip the jar
- open ./META-INF/MANIFEST.MF
- locate the property Implementation-Build: 87c84d58313b2bcbdca306de69758320aee174d0
This value can be placed in github to get the exact code you are looking for.
Example:
https://github.com/craftercms/legacy-deployer/blob/87c84d58313b2bcbdca306de69758320aee174d0/cstudio-publishing-receiver-zip/pom.xml
The reason we renamed the project "legacy-deployer" in github is that with Crafter 3.x we are moving to a new deployment system. Without going too deep on this: The new system is based on Git pulls, as you can imagine, this approach has many benefits. It will support the same concepts (callbacks etc) as the now "legacy" deployer.

Where's the NodaTime.Serialization.JsonNet?

At Latest API documentation NodaTime.Serialization.JsonNet is shown as a part of NodaTime library.
But I can't find it anywhere. Here's the NodeTime in ObjectBrowser in my VisualStudio.
I even looked into NodeTime.Testing and haven't found it.
I don't know where to look for it anymore. These two (NodeTime and NodeTime.Testing) are only packages available over NuGet.
From the page you linked to:
Code in this namespace is not currently included in Noda Time NuGet packages; it is still deemed "experimental". To use these serializers, please download and build the Noda Time source code from the project home page.
For 1.2, we'll be distributing a separate pre-built assembly and NuGet package, but that's not quite ready yet, so for now you'll have to build your own.

Resources