Flink StreamingFileSink on Azure Blob Storage

Flink StreamingFileSink on Azure Blob Storage - azure

I am trying to connect the StreamingFileSink to Azure Blob Storage. There is currently no mention of Azure in the documentation, but I hoped it will work with the file system abstraction.
After analysing the error, I assume this feature is out of scope right now for Azure Blob Storage.
Now I'd like to be sure that I did not make any mistake and would appreciate any pointers if there is a way to make it work.
So far i found out:
This is the exception I see:
java.lang.NoClassDefFoundError: org/apache/flink/fs/azure/common/hadoop/HadoopRecoverableWriter
at org.apache.flink.fs.azure.common.hadoop.HadoopFileSystem.createRecoverableWriter(HadoopFileSystem.java:202)
at org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.createRecoverableWriter(PluginFileSystemFactory.java:129)
at org.apache.flink.core.fs.SafetyNetWrapperFileSystem.createRecoverableWriter(SafetyNetWrapperFileSystem.java:69)
at org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.<init>(Buckets.java:117)
at org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink$RowFormatBuilder.createBuckets(StreamingFileSink.java:288)
at org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink.initializeState(StreamingFileSink.java:402)
at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:284)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeStateAndOpen(StreamTask.java:1006)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:454)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94)
at org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:449)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:461)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
at java.lang.Thread.run(Thread.java:748)
Now the HadoopRecoverableWriter class method seems to be missing
But in the flink-azure-fs-hadoop repository I can find this in the pom.xml
<!-- shade Flink's Hadoop FS adapter classes -->
<relocation>
<pattern>org.apache.flink.runtime.fs.hdfs</pattern>
<shadedPattern>org.apache.flink.fs.azure.common.hadoop</shadedPattern>
</relocation>
And the shaded package includes this class here.
Now I had a look at the contents of the flink-azure-fs-hadoop-1.10.0.jar and the shaded package and the RecoverableWriter is missing:
HadoopBlockLocation.class
HadoopDataInputStream.class
HadoopDataOutputStream.class
HadoopFileStatus.class
HadoopFileSystem.class
HadoopFsFactory.class
HadoopFsRecoverable.class
More digging shows there is actually a filter in the pom.xml shade part which excludes the RecoverableWriter.
<filter>
<artifact>org.apache.flink:flink-hadoop-fs</artifact>
<excludes>
<exclude>org/apache/flink/runtime/util/HadoopUtils</exclude>
<exclude>org/apache/flink/runtime/fs/hdfs/HadoopRecoverable*</exclude>
</excludes>
</filter>

Related

Problems configuring clustered Vertx Eventbus in Quarkus

I'm using:
Quarkus 1.6.1.Final
Vertx 3.9.1 (provided by quarkus-vertx dependency, see pom.xml below)
And I can't get the clusered Eventbus working. I've followed the instructions listed here:
https://vertx.io/docs/vertx-hazelcast/java/
I've also enabled clustering in Quarkus:
quarkus.vertx.cluster.clustered=true
quarkus.vertx.cluster.port=8081
quarkus.vertx.prefer-native-transport=true
quarkus.http.port=8080
And here is my pom.xml:
<dependencies>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-resteasy</artifactId>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-resteasy-mutiny</artifactId>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-vertx</artifactId>
</dependency>
<dependency>
<groupId>io.vertx</groupId>
<artifactId>vertx-hazelcast</artifactId>
<version>3.9.2</version>
<exclusions>
<exclusion>
<groupId>io.vertx</groupId>
<artifactId>vertx-core</artifactId>
</exclusion>
<!-- <exclusion>-->
<!-- <groupId>com.hazelcast</groupId>-->
<!-- <artifactId>hazelcast</artifactId>-->
<!-- </exclusion>-->
</exclusions>
</dependency>
<!-- <dependency>-->
<!-- <groupId>com.hazelcast</groupId>-->
<!-- <artifactId>hazelcast-all</artifactId>-->
<!-- <version>3.9</version>-->
<!-- </dependency>-->
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-transport-native-epoll</artifactId>
<classifier>linux-x86_64</classifier>
</dependency>
</dependencies>
And the error I get is the following:
Caused by: java.lang.ClassNotFoundException: com.hazelcast.core.MembershipListener
As you can see in my pom.xml, I've also added the dependency hazelcast-all:3.9 and excluded the hazelcast dependency from vertx-hazelcast:3.9.2, then this error disappears but another comes up:
Caused by: com.hazelcast.config.InvalidConfigurationException: cvc-complex-type.2.4.a: Invalid content was found starting with element '{"http://www.hazelcast.com/schema/config":memcache-protocol}'. One of '{"http://www.hazelcast.com/schema/config":public-address, "http://www.hazelcast.com/schema/config":reuse-address, "http://www.hazelcast.com/schema/config":outbound-ports, "http://www.hazelcast.com/schema/config":join, "http://www.hazelcast.com/schema/config":interfaces, "http://www.hazelcast.com/schema/config":ssl, "http://www.hazelcast.com/schema/config":socket-interceptor, "http://www.hazelcast.com/schema/config":symmetric-encryption, "http://www.hazelcast.com/schema/config":member-address-provider}' is expected.
Am I doing something wrong or forgetting something, or is this simply a bug in Quarkus or in Vertx ?
Thx in advance for any help.

I think the most probable reason of your issue is that you are using the quarkus-universe-bom which enforces a version of Hazelcast (we have an Hazelcast extension there) which is not compatible with vertx-hazelcast.
Check your dependency tree with mvn dependency:tree and make sure the Hazelcast artifacts are of the version required by vertx-hazelcast.
Another option would be to simply use the quarkus-bom which does not enforce an Hazelcast version and let vertx-hazelcast drag the dependency by itself.

It seems like a bug in Quarkus and this issue is related to:
https://github.com/quarkusio/quarkus/issues/10889

Bringing this from its winter sleep...
I am looking to use quarkus 2 + vert.x 4 and use either the shared data vert.x api or vert.x cluster manager in order to achieve an in-process, distributed cache (as opposed to an external dist. cache cluster)
What's unclear to me, also by looking at the git issue described above (thats still open), is if I can count on these APIs working for me at this time with the versions I mentioned.
Any comments will be great!
Thanks in advance...
[UPDATE]: looks like the clustered cache works with no issues using the shared data API along with quarkus, vertx, hazlecast & mutiny bindings for vertx (all with latest versions).
all I needed to do is set quarkus.vertx.cluster.clustered=true in quarkus properties file, use vertx.sharedData().getClusterWideMap implementation for the distrubuted cache and add gradle/maven 'io.vertx:vertx-hazelcast:4.3.1' support.
in general, thats all it took for a small poc code.
thanks

Scala + SBT - How to configure reference.conf for a shaded Akka library

TL;DR
I am trying to shade a version of the akka library and bundle it with my application (to be able to run a spray-can server on the CDH 5.7 version of Spark 1.6). The shading process messes up akka's default configuration, and after manually providing a separate version of akka's reference.conf for the shaded akka, it still looks like the 2 versions get mixed up somehow.
Is shading akka versions known to cause problems? What am I doing wrong?
Background
I have a Scala/Spark application currently running on Spark 1.6.1 standalone. The application runs a spray-can http server using spray 1.3.3, which requires akka 2.3.9 (Spark 1.6.1 standalone includes a compatible akka 2.3.11).
I am trying to migrate the application to a new Cloudera-based Spark cluster running the CDH 5.7 version of Spark 1.6. The problem is that Spark 1.6 in CDH 5.7 is bundled with akka 2.2.3 which is not sufficient for spray 1.3.3 to function properly.
Attempted solution
Following the suggestion in this post, I decided to shade akka 2.3.9 and bundle it along with my application. Although this time I stumbled upon a new problem - akka has it's default configuration defined in a reference.conf file, which should be located on the application's classpath. Due to a known issue in sbt-assembly's shading feature, it seems that the shaded akka library would require a separate configuration.
So, I ended up shading akka with the following shade rule:
ShadeRule.rename("akka.**" -> "akka_2_3_9_shade.#1")
.inLibrary("com.typesafe.akka" % "akka-actor_2.10" % "2.3.9")
.inAll
and including an additional reference.conf file in my project, which is identical to akka's original reference.conf, but with all occurances of "akka" replaced with "akka_2_3_9_shade".
Now, though, it seems that the Spark-provided akka gets mixed up somehow with the shaded akka, as I'm getting the following error:
Exception in thread "main" java.lang.IllegalArgumentException: Cannot instantiate MailboxType [akka.dispatch.UnboundedMailbox], defined in [akka.actor.default-mailbox], make sure it has a public constructor with [akka.actor.ActorSystem.Settings, com.typesafe.config.Config] parameters
at akka_2_3_9_shade.dispatch.Mailboxes$$anonfun$1.applyOrElse(Mailboxes.scala:197)
at akka_2_3_9_shade.dispatch.Mailboxes$$anonfun$1.applyOrElse(Mailboxes.scala:195)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at scala.util.Failure$$anonfun$recover$1.apply(Try.scala:185)
at scala.util.Try$.apply(Try.scala:161)
at scala.util.Failure.recover(Try.scala:185)
at akka_2_3_9_shade.dispatch.Mailboxes.lookupConfiguration(Mailboxes.scala:195)
at akka_2_3_9_shade.dispatch.Mailboxes.lookup(Mailboxes.scala:78)
at akka_2_3_9_shade.actor.LocalActorRefProvider.akka$actor$LocalActorRefProvider$$defaultMailbox$lzycompute(ActorRefProvider.scala:561)
at akka_2_3_9_shade.actor.LocalActorRefProvider.akka$actor$LocalActorRefProvider$$defaultMailbox(ActorRefProvider.scala:561)
at akka_2_3_9_shade.actor.LocalActorRefProvider$$anon$1.<init>(ActorRefProvider.scala:568)
at akka_2_3_9_shade.actor.LocalActorRefProvider.rootGuardian$lzycompute(ActorRefProvider.scala:564)
at akka_2_3_9_shade.actor.LocalActorRefProvider.rootGuardian(ActorRefProvider.scala:563)
at akka_2_3_9_shade.actor.LocalActorRefProvider.init(ActorRefProvider.scala:618)
at akka_2_3_9_shade.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:619)
at akka_2_3_9_shade.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:616)
at akka_2_3_9_shade.actor.ActorSystemImpl._start(ActorSystem.scala:616)
at akka_2_3_9_shade.actor.ActorSystemImpl.start(ActorSystem.scala:633)
at akka_2_3_9_shade.actor.ActorSystem$.apply(ActorSystem.scala:142)
at akka_2_3_9_shade.actor.ActorSystem$.apply(ActorSystem.scala:109)
at akka_2_3_9_shade.actor.ActorSystem$.apply(ActorSystem.scala:100)
at MyApp.api.Boot$delayedInit$body.apply(Boot.scala:45)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
at scala.App$class.main(App.scala:71)
at MyApp.api.Boot$.main(Boot.scala:28)
at MyApp.api.Boot.main(Boot.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassCastException: interface akka_2_3_9_shade.dispatch.MailboxType is not assignable from class akka.dispatch.UnboundedMailbox
at akka_2_3_9_shade.actor.ReflectiveDynamicAccess$$anonfun$getClassFor$1.apply(DynamicAccess.scala:69)
at akka_2_3_9_shade.actor.ReflectiveDynamicAccess$$anonfun$getClassFor$1.apply(DynamicAccess.scala:66)
at scala.util.Try$.apply(Try.scala:161)
at akka_2_3_9_shade.actor.ReflectiveDynamicAccess.getClassFor(DynamicAccess.scala:66)
at akka_2_3_9_shade.actor.ReflectiveDynamicAccess.CreateInstanceFor(DynamicAccess.scala:84)
... 34 more
The relevant code from my application's Boot.scala file is the following:
[45] implicit val system = ActorSystem()
...
[48] val service = system.actorOf(Props[MyAppApiActor], "MyApp.Api")
...
[52] val port = config.getInt("MyApp.server.port")
[53] IO(Http) ? Http.Bind(service, interface = "0.0.0.0", port = port)

OK, so eventually I managed to solve this.
Turns out akka loads (some of the) configuration settings from the config file using keys that are defined as string literals. You can find a lot of these in akka/actor/ActorSystem.scala, for example.
And it seems that sbt-assembly does not change references to the shaded library/package name in string literals.
Also, some configuration keys are being changed by sbt-assembly's shading. I haven't really taken the time to find where and how exactly they are defined in akka's source, but the following exception, which is being thrown during the ActorSystem init code, proves that this is indeed the case:
ConfigException$Missing: No configuration setting found for key 'akka_2_3_9_shade'
So, the solution it to include a custom config file (call it for example akka_spray_shade.conf), and copy the following configuration sections in it:
The contents of akka's original reference.conf, but having the akka prefix in the configuration values changed to akka_2_3_9_shade. (this is required for the hard-coded string literal config keys)
The contents of akka's original reference.conf, but having the akka prefix in the configuration values changed to akka_2_3_9_shade and having the root configuration key changed from akka to akka_2_3_9_shade. (this is required for the config keys which do get modified by sbt-assembly)
The contents of spray's original reference.conf, but having the akka prefix in the configuration values changed to akka_2_3_9_shade. (this is required to make sure that spray always refers to the shaded akka)
Now, this custom config file must be provided explicitly during the initialization of the ActorSystem in application's Boot.scala code:
val akkaShadeConfig = ConfigFactory.load("akka_spray_shade")
implicit val system = ActorSystem("custom-actor-system-name", akkaShadeConfig)

A small addition to the accepted answer.
It is not necessary to put this configuration in a custom-named file like akka_spray_shade.conf. The configuration can be placed into application.conf which is being loaded by default during ActorSystem creation when no custom configuration is explicitly specified: ActorSystem("custom-actor-system-name") effectively means ActorSystem("custom-actor-system-name", ConfigFactory.load("application")).

I struggled with this for a long time as well. It turns out that the default merge strategy in sbt-assembly excludes all the reference.conf files. Adding this to build.sbt solved it for me:
assemblyMergeStrategy in assembly := {
case PathList("reference.conf") => MergeStrategy.concat
}

Offline Jacoco using MockStatic cause re-instrumentation exception

I'm running PowerMock 1.6.4 and all the latest (JUnit 4.11 though).
I use the Jacoco Ant task to only instrument the classes, not the test classes. I also use the Jacoco ant task to run the Junit tests, then generate the reports.
Now I'm hitting a problem that I can't figure out...
I have a test class that tests one member function of class Foo.
One of the members of Foo is static, so I've wrapped that in a static function so I can control the execution via mock but the side effect is that I need to mockStatic now.
What I've noticed is that PowerMockito.mockStatic(Foo.class) ... all tests fail with instrumentation problems.
I have another test class that tests another member function of Foo. This test class works fine, but as soon as I introduce a mockStatic the test class fails with instrumentation failures.
Has anyone see this failure and know of any workarounds? I can't change the static member variable.

I finally figured out what I believe is the issue. Jacoco instrumentation injects data into your byte code so does PowerMock when it attempts to mock statics. This wrecks havoc since they are stepping on each other and you will get really odd behavior due to them messing with each other. I was getting an assorted bunch of NPE's in code that shouldn't throw NPE's.
The easy solution was to refactor out the unnecessary statics and know that if you plan to use statics to control data flow, probably should rethink the architecture for testing if you plan to use Jacoco for coverage.
You can still run Jacoco instrumentation on statics but you can't mock statics at the same time; at least not with the way PowerMock with Mockito does it. I'm not sure if EasyMock would result in a different behavior, so ymmv.

I had similar issues, but instead of having to refactor the statics out I believe there is another solution. I did this in a maven pom file but I'll explain what is going on. Jacoco does inject data into your byte code. And yes Powermock uses a custom Byte loader and Jacoco hates that. So heres a solution to work around it.
In your Jacoco executions, you need Jacoco to use default-instrumentation for your tests. (you can specify powermock tests or just include all tests it works either way). Heres an explantion of whats going on with default-Instrumentation: Offline-instrumentation with Jacoco.
You must then have the restore step for the tests. Now heres the interesting part, you have to run the normal Jacoco Prepare Agent step, while EXCLUDING all the tests that were run in default instrumentation. (if you don't you will get a bunch of warnings something like JaCoCo exceution data already exists for xTest)
This will solve your problem, and you don't need to refactor out your static methods. Though if they were unnecessary you should probably still take them out ;)
<plugin>
<groupId>org.jacoco</groupId>
<artifactId>jacoco-maven-plugin</artifactId>
<version>${jacoco.version}</version>
<configuration>
<append>true</append>
</configuration>
<executions>
<execution>
<id>default-instrument</id>
<goals>
<goal>instrument</goal>
</goals>
<configuration>
<includes>
<include>**/*test*</include>
</includes>
</configuration>
</execution>
<execution>
<id>default-restore-instrumented-classes</id>
<goals>
<goal>restore-instrumented-classes</goal>
</goals>
<configuration>
<includes>
<include>**/*test*</include>
</includes>
</configuration>
</execution>
<execution>
<id>Prepare-Jacoco</id>
<goals>
<goal>prepare-agent</goal>
</goals>
<configuration>
<excludes>
<exclude>**/*test*</exclude>
</excludes>
</configuration>
</execution>
</executions>
</plugin>

ElasticSearch: aggregation on _score field w/ Groovy disabled

Every example I've seen (e.g., ElasticSearch: aggregation on _score field?) for doing aggregations on or related to the _score field seems to require the usage of scripting. With ElasticSearch disabling dynamic scripting by default for security reasons, is there any way to accomplish this without resorting to loading a script file onto every ES node or re-enabling dynamic scripting?
My original aggregation looked like the following:
"aggs": {
"terms_agg": {
"terms": {
"field": "field1",
"order": {"max_score": "desc"}
},
"aggs": {
"max_score": {
"max": {"script": "_score"}
},
"top_terms": {
"top_hits": {"size": 1}
}
}
}
Trying to specify expression as the lang doesn't seem to work as ES throws an error stating the score can only be accessed when being used to sort. I can't figure out any other method of ordering my buckets by the score field. Anyone have any ideas?
Edit: To clarify, my restriction is not being able to modify the server-side. I.e., I cannot add or edit anything as part of the ES installation or configuration.

One possible approach is to use the other scripting options available. mvel seems not to be possible to be used unless dynamic scripting is enabled. And, unless a more fine-grained control of scripting enable/disable reaches 1.6 version, I don't think is possible to enable dynamic scripting for mvel and not for groovy.
We are left with native and mustache (used for templates) that are enabled by default. I don't think custom scripting can be done with mustache, if it's possible I didn't find a way and we are left with native (Java) scripting.
Here's my take to this:
create an implementation of NativeScriptFactory:
package com.foo.script;
import java.util.Map;
import org.elasticsearch.script.ExecutableScript;
import org.elasticsearch.script.NativeScriptFactory;
public class MyScriptNativeScriptFactory implements NativeScriptFactory {
#Override
public ExecutableScript newScript(Map<String, Object> arg0) {
return new MyScript();
}
}
an implementation of AbstractFloatSearchScript for example:
package com.foo.script;
import java.io.IOException;
import org.elasticsearch.script.AbstractFloatSearchScript;
public class MyScript extends AbstractFloatSearchScript {
#Override
public float runAsFloat() {
try {
return score();
} catch (IOException e) {
e.printStackTrace();
}
return 0;
}
}
alternatively, build a simple Maven project to tie all together. pom.xml:
<properties>
<elasticsearch.version>1.5.2</elasticsearch.version>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>${elasticsearch.version}</version>
<scope>compile</scope>
</dependency>
</dependencies>
<build>
<sourceDirectory>src</sourceDirectory>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>
build it and get the resulting jar file.
place the jar inside [ES_folder]/lib
edit elasticsearch.yml and add
script.native.my_script.type: com.foo.script.MyScriptNativeScriptFactory
restart ES nodes.
use it in aggregations:
{
"aggs": {
"max_score": {
"max": {
"script": "my_script",
"lang": "native"
}
}
}
}
My sample above just returns the _score as a script but, of course, it can be used in more advanced scenarios.
EDIT: if you are not allowed to touch the instances, then I don't think you have any options.

ElasticSearch at least of version 1.7.1 and possibly earlier also offers the use of Lucene's Expression scripting language – and as Expression is sandboxed by default it can be used for dynamic inline scripts in much the same way that Groovy was. In our case, where our production ES cluster has just been upgraded from 1.4.1 to 1.7.1, we decided not to use Groovy anymore because of it's non-sandboxed nature, although we really still want to make use of dynamic scripts because of the ease of deployment and the flexibility they offer as we continue to fine-tune our application and its search layer.
While writing a native Java script as a replacement for our dynamic Groovy function scores may have also have been a possibility in our case, we wanted to look at the feasibility of using Expression for our dynamic inline scripting language instead. After reading through the documentation, I found that we were simply able to change the "lang" attribute from "groovy" to "expression" in our inline function_score scripts and with the script.inline: sandbox property set in the .../config/elasticsearch.yml file – the function score script worked without any other modification. As such, we can now continue to use dynamic inline scripting within ElasticSearch, and do so with sandboxing enabled (as Expression is sandboxed by default). Obviously other security measures such as running your ES cluster behind an application proxy and firewall should also be implemented to ensure that outside users have no direct access to your ES nodes or the ES API. However, this was a very simple change, that for now has solved a problem with Groovy's lack of sandboxing and the concerns over enabling it to run without sandboxing.
While switching your dynamic scripts to Expression may only work or be applicable in some cases (depending on the complexity of your inline dynamic scripts), it seemed it was worth sharing this information in the hopes it could help other developers.
As a note, one of the other supported ES scripting languages, Mustache only appears to be usable for creating templates within your search queries. It does not appear to be usable for any of the more complexing scripting needs such as function_score, etc., although I am not sure this was entirely apparent during the first read through of the updated ES documentation.
Lastly, one further issue to be mindful of is that the use of Lucene Expression scripts are marked as an experimental feature in the latest ES release and the documentation notes that as this scripting extension is undergoing significant development work at this time, its usage or functionality may change in later versions of ES. Thus if you do switch over to using Expression for any of your scripts (dynamic or otherwise), it should be noted in your documentation/developer notes to revisit these changes before upgrading your ES installation next time to ensure your scripts remain compatible and work as expected.
For our situation at least, unless we were willing to allow non-sandboxed dynamic scripting to be enabled again in the latest version of ES (via the script.inline: on option) so that inline Groovy scripts could continue to run, switching over to Lucene Expression scripting seemed like the best option for now.
It will be interesting to see what changes occur to the scripting choices for ES in future releases, especially given that the (apparently ineffective) sandboxing option for Groovy will be completely removed by version 2.0. Hopefully other protections can be put in place to enable dynamic Groovy usage, or perhaps Lucene Expression scripting will take Groovy's place and will enable all the types of dynamic scripting that developers are already making use of.
For more notes on Lucene Expression see the ES documentation here: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting.html#_lucene_expressions_scripts – this page is also the source of the note regarding the planned removal of Groovy's sandboxing option from ES v2.0+. Further Lucene Expression documentation can be found here: http://lucene.apache.org/core/4_9_0/expressions/index.html?org/apache/lucene/expressions/js/package-summary.html

Spring Webflow Flow Registry Configuration

Please bear with me as this is my first question. Let me know what I can do to improve future questions.
I am trying to learn Spring Webflow, and am slowly wrapping my head around it. I am finding that there are a lot of conventions that the programmer is expected to "just know", and the examples online don't seem to work.
I have cobbled together one example that works as expected, but now I am trying to extend my understanding to the next level in another small project, with my long term goal being a much more complex application. The goal of this exercise is to build a login system that supports different types of client (phone, desktop, etc) with different webflows.
As near as I can tell, I am having trouble configuring the flow registry, probably because I am misunderstanding the convention.
The textbook example I am emulating is this:
<!-- The registry of executable flow definitions -->
<webflow:flow-registry flow-builder-services="flowBuilderServices"
id="flowRegistry" base-path="/WEB-INF/flows/">
<webflow:flow-location path="/welcome/welcome.xml" />
</webflow:flow-registry>
My configuration is this:
<!-- The registry of executable flow definitions -->
<webflow:flow-registry id="flowRegistry"
flow-builder-services="flowBuilderServices"
base-path="/WEB-INF/pages/">
<webflow:flow-location path="d/login.xml" />
<webflow:flow-location path="d/signup.xml" />
</webflow:flow-registry>
The log states:
DEBUG o.s.w.d.r.FlowDefinitionRegistryImpl - Registering flow definition 'ServletContext resource [/WEB-INF/pages/d/login.xml]' under id 'd'
DEBUG o.s.w.d.r.FlowDefinitionRegistryImpl - Registering flow definition 'ServletContext resource [/WEB-INF/pages/d/signup.xml]' under id 'd'
Since, under the covers, the flow registry is a simple HashMap, only one of the flow files is being registered, and not as what I would expect.
What am I missing?

Change configuration as mentioned below, this might help you:
<webflow:flow-registry id="flowRegistry"
flow-builder-services="flowBuilderServices"
base-path="/WEB-INF/pages">
<webflow:flow-location path="/d/login.xml" />
<webflow:flow-location path="/d/signup.xml" />
</webflow:flow-registry>
also see Spring Webflow - How to Get List of FLOW IDs

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Flink StreamingFileSink on Azure Blob Storage - azure

Related

Problems configuring clustered Vertx Eventbus in Quarkus

Scala + SBT - How to configure reference.conf for a shaded Akka library

Offline Jacoco using MockStatic cause re-instrumentation exception

ElasticSearch: aggregation on _score field w/ Groovy disabled

Spring Webflow Flow Registry Configuration

Categories

Resources