Error while dropping the index in Cassandra - cassandra

While creating the new index in cassandra by mistake given same directory path(directory_path) for both the indexes which is might be giving error while dropping the indexes. And able to delete other indexes which is pointing to unique different directory paths. Please share your thoughts that how we can delete the new index which was pointed to same directory path of existing index.
New index
CREATE CUSTOM INDEX inbound_idx ON inventory.inbound (p_index) USING 'com.stratio.cassandra.lucene.Index' WITH OPTIONS = {'schema': '{
fields : {
symbol : {type : "string",case_sensitive: false},
destination : {type : "string",case_sensitive: false},
ticket_id : {type : "string",case_sensitive: false}
}
}', 'refresh_seconds': '1', 'directory_path': '/c05/stratio_index/inventory/inbound'};
CREATE CUSTOM INDEX inbound_idx_new ON inventory.inbound (p_index_new) USING 'com.stratio.cassandra.lucene.Index' WITH OPTIONS = {'schema': '{
fields : {
symbol : {type : "string",case_sensitive: false},
destination : {type : "string",case_sensitive: false},
ticket_id : {type : "string",case_sensitive: false}
}
}', 'refresh_seconds': '1', 'directory_path': '/c05/stratio_index/inventory/inbound'};
Error:
inventory_admin#cqlsh:inventory> drop index inbound_idx_new;
ServerError: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
inventory_admin#cqlsh:inventory> drop INDEX inbound_idx;
ServerError: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError: attempted to delete non-existing file inbound
inventory_admin#cqlsh:inventory> drop index inbound_idx1;
inventory_admin#cqlsh:inventory>

Related

Query CUstom Record using Custom Field in Mulesoft

I want to query a custom record based on the custom field in the custom record in Anypoint Studio. I tried using Search operation in Netsuite but it seems that I unable to write the dataweave code accordingly.
I used the code below
%dw 2.0
output application/java
---
{
customFieldList: {
customField: [{
internalId: "8",
scriptId: "abc"
} as Object {
class : "org.mule.module.netsuite.extension.api.ListOrRecordRef"
} ]
} as Object {
class : "org.mule.module.netsuite.extension.api.CustomFieldList"
},
recType: {
internalId: "10078"
}
} as Object {
class : "org.mule.module.netsuite.extension.api.CustomRecordSearchBasic"
}
I used basic Custom record search.I am getting below error
Message : null
Element : testFlow2/processors/0 # test:test.xml:18 (Transform Message)
Element DSL : <ee:transform doc:name="Transform Message" doc:id="c6dbb0ee-4979-4667-bab1-e0b5d72a6b2b">
<ee:message>
<ee:set-payload>%dw 2.0
output application/java
---
{
customFieldList: {
customField: [{
internalId: "8",
scriptId: "abc"
} as Object {
class : "org.mule.module.netsuite.extension.api.ListOrRecordRef"
} ]
} as Object {
class : "org.mule.module.netsuite.extension.api.CustomFieldList"
},
recType: {
internalId: "10078"
}
} as Object {
class : "org.mule.module.netsuite.extension.api.CustomRecordSearchBasic"
}</ee:set-payload>
</ee:message>
</ee:transform>
Error type : MULE:UNKNOWN
FlowStack : at testFlow2(testFlow2/processors/0 # test:test.xml:18 (Transform Message))
--------------------------------------------------------------------------------
Root Exception stack trace:
java.lang.NullPointerException
at org.mule.weave.v2.module.pojo.exception.CannotInstantiateException.message(CannotInstantiateException.scala:7)
at org.mule.weave.v2.parser.exception.LocatableException.getMessage(LocatableException.scala:18)
at org.mule.weave.v2.parser.exception.LocatableException.getMessage$(LocatableException.scala:15)
at org.mule.weave.v2.module.pojo.exception.CannotInstantiateException.getMessage(CannotInstantiateException.scala:6)
at org.mule.runtime.core.internal.el.dataweave.DataWeaveExpressionLanguageAdaptor$1.handledException(DataWeaveExpressionLanguageAdaptor.java:298)
at org.mule.runtime.core.internal.el.dataweave.DataWeaveExpressionLanguageAdaptor$1.evaluate(DataWeaveExpressionLanguageAdaptor.java:309)
at org.mule.runtime.core.internal.el.DefaultExpressionManagerSession.evaluate(DefaultExpressionManagerSession.java:105)
at com.mulesoft.mule.runtime.core.internal.processor.SetPayloadTransformationTarget.process(SetPayloadTransformationTarget.java:32)
at com.mulesoft.mule.runtime.core.internal.processor.TransformMessageProcessor.lambda$0(TransformMessageProcessor.java:92)
at java.util.Optional.ifPresent(Optional.java:159)
at com.mulesoft.mule.runtime.core.internal.processor.TransformMessageProcessor.process(TransformMessageProcessor.java:92)
at org.mule.runtime.core.api.util.func.CheckedFunction.apply(CheckedFunction.java:25)
at org.mule.runtime.core.api.rx.Exceptions.lambda$checkedFunction$2(Exceptions.java:84)
at org.mule.runtime.core.internal.util.rx.Operators.lambda$nullSafeMap$0(Operators.java:47)
at reactor.core.* (1 elements filtered from stack; set debug level logging or '-Dmule.verbose.exceptions=true' for everything)(Unknown Source)
at org.mule.runtime.core.privileged.processor.chain.* (2 elements filtered from stack; set debug level logging or '-Dmule.verbose.exceptions=true' for everything)(Unknown Source)
at reactor.core.* (6 elements filtered from stack; set debug level logging or '-Dmule.verbose.exceptions=true' for everything)(Unknown Source)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.mule.service.scheduler.internal.AbstractRunnableFutureDecorator.doRun(AbstractRunnableFutureDecorator.java:111)
at org.mule.service.scheduler.internal.RunnableFutureDecorator.run(RunnableFutureDecorator.java:54)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I don't know where I am doing wrong.
Any help is appreciated.
Thank you so much. But I figured it some how. I got the solution to it. But I can accept other solutions too. I just added a class and it worked.
Thank you again.

"[circuit_breaking_exception] [parent]" Data too large, data for "[<http_request>]" would be error

After smoothly working for more than 10 months, I start getting this error on production suddenly while doing simple search queries.
{
"error" : {
"root_cause" : [
{
"type" : "circuit_breaking_exception",
"reason" : "[parent] Data too large, data for [<http_request>] would be [745522124/710.9mb], which is larger than the limit of [745517875/710.9mb]",
"bytes_wanted" : 745522124,
"bytes_limit" : 745517875
}
],
"type" : "circuit_breaking_exception",
"reason" : "[parent] Data too large, data for [<http_request>] would be [745522124/710.9mb], which is larger than the limit of [745517875/710.9mb]",
"bytes_wanted" : 745522124,
"bytes_limit" : 745517875
},
"status" : 503
}
Initially, I was getting this error while doing simple term queries when I got this circuit_breaking_exception error, To debug this I tried _cat/health query on elasticsearch cluster, but still, the same error, even the simplest query localhost:9200 is giving the same error Not sure what happens to the cluster suddenly.
Her is my circuit breaker status:
"breakers" : {
"request" : {
"limit_size_in_bytes" : 639015321,
"limit_size" : "609.4mb",
"estimated_size_in_bytes" : 0,
"estimated_size" : "0b",
"overhead" : 1.0,
"tripped" : 0
},
"fielddata" : {
"limit_size_in_bytes" : 639015321,
"limit_size" : "609.4mb",
"estimated_size_in_bytes" : 406826332,
"estimated_size" : "387.9mb",
"overhead" : 1.03,
"tripped" : 0
},
"in_flight_requests" : {
"limit_size_in_bytes" : 1065025536,
"limit_size" : "1015.6mb",
"estimated_size_in_bytes" : 560,
"estimated_size" : "560b",
"overhead" : 1.0,
"tripped" : 0
},
"accounting" : {
"limit_size_in_bytes" : 1065025536,
"limit_size" : "1015.6mb",
"estimated_size_in_bytes" : 146387859,
"estimated_size" : "139.6mb",
"overhead" : 1.0,
"tripped" : 0
},
"parent" : {
"limit_size_in_bytes" : 745517875,
"limit_size" : "710.9mb",
"estimated_size_in_bytes" : 553214751,
"estimated_size" : "527.5mb",
"overhead" : 1.0,
"tripped" : 0
}
}
I found a similar issue hereGithub Issue that suggests increasing circuit breaker memory or disabling the same. But I am not sure what to choose. Please help!
Elasticsearch Version 6.3
After some more research finally, I found a solution for this i.e
We should not disable circuit breaker as it might result in OOM error and eventually might crash elasticsearch.
dynamically increasing circuit breaker memory percentage is good but it is also a temporary solution because at the end after solution increased percentage might also fill up.
Finally, we have a third option i.e increase overall JVM heap size which is 1GB by default but as recommended it should be around 30-32 GB on production, also it should be less than 50% of available total memory.
For more info check this for good JVM memory configurations of elasticsearch on production, Heap: Sizing and Swapping
In my case I have an index with large documents, each document has ~30 KB and more than 130 fields (nested objects, arrays, dates and ids).
and I was searching all fields using this DSL query:
query_string: {
query: term,
analyze_wildcard: true,
fields: ['*'], // search all fields
fuzziness: 'AUTO'
}
Since full-text searches are expensive. Searching through multiple fields at once is even more expensive. Expensive in terms of computing power, not storage.
Therefore:
The more fields a query_string or multi_match query targets, the
slower it is. A common technique to improve search speed over multiple
fields is to copy their values into a single field at index time, and
then use this field at search time.
please refer to ELK docs that recommends searching as few fields as possible with the help of copy-to directive.
After I changed my query to search one field:
query_string: {
query: term,
analyze_wildcard: true,
fields: ['search_field'] // search in one field
}
everything worked like a charm.
I got this error with my docker container so I increase the java_opts to 1GB and now it works without any error.
Here are the docker-compose.yml
version: '1'
services:
elasticsearch-cont:
image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
container_name: elasticsearch
environment:
- "ES_JAVA_OPTS=-Xms1024m -Xmx1024m"
- discovery.type=single-node
ulimits:
memlock:
soft: -1
hard: -1
ports:
- 9200:9200
- 9300:9300
networks:
- elastic
networks:
elastic:
driver: bridge
In my case, I also have an index with large documents which store system running logs and I searched the index with all fields. I use the Java Client API, like this:
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("uid", uid);
searchSourceBuilder.query(termQueryBuilder);
When I changed my code like this:
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("uid", uid);
searchSourceBuilder.fetchField("uid");
searchSourceBuilder.fetchSource(false);
searchSourceBuilder.query(termQueryBuilder);
the error disappeared.

Mongoose validation failed, without giving the field-name with the issue

When I try to save my "Rating" schema, it gives the error
error: ValidationError: Rating validation failed
without giving the field name with the problem, neither the kind of issue (if it's about a casting, a null or missing value, or something else..)
In my schema I don't have any field "required", neither "index:true"... I just specify the types in this way:
{
ex_string_field : {type:String},
ex_number_field : {type:Number},
ex_array_field : [
{
sub_field_string : {type:String}
}
],
ex_object_field : {
sub_field_string : {type:String}
}
}
How can I find which is the field rising the error? I have many fields. Actually, without any required field, it should be pretty easy to validate an object. I guessed that if a field is not compliant, it's just ignored....
This is my system:
"mongoose": "^4.11.12",
"node" : "v8.1.3",
"npm" : "5.3.0"
And this is the full error stack track:
2017-12-5 - error: ValidationError: Rating validation failed
at ValidationError (/Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/mongoose/lib/error/validation.js:28:11)
at model.Document.invalidate (/Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/mongoose/lib/document.js:1658:32)
at model.Document.$set (/Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/mongoose/lib/document.js:760:10)
at model._handleIndex (/Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/mongoose/lib/document.js:590:14)
at model.Document.$set (/Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/mongoose/lib/document.js:550:24)
at model._handleIndex (/Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/mongoose/lib/document.js:574:12)
at model.Document.$set (/Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/mongoose/lib/document.js:550:24)
at model.Document (/Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/mongoose/lib/document.js:77:12)
at model.Model (/Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/mongoose/lib/model.js:55:12)
at new model (/Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/mongoose/lib/model.js:3885:13)
at /Volumes/HD Daniele/Sites/cashinvoice/server/server/routes/admin/admin-utils.js:370:13
at _fulfilled (/Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/q/q.js:854:54)
at self.promiseDispatch.done (/Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/q/q.js:883:30)
at Promise.promise.promiseDispatch (/Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/q/q.js:816:13)
at /Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/q/q.js:624:44
at runSingle (/Volumes/HD Daniele/Sites/cashinvoice/server/node_modules/q/q.js:137:13)
================== UPDATE ====================
After some testing, I ended up that the error is the complex field 'main_info'. In the specific, 2 types were swapped (a number as String, and a String as Number). They were defined as follow:
this.schema = new Schema({
main_info :
{
rea : {type:String},
name : {type:Number},
}
});
This is an example of passed data that rise the Validation Error:
main_info :
{
"rea": 649087,
"name": "NUTIS S.R.L."
}
Anyway I think that it's crazy manual parse the whole data to find which is the problem. Does Mongoose really not provide any further info about the validation fail?

saveToCassandra works with Cassandra Lucene plugin?

I am implementing the example on Lucene plugin for Cassandra page (https://github.com/Stratio/cassandra-lucene-index) and when I try to save the data using saveToCassandra I get the exception NoSuchElementException.
If I use CassandraConnector.withSessionDo I am able to add elements into Cassandra and no exception is raised.
The tables:
CREATE KEYSPACE demo
WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor': 1};
USE demo;
CREATE TABLE tweets (
id INT PRIMARY KEY,
user TEXT,
body TEXT,
time TIMESTAMP,
latitude FLOAT,
longitude FLOAT
);
CREATE CUSTOM INDEX tweets_index ON tweets ()
USING 'com.stratio.cassandra.lucene.Index'
WITH OPTIONS = {
'refresh_seconds' : '1',
'schema' : '{
fields : {
id : {type : "integer"},
user : {type : "string"},
body : {type : "text", analyzer : "english"},
time : {type : "date", pattern : "yyyy/MM/dd", sorted : true},
place : {type : "geo_point", latitude:"latitude", longitude:"longitude"}
}
}'
};
The code :
import org.apache.spark.{SparkConf, SparkContext, Logging}
import com.datastax.spark.connector.cql.CassandraConnector
import com.datastax.spark.connector._
object App extends Logging{
def main(args: Array[String]) {
// Get the cassandra IP and create the spark context
val cassandraIP = System.getenv("CASSANDRA_IP");
val sparkConf = new SparkConf(true)
.set("spark.cassandra.connection.host", cassandraIP)
.set("spark.cleaner.ttl", "3600")
.setAppName("Simple Spark Cassandra Example")
val sc = new SparkContext(sparkConf)
// Works
CassandraConnector(sparkConf).withSessionDo { session =>
session.execute("INSERT INTO demo.tweets(id, user, body, time, latitude, longitude) VALUES (19, 'Name', 'Body', '2016-03-19 09:00:00-0300', 39, 39)")
}
// Does not work
val demo = sc.parallelize(Seq((9, "Name", "Body", "2016-03-29 19:00:00-0300", 29, 29)))
// Raises the exception
demo.saveToCassandra("demo", "tweets", SomeColumns("id", "user", "body", "time", "latitude", "longitude"))
}
}
The exception:
16/03/28 14:15:41 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
Exception in thread "main" java.util.NoSuchElementException: Column not found in demo.tweets
at com.datastax.spark.connector.cql.StructDef$$anonfun$columnByName$2.apply(Schema.scala:60)
at com.datastax.spark.connector.cql.StructDef$$anonfun$columnByName$2.apply(Schema.scala:60)
at scala.collection.Map$WithDefault.default(Map.scala:52)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at com.datastax.spark.connector.cql.TableDef$$anonfun$9.apply(Schema.scala:153)
at com.datastax.spark.connector.cql.TableDef$$anonfun$9.apply(Schema.scala:152)
at scala.collection.TraversableLike$WithFilter$$anonfun$map$2.apply(TraversableLike.scala:722)
at scala.collection.immutable.Map$Map1.foreach(Map.scala:109)
at scala.collection.TraversableLike$WithFilter.map(TraversableLike.scala:721)
at com.datastax.spark.connector.cql.TableDef.<init>(Schema.scala:152)
at com.datastax.spark.connector.cql.Schema$$anonfun$com$datastax$spark$connector$cql$Schema$$fetchTables$1$2.apply(Schema.scala:283)
at com.datastax.spark.connector.cql.Schema$$anonfun$com$datastax$spark$connector$cql$Schema$$fetchTables$1$2.apply(Schema.scala:271)
at scala.collection.TraversableLike$WithFilter$$anonfun$map$2.apply(TraversableLike.scala:722)
at scala.collection.immutable.Set$Set4.foreach(Set.scala:137)
at scala.collection.TraversableLike$WithFilter.map(TraversableLike.scala:721)
at com.datastax.spark.connector.cql.Schema$.com$datastax$spark$connector$cql$Schema$$fetchTables$1(Schema.scala:271)
at com.datastax.spark.connector.cql.Schema$$anonfun$com$datastax$spark$connector$cql$Schema$$fetchKeyspaces$1$2.apply(Schema.scala:295)
at com.datastax.spark.connector.cql.Schema$$anonfun$com$datastax$spark$connector$cql$Schema$$fetchKeyspaces$1$2.apply(Schema.scala:294)
at scala.collection.TraversableLike$WithFilter$$anonfun$map$2.apply(TraversableLike.scala:722)
at scala.collection.immutable.HashSet$HashSet1.foreach(HashSet.scala:153)
at scala.collection.immutable.HashSet$HashTrieSet.foreach(HashSet.scala:306)
at scala.collection.TraversableLike$WithFilter.map(TraversableLike.scala:721)
at com.datastax.spark.connector.cql.Schema$.com$datastax$spark$connector$cql$Schema$$fetchKeyspaces$1(Schema.scala:294)
at com.datastax.spark.connector.cql.Schema$$anonfun$fromCassandra$1.apply(Schema.scala:307)
at com.datastax.spark.connector.cql.Schema$$anonfun$fromCassandra$1.apply(Schema.scala:304)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withClusterDo$1.apply(CassandraConnector.scala:121)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withClusterDo$1.apply(CassandraConnector.scala:120)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:110)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:109)
at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:139)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:109)
at com.datastax.spark.connector.cql.CassandraConnector.withClusterDo(CassandraConnector.scala:120)
at com.datastax.spark.connector.cql.Schema$.fromCassandra(Schema.scala:304)
at com.datastax.spark.connector.writer.TableWriter$.apply(TableWriter.scala:275)
at com.datastax.spark.connector.RDDFunctions.saveToCassandra(RDDFunctions.scala:36)
at com.webradar.spci.spark.cassandra.App$.main(App.scala:27)
at com.webradar.spci.spark.cassandra.App.main(App.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) 16/03/28 14:15:41 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
Exception in thread "main" java.util.NoSuchElementException: Column not found in demo.tweets
at com.datastax.spark.connector.cql.StructDef$$anonfun$columnByName$2.apply(Schema.scala:60)
at com.datastax.spark.connector.cql.StructDef$$anonfun$columnByName$2.apply(Schema.scala:60)
at scala.collection.Map$WithDefault.default(Map.scala:52)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at com.datastax.spark.connector.cql.TableDef$$anonfun$9.apply(Schema.scala:153)
at com.datastax.spark.connector.cql.TableDef$$anonfun$9.apply(Schema.scala:152)
at scala.collection.TraversableLike$WithFilter$$anonfun$map$2.apply(TraversableLike.scala:722)
at scala.collection.immutable.Map$Map1.foreach(Map.scala:109)
at scala.collection.TraversableLike$WithFilter.map(TraversableLike.scala:721)
at com.datastax.spark.connector.cql.TableDef.<init>(Schema.scala:152)
at com.datastax.spark.connector.cql.Schema$$anonfun$com$datastax$spark$connector$cql$Schema$$fetchTables$1$2.apply(Schema.scala:283)
at com.datastax.spark.connector.cql.Schema$$anonfun$com$datastax$spark$connector$cql$Schema$$fetchTables$1$2.apply(Schema.scala:271)
at scala.collection.TraversableLike$WithFilter$$anonfun$map$2.apply(TraversableLike.scala:722)
at scala.collection.immutable.Set$Set4.foreach(Set.scala:137)
at scala.collection.TraversableLike$WithFilter.map(TraversableLike.scala:721)
at com.datastax.spark.connector.cql.Schema$.com$datastax$spark$connector$cql$Schema$$fetchTables$1(Schema.scala:271)
at com.datastax.spark.connector.cql.Schema$$anonfun$com$datastax$spark$connector$cql$Schema$$fetchKeyspaces$1$2.apply(Schema.scala:295)
at com.datastax.spark.connector.cql.Schema$$anonfun$com$datastax$spark$connector$cql$Schema$$fetchKeyspaces$1$2.apply(Schema.scala:294)
at scala.collection.TraversableLike$WithFilter$$anonfun$map$2.apply(TraversableLike.scala:722)
at scala.collection.immutable.HashSet$HashSet1.foreach(HashSet.scala:153)
at scala.collection.immutable.HashSet$HashTrieSet.foreach(HashSet.scala:306)
at scala.collection.TraversableLike$WithFilter.map(TraversableLike.scala:721)
at com.datastax.spark.connector.cql.Schema$.com$datastax$spark$connector$cql$Schema$$fetchKeyspaces$1(Schema.scala:294)
at com.datastax.spark.connector.cql.Schema$$anonfun$fromCassandra$1.apply(Schema.scala:307)
at com.datastax.spark.connector.cql.Schema$$anonfun$fromCassandra$1.apply(Schema.scala:304)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withClusterDo$1.apply(CassandraConnector.scala:121)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withClusterDo$1.apply(CassandraConnector.scala:120)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:110)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:109)
at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:139)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:109)
at com.datastax.spark.connector.cql.CassandraConnector.withClusterDo(CassandraConnector.scala:120)
at com.datastax.spark.connector.cql.Schema$.fromCassandra(Schema.scala:304)
at com.datastax.spark.connector.writer.TableWriter$.apply(TableWriter.scala:275)
at com.datastax.spark.connector.RDDFunctions.saveToCassandra(RDDFunctions.scala:36)
at com.webradar.spci.spark.cassandra.App$.main(App.scala:27)
at com.webradar.spci.spark.cassandra.App.main(App.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
EDITED:
Versions
Spark 1.6.0
Cassandra 3.0.3
Lucene plugin 3.0.3.1
For Jar creation I used maven-assembly-plugin to get a fat JAR.
If I remove the custom index I am able to use saveToCassandra
It seems that the problem is caused by a problem in the Cassandra Spark driver, and not in the plugin.
Since CASSANDRA-10217 Cassandra 3.x per-row indexes don't require to be created on a fake column anymore. Thus, from Cassandra 3.x the "CREATE CUSTOM INDEX %s ON %s(%s)" column-based syntax is replaced with the new "CREATE CUSTOM INDEX %s ON %s()" row-based syntax. However, DataStax Spark driver doesn't seem to support this new feature yet.
When "com.datastax.spark.connector.RDDFunctions.saveToCassandra" is called it tries to load the table schema and the index schema related to a table column. Since this new index syntax does not have the fake-column anymore it results in a NoSuchElementException due to an empty column name.
However, saveToCassandra works well if you execute the same example with prior fake column syntax:
CREATE KEYSPACE demo
WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor': 1};
USE demo;
CREATE TABLE tweets (
id INT PRIMARY KEY,
user TEXT,
body TEXT,
time TIMESTAMP,
latitude FLOAT,
longitude FLOAT,
lucene TEXT
);
CREATE CUSTOM INDEX tweets_index ON tweets (lucene)
USING 'com.stratio.cassandra.lucene.Index'
WITH OPTIONS = {
'refresh_seconds' : '1',
'schema' : '{
fields : {
id : {type : "integer"},
user : {type : "string"},
body : {type : "text", analyzer : "english"},
time : {type : "date", pattern : "yyyy/MM/dd", sorted : true},
place : {type : "geo_point", latitude:"latitude", longitude:"longitude"}
}
}'
};

ElasticSearch 1.5 upgrade: JodaTime conversion script error

I'm working through upgrading our ElasticSearch version from 1.3 to 1.5. We use the Java API heavily. the following script in an ES query:
{
"script" : {
"script" : "values contains (int)doc['timestamp'].date.toDateTime(DateTimeZone.forID('America/New_York')).getMonthOfYear()",
"params" : {
"values" : [ 1 ]
},
"lang" : "groovy"
}
}
This works with 1.3, but gives the following error in 1.5:
org.elasticsearch.action.search.SearchPhaseExecutionException: Failed to execute phase [query_fetch], all shards failed; shardFailures {[XwOu9zq0TMi2uOptdfIS7w][eventdata][0]: QueryPhaseExecutionException[[eventdata][0]: query[filtered(ConstantScore(ScriptFilter(values contains (int)doc['timestamp'].date.toDateTime(DateTimeZone.forID('America/New_York')).getMonthOfYear().toString())))->cache(org.elasticsearch.index.search.nested.NonNestedDocsFilter#2a8c0465)],from[0],size[10]: Query Failed [Failed to execute main query]]; nested: GroovyScriptExecutionException[MissingMethodException[No signature of method: Script1.contains() is applicable for argument types: (java.lang.Class) values: [int]
Possible solutions: toString(), toString(), notify()]]; }
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:238)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onFailure(TransportSearchTypeAction.java:184)
at org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:565)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
What's the best way to address this?
That looks like a brackets issue with the cast, so it thinks you're passing int to contains. Try adding brackets to help out the parser:
values.contains((int)doc['timestamp'].date.toDateTime(DateTimeZone.forID('Ameri‌​ca/New_York')).getMonthOfYear())

Resources