Error with CASSANDRA + PIG + CQL + Counter Column - cassandra

I'm using pig to access a column family in cassandra with counter column. When i try to dump the data i get the error below:
cqlsh:pollkan> CREATE TABLE votes_count_period_1 (
... period int,
... poll text,
... votes counter,
... PRIMARY KEY (period, poll)
... );
cqlsh:pollkan> UPDATE votes_count_period_1 SET votes = votes + 1 WHERE period = 20130831 AND poll = '405bd9c0-0d05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_1 SET votes = votes + 1 WHERE period = 20130831 AND poll = '405bd9c0-0d05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_1 SET votes = votes + 1 WHERE period = 20130831 AND poll = '505bd9c0-ff05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_1 SET votes = votes + 1 WHERE period = 20130831 AND poll = '505bd9c0-ff05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_1 SET votes = votes + 1 WHERE period = 20130831 AND poll = '505bd9c0-ff05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_1 SET votes = votes + 1 WHERE period = 20130830 AND poll = '605bd9c0-aa05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_1 SET votes = votes + 1 WHERE period = 20130830 AND poll = '605bd9c0-aa05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_1 SET votes = votes + 1 WHERE period = 20130830 AND poll = '605bd9c0-aa05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_1 SET votes = votes + 1 WHERE period = 20130830 AND poll = '605bd9c0-aa05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_1 SET votes = votes + 1 WHERE period = 20130830 AND poll = '605bd9c0-aa05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> select * from votes_count_period_1;
period | poll | votes
----------+--------------------------------------+-------
20130830 | 605bd9c0-aa05-11e3-8c9a-4d42ba09ab2a | 5
20130831 | 405bd9c0-0d05-11e3-8c9a-4d42ba09ab2a | 2
20130831 | 505bd9c0-ff05-11e3-8c9a-4d42ba09ab2a | 3
root#batch:/usr/share/cassandra# pig -x local
2013-08-31 23:02:06,135 [main] INFO org.apache.pig.Main - Apache Pig version 0.11.1 (r1459164) compiled Mar 21 2013, 06:14:38
2013-08-31 23:02:06,136 [main] INFO org.apache.pig.Main - Logging error messages to: /usr/share/cassandra/pig_1377982926133.log
2013-08-31 23:02:06,154 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /root/.pigbootup not found
2013-08-31 23:02:06,252 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
grunt> register /usr/share/cassandra/apache-cassandra-1.2.9.jar
grunt> register /usr/share/cassandra/apache-cassandra-thrift-1.2.9.jar
grunt> register /usr/share/cassandra/lib/libthrift-0.7.0.jar
grunt> A = LOAD 'cql://pollkan/votes_count_period_1' USING org.apache.cassandra.hadoop.pig.CqlStorage();
grunt> DUMP A;
Causes:
2013-08-31 23:01:35,397 [pool-4-thread-1] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed ColumnFamilySplit((-69569900416187863, '-54603788994328078] #[cassandra001, cassandra002, cassandra003])
2013-08-31 23:01:35,417 [pool-4-thread-1] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2013-08-31 23:01:35,418 [pool-4-thread-1] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: A[2,4] C: R:
2013-08-31 23:01:35,424 [Thread-10] INFO org.apache.hadoop.mapred.LocalJobRunner - Map task executor complete.
2013-08-31 23:01:35,428 [Thread-10] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local712790083_0002
java.lang.Exception: java.lang.IndexOutOfBoundsException
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: java.lang.IndexOutOfBoundsException
at java.nio.Buffer.checkIndex(Buffer.java:538)
at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:410)
at org.apache.cassandra.db.context.CounterContext.total(CounterContext.java:477)
at org.apache.cassandra.db.marshal.AbstractCommutativeType.compose(AbstractCommutativeType.java:34)
at org.apache.cassandra.db.marshal.AbstractCommutativeType.compose(AbstractCommutativeType.java:25)
at org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.columnToTuple(AbstractCassandraStorage.java:137)
at org.apache.cassandra.hadoop.pig.CqlStorage.getNext(CqlStorage.java:110)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:531)
at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
I read that https://issues.apache.org/jira/browse/CASSANDRA-5234 was resolved issues with cql3 tables and counter column, but i stil having issues.
By the way, i tried re-create the table with old style COMPACT STORAGE, and i have advanced a little more, but stucked in a new issue with the below error:
cqlsh:pollkan> CREATE TABLE votes_count_period_2 (
... period int,
... poll text,
... votes counter,
... PRIMARY KEY (period, poll)
... ) WITH COMPACT STORAGE;
cqlsh:pollkan>
cqlsh:pollkan> UPDATE votes_count_period_2 SET votes = votes + 1 WHERE period = 20130831 AND poll = '405bd9c0-0d05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_2 SET votes = votes + 1 WHERE period = 20130831 AND poll = '405bd9c0-0d05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_2 SET votes = votes + 1 WHERE period = 20130831 AND poll = '505bd9c0-ff05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_2 SET votes = votes + 1 WHERE period = 20130831 AND poll = '505bd9c0-ff05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_2 SET votes = votes + 1 WHERE period = 20130831 AND poll = '505bd9c0-ff05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_2 SET votes = votes + 1 WHERE period = 20130830 AND poll = '605bd9c0-aa05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_2 SET votes = votes + 1 WHERE period = 20130830 AND poll = '605bd9c0-aa05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_2 SET votes = votes + 1 WHERE period = 20130830 AND poll = '605bd9c0-aa05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_2 SET votes = votes + 1 WHERE period = 20130830 AND poll = '605bd9c0-aa05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan> UPDATE votes_count_period_2 SET votes = votes + 1 WHERE period = 20130830 AND poll = '605bd9c0-aa05-11e3-8c9a-4d42ba09ab2a';
cqlsh:pollkan>
cqlsh:pollkan> select * from votes_count_period_2;
period | poll | votes
----------+--------------------------------------+-------
20130830 | 605bd9c0-aa05-11e3-8c9a-4d42ba09ab2a | 5
20130831 | 405bd9c0-0d05-11e3-8c9a-4d42ba09ab2a | 2
20130831 | 505bd9c0-ff05-11e3-8c9a-4d42ba09ab2a | 3
root#batch:/usr/share/cassandra# pig -x local
2013-08-31 23:02:06,135 [main] INFO org.apache.pig.Main - Apache Pig version 0.11.1 (r1459164) compiled Mar 21 2013, 06:14:38
2013-08-31 23:02:06,136 [main] INFO org.apache.pig.Main - Logging error messages to: /usr/share/cassandra/pig_1377982926133.log
2013-08-31 23:02:06,154 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /root/.pigbootup not found
2013-08-31 23:02:06,252 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
grunt> register /usr/share/cassandra/apache-cassandra-1.2.9.jar
grunt> register /usr/share/cassandra/apache-cassandra-thrift-1.2.9.jar
grunt> register /usr/share/cassandra/lib/libthrift-0.7.0.jar
grunt> A = LOAD 'cql://pollkan/votes_count_period_2' USING org.apache.cassandra.hadoop.pig.CqlStorage();
grunt> DUMP A;
2013-08-31 23:05:59,454 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2013-08-31 23:05:59,458 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2013-08-31 23:05:59,465 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2013-08-31 23:05:59,466 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
((period,20130830),(poll,605bd9c0-aa05-11e3-8c9a-4d42ba09ab2a),(votes,5))
((period,20130831),(poll,405bd9c0-0d05-11e3-8c9a-4d42ba09ab2a),(votes,2))
((period,20130831),(poll,505bd9c0-ff05-11e3-8c9a-4d42ba09ab2a),(votes,3))
grunt> A = LOAD 'cql://pollkan/votes_count_period_2' USING org.apache.cassandra.hadoop.pig.CqlStorage();
grunt> B = FOREACH A GENERATE poll, votes;
grunt> describe B;
B: {poll: chararray,votes: long}
grunt> C = GROUP B BY poll;
grunt> describe C;
C: {group: chararray,B: {(poll: chararray,votes: long)}}
grunt> D = FOREACH C GENERATE group AS pollgroup, SUM(B.votes);
grunt> describe D;
D: {pollgroup: chararray,long}
grunt> dump D;
2013-08-31 23:53:32,577 [pool-33-thread-1] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map - Aliases being processed per job phase (AliasName[line,offset]): M: A[13,4],B[14,4],D[18,4],C[17,4] C: D[18,4],C[17,4] R: D[18,4]
2013-08-31 23:53:32,586 [pool-33-thread-1] INFO org.apache.hadoop.mapred.MapTask - Starting flush of map output
2013-08-31 23:53:32,589 [Thread-65] INFO org.apache.hadoop.mapred.LocalJobRunner - Map task executor complete.
2013-08-31 23:53:32,591 [Thread-65] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local814297309_0018
java.lang.Exception: java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to java.lang.String
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to java.lang.String
at org.apache.pig.backend.hadoop.HDataType.getWritableComparableTypes(HDataType.java:76)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map.collect(PigGenericMapReduce.java:112)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:285)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
My versions are Pig 0.11.1 and Cassandra 1.2.9.
Any help?
Thanks

I found the same problem earlier today while testing the latest Pig cql3 integration with similar data structures.
The JIRA issue you mentioned, https://issues.apache.org/jira/browse/CASSANDRA-5234, does contain a patch which has been verified to work for one of the commenters. However, a quick look through the cassandra git reveals that it has not been applied either on the 1.2 branch or on the trunk. I have added a comment to that effect to the JIRA issue.
Until the patch gets committed and a new stable version gets released, a solution would be to apply the patch on a fresh checkout of 1.2.9, recompile and deploy to your hadoop nodes, if that is an option for you.

Related

JSR223 Sampler return Response same as Request for EDIFACT message in JMeter

JSR223 Sampler return Response same as Request for EDIFACT message.
Request:
def payload = "UNB+IATA:1+1S+XX+121103+FF168019110033++ETK1+O'\n" +
"UNH+1+TKCREQ:00:1:IA'\n" +
"MSG+:131'\n" +
"ORG+1S+99999999:X7HH+VZX++T+GR+CXN'\n" +
"TKT+676713121:T'\n" +
"UNT+5+1'\n" +
"UNZ+1+FF168019110033
Response:
"UNB+IATA:1+1S+XX+121103+FF168019110033++ETK1+O'\n" +
"UNH+1+TKCREQ:00:1:IA'\n" +
"MSG+:131'\n" +
"ORG+1S+99999999:X7HH+VZX++T+GR+CXN'\n" +
"TKT+676713121:T'\n" +
"UNT+5+1'\n" +
"UNZ+1+FF168019110033
Log:
2023-02-07 15:33:39,890 DEBUG o.a.j.p.t.s.TCPSampler: Created org.apache.jmeter.protocol.tcp.sampler.TCPSampler#4a6c4b41
2023-02-07 15:33:39,912 DEBUG o.a.j.p.t.s.TCPSampler: Created org.apache.jmeter.protocol.tcp.sampler.TCPSampler#7bd45f7b
2023-02-07 15:33:39,939 INFO o.a.j.e.StandardJMeterEngine: Running the test!
2023-02-07 15:33:39,939 INFO o.a.j.s.SampleEvent: List of sample_variables: []
2023-02-07 15:33:39,942 INFO o.a.j.g.u.JMeterMenuBar: setRunning(true, *local*)
2023-02-07 15:33:40,147 INFO o.a.j.e.StandardJMeterEngine: Starting ThreadGroup: 1 : Thread Group
2023-02-07 15:33:40,147 INFO o.a.j.e.StandardJMeterEngine: Starting 1 threads for group Thread Group.
2023-02-07 15:33:40,147 INFO o.a.j.e.StandardJMeterEngine: Thread will start next loop on error
2023-02-07 15:33:40,147 INFO o.a.j.t.ThreadGroup: Starting thread group... number=1 threads=1 ramp-up=1 perThread=1000.0 delayedStart=false
2023-02-07 15:33:40,150 INFO o.a.j.t.ThreadGroup: Started thread group number 1
2023-02-07 15:33:40,150 INFO o.a.j.e.StandardJMeterEngine: All thread groups have been started
2023-02-07 15:33:40,150 INFO o.a.j.t.JMeterThread: Thread started: Thread Group 1-1
2023-02-07 15:33:40,162 INFO o.a.j.t.JMeterThread: Thread is done: Thread Group 1-1
2023-02-07 15:33:40,162 INFO o.a.j.t.JMeterThread: Thread finished: Thread Group 1-1
2023-02-07 15:33:40,162 INFO o.a.j.e.StandardJMeterEngine: Notifying test listeners of end of test
2023-02-07 15:33:40,162 INFO o.a.j.g.u.JMeterMenuBar: setRunning(false, *local*)
SampleResult fields:
ContentType:
DataEncoding: windows-1252
Steps followed:
Set up TCP in JMeter properties:
tcp.handler=TCPClientImpl
eolByte = 111
tcp.eolByte=1000
tcp.charset=
tcp.status.prefix=Status=
tcp.status.suffix=.
tcp.binarylength.prefix.length=2
TCP Sampler Config
TCPClient classname=TCPClientImpl
Servername=xxxxxx
Port: 3432
Timeouts: Connect(2000ms,)Response: 2000ms
Reuse Connection -enabled
JSR223 Sampler
Payload Request :
def payload = "UNB+IATA:1+1S+XX+121103+FF168019110033++ETK1+O'\n" +
"UNH+1+TKCREQ:00:1:IA'\n" +
"MSG+:131'\n" +
"ORG+1S+99999999:X7HH+VZX++T+GR+CXN'\n" +
"TKT+676713121:T'\n" +
"UNT+5+1'\n" +
"UNZ+1+FF168019110033'
If this code:
def payload = "UNB+IATA:1+1S+XX+121103+FF168019110033++ETK1+O'\n" +
"UNH+1+TKCREQ:00:1:IA'\n" +
"MSG+:131'\n" +
"ORG+1S+99999999:X7HH+VZX++T+GR+CXN'\n" +
"TKT+676713121:T'\n" +
"UNT+5+1'\n" +
"UNZ+1+FF168019110033"
is the complete code for the JSR223 Sampler - it does absolutely nothing.
If you want to send it over TCP you need to put the payload into "Text to send" field of the TCP Sampler:
If you want to send the "payload" using JSR223 Sampler you need to add the relevant code to do it, the simplest one is:
def payload = "UNB+IATA:1+1S+XX+121103+FF168019110033++ETK1+O'\n" +
"UNH+1+TKCREQ:00:1:IA'\n" +
"MSG+:131'\n" +
"ORG+1S+99999999:X7HH+VZX++T+GR+CXN'\n" +
"TKT+676713121:T'\n" +
"UNT+5+1'\n" +
"UNZ+1+FF168019110033"
def socket = new Socket('xxxxxx', 3432)
socket.withStreams { input, output ->
log.info(input.newReader().readLine())
output << payload
}
The output will go to jmeter.log file.
More information on Groovy scripting in JMeter: Apache Groovy: What Is Groovy Used For?
Sending Request in Jmeter as below:
import org.apache.jmeter.util.JsseSSLManager;
import org.apache.jmeter.util.SSLManager;
import javax.net.ssl.SSLSocket;
import javax.net.ssl.SSLSocketFactory;
int your_app_port = 7XX0;
String your_app_host = 'XXXXXXX';
JsseSSLManager sslManager = (JsseSSLManager) SSLManager.getInstance();
SSLSocketFactory sslsocketfactory = sslManager.getContext().getSocketFactory();
SSLSocket sslsocket = (SSLSocket) sslsocketfactory.createSocket('XXXXXXX', 7XX0)
def sessInboundQueue = System.getProperties().get("SessionInbound")
def destinationInboundQueue = System.getProperties().get("DestinationInbound")
def payload = """UNB+IATA:1+1S+1D+201112:0204+FF168019119008++ETK1+O'
UNH+1+TKCREQ:00:1:IA'
MSG+:131'
ORG+1S+99999909:X7HG+ATH++T+GR+ABK'
TKT+9992170003108:T'
UNT+5+1'
UNZ+1+FF168019119008'"""
def msg = sessInboundQueue.createTextMessage(payload)
Response:
Response code:500
Response message:javax.script.ScriptException: java.lang.NullPointerException: Cannot invoke method createTextMessage() on null object
Can some one help me?
#java #jmeter

Spark stage taking too long - 2 executors doing "all" the work

I've been trying to figure this out for the past day, but have not been successful.
Problem I am facing
I'm reading a parquet file that is about 2GB big. The initial read is 14 partitions, then eventually gets split into 200 partitions. I perform seemingly simple SQL query that runs for 25+ mins runtime, about 22 mins is spent on a single stage. Looking in the Spark UI, I see that all computation is eventually pushed to about 2 to 4 executors, with lots of shuffling. I don't know what is going on. Please I would appreciate any help.
Setup
Spark environment - Databricks
Cluster mode - Standard
Databricks Runtime Version - 6.4 ML (includes Apache Spark 2.4.5, Scala 2.11)
Cloud - Azure
Worker Type - 56 GB, 16 cores per machine. Minimum 2 machines
Driver Type - 112 GB, 16 cores
Notebook
Cell 1: Helper functions
load_data = function(path, type) {
input_df = read.df(path, type)
input_df = withColumn(input_df, "dummy_col", 1L)
createOrReplaceTempView(input_df, "__current_exp_data")
## Helper function to run query, then save as table
transformation_helper = function(sql_query, destination_table) {
createOrReplaceTempView(sql(sql_query), destination_table)
}
## Transformation 0: Calculate max date, used for calculations later on
transformation_helper(
"SELECT 1L AS dummy_col, MAX(Date) max_date FROM __current_exp_data",
destination_table = "__max_date"
)
## Transformation 1: Make initial column calculations
transformation_helper(
"
SELECT
cId AS cId
, date_format(Date, 'yyyy-MM-dd') AS Date
, date_format(DateEntered, 'yyyy-MM-dd') AS DateEntered
, eId
, (CASE WHEN isnan(tSec) OR isnull(tSec) THEN 0 ELSE tSec END) AS tSec
, (CASE WHEN isnan(eSec) OR isnull(eSec) THEN 0 ELSE eSec END) AS eSec
, approx_count_distinct(eId) OVER (PARTITION BY cId) AS dc_eId
, COUNT(*) OVER (PARTITION BY cId, Date) AS num_rec
, datediff(Date, DateEntered) AS analysis_day
, datediff(max_date, DateEntered) AS total_avail_days
FROM __current_exp_data
CROSS JOIN __max_date ON __main_data.dummy_col = __max_date.dummy_col
",
destination_table = "current_exp_data_raw"
)
## Transformation 2: Drop row if Date is not valid
transformation_helper(
"
SELECT
cId
, Date
, DateEntered
, eId
, tSec
, eSec
, analysis_day
, total_avail_days
, CASE WHEN analysis_day == 0 THEN 0 ELSE floor((analysis_day - 1) / 7) END AS week
, CASE WHEN total_avail_days < 7 THEN NULL ELSE floor(total_avail_days / 7) - 1 END AS avail_week
FROM current_exp_data_raw
WHERE
isnotnull(Date) AND
NOT isnan(Date) AND
Date >= DateEntered AND
dc_eId == 1 AND
num_rec == 1
",
destination_table = "main_data"
)
cacheTable("main_data_raw")
cacheTable("main_data")
}
spark_sql_as_data_table = function(query) {
data.table(collect(sql(query)))
}
get_distinct_weeks = function() {
spark_sql_as_data_table("SELECT week FROM current_exp_data GROUP BY week")
}
Cell 2: Call helper function that triggers the long running task
library(data.table)
library(SparkR)
spark = sparkR.session(sparkConfig = list())
load_data_pq("/mnt/public-dir/file_0000000.parquet")
set.seed(1234)
get_distinct_weeks()
Long running stage DAG
Stats about long running stage
Logs
I trimmed it down, and show only entries that appeared multiple times below
BlockManager: Found block rdd_22_113 locally
CoarseGrainedExecutorBackend: Got assigned task 812
ExternalAppendOnlyUnsafeRowArray: Reached spill threshold of 4096 rows, switching to org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter
InMemoryTableScanExec: Predicate (dc_eId#61L = 1) generates partition filter: ((dc_eId.lowerBound#622L <= 1) && (1 <= dc_eId.upperBound#621L))
InMemoryTableScanExec: Predicate (num_rec#62L = 1) generates partition filter: ((num_rec.lowerBound#627L <= 1) && (1 <= num_rec.upperBound#626L))
InMemoryTableScanExec: Predicate isnotnull(Date#57) generates partition filter: ((Date.count#599 - Date.nullCount#598) > 0)
InMemoryTableScanExec: Predicate isnotnull(DateEntered#58) generates partition filter: ((DateEntered.count#604 - DateEntered.nullCount#603) > 0)
MemoryStore: Block rdd_17_104 stored as values in memory (estimated size <VERY SMALL NUMBER < 10> MB, free 10.0 GB)
ShuffleBlockFetcherIterator: Getting 200 non-empty blocks including 176 local blocks and 24 remote blocks
ShuffleBlockFetcherIterator: Started 4 remote fetches in 1 ms
UnsafeExternalSorter: Thread 254 spilling sort data of <Between 1 and 3 GB> to disk (3 times so far)

How to find the process that is causing deadlock in MSSQL

I am getting latency arrival and departure from an on board computer to a third party application that generates XML file with the information gathered from the OBC.
I tried to update the statuses of some of the transaction in the DB to all the 3rd party application to process the information, but it hasn't helped.
Below is the high level of chronological order of logic in MIF scheduler code for Arrival and Depart OBC interfaces ( MPNL.Services:mapPacosEvent):
Perform:
Select uniqkey, status, vehicle_number, created_datetime, msn, base_msn, message_type, form_id, message_text from ALLINBOUNDMESSAGES where
status IN (0,6,7,8,9) AND (form_id = '003' OR form_id = '005') AND ((message_type = 'pacos') OR (message_type = 'form'))
ORDER BY created_datetime
For each row result of the above query from the message_text get the dispatch_number. E.g.
Vehicle 103184 has arrived at Stop #5 (310466-000) at 2019-07-19 08:00:14 (local time).^886727^5^310466-000^41.412191^-73.454032^68139.250^401773.3
here the dispatch_id is 886727)
Perform:
select dispatch_id, driver from dispatch where dispatch_number='${dispatch_number}' ORDER BY created_datetime DESC
Perform:
select status,dispatch_number,stop_number,scheduled_arrival,stopprofile_id,planned5, planned4 ,planned3 ,planned2 from dispstops where dispatch_id='${dispatch_id}' AND stop_number='${stop_number}' AND status IN (${StopStatusValue}) ORDER BY Status Asc ,stop_number DESC.
Here the Stop_number and StopStatusVaues are read from the MPNLInboundInterfaces_ConfigFile.xml file.
Perform:
select DISTINCT t1.login ,t1.login_datime_gmt from vehicleloginxref t1
join [elogevents] t2 ON ((t1.vehicle_number = t2.vehicle_number) AND (t1.login = t2.driverid))
where((t1.status = '${status}') AND (t1.userstatus = '${userstatus}') AND (t2.eid = '${eid}') AND (t1.vehicle_number='${vehicle_number}') AND (t1.logout_datime_gmt IS NULL)) ORDER BY login_datime_gmt DESC
Here the input is status, eid and vehicle_number.
Perform:
select primary driver detail from TPNE using below query
select secondary_driver, carrier_driver_id from driver where secondary_driver != '' and secondary_driver is not null AND mark_for_deletion!='1'
Perform:
Select TP_company_ID,LAST_KNOWN_FACILITY_ALIAS_ID,CARRIER_ID from DRIVER where Carrier_Driver_ID ='${DriverID}' AND mark_for_deletion!='1'
Perform:
Select Last known FacilityAliasID from TPE
Select TP_company_ID,LAST_KNOWN_FACILITY_ALIAS_ID,CARRIER_ID from DRIVER where Carrier_Driver_ID ='${DriverID}' AND mark_for_deletion!='1'
select carrier_code from carrier_code where TP_company_ID= '${TPCompanyID}' and CARRIER_ID='${CARRIER_ID}'
select gmtoffset from opcenters where opcenter = (select opCenter from vehicles where vehicle_number = ’${vehicle_number}' )
select time_zone_name from time_zone where time_zone_id = (select min(time_zone_id) from time_zone where gmt_offset = '${gmtoffset}')
select uniqkey,vehicle_number,created_datetime,msn,base_msn,form_id,dispatchstop,on_dock_time,trailer,pro,seal,detention_hours,auto_latlong,
auto_location,auto_odometer,userflag1,userflag2,userfield1,userfield2,dispatch_number,stop_number from ${tablename1} where msn='${msn}' AND
base_msn='${base_msn}' AND vehicle_number='${vehicle_number}' AND userflag1 NOT in ('${successStatus}','${errorStatus}')
Scheduler service MPNL.Services:mapPacosEvent composes Arrival_OBC and Depart_OBC xmls using the information pulled from TPNE and PeopleNet database and sends the same to MIF.
Once the message sent to MIF successfully scheduler updates the ALLINBOUNDMESSAGES table using the below query:
Update status=’1’ where uniqkey=’’ and vehicle_number=’’ and message_type=’’ and form_id=’’;
If sending the Arrival_OBC and Depart_OBC xmls fails scheduler updates the ALLINBOUNDMESSAGES table using the below query:
Update status=’10’ where uniqkey=’’ and vehicle_number=’’ and message_type=’’ and form_id=’’
The error:
2019-07-17 14:02:00 EDT [ART.0114.1007E] Adapter Runtime: Error Logged. See Error log for details. Error: [ADA.1.316] Cannot execute the SQL statement "UPDATE allinboundmessages SET status = ? WHERE uniqkey = ? AND vehicle_number = ? AND message_type = ? AND form_id = ?". "
(40001/1205) Transaction (Process ID 78) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction."
Transaction (Process ID 78) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
2019-07-17 14:02:00 EDT [ART.0114.1007E] Adapter Runtime: Error Logged. See Error log for details. Error: [ART.117.4002] Adapter Runtime (Adapter Service): Unable to invoke adapter service MPNL.AdapterServices:updateAllInboundMessageRecs.
[ADA.1.316] Cannot execute the SQL statement "UPDATE allinboundmessages SET status = ? WHERE uniqkey = ? AND vehicle_number = ? AND message_type = ? AND form_id = ?". "
(40001/1205) Transaction (Process ID 78) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction."
Transaction (Process ID 78) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
If you run the stored proc sp_who2, it will show you what processes are blocked and what process is doing the blocking.

Dse\Exception\RuntimeException: All connections on all I/O threads are busy

We have a facility in our web app to delete large quantities of data. We do this by paginating through all records found against u_id.
The keys we have are designed for other queries we have in the application - ideally, it would be great to have a primary key for u_id but this would break all our other queries.
The below method works well most of the time, however, after deleting approximately 6-8 million of records, we get:
Dse\Exception\RuntimeException: All connections on all I/O threads are busy
We also sometimes get a slightly different error message:
Dse\Exception\ReadTimeoutException: Operation timed out - received only 0 responses
You'll notice in the below code usleep(2500000) which pauses the script. This has been our workaround but would be good to get this resolved as Cassandra should be able to handle this number of deletes.
$cluster = \Dse::cluster()
->withDefaultTimeout(3600)
->withContactPoints(env('CA_HOST'))
->build();
$session = $cluster->connect(env('CONNECT'));
$options = array('page_size' => 50);
$results = $session->execute("SELECT * FROM datastore WHERE u_id = $u_id;", $options);
$future_deletes = array();
while (true) {
foreach ($results as $result) {
$future_deletes[] = $session->executeAsync("DELETE FROM datastore WHERE record_id = '" . $result['record_id'] . "' AND record_version = " . $result['record_version'] . " AND user_id = " . $result['user_id']);
$future_deletes[] = $session->executeAsync("UPDATE data_count set u_count = u_count - 1 WHERE u_id = " . $u_id);
}
if( !empty($future_deletes) ){
foreach ($future_deletes as $future_delete) {
// we will not wait for each result for more than 5 seconds
$future_delete->get(5);
}
//usleep(2500000); //2.5 seconds
}
$future_deletes = array();
if ($results->isLastPage()) {
break;
}
$results = $results->nextPage();
}
//Disconnect
$session = NULL;
For your reference, here are our tables:
CREATE TABLE datastore (id uuid,
record_id varchar,
record_version int,
user_id int,
u_id int,
column_1 varchar,
column_2 varchar,
column_3 varchar,
column_4 varchar,
column_5 varchar,
PRIMARY KEY((record_id), record_version, user_id)
);
CREATE INDEX u_id ON datastore (u_id);
CREATE TABLE data_count (u_id int PRIMARY KEY, u_count counter);
We are running a server with 8GB RAM.
The version of the DSE driver is 6.0.1.
Thank you in advance!
You need to control, how many "in-flight" requests do you have a the same point of time. There is a limit on number of queries per connection, and number of connections. They are controlled by corresponding functions of the Cluster class (can't find fast enough in PHP docs, but it should be similar to Cluster functions in the C++ driver, because PHP is built on top of C++ driver).

Problem when using Hector0.8.0 do addCounter

I'm using hector-core 0.8.0-1 and Cassandra 0.8.0 to test a addCounter operation, but I found my code cannot insert any data into the CF, can anyone tell me the reason?
StringSerializer ser = StringSerializer.get();
Mutator<String> mutator = HFactory.createMutator(keyspace, ser);
List<HCounterColumn<String>> counterColumns = Arrays.asList(
HFactory.createCounterColumn("1", 30L, ser),
HFactory.createCounterColumn("2", 20L, ser)
);
for (HCounterColumn c : counterColumns)
{
mutator.addCounter("testKey1", "CounterColumn", c);
mutator.addCounter("testKey2", "CounterColumn", c);
}
mutator.execute();
and I found the following info in my log:
> 2011-06-21 17:17:00,025 [Thread-3]
> INFO me.prettyprint.cassandra.hector.TimingLogger
> - Tag Avg(ms)
> Min Max Std Dev 95th Count 2011-06-21 17:17:00,030
> [Thread-3] INFO me.prettyprint.cassandra.hector.TimingLogger
> - WRITE.fail_ 4.84
> 4.84 4.84 0.00 4.84 1 2011-06-21 17:17:00,031 [Thread-3]
> INFO me.prettyprint.cassandra.hector.TimingLogger
> - META_WRITE.fail_ 17.20
> 11.31 23.09 5.89 23.09 2 2011-06-21 17:17:00,031 [Thread-3]
> INFO me.prettyprint.cassandra.hector.TimingLogger
> -
seems something wrong while doing mutator.execute();
Thanks in advance!
Currently, in Cassandra 0.8.0, you cannot create counter columns in a Column Family that is not specifically created to handle counters:
create column family Counter1 with default_validation_class = CounterColumnType;
Here is the JIRA reference: https://issues.apache.org/jira/browse/CASSANDRA-2614
This way:
cfDef.setDefaultValidationClass(...)
This is available in the latest version of trunk, 0.7.0 and 0.8.0 branch.
So you need to pull from sources.
or, if you want to do that, and assoming you are using the latest hector available at maven central, you can do this
ThriftCfDef cfDef = new ThriftCfDef(String keyspace, String columnFamilyName, ComparatorType comparatorType)
cf.setDefaultValidationClass(ComparatorType.COUNTERTYPE.getClassName());
cluster.addColumnFamiily(cfDef);

Resources