I am using datastax driver to do cassandra writes asynchronously, would like to see if there is a way to retry writes on failure. It does not seem to contain the request in Throwable throwable during failure.
public void onQueryComplete(final ResultSetFuture rsf)
{
Futures.addCallback(rsf, new FutureCallback<ResultSet>()
{
#Override
public void onSuccess(ResultSet resultSet)
{
totalRecordsWritten.incrementAndGet();
jobContext.putLong("MDPREC_WRITE_CNT", totalRecordsWritten.get());
System.out.println("Ingestion succesful " + totalRecordsWritten.get());
Logging.log(Logging.INFO, "CassandraPersistence.java ingest() Ingestion succesful");
}
#Override public void onFailure(Throwable throwable)
{
jobContext.putInt("MDP_WRITE_FAILED", 1);
Logging.log(Logging.INFO,"CassandraPersistence.java ingest() Ingestion failed"); throw
new UnexpectedJobExecutionException("Exception while inserting data Job terminated"+throwable.getMessage());
}
});
}
You can implement your own retry strategy in case you don't want to save each query along with the result handler:
http://christopher-batey.blogspot.de/2013/10/cassandra-datastax-java-driver-retry.html
Related
I am attempting to close a stream coming from an http request using Retrofit and rxjava, either because it timedOut, or because I need to change details that went into the request. Both appear to work perfectly, as when I cancel subscription I get the doOnCancel debug message and when doOnNext is completed I get the doOnTerminate message. I also do not receive inputLines from multiple threads. However, my thread count rises every single time either of the above actions happen. It appears that responsebody.close is not releasing their resources and therefore the thread is not dying (I also have gotten error messages along the lines of "OKHTTP leaked. did you close youre responseBody?")
Does anyone have any suggestions?
public boolean closeSubscription() {
flowableAlive = false;
subscription.cancel();
return true;
}
public void subscribeToFlowable() {
streamFlowable.observeOn(Schedulers.newThread()).subscribeOn(Schedulers.newThread())
.doOnTerminate(() -> log.debug("TERMINATED")).doOnCancel(() -> log.debug("FLOWABLE CANCELED"))
.subscribe(new Subscriber<ResponseBody>() {
#Override
public void onSubscribe(Subscription s) {
subscription = s;
subscription.request(Long.MAX_VALUE);
}
#Override
public void onNext(ResponseBody responseBody) {
log.debug("onNext called");
String inputLine;
try (InputStream inputStream = responseBody.byteStream()) {
BufferedReader br = new BufferedReader(new InputStreamReader(inputStream));
while (flowableAlive && ((inputLine = br.readLine()) != null)) {
log.debug("stream receive input line for thread " + name);
log.debug(inputLine);
}
} catch (IOException e) {
log.debug("error occurred");
log.debug(e.getMessage());
}
}
#Override
public void onError(Throwable t) {
log.debug("error");
flowableAlive = false;
}
#Override
public void onComplete() {
log.debug("completed");
closeSubscription();
flowableAlive = false;
}
});
}
The result of subscribe() is Disposable object. You should store it as a filed and call Disposable.dispose() on it later as shown here:
https://proandroiddev.com/disposing-on-android-the-right-way-97bd55cbf970
Tour OkHttp call will be interrupted properly because dispose() interrupts thread on which the call runs and OkHttp checks regularly if Thread was interrupted to stop transfer when that happened - it's called cooperative cancelling/interruption.
I have a test project with Room database. Using Asynctask I can successfully insert an object with some test data into the database. I'm trying to learn RxJava and replace Asynctask with RxJava's observer, but it doesn't work. I have read alot of documentation and watched tutorials, but I don't think I quite get it. Here's the relevant code:
Here I set my Room object with the data from my List:
for(ObjectForArray item: listToDatabase) {
myRoomEntity.setName( item.getName() );
Log.d( "TAG", myRoomEntity.getName() );
}
Then I try to use RxJava Observable to insert data into the database. This was originally and successfully done using Asynctask:
Observable<MyRoomEntity> myRX = Observable
.just(myRoomEntity)
.subscribeOn( Schedulers.io() )
.observeOn( AndroidSchedulers.mainThread() );
myRX.subscribe( new Observer<MyRoomEntity>() {
#Override
public void onSubscribe(Disposable d) {
Log.d("TAG ONSUBSCRIBE", d.toString());
try {
myViewModel.insertDatabase( myRoomEntity );
Log.d( "TAG", "Populating database Success" );
}catch(Error error) {
Log.d( "TAG", error.toString() );
}
}
The OnNext, OnError and OnComplete are empty.
When I run the project it crashes with the error:
Cannot access database on the main thread since it may potentially lock the UI for a long period of time.
I'm obviously using RxJava wrong since the point is to do asynchronous tasks away from the main thread.
i have use RX java in replace of Asyntask as it has been deprecated in android 9
there are multiple replacements that android provides like Executors, threads, Listenable Futures , Coroutines 🔥, so you are looking how to implement this with rxjava and how RX Java java helps your to migrate just add these dependencies first in gradle
implementation "io.reactivex.rxjava2:rxjava:2.2.20"
implementation "io.reactivex.rxjava2:rxandroid:2.1.1"
once you import lets start working with RX java i will let you know where you can put background task, pre execute, on post execute like asynctask
lets start codding with Rx java first , i have comment in the method that will help you to put the code
Observable.fromCallable(new Callable<Boolean>() {
#Override
public Boolean call() throws Exception {
/// here is your background task
return true;
}
}).subscribeOn(Schedulers.io()).observeOn(AndroidSchedulers.mainThread())
.subscribe(new Observer<Boolean>() {
#Override
public void onSubscribe(Disposable d) {
//// pre execute here is my progress dialog
showProgressDialog(getString(R.string.scanning));
}
#Override
public void onNext(Boolean aBoolean) {
//// here is on sucess you can do anystuff here like
if (aBoolean){
/// if its value true you can go ahead with this
}
}
#Override
public void onError(Throwable e) {
/// this helps you to go if there is any error show dialog whatever you wants here
Log.e("error of kind",e.getMessage() );
}
#Override
public void onComplete() {
/// when your task done means post execute
}
});
once its done lets start working with implementation
Observable.fromCallable(new Callable<Boolean>() {
#Override
public Boolean call() throws Exception {
/// here is your background task
uribitmap = getScannedBitmap(original, points);
uri = Utils.getUri(getActivity(), uribitmap);
scanner.onScanFinish(uri);
return true;
}
}).subscribeOn(Schedulers.io()).observeOn(AndroidSchedulers.mainThread())
.subscribe(new Observer<Boolean>() {
#Override
public void onSubscribe(Disposable d) {
//// pre execute here is my progress dialog
showProgressDialog(getString(R.string.scanning));
}
#Override
public void onNext(Boolean aBoolean) {
//// here is on sucess you can do anystuff here like
if (aBoolean){
/// if its value true you can go ahead with this
}
}
#Override
public void onError(Throwable e) {
/// this helps you to go if there is any error show dialog whatever you wants here
Log.e("error of kind",e.getMessage() );
}
#Override
public void onComplete() {
/// when your task done means post execute
uribitmap.recycle();
dismissDialog();
}
});
now i will do this with executors :
/// pre execute you can trigger to progress dialog
showProgressDialog(getString(R.string.scanning));
ExecutorService executors = Executors.newSingleThreadExecutor();
executors.execute(new Runnable() {
#Override
public void run() {
//// do background heavy task here
final Bitmap uribitmap = getScannedBitmap(original, points);
uri = Utils.getUri(getActivity(), uribitmap);
scanner.onScanFinish(uri);
new Handler(Looper.getMainLooper()).post(new Runnable() {
#Override
public void run() {
//// Ui thread work like
uribitmap.recycle();
dismissDialog();
}
});
}
});
You are getting this error because you are trying to insert an Object on the main (UI) thread.
You should do something like this:
Observable.fromCallable(() -> myViewModel.insertDatabase( myRoomEntity ))
.subscribeOn( Schedulers.io() )
.observeOn( AndroidSchedulers.mainThread() );
And then use an Observer to subscribe to the Observable.
Please try restructuring your code like this:
Completable.fromAction(() -> myViewModel.insertDatabase(myRoomEntity))
.subscribeOn(Schedulers.io())
.observeOn(AndroidSchedulers.mainThread())
.subscribe(() -> Log.d("TAG", "Populating database Success"),
throwable -> Log.d("TAG", throwable.toString()))
Considerations:
If your myRoomEntity is not available before this whole construct gets subscribed, make sure you use defer http://reactivex.io/documentation/operators/defer.html
Your subscribe section handlers are operating on "main", that's why you were receiving a crash.
If possible, avoid unnecessary just calls
The doc of kafka give an approach about with following describes:
One Consumer Per Thread:A simple option is to give each thread its own consumer > instance.
My code:
public class KafkaConsumerRunner implements Runnable {
private final AtomicBoolean closed = new AtomicBoolean(false);
private final CloudKafkaConsumer consumer;
private final String topicName;
public KafkaConsumerRunner(CloudKafkaConsumer consumer, String topicName) {
this.consumer = consumer;
this.topicName = topicName;
}
#Override
public void run() {
try {
this.consumer.subscribe(topicName);
ConsumerRecords<String, String> records;
while (!closed.get()) {
synchronized (consumer) {
records = consumer.poll(100);
}
for (ConsumerRecord<String, String> tmp : records) {
System.out.println(tmp.value());
}
}
} catch (WakeupException e) {
// Ignore exception if closing
System.out.println(e);
//if (!closed.get()) throw e;
}
}
// Shutdown hook which can be called from a separate thread
public void shutdown() {
closed.set(true);
consumer.wakeup();
}
public static void main(String[] args) {
CloudKafkaConsumer kafkaConsumer = KafkaConsumerBuilder.builder()
.withBootstrapServers("172.31.1.159:9092")
.withGroupId("test")
.build();
ExecutorService executorService = Executors.newFixedThreadPool(5);
executorService.execute(new KafkaConsumerRunner(kafkaConsumer, "log"));
executorService.execute(new KafkaConsumerRunner(kafkaConsumer, "log.info"));
executorService.shutdown();
}
}
but it doesn't work and throws an exception:
java.util.ConcurrentModificationException: KafkaConsumer is not safe for multi-threaded access
Furthermore, I read the source of Flink (an open source platform for distributed stream and batch data processing). Flink using multi-thread consumer is similar to mine.
long pollTimeout = Long.parseLong(flinkKafkaConsumer.properties.getProperty(KEY_POLL_TIMEOUT, Long.toString(DEFAULT_POLL_TIMEOUT)));
pollLoop: while (running) {
ConsumerRecords<byte[], byte[]> records;
//noinspection SynchronizeOnNonFinalField
synchronized (flinkKafkaConsumer.consumer) {
try {
records = flinkKafkaConsumer.consumer.poll(pollTimeout);
} catch (WakeupException we) {
if (running) {
throw we;
}
// leave loop
continue;
}
}
flink code of mutli-thread
What's wrong?
Kafka consumer is not thread safe. As you pointed out in your question, the document stated that
A simple option is to give each thread its own consumer instance
But in your code, you have the same consumer instance wrapped by different KafkaConsumerRunner instances. Thus multiple threads are accessing the same consumer instance. The kafka documentation clearly stated
The Kafka consumer is NOT thread-safe. All network I/O happens in the
thread of the application making the call. It is the responsibility of
the user to ensure that multi-threaded access is properly
synchronized. Un-synchronized access will result in
ConcurrentModificationException.
That's exactly the exception you received.
It is throwing the exception on your call to subscribe. this.consumer.subscribe(topicName);
Move that block into a synchronized block like this:
#Override
public void run() {
try {
synchronized (consumer) {
this.consumer.subscribe(topicName);
}
ConsumerRecords<String, String> records;
while (!closed.get()) {
synchronized (consumer) {
records = consumer.poll(100);
}
for (ConsumerRecord<String, String> tmp : records) {
System.out.println(tmp.value());
}
}
} catch (WakeupException e) {
// Ignore exception if closing
System.out.println(e);
//if (!closed.get()) throw e;
}
}
Maybe is not your case, but if you are mergin processing of data of serveral topics, then you can read data from multiple topics with the same consumer. If not, then is preferable to create separate jobs consuming each topic.
I'm using nested Asynchronous query execution with Cassandra. Data is continuously streamed in and for each incoming data, the below block of cassandra operations are executed. It works fine for a while but then starts throwing a lot of NoHostAvailableException.
Please me help me out here.
Cassandra Session Connection code :
I use separate sessions for read and write. Each of these sessions connect to a different seed as I was told this would improve performance.
final com.datastax.driver.core.Session readSession = CassandraManager.connect("10.22.1.144", "fr_repo",
"READ");
final com.datastax.driver.core.Session writeSession = CassandraManager.connect("10.1.12.236", "fr_repo",
"WRITE");
The CassandraManager.connect method is below :
public static Session connect(String ip, String keySpace,String type) {
PoolingOptions poolingOpts = new PoolingOptions();
poolingOpts.setCoreConnectionsPerHost(HostDistance.REMOTE, 2);
poolingOpts.setMaxConnectionsPerHost(HostDistance.REMOTE, 400);
poolingOpts.setMaxSimultaneousRequestsPerConnectionThreshold(HostDistance.REMOTE, 128);
poolingOpts.setMinSimultaneousRequestsPerConnectionThreshold(HostDistance.REMOTE, 2);
cluster = Cluster
.builder()
.withPoolingOptions( poolingOpts )
.addContactPoint(ip)
.withRetryPolicy( DowngradingConsistencyRetryPolicy.INSTANCE )
.withReconnectionPolicy( new ConstantReconnectionPolicy( 100L ) ).build();
Session s = cluster.connect(keySpace);
return s;
}
Database operation code :
ResultSetFuture resultSetFuture = readSession.executeAsync(selectBound.bind(fr.getHashcode()));
Futures.addCallback(resultSetFuture, new FutureCallback<ResultSet>() {
public void onSuccess(com.datastax.driver.core.ResultSet resultSet) {
try {
Iterator<Row> rows = resultSet.iterator();
if (!rows.hasNext()) {
ResultSetFuture resultSetFuture = readSession.executeAsync(selectPrimaryBound
.bind(fr.getPrimaryKeyHashcode()));
Futures.addCallback(resultSetFuture, new FutureCallback<ResultSet>() {
public void onFailure(Throwable arg0) {
}
public void onSuccess(ResultSet arg0) {
Iterator<Row> rows = arg0.iterator();
if (!rows.hasNext()) {
writeSession.executeAsync(insertBound.bind(fr.getHashcode(), fr,
System.currentTimeMillis()));
writeSession.executeAsync(insertPrimaryBound.bind(
fr.getHashcode(),
fr.getCombinedPrimaryKeys(), System.currentTimeMillis()));
produceintoQueue(new Gson().toJson(frCompleteMap));
} else {
writeSession.executeAsync(updateBound.bind(fr,
System.currentTimeMillis(), fr.getHashcode()));
produceintoQueue(new Gson().toJson(frCompleteMap));
}
}
});
} else {
writeSession.executeAsync(updateLastSeenBound.bind(System.currentTimeMillis(),
fr.getHashcode()));
}
} catch (Exception e) {
e.printStackTrace();
}
}
It sounds like you're sending more requests than your pool/cluster can handle. This is pretty easy to do when you're never actually waiting for a result, as is the case in your code. You're essentially just throwing as many requests as you can into the pipeline with no blocking, and there's no natural back pressure to slow down your app if the pool or cluster get backed up. So if your request volume is too high, eventually all the hosts will be busy with the backed up work queue. You can use nodetool tpstats to see what your request queues look like on each node.
I am confused with Async feature introduced in Servlet 3.0 spec
From Oracle site (http://docs.oracle.com/javaee/7/tutorial/doc/servlets012.htm):
To create scalable web applications, you must ensure that no threads
associated with a request are sitting idle, so the container can use
them to process new requests.
There are two common scenarios in which a thread associated with a
request can be sitting idle.
1- The thread needs to wait for a resource to become available or process data before building the response. For example, an application
may need to query a database or access data from a remote web service
before generating the response.
2- The thread needs to wait for an event before generating the response. For example, an application may have to wait for a JMS
message, new information from another client, or new data available in
a queue before generating the response.
The first item happens a lot (nearly always, we always query db or call a remote webservice to get some data). And calling an external resource will always consume some time.
Does it mean that we should ALWAYS use servelt async feature for ALL our servelts and filter ?!
I can ask this way too, if I write all my servelts and filters async, will I lose anything (performance)?!
If above is correct the skeleton of ALL our servlets will be:
public class Work implements ServletContextListener {
private static final BlockingQueue queue = new LinkedBlockingQueue();
private volatile Thread thread;
#Override
public void contextInitialized(ServletContextEvent servletContextEvent) {
thread = new Thread(new Runnable() {
#Override
public void run() {
while (true) {
try {
ServiceFecade.doBusiness();
AsyncContext context;
while ((context = queue.poll()) != null) {
try {
ServletResponse response = context.getResponse();
PrintWriter out = response.getWriter();
out.printf("Bussiness done");
out.flush();
} catch (Exception e) {
throw new RuntimeException(e.getMessage(), e);
} finally {
context.complete();
}
}
} catch (InterruptedException e) {
return;
}
}
}
});
thread.start();
}
public static void add(AsyncContext c) {
queue.add(c);
}
#Override
public void contextDestroyed(ServletContextEvent servletContextEvent) {
thread.interrupt();
}
}