How do RSocket issue lease to multiple clients? - backpressure

I could create a server lease to a single client as follows:
#Slf4j
public class LeaseServer {
private static final String SERVER_TAG = "server";
public static void main(String[] args) throws InterruptedException {
// Queue for incoming messages represented as Flux
// Imagine that every fireAndForget that is pushed is processed by a worker
int queueCapacity = 50;
BlockingQueue<String> messagesQueue = new ArrayBlockingQueue<>(queueCapacity);
// emulating a worker that process data from the queue
Thread workerThread =
new Thread(
() -> {
try {
while (!Thread.currentThread().isInterrupted()) {
String message = messagesQueue.take();
System.out.println("consume message:" + message);
Thread.sleep(100000); // emulating processing
}
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
});
workerThread.start();
CloseableChannel server = getFireAndForgetServer(messagesQueue, workerThread);
TimeUnit.MINUTES.sleep(10);
server.dispose();
}
private static CloseableChannel getFireAndForgetServer(BlockingQueue<String> messagesQueue, Thread workerThread) {
CloseableChannel server =
RSocketServer.create((setup, sendingSocket) ->
Mono.just(new RSocket() {
#Override
public Mono<Void> fireAndForget(Payload payload) {
// add element. if overflows errors and terminates execution
// specifically to show that lease can limit rate of fnf requests in
// that example
try {
if (!messagesQueue.offer(payload.getDataUtf8())) {
System.out.println("Queue has been overflowed. Terminating execution");
sendingSocket.dispose();
workerThread.interrupt();
}
} finally {
payload.release();
}
return Mono.empty();
}
}))
.lease(() -> Leases.create().sender(new LeaseCalculator(SERVER_TAG, messagesQueue)))
.bindNow(TcpServerTransport.create("localhost", 7000));
return server;
}
}
But how do I issue a lease to multiple clients connected to that server?
Otherwise my queue will be written multiple times by multiple clients, resulting in an overflow of the service.
I can't find the details in the public documents and materials.
Your help was very much appreciated.

Related

How to properly close a flowable and close response body using rxjava and retrofit

I am attempting to close a stream coming from an http request using Retrofit and rxjava, either because it timedOut, or because I need to change details that went into the request. Both appear to work perfectly, as when I cancel subscription I get the doOnCancel debug message and when doOnNext is completed I get the doOnTerminate message. I also do not receive inputLines from multiple threads. However, my thread count rises every single time either of the above actions happen. It appears that responsebody.close is not releasing their resources and therefore the thread is not dying (I also have gotten error messages along the lines of "OKHTTP leaked. did you close youre responseBody?")
Does anyone have any suggestions?
public boolean closeSubscription() {
flowableAlive = false;
subscription.cancel();
return true;
}
public void subscribeToFlowable() {
streamFlowable.observeOn(Schedulers.newThread()).subscribeOn(Schedulers.newThread())
.doOnTerminate(() -> log.debug("TERMINATED")).doOnCancel(() -> log.debug("FLOWABLE CANCELED"))
.subscribe(new Subscriber<ResponseBody>() {
#Override
public void onSubscribe(Subscription s) {
subscription = s;
subscription.request(Long.MAX_VALUE);
}
#Override
public void onNext(ResponseBody responseBody) {
log.debug("onNext called");
String inputLine;
try (InputStream inputStream = responseBody.byteStream()) {
BufferedReader br = new BufferedReader(new InputStreamReader(inputStream));
while (flowableAlive && ((inputLine = br.readLine()) != null)) {
log.debug("stream receive input line for thread " + name);
log.debug(inputLine);
}
} catch (IOException e) {
log.debug("error occurred");
log.debug(e.getMessage());
}
}
#Override
public void onError(Throwable t) {
log.debug("error");
flowableAlive = false;
}
#Override
public void onComplete() {
log.debug("completed");
closeSubscription();
flowableAlive = false;
}
});
}
The result of subscribe() is Disposable object. You should store it as a filed and call Disposable.dispose() on it later as shown here:
https://proandroiddev.com/disposing-on-android-the-right-way-97bd55cbf970
Tour OkHttp call will be interrupted properly because dispose() interrupts thread on which the call runs and OkHttp checks regularly if Thread was interrupted to stop transfer when that happened - it's called cooperative cancelling/interruption.

How can I parallel consumption kafka with spark streaming? I set concurrentJobs but something error [duplicate]

The doc of kafka give an approach about with following describes:
One Consumer Per Thread:A simple option is to give each thread its own consumer > instance.
My code:
public class KafkaConsumerRunner implements Runnable {
private final AtomicBoolean closed = new AtomicBoolean(false);
private final CloudKafkaConsumer consumer;
private final String topicName;
public KafkaConsumerRunner(CloudKafkaConsumer consumer, String topicName) {
this.consumer = consumer;
this.topicName = topicName;
}
#Override
public void run() {
try {
this.consumer.subscribe(topicName);
ConsumerRecords<String, String> records;
while (!closed.get()) {
synchronized (consumer) {
records = consumer.poll(100);
}
for (ConsumerRecord<String, String> tmp : records) {
System.out.println(tmp.value());
}
}
} catch (WakeupException e) {
// Ignore exception if closing
System.out.println(e);
//if (!closed.get()) throw e;
}
}
// Shutdown hook which can be called from a separate thread
public void shutdown() {
closed.set(true);
consumer.wakeup();
}
public static void main(String[] args) {
CloudKafkaConsumer kafkaConsumer = KafkaConsumerBuilder.builder()
.withBootstrapServers("172.31.1.159:9092")
.withGroupId("test")
.build();
ExecutorService executorService = Executors.newFixedThreadPool(5);
executorService.execute(new KafkaConsumerRunner(kafkaConsumer, "log"));
executorService.execute(new KafkaConsumerRunner(kafkaConsumer, "log.info"));
executorService.shutdown();
}
}
but it doesn't work and throws an exception:
java.util.ConcurrentModificationException: KafkaConsumer is not safe for multi-threaded access
Furthermore, I read the source of Flink (an open source platform for distributed stream and batch data processing). Flink using multi-thread consumer is similar to mine.
long pollTimeout = Long.parseLong(flinkKafkaConsumer.properties.getProperty(KEY_POLL_TIMEOUT, Long.toString(DEFAULT_POLL_TIMEOUT)));
pollLoop: while (running) {
ConsumerRecords<byte[], byte[]> records;
//noinspection SynchronizeOnNonFinalField
synchronized (flinkKafkaConsumer.consumer) {
try {
records = flinkKafkaConsumer.consumer.poll(pollTimeout);
} catch (WakeupException we) {
if (running) {
throw we;
}
// leave loop
continue;
}
}
flink code of mutli-thread
What's wrong?
Kafka consumer is not thread safe. As you pointed out in your question, the document stated that
A simple option is to give each thread its own consumer instance
But in your code, you have the same consumer instance wrapped by different KafkaConsumerRunner instances. Thus multiple threads are accessing the same consumer instance. The kafka documentation clearly stated
The Kafka consumer is NOT thread-safe. All network I/O happens in the
thread of the application making the call. It is the responsibility of
the user to ensure that multi-threaded access is properly
synchronized. Un-synchronized access will result in
ConcurrentModificationException.
That's exactly the exception you received.
It is throwing the exception on your call to subscribe. this.consumer.subscribe(topicName);
Move that block into a synchronized block like this:
#Override
public void run() {
try {
synchronized (consumer) {
this.consumer.subscribe(topicName);
}
ConsumerRecords<String, String> records;
while (!closed.get()) {
synchronized (consumer) {
records = consumer.poll(100);
}
for (ConsumerRecord<String, String> tmp : records) {
System.out.println(tmp.value());
}
}
} catch (WakeupException e) {
// Ignore exception if closing
System.out.println(e);
//if (!closed.get()) throw e;
}
}
Maybe is not your case, but if you are mergin processing of data of serveral topics, then you can read data from multiple topics with the same consumer. If not, then is preferable to create separate jobs consuming each topic.

Azure EventProcessorHost and Worker role

I was hoping for some guidance on how to use the EventProcessorHost with a worker role. Basically I am hoping to have the EventProcessorHost process the partitions in parallel and I'm wondering where I should go about placing this type of code within the worker role and if I'm missing anything key.
var manager = NamespaceManager.CreateFromConnectionString(connectionString);
var desc = manager.CreateEventHubIfNotExistsAsync(path).Result;
var client = Microsoft.ServiceBus.Messaging.EventHubClient.CreateFromConnectionString(connectionString, path);
var host = new EventProcessorHost(hostname, path, consumerGroup, connectionString, blobStorageConnectionString);
EventHubProcessorFactory<EventData> factory = new EventHubProcessorFactory<EventData>();
host.RegisterEventProcessorFactoryAsync(factory);
Everything I've read says the EventProcessorHost will divide up the partitions on its own, but is the above code sufficient to process all the partitions asynchronously?
Here's a simplified version of how we process our event hub from an Worker Role. We keep the instance in the mainWorker role and call the IEventProcessor to start processing it.
This way we can call it and close it down when the Worker Responds to shutdown events etc.
EDIT:
As for the processing it in parallel, the IEventProcessor class will just grab 10 more events from the event hub when it's finished processing the current one. Handling all the fancy partition leasing for you.
It's a synchronous workflow, When I scale to multiple worker roles I start to see the partitions get split between instances and it gets faster etc. You'd have to roll your own solution if you wanted it to process the event hub in a different way.
public class WorkerRole : RoleEntryPoint
{
private readonly CancellationTokenSource _cancellationTokenSource = new CancellationTokenSource();
private readonly ManualResetEvent _runCompleteEvent = new ManualResetEvent(false);
private EventProcessorHost _eventProcessorHost;
public override bool OnStart()
{
ThreadPool.SetMaxThreads(4096, 2048);
ServicePointManager.DefaultConnectionLimit = 500;
ServicePointManager.UseNagleAlgorithm = false;
ServicePointManager.Expect100Continue = false;
var eventClient = EventHubClient.CreateFromConnectionString("consumersConnectionString",
"eventHubName");
_eventProcessorHost = new EventProcessorHost(Dns.GetHostName(), eventClient.Path,
eventClient.GetDefaultConsumerGroup().GroupName,
"consumersConnectionString", "blobLeaseConnectionString");
return base.OnStart();
}
public override void Run()
{
try
{
RunAsync(this._cancellationTokenSource.Token).Wait();
}
finally
{
_runCompleteEvent.Set();
}
}
private async Task RunAsync(CancellationToken cancellationToken)
{
// starts processing here
await _eventProcessorHost.RegisterEventProcessorAsync<EventProcessor>();
while (!cancellationToken.IsCancellationRequested)
{
await Task.Delay(TimeSpan.FromMinutes(1));
}
}
public override void OnStop()
{
_eventProcessorHost.UnregisterEventProcessorAsync().Wait();
_cancellationTokenSource.Cancel();
_runCompleteEvent.WaitOne();
base.OnStop();
}
}
I have multiple processors for the specific partitions (you can guarantee FIFO this way), but you can implement you're own logic easily i.e. skip the use of a EventDataProcessor class and Dictionary lookup in my example and just implement some logic within the ProcessEventsAsync method.
public class EventProcessor : IEventProcessor
{
private readonly Dictionary<string, IEventDataProcessor> _eventDataProcessors;
public EventProcessor()
{
_eventDataProcessors = new Dictionary<string, IEventDataProcessor>
{
{"A", new EventDataProcessorA()},
{"B", new EventDataProcessorB()},
{"C", new EventDataProcessorC()}
}
}
public Task OpenAsync(PartitionContext context)
{
return Task.FromResult<object>(null);
}
public async Task ProcessEventsAsync(PartitionContext context, IEnumerable<EventData> messages)
{
foreach(EventData eventData in messages)
{
// implement your own logic here, you could just process the data here, just remember that they will all be from the same partition in this block
try
{
IEventDataProcessor eventDataProcessor;
if(_eventDataProcessors.TryGetValue(eventData.PartitionKey, out eventDataProcessor))
{
await eventDataProcessor.ProcessMessage(eventData);
}
}
catch (Exception ex)
{
_//log exception
}
}
await context.CheckpointAsync();
}
public async Task CloseAsync(PartitionContext context, CloseReason reason)
{
if (reason == CloseReason.Shutdown)
await context.CheckpointAsync();
}
}
Example of one of our EventDataProcessors
public interface IEventDataProcessor
{
Task ProcessMessage(EventData eventData);
}
public class EventDataProcessorA : IEventDataProcessor
{
public async Task ProcessMessage(EventData eventData)
{
// Do Something specific with data from Partition "A"
}
}
public class EventDataProcessorB : IEventDataProcessor
{
public async Task ProcessMessage(EventData eventData)
{
// Do Something specific with data from Partition "B"
}
}
Hope this helps, it's been rock solid for us so far and scales easily to multiple instances

On servlet 3.0 webserver, is it good to make all servlets and filters async?

I am confused with Async feature introduced in Servlet 3.0 spec
From Oracle site (http://docs.oracle.com/javaee/7/tutorial/doc/servlets012.htm):
To create scalable web applications, you must ensure that no threads
associated with a request are sitting idle, so the container can use
them to process new requests.
There are two common scenarios in which a thread associated with a
request can be sitting idle.
1- The thread needs to wait for a resource to become available or process data before building the response. For example, an application
may need to query a database or access data from a remote web service
before generating the response.
2- The thread needs to wait for an event before generating the response. For example, an application may have to wait for a JMS
message, new information from another client, or new data available in
a queue before generating the response.
The first item happens a lot (nearly always, we always query db or call a remote webservice to get some data). And calling an external resource will always consume some time.
Does it mean that we should ALWAYS use servelt async feature for ALL our servelts and filter ?!
I can ask this way too, if I write all my servelts and filters async, will I lose anything (performance)?!
If above is correct the skeleton of ALL our servlets will be:
public class Work implements ServletContextListener {
private static final BlockingQueue queue = new LinkedBlockingQueue();
private volatile Thread thread;
#Override
public void contextInitialized(ServletContextEvent servletContextEvent) {
thread = new Thread(new Runnable() {
#Override
public void run() {
while (true) {
try {
ServiceFecade.doBusiness();
AsyncContext context;
while ((context = queue.poll()) != null) {
try {
ServletResponse response = context.getResponse();
PrintWriter out = response.getWriter();
out.printf("Bussiness done");
out.flush();
} catch (Exception e) {
throw new RuntimeException(e.getMessage(), e);
} finally {
context.complete();
}
}
} catch (InterruptedException e) {
return;
}
}
}
});
thread.start();
}
public static void add(AsyncContext c) {
queue.add(c);
}
#Override
public void contextDestroyed(ServletContextEvent servletContextEvent) {
thread.interrupt();
}
}

Threading multiple async calls

Part of my Silverlight application requires data from three service requests. Up until now I've been chaining the requests so as one completes the other starts... until the end of the chain where I do what I need to do with the data.
Now, I know thats not the best method(!). I've been looking at AutoResetEvent (link to MSDN example) to thread and then synchronize the results but cannot seem to get this to work with async service calls.
Does anyone have any reason to doubt this method or should this work? Code samples gratefully received!
Take a look at this example:
Will fire Completed event and print 'done' to Debug Output once both services returned.
Key thing is that waiting for AutoResetEvents happens in background thread.
public partial class MainPage : UserControl
{
public MainPage()
{
InitializeComponent();
Completed += (s, a) => { Debug.WriteLine("done"); };
wrk.DoWork += (s, a) =>
{
Start();
};
wrk.RunWorkerAsync();
}
public event EventHandler Completed;
private void Start()
{
auto1.WaitOne();
auto2.WaitOne();
Completed(this, EventArgs.Empty);
}
public AutoResetEvent auto1 = new AutoResetEvent(false);
public AutoResetEvent auto2 = new AutoResetEvent(false);
BackgroundWorker wrk = new BackgroundWorker();
private void Button_Click(object sender, RoutedEventArgs e)
{
ServiceReference1.Service1Client clien = new SilverlightAsyncTest.ServiceReference1.Service1Client();
clien.DoWorkCompleted += new EventHandler<SilverlightAsyncTest.ServiceReference1.DoWorkCompletedEventArgs>(clien_DoWorkCompleted);
clien.DoWork2Completed += new EventHandler<SilverlightAsyncTest.ServiceReference1.DoWork2CompletedEventArgs>(clien_DoWork2Completed);
clien.DoWorkAsync();
clien.DoWork2Async();
}
void clien_DoWork2Completed(object sender, SilverlightAsyncTest.ServiceReference1.DoWork2CompletedEventArgs e)
{
Debug.WriteLine("2");
auto1.Set();
}
void clien_DoWorkCompleted(object sender, SilverlightAsyncTest.ServiceReference1.DoWorkCompletedEventArgs e)
{
Debug.WriteLine("1");
auto2.Set();
}
}
It could be done using the WaitHandle in the IAsyncResult returned by each async method.
The code is simple. In Silverlight I just do 10 service calls that will add an item to a ListBox. I'll wait until all the service calls end to add another message to the list (this has to run in a different thread to avoid blocking the UI). Also note that adding items to the list have to be done through the Dispatcher since they will modify the UI. There're a bunch of lamdas, but it's easy to follow.
public MainPage()
{
InitializeComponent();
var results = new ObservableCollection<string>();
var asyncResults = new List<IAsyncResult>();
resultsList.ItemsSource = results;
var service = new Service1Client() as Service1;
1.To(10).Do(i=>
asyncResults.Add(service.BeginDoWork(ar =>
Dispatcher.BeginInvoke(() => results.Add(String.Format("Call {0} finished: {1}", i, service.EndDoWork(ar)))),
null))
);
new Thread(()=>
{
asyncResults.ForEach(a => a.AsyncWaitHandle.WaitOne());
Dispatcher.BeginInvoke(() => results.Add("Everything finished"));
}).Start();
}
Just to help with the testing, this is the service
public class Service1
{
private const int maxMilliSecs = 500;
private const int minMillisSecs = 100;
[OperationContract]
public int DoWork()
{
int millisSecsToWait = new Random().Next(maxMilliSecs - minMillisSecs) + minMillisSecs;
Thread.Sleep(millisSecsToWait);
return millisSecsToWait;
}
}

Resources