I want to limit concurrent threads when some operation in cassandra is being performed. It could be limited with direct use of cassandraTemplate like this:
public void insert(List<MyEntity> entities, int maxConcurrentThreadsAllowed) {
Flux.fromIterable(entities)
.flatMap(this::insert, maxConcurrentThreadsAllowed)
.subscribe();
}
public Mono<MyEntity> insert(MyEntity e) {
return cassandraTemplate.insert(e);
}
Is it possible to achieve the same using reactive repositories? Some kind of configuration on cassandraSession level?
Related
Using spring integration and zookeeper, one can implement a leader to perform activities such as polling.
However how do we distribute the leader responsibility to all nodes in the cluster to load balance?
Given below code, once the application starts, I see that the same node is maintaining the leader role and fetching events. I want to distribute this activity to every node in the cluster to better load balance.
Is there any way I can schedule each node in the cluster to gain leadership and revoke in round robin manner?
#Bean
public LeaderInitiatorFactoryBean fooLeaderInitiator(CuratorFramework client) {
new LeaderInitiatorFactoryBean()
.setClient(client)
.setPath("/foofeed")
.setRole("foo");
}
#Bean
#InboundChannelAdapter(channel = "fooIncomingEvents", autoStartup = "false", poller = #Poller(fixedDelay = "5000"))
#Role("foo")
public FooTriggerMessageSource fooInboundChannelAdapter() {
new FooMessageSource("foo")
}
I could simulate load balancing using below code. Not sure if this is the correct approach. I could see fetching events log statement only from one node at a time in the cluster. This code yields leadership after performing gaining leadership and performing its job.
#Bean
public LeaderInitiator fooLeaderInitiator(CuratorFramework client,
FooPollingCandidate fooPollingCandidate) {
LeaderInitiator leader = new LeaderInitiator(client, fooPollingCandidate, zooKeeperNamespace)
leader.start()
leader
}
#Component
class FooPollingCandidate extends DefaultCandidate {
final Logger log = LoggerFactory.getLogger(this.getClass());
FooPollingCandidate() {
super("fooPoller", "foo")
}
#Override
void onGranted(Context ctx) {
log.debug("Leadership granted {}", ctx)
pullEvents()
ctx.yield();
}
#Override
void onRevoked(Context ctx) {
log.debug("Leadership revoked")
}
#Override
void yieldLeadership() {
log.debug("yielding Leadership")
}
//pull events and drop them on any channel needed
void pullEvents() {
log.debug("fetching events")
//simulate delay
sleep(5000)
}
}
What you are suggesting is an abuse of the leader election technology, which is intended for warm failover when the current leader fails, manually yielding leadership after each event is an anti-pattern
What you probably want is competing pollers where all pollers are active, but use a shared store to prevent duplicate processing.
For example, if you are polling a shared directory for files to process, you would use a FileSystemPersistentFileListFilter with a shared MetadataStore (such as the zookeeper implementation) to prevent multiple instances from processing the same file.
You can use the same technique (shared metadata store) for any polled message source.
I came across following code:
public class ShippingSaga : Saga<ShippingSagaData>,
ISagaStartedBy<OrderAccepted>,
ISagaStartedBy<CustomerBilledForOrder>
{
public void Handle(CustomerBilledForOrder message)
{
this.Data.CustomerHasBeenBilled = true;
this.Data.CustomerId = message.CustomerId;
this.Data.OrderId = message.OrderId;
this.CompleteIfPossible();
}
public void Handle(OrderAccepted message)
{
this.Data.ProductIdsInOrder = message.ProductIdsInOrder;
this.Data.CustomerId = message.CustomerId;
this.Data.OrderId = message.OrderId;
this.CompleteIfPossible();
}
private void CompleteIfPossible()
{
if (this.Data.ProductIdsInOrder != null && this.Data.CustomerHasBeenBilled)
{
this.Bus.Send<ShipOrderToCustomer>(
(m =>
{
m.CustomerId = this.Data.CustomerId;
m.OrderId = this.Data.OrderId;
m.ProductIdsInOrder = this.Data.ProductIdsInOrder;
}
));
this.MarkAsComplete();
}
}
}
By the look of things in above code sagas seem to be some kind of higher level co-ordinator/controller of events. Is this true ?If so ,are they used only in Event Driven Architectures? And at last , are sagas parts of INFRASTRUCTURE?
first query seems to be answered. but where do they really belong in terms of responsibility i.e. Infrastrucure ? Domain ? . are these applicable to only EDAs?
Warning: there's some confusion, especially around nservicebus on the definition of "Saga"; see below.
Process Managers are, fundamentally, read models -- you rehyrdrate them from a history of events, and query them for a list of commands that should be run.
They are analogous to a human being looking at a view, and sending commands to the write model. See Rinat Abdullin's essay Evolving Business Processes for more on this viewpoint.
They serve as a description of the business process, which is to say that they identify additional decisions (commands) that should be run by the aggregates. In implementation, they are very much state machines - given event X and event Y, the process manager is in state(XY), and the commands that it will recommend are fixed.
I find them easier to think about if you tease apart the state machine (which is pure logic) from the side effects (interactions with the bus).
public class ShippingSaga : Saga,
ISagaStartedBy<OrderAccepted>,
ISagaStartedBy<CustomerBilledForOrder>
{
public void Handle(CustomerBilledForOrder message)
{
this.process.apply(message);
this.CompleteIfPossible();
}
public void Handle(OrderAccepted message)
{
this.process.apply(message);
this.CompleteIfPossible();
}
private void CompleteIfPossible()
{
this.process.pendingCommands().each ( m=>
this.Bus.Send(m);
}
}
}
Or equivalently -- if you prefer to think about immutable data structures
public class ShippingSaga : Saga,
ISagaStartedBy<OrderAccepted>,
ISagaStartedBy<CustomerBilledForOrder>
{
public void Handle(CustomerBilledForOrder message)
{
this.process = this.process.apply(message);
this.CompleteIfPossible();
}
public void Handle(OrderAccepted message)
{
this.process = this.process.apply(message);
this.CompleteIfPossible();
}
private void CompleteIfPossible()
{
this.process.pendingCommands().each ( m=>
this.Bus.Send(m);
}
}
}
So the shipping process is defined in terms of the business domain, and the NServiceBus "Saga" interfaces that bit of business domain with the bus infrastructure. Isn't separation of concerns wonderful.
I use "Saga" in quotes because -- the NService bus sagas aren't a particularly good fit for the prior use of the term
The term saga is commonly used in discussions of CQRS to refer to a piece of code that coordinates and routes messages between bounded contexts and aggregates. However, for the purposes of this guidance we prefer to use the term process manager to refer to this type of code artifact. There are two reasons for this:
There is a well-known, pre-existing definition of the term saga that has a different meaning from the one generally understood in relation to CQRS.
The term process manager is a better description of the role performed by this type of code artifact.
Storm Topology reads data from kafka and write into cassandra tables
In Storm i am creating cassandra cluster connection and session in prepare method.
cassandraCluster = Cluster.builder().withoutJMXReporting().withoutMetrics()
.addContactPoints(nodes)
.withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE)
.withReconnectionPolicy(new ExponentialReconnectionPolicy(100L,
TimeUnit.MINUTES.toMillis(5)))
.withLoadBalancingPolicy(
new TokenAwarePolicy(new RoundRobinPolicy()))
.build();
session = cassandraCluster.connect(keyspace);
In execute method i can process the tuple and save it in cassandra table
Suppose if i want to write data from single tuple into multiple table
Writing separate bolt for each table will be good choice. But i have to create cluster connection and session each table in each bolt.
But in this link single connection per cluster will be a good idea for performance
http://www.datastax.com/dev/blog/4-simple-rules-when-using-the-datastax-drivers-for-cassandra
Did any of you have any idea on creating cluster connection in one bolt and use this connection in other bolt?
It depends on how storm allocates the bolts and spouts to the workers. You can't assume that you can can share connections between bolts because they might be running in different workers (read: JVMs) or on different nodes entirely.
See my answer here: Mongo connection pooling for Storm topology
Might look something like this pseudocode:
public class CassandraBolt extends BaseRichBolt {
private static final long serialVersionUID = 1L;
private static Logger LOG = LoggerFactory.getLogger(CassandraBolt.class);
OutputCollector _collector;
// whatever your cassandra session is
// has to be transient because session is not serializable
protected transient CassandraSession _session;
#SuppressWarnings("rawtypes")
#Override
public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) {
_collector = collector;
// maybe get properties from stormConf instead of hard coding them
cassandraCluster = Cluster.builder().withoutJMXReporting().withoutMetrics()
.addContactPoints(nodes)
.withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE)
.withReconnectionPolicy(new ExponentialReconnectionPolicy(100L,
TimeUnit.MINUTES.toMillis(5)))
.withLoadBalancingPolicy(
new TokenAwarePolicy(new RoundRobinPolicy()))
.build();
_session = cassandraCluster.connect(keyspace);
}
#Override
public void execute(Tuple input) {
try {
// use _session to talk to cassandra
} catch (Exception e) {
LOG.error("CassandraBolt error", e);
_collector.reportError(e);
}
}
#Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// TODO Auto-generated method stub
}
}
Having an huge customers profile page if two or more users start using same page and start editing big change will happen in my database so planing to implement Threads concept where only one user can use that customer page
i'm aware about threads concept but confused how to implement it
hope i need to use Singleton class as well
Any suggestion or Logic's will be helpful
I'm using Struts,Hibernate frame work
You may use application context to store a flag variable. Action will use its value to allow only one simultaneous execution.
public class TestAction extends ActionSupport implements ApplicationAware {
private static final String APP_BUSY_KEY = "APP_BUSY";
Map<String, Object> map;
#Override
public void setApplication(Map<String, Object> map) {
this.map = map;
}
#Override
public String execute() throws Exception {
if (map.containsKey(APP_BUSY_KEY)) {
return ERROR;
} else {
map.put(APP_BUSY_KEY, "1");
try {
// action logic here
} finally {
map.remove(APP_BUSY_KEY);
}
return SUCCESS;
}
}
}
If you plan to implement similar logic for two requests (lock after displaying values and release lock after submitting new values) then logic will be more complex and you will also need to handle lock release after timeout.
I use Netty for a multithreaded TCP server and a single client persistent connection.
The client sends many binary messages (10000 in my use case) and is supposed to receive an answer for each message. I added an OrderedMemoryAwareThreadPoolExecutor to the pipeline to handle the execution of DB calls on multiple threads.
If I run a DB call in the method messageReceived() (or simulate it with Thread.currentThread().sleep(50)) then all events are handled by a single thread.
5 count of {main}
1 count of {New
10000 count of {pool-3-thread-4}
For a simple implementation of messageReceived() the server creates many executor threads as expected.
How should I configure the ExecutionHandler to get multiple threads executors for the business logic, please?
Here is my code:
public class MyServer {
public void run() {
OrderedMemoryAwareThreadPoolExecutor eventExecutor = new OrderedMemoryAwareThreadPoolExecutor(16, 1048576L, 1048576L, 1000, TimeUnit.MILLISECONDS, Executors.defaultThreadFactory());
ExecutionHandler executionHandler = new ExecutionHandler(eventExecutor);
bootstrap.setPipelineFactory(new ServerChannelPipelineFactory(executionHandler));
}
}
public class ServerChannelPipelineFactory implements ChannelPipelineFactory {
public ChannelPipeline getPipeline() throws Exception {
pipeline.addLast("encoder", new MyProtocolEncoder());
pipeline.addLast("decoder", new MyProtocolDecoder());
pipeline.addLast("executor", executionHandler);
pipeline.addLast("myHandler", new MyServerHandler(dataSource));
}
}
public class MyServerHandler extends SimpleChannelHandler {
public void messageReceived(ChannelHandlerContext ctx, final MessageEvent e) throws DBException {
// long running DB call simulation
try {
Thread.currentThread().sleep(50);
} catch (InterruptedException ex) {
}
// a simple message
final MyMessage answerMsg = new MyMessage();
if (e.getChannel().isWritable()) {
e.getChannel().write(answerMsg);
}
}
}
OrderedMemoryAwareThreadPoolExecutor guarantees that events from a single channel are processed in order. You can think of it as binding a channel to a specific thread in the pool and then processing all events on that thread - although it's a bit more complex than that, so don't depend on a channel always being processed by the same thread.
If you start up a second client you'll see it (most likely) being processed on another thread from the pool. If you really can process a single client's requests in parallel then you probably want MemoryAwareThreadPoolExecutor but be aware that this offers no guarantees on the order of channel events.