In RestKit 0.10.3 using delegate methods I was able to send simultaneous GET requests, even if mapping of each one was executed sequentially. I have GET requests which need so much time to receive a response, so I would like to send all requests simultaneously but respect an order in mapping (I have relationships crossing the three requests). What I need:
1) send GET request n. 1, send GET request n. 2, send GET request n. 3
2) start mapping of request 1 as soon as response 1 is received
3) when mapping of response 1 did finish, wait for response 2 and map
(or just start mapping if already received)
4) when mapping of response 2 did finish, wait for response 3 and map
(or just start mapping if already received)
What seems to happen (if operationQueue on objectManeger is NOT set to 1):
1) send GET request n. 1, send GET request n. 2, send GET request n. 3
2) map a response as soon as is received and the previous mapping has finished
Question 1: is it possible to respect an order in mapping?
Question 2: can the mapping of more responses (point 2) occur simultaneously? In other words, is this possibile:
1) send GET request n. 1, send GET request n. 2
2) start mapping response 1
3) start mapping response 2
4) mapping response 1 ends
5) mapping response 2 ends
If this is not possible, I would have a "half" solution: enqueue each request in willMapDeserializedResponseBlock. The requests will not be send simultaneously, but at least I will be able to send each request before the previous mapping.
Question 3: if I duplicate the "addConnectionForRelationship" of entityMappings on both sides of crossed relationship, the three mappings should be executed simultaneously and the order should not be important anymore. Is this possible without creating some evil behavior (duplicate objects, orphaned objects, missing relationships, low performance) ??
Am I crazy?
:D
If you don't have existing objects in the database then concurrently running multiple operations in the background will cause issues because you will be trying to connect or prevent duplicates across multiple contexts in different threads.
The 2 general solutions are:
Don't run the requests concurrently
Have all of the stub objects created up front and then the responses populate and relate them
If you need the order to be explicit then you should use AFNetworking to execute the downloads concurrently and then use RestKit mapping operations with the response data and specify dependencies between them.
Related
I have a background with Java and I am relatively new to node. I am trying to understand node in relation to the fact that it is single threaded, but can still handle multiple requests at the same time.
I have read about the single thread and the event loop, as well as the related stackoverflow questions, but I am still not sure I have understood it correctly, hence this question.
I have a simple http service that takes an id as an input. There can be multiple requests at almost the same time with the same id, and of course also other requests at almost the same time with other ids.
When the service is called, the following happens:
Lookup id in DB (in a blocking manner, i.e. await)
If the DB lookup did not find a result, insert id in DB
Let's say there are two requests at almost the same time, with the same id.
My question is whether the following is possible:
Request 1 makes the lookup in the DB -> no result
Request 2 makes the lookup in the DB -> no result
Request 1 inserts a new row
Request 2 insert a new row
The blocking manner of the lookup makes me guess the answer is "no, that is not possible", but then I read that the blocking does not block the single thread. What makes me want to answer "yes, it is possible", is because I do not understand how several requests can be handled, if the above is not possible.
Thanks,
-Louise
As far as I can determine the answer is "yes, that is possible". The "await" on the call to the DB ensures that the query has finished before we continue to the next line of code, but it does not block the thread.
The thread continues with other tasks while awaiting the DB operation to finish, and those other tasks might be handling another request. This means that a race condition can happen between multiple requests.
I'm using JMeter 4.0 trying to create a stress test. The purpose is to emulate the types of requests we receive in production, which is generally an array of requests of different types with a certain frequency and occasionally (1 in 1000) duplicate requests of the same type within milliseconds of each other.
I've managed to create a thread group emulating frequent requests of different types and a second thread group emulating duplicate requests (using synchronizing timer to ensure the requests fire off together).
I'm almost finished. My only problem is that there is no relationship between the thread groups whatsoever. If I wanted to perform a duplicate request once every 1000 requests, I'd need to know how long it takes to perform an average request (which is complicated by the fact that there are several request types) and calculate the time it would require for roughly 1000 requests to be made, and add an appropriate constant timer in the other thread group.
This isn't ideal. I'll settle for this if I must, but I was hoping the bright minds of stackoverflow could shine some insight for my issue.
Some ideas I've had:
Add a run counter which cycles every 1000 normal requests and once run counter hits 1000, I perform a second request (though it would be under the same thread and after I've received the response from the first). Could this be made to work using a synchronized timer?
Use a constant throughput timer with "all active threads (shared)" set whose samples per minutes is set to 1000.
Is there a better way still? The actual requests are HTTP requests, though there are several steps prior in preparation of the message to send. I'm already using a constant throughput timer in the first thread group (random service requests) to maintain a specific amount of requests per minute, so I'm not sure if adding a second constant throughput timer in the other thread group would create issues.
Thank you for your time.
You can add If Controller with condition of 1 every 1000 threads
${__jexl3(${__threadNum} % 1000 == 0)}
and inside If Controller execute your duplicate HTTP Request
__threadNum return current thread/user number
I'm using the cluster module to have multiple worker that fetch data from an API, process it and write an aggregate to the DB. The problem is, that the API has limited the requests per second. Now I'm searching for a solution to sync the limitation across all workers.
I'm thankful for every hint to solve this.
If you have a limit of number of requests per second, you could keep track of how many requests you have left in the master thread and each child could ask the master thread if it can send a request before sending, and the master thread would only fulfill the request when it has requests available for the current second. Here is another answer showing how master -> slave communication works.
At the end of each second, you would then reset the master thread to the number of requests available.
This approach would be best for achieving the maximum, however a much simpler approach would be to start N number of thread and allow them to make K number of requests per second, where K * N is just less than the number of requests allowed per second. The safest and least likely way to hit the limit with this is to do a setTimeout between the end of one request and start of the next request, but that would avoid the delay it takes processing the request. The next best option is for each thread to fire N number of requests at the start of the second and not firing again until the next second.
Your safest solution is to not go close to the limit and instead stick to max of N/2 requests per second where N is the max number of requests per second.
I'm trying to build a simple orchestration engine in a functional test like the following:
object Engine {
def orchestrate(apiSequence : Seq[Any]) {
val execUnitList = getExecutionUnits(apiSequence) // build a specific list
schedule(execUnitList) // call multiple APIs
}
In the methods called underneath (getExecutionUnits, and schedule), the pattern I've applied is one where I incrementally build a list (hence, not a val but a var), iterate over the list and call sepcific APIs and run some custom validation on each one.
I'm aware that an object in scala is sort of equivalent to a singleton (so there's only one instance of Engine, in my case). I'm wondering if this is an appropriate pattern if I'm expecting 100's of invocations of the orchestrate method concurrently. I'm not managing any other internal variables within the Engine object and I'm simply acting on the provided arguments in the method. Assuming that the schedule method can take up to 10 seconds, I'm worried about the behavior when it comes to concurrent access. If client1, client2 and client3 call this method at the same time, will 2 of the clients get queued up and be blocked my the current client being processed?
Is there a safer idiomatic way to handle the use-case? Do you recommend using actors to wrap up the "orchestrate" method to handle concurrent requests?
Edit: To clarify, it is absolutely essential the the 2 methods (getExecutionUnits and schedule) and called in sequence. Moreover, the schedule method in turn calls multiple APIs (anywhere between 1 to 10) and it is important that they too get executed in sequence. As of right now I have a simply for loop that tackles 1 Api at a time, waits for the response, then moves onto the next one if appropriate.
I'm not managing any other internal variables within the Engine object and I'm simply acting on the provided arguments in the method.
If you are using any vars in Engine at all, this won't work. However, from your description it seems like you don't: you have a local var in getExecutionUnits method and (possibly) a local var in schedule which is initialized with the return value of getExecutionUnits. This case should be fine.
If client1, client2 and client3 call this method at the same time, will 2 of the clients get queued up and be blocked my the current client being processed?
No, if you don't add any synchronization (and if Engine itself has no state, you shouldn't).
Do you recommend using actors to wrap up the "orchestrate" method to handle concurrent requests?
If you wrap it in one actor, then the clients will be blocked waiting while the engine is handling one request.
I am redesigning the messaging system for my app to use intel threading building blocks and am stumped trying to decide between two possible approaches.
Basically, I have a sequence of message objects and for each message type, a sequence of handlers. For each message object, I apply each handler registered for that message objects type.
The sequential version would be something like this (pseudocode):
for each message in message_sequence <- SEQUENTIAL
for each handler in (handler_table for message.type)
apply handler to message <- SEQUENTIAL
The first approach which I am considering processes the message objects in turn (sequentially) and applies the handlers concurrently.
Pros:
predictable ordering of messages (ie, we are guaranteed a FIFO processing order)
(potentially) lower latency of processing each message
Cons:
more processing resources available than handlers for a single message type (bad parallelization)
bad use of processor cache since message objects need to be copied for each handler to use
large overhead for small handlers
The pseudocode of this approach would be as follows:
for each message in message_sequence <- SEQUENTIAL
parallel_for each handler in (handler_table for message.type)
apply handler to message <- PARALLEL
The second approach is to process the messages in parallel and apply the handlers to each message sequentially.
Pros:
better use of processor cache (keeps the message object local to all handlers which will use it)
small handlers don't impose as much overhead (as long as there are other handlers also to be run)
more messages are expected than there are handlers, so the potential for parallelism is greater
Cons:
Unpredictable ordering - if message A is sent before message B, they may both be processed at the same time, or B may finish processing before all of A's handlers are finished (order is non-deterministic)
The pseudocode is as follows:
parallel_for each message in message_sequence <- PARALLEL
for each handler in (handler_table for message.type)
apply handler to message <- SEQUENTIAL
The second approach has more advantages than the first, but non-deterministic ordering is a big disadvantage..
Which approach would you choose and why? Are there any other approaches I should consider (besides the obvious third approach: parallel messages and parallel handlers, which has the disadvantages of both and no real redeeming factors as far as I can tell)?
Thanks!
EDIT:
I think what I'll do is use #2 by default, but allow a "conversation tag" to be attached to each message. Any messages with the same tag are ordered and handled sequentially in relation to its conversation. Handlers are passed the conversation tag alongside the message, so they may continue the conversation if they require. Something like this:
Conversation c = new_conversation()
send_message(a, c)
...
send_message(b, c)
...
send_message(x)
handler foo (msg, conv)
send_message(z, c)
...
register_handler(foo, a.type)
a is handled before b, which is handled before z. x can be handled in parallel to a, b and z. Once all messages in a conversation have been handled, the conversation is destroyed.
I'd say do something even different. Don't send work to the threads. Have the threads pull work when they finish previous work.
Maintain a fixed amount of worker threads (the optimal amount equal to the number of CPU cores in the system) and have each of them pull sequentially the next task to do from the global queue after it finishes with the previous one. Obviously, you would need to keep track of dependencies between messages to defer handling of a message until its dependencies are fully handled.
This could be done with very small synchronization overhead - possibly only with atomic operations, no heavy primitives like mutexes or semaphores.
Also, if you pass a message to each handler by reference, instead of making a copy, having the same message handled simultaneously by different handlers on different CPU cores can actually improve cache performance, as higher levels of cache (usually from L2 upwards) are often shared between CPU cores - so when one handler reads a message into the cache, the other handler on the second core will have this message already in L2. So think carefully - do you really need to copy the messages?
If possible I would go for number two with some tweaks. Do you really need every message tp be in order? I find that to be an unusual case. Some messages we just need to handle as soon as possible, and then some messages need be processed before another message but not before every message.
If there are some messages that have to be in order, then mark them someway. You can mark them with some conversation code that lets the processor know that it must be processed in order relative to the other messages in that conversation. Then you can process all conversation-less messages and one message from each conversation concurrently.
Give your design a good look and make sure that only messages that need to be in order are.
I Suppose it comes down to wether or not the order is important. If the order is unimportant you can go for method 2. If the order is important you go for method 1. Depending on what your application is supposed to do, you can still go for method 2, but use a sequence number so all the messages are processed in the correct order (unless of cause if it is the processing part you are trying to optimize).
The first method also has unpredictable ordering. The processing of message 1 on thread 1 could take very long, making it possible that message 2, 3 and 4 have long been processed
This would tip the balance to method 2
Edit:
I see what you mean.
However why in method 2 would you do the handlers sequentially. In method 1 the ordering doesn't matter and you're fine with that.
E.g. Method 3: both handle the messages and the handlers in parallel.
Of course, here also, the ordering is unguaranteed.
Given that there is some result of the handlers, you might just store the results in an ordered list, this way restoring ordering eventually.