Function/Code Design with Concurrency in Swift - multithreading

I'm trying to create my first app in Swift which involves making multiple requests to a website. These requests are each done using the block
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in ... }
task.resume()
From what I understand this block uses a thread different to the main thread.
My question is, what is the best way to design code that relies on the values in that block? For instance, the ideal design (however not possible due to the fact that the thread executing these blocks is not the main thread) is
func prepareEmails() {
var names = getNames()
var emails = getEmails()
...
sendEmails()
}
func getNames() -> NSArray {
var names = nil
....
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in
names = ...
})
task.resume()
return names
}
func getEmails() -> NSArray {
var emails = nil
....
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in
emails = ...
})
task.resume()
return emails
}
However in the above design, most likely getNames() and getEmails() will return nil, as the the task will not have updated emails/name by the time it returns.
The alternative design (which I currently implement) is by effectively removing the 'prepareEmails' function and doing everything sequentially in the task functions
func prepareEmails() {
getNames()
}
func getNames() {
...
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in
getEmails(names)
})
task.resume()
}
func getEmails(names: NSArray) {
...
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in
sendEmails(emails, names)
})
task.resume()
}
Is there a more effective design than the latter? This is my first experience with concurrency, so any advice would be greatly appreciated.

The typical pattern when calling an asynchronous method that has a completionHandler parameter is to use the completionHandler closure pattern, yourself. So the methods don't return anything, but rather call a closure with the returned information as a parameter:
func getNames(completionHandler:(NSArray!)->()) {
....
let task = NSURLSession.sharedSession().dataTaskWithRequest(request) {data, response, error -> Void in
let names = ...
completionHandler(names)
}
task.resume()
}
func getEmails(completionHandler:(NSArray!)->()) {
....
let task = NSURLSession.sharedSession().dataTaskWithRequest(request) {data, response, error -> Void in
let emails = ...
completionHandler(emails)
}
task.resume()
}
Then, if you need to perform these sequentially, as suggested by your code sample (i.e. if the retrieval of emails was dependent upon the names returned by getNames), you could do something like:
func prepareEmails() {
getNames() { names in
getEmails() {emails in
sendEmails(names, emails) // I'm assuming the names and emails are in the input to this method
}
}
}
Or, if they can run concurrently, then you should do so, as it will be faster. The trick is how to make a third task dependent upon two other asynchronous tasks. The two traditional alternatives include
Wrapping each of these asynchronous tasks in its own asynchronous NSOperation, and then create a third task dependent upon those other two operations. This is probably beyond the scope of the question, but you can refer to the Operation Queue section of the Concurrency Programming Guide or see the Asynchronous vs Synchronous Operations and Subclassing Notes sections of the NSOperation Class Reference.
Use dispatch groups, entering the group before each request, leaving the group within the completion handler of each request, and then adding a dispatch group notification block (called when all of the group "enter" calls are matched by their corresponding "leave" calls):
func prepareEmails() {
let group = dispatch_group_create()
var emails: NSArray!
var names: NSArray!
dispatch_group_enter(group)
getNames() { results in
names = results
dispatch_group_leave(group)
}
dispatch_group_enter(group)
getEmails() {results in
emails = results
dispatch_group_leave(group)
}
dispatch_group_notify(group, dispatch_get_main_queue()) {
if names != nil && emails != nil {
self.sendEmails(names, emails)
} else {
// one or both of those requests failed; tell the user
}
}
}
Frankly, if there's any way to retrieve both the emails and names in a single network request, that's going to be far more efficient. But if you're stuck with two separate requests, you could do something like the above.
Note, I wouldn't generally use NSArray in my Swift code, but rather use an array of String objects (e.g. [String]). Furthermore, I'd put in error handling where I return the nature of the error if either of these fail. But hopefully this illustrates the concepts involved in (a) writing your own methods with completionHandler blocks; and (b) invoking a third bit of code dependent upon the completion of two other asynchronous tasks.

The answers above (particularly Rob's DispatchQueue based answer) describe the concurrency concepts necessary to run two tasks in parallel and then respond to the result. The answers lack error handling for clarity because traditionally, correct solutions to concurrency problems are quite verbose.
Not so with HoneyBee.
HoneyBee.start()
.setErrorHandler(handleErrorFunc)
.branch {
$0.chain(getNames)
+
$0.chain(getEmails)
}
.chain(sendEmails)
This code snippet manages all of the concurrency, routes all errors to handleErrorFunc and looks like the concurrent pattern that is desired.

Related

Node.js: How to implement a simple and functional Mutex mechanism to avoid racing conditions that bypass the guard statement in simultaneous actions

In the following class, the _busy field acts as a semaphore; but, in "simultaneous" situations it fails to guard!
class Task {
_busy = false;
async run(s) {
try {
if (this._busy)
return;
this._busy = true;
await payload();
} finally {
this._busy = false;
}
}
}
The sole purpose of the run() is to execute the payload() exclusively, denying all the other invocations while it's still being carried on. In other words, when "any" of the invocations reach to to the run() method, I want it to only allow the first one to go through and lock it down (denying all the others) until it's done with its payload; "finally", it opens up once it's done.
In the implementation above, the racing condition do occur by invoking the run() method simultaneously through various parts of the app. Some of the invocations (more than 1) make it past through the "guarding" if statement, since none of them are yet reached to the this._busy = true to lock it down (they get past simultaneously). So, the current implementation doesn't cut it!
I just want to deny the simultaneous invocations while one of them is already being carried out. I'm looking for a simple solution to only resolve this issue. I've designated the async-mutex library as a last resort!
So, how to implement a simple "locking" mechanism to avoid racing conditions that bypass the guard statement in simultaneous actions?
For more clarification, as per the comments below, the following is almost the actual Task class (without the irrelevant).
class Task {
_cb;
_busy = false;
_count = 0;
constructor(cb) {
this._cb = cb;
}
async run(params = []) {
try {
if (this._busy)
return;
this._busy = true;
this._count++;
if (this._count > 1) {
console.log('Race condition!', 'count:', this._count);
this._count--;
return;
}
await this._cb(...params);
} catch (err) {
await someLoggingRoutine();
} finally {
this._busy = false;
this._count--;
}
}
}
I do encounter with the Race condition! log. Also, all the task instances are local to a simple driver file (the instances are not passed down to any other function, they only wander as local instances in a single function.) They are created in the following form:
const t1 = new Task(async () => { await doSth1(); /*...*/ });
const t2 = new Task(async () => { await doSth2(); /*...*/ });
const t3 = new Task(async () => { await doSth3(); /*...*/ });
// ...
I do call them in the various library events, some of which happen concurrently and causing the "race condition" issue; e.g.:
someLib.on('some-event', async function() { /*...*/ t1.run().then(); /*...*/ });
anotherLib.on('event-2', async function() { /*...*/ t1.run().then(); /*...*/ });
Oh god, now I see it. How could I have missed this so long! Here is your implemenation:
async run() {
try {
if (this._busy)
return;
...
} finally {
this._busy = false;
}
}
As per documentations:
The Statements in the finally block are executed before control flow exits the try...catch...finally construct. These statements execute regardless of whether an exception was thrown or caught.
Thus, when it's busy and the flow reaches the guarding if, and then, logically encounters the return statement. The return statement causes the flow to exit the try...catch...finally construct; thus, as per the documentations, the statements in the finally block are executed whatsoever: setting the this._busy = false;, opening the thing up!
So, the first call of run() sets this._busy as being true; then enters the critical section with its longrunning callback. While this callback is running, just another event causes the run() to be invoked. This second call is rationally blocked from entering the critical section by the guarding if statement:
if (this._busy) return;
Encountering the return statement to exit the function (and thus exiting the try...catch...finally construct) causes the statements in the finally block to be executed; thus, this._busy = false resets the flag, even though the first callback is still running! Now suppose a third call to the run() from yet another event is invoked! Since this._busy is just set to false, the flow happily enters the critical section again, even though the first callback is still running! In turn, it sets this._busy back to true. In the meantime, the first callback finishes, and reaches the finally block, where it set this._busy = false again; even though the other callback is still running. So the next call to run() can enter the critical section again with no problems... And so on and so forth...
So to resolve the issue, the check for the critical section should be outside of the try block:
async run() {
if (this._busy) return;
this._busy = true;
try { ... }
finally {
this._busy = false;
}
}

Hi, is there is a better way to wait for an async execution to finish in rust

Just to explain, I send a command and an id to the other thread, the other thread executes the thing, and return the command and id as a tuple. Because multiple functions might be called at nearly the same time and they are io operations, they might be returned in a different order than they came. So, I have this function to wait for the right return value and put the rest into a map(the variable called results) so that the other functions calling can retrieve theirs as well. However, I think if there is a case of two functions calling this one at the same time, they might be racing to retrieve the value at the channels. I just think there should be a more elegant way of doing this.
For now I have this function to wait for an async execution in another thread:
async fn wait_for(&mut self, id : u64) -> (u64, io::Result<SomeResult>) {
loop {
let temp = self.channels.1.recv().await.unwrap();
if temp.0 == id {
return temp
} else {
self.results.insert(temp.0, temp.1);
}
if let Some(res) = self.results.remove(&id) {
return (id, res)
}
}
}
P.S. I also had a thought of having another thread do the checking and return the results in order, but that pretty much defeats the purpose of async/await (as far as I know).

dataTaskWithURL for dummies

I keep learning iDev but I still can't deal with http requests.
It seems to be crazy, but everybody whom I talk about synchronous requests do not understand me. Okay, it's really important to keep on a background queue as much as it possible to provide smooth UI. But in my case I load JSON data from server and I need to use this data immediately.
The only way I achieved it are semaphores. Is it okay? Or I have to use smth else? I tried NSOperation, but in fact I have to many little requests so creating each class for them for me seems to be not easy-reading-code.
func getUserInfo(userID: Int) -> User {
var user = User()
let linkURL = URL(string: "https://server.com")!
let session = URLSession.shared
let semaphore = DispatchSemaphore(value: 0)
let dataRequest = session.dataTask(with: linkURL) { (data, response, error) in
let json = JSON(data: data!)
user.userName = json["first_name"].stringValue
user.userSurname = json["last_name"].stringValue
semaphore.signal()
}
dataRequest.resume()
semaphore.wait(timeout: DispatchTime.distantFuture)
return user
}
You wrote that people don't understand you, but on the other hand it reveals that you don't understand how asynchronous network requests work.
For example imagine you are setting an alarm for a specific time.
Now you have two options to spend the following time.
Do nothing but sitting in front of the alarm clock and wait until the alarm occurs. Have you ever done that? Certainly not, but this is exactly what you have in mind regarding the network request.
Do several useful things ignoring the alarm clock until it rings. That is the way how asynchronous tasks work.
In terms of a programming language you need a completion handler which is called by the network request when the data has been loaded. In Swift you are using a closure for that purpose.
For convenience declare an enum with associated values for the success and failure cases and use it as the return value in the completion handler
enum RequestResult {
case Success(User), Failure(Error)
}
Add a completion handler to your function including the error case. It is highly recommended to handle always the error parameter of an asynchronous task. When the data task returns it calls the completion closure passing the user or the error depending on the situation.
func getUserInfo(userID: Int, completion:#escaping (RequestResult) -> ()) {
let linkURL = URL(string: "https://server.com")!
let session = URLSession.shared
let dataRequest = session.dataTask(with: linkURL) { (data, response, error) in
if error != nil {
completion(.Failure(error!))
} else {
let json = JSON(data: data!)
var user = User()
user.userName = json["first_name"].stringValue
user.userSurname = json["last_name"].stringValue
completion(.Success(user))
}
}
dataRequest.resume()
}
Now you can call the function with this code:
getUserInfo(userID: 12) { result in
switch result {
case .Success(let user) :
print(user)
// do something with the user
case .Failure(let error) :
print(error)
// handle the error
}
}
In practice the point in time right after your semaphore and the switch result line in the completion block is exactly the same.
Never use semaphores as an alibi not to deal with asynchronous patterns
I hope the alarm clock example clarifies how asynchronous data processing works and why it is much more efficient to get notified (active) rather than waiting (passive).
Don't try to force network connections to work synchronously. It invariably leads to problems. Whatever code is making the above call could potentially be blocked for up to 90 seconds (30 second DNS timeout + 60 second request timeout) waiting for that request to complete or fail. That's an eternity. And if that code is running on your main thread on iOS, the operating system will kill your app outright long before you reach the 90 second mark.
Instead, design your code to handle responses asynchronously. Basically:
Create data structures to hold the results of various requests, such as obtaining info from the user.
Kick off those requests.
When each request comes back, check to see if you have all the data you need to do something, and then do it.
For a really simple example, if you have a method that updates the UI with the logged in user's name, instead of:
[self updateUIWithUserInfo:[self getUserInfoForUser:user]];
you would redesign this as:
[self getUserInfoFromServerAndRun:^(NSDictionary *userInfo) {
[self updateUIWithUserInfo:userInfo];
}];
so that when the response to the request arrives, it performs the UI update action, rather than trying to start a UI update action and having it block waiting for data from the server.
If you need two things—say the userInfo and a list of books that the user has read, you could do:
[self getUserInfoFromServerAndRun:^(NSDictionary *userInfo) {
self.userInfo = userInfo;
[self updateUI];
}];
[self getBookListFromServerAndRun:^(NSDictionary *bookList) {
self.bookList = bookList;
[self updateUI];
}];
...
(void)updateUI
{
if (!self.bookList) return;
if (!self.userInfo) return;
...
}
or whatever. Blocks are your friend here. :-)
Yes, it's a pain to rethink your code to work asynchronously, but the end result is much, much more reliable and yields a much better user experience.

Return from asynchronous request in for loop

I’m trying to get data from my RestAPI, specifically i’m getting an array of integers (which are id’s from other users), i want to loop through this array and download the data from all the other customers. A simplified version of the code is show below.
func asyncFunc(completion: (something:[Int])->Void){
//Get a json Array asynchonous from my RestAPI
let jsonArray = [1,2,3,4,5]
var resultingArray:[Int] = []
for myThing in jsonArray{
anotherAsyncFunc(myThing, completion: { (somethingElse) -> Void in
resultingArray.append(somethingElse)
})
}
}
func anotherAsyncFunc(data:Int, completion: (somethingElse:Int)->Void){
//Get some more jsonData from RestApi/data
let myLoadedData:Int = data*13356
completion(somethingElse: myLoadedData)
}
How would I make my asyncFunc return an array with all the items it has gotten from the second (inner) async request.
I have tried getting the count of the array which is first requested from the Rest Api and just “blocking” the UI thread by using a while loop too see if the “new” array has collected all the data (the count is equal to the count of the first requested array). This has 2 major disadvantages, mainly it blocks the UI thread and further more, it will fail and crash the app if the data connection gets broken while i’m getting the data from the other users (the inner async request), cause the while loop will never complete.
My question is how would I use a completion handler to return all the data that it should return without blocking the main thread and/or having to worry about badly timed data connection losses.
You can use a dispatch group notification. So create a dispatch group, enter the group for each item in the array, exit in the completion handler of the anotherAsyncFunc asynchronous process, and then create a notification that will trigger the final completion closure when all of the dispatch_group_enter calls have been offset by a corresponding dispatch_group_leave call:
func asyncFunc(completion: (something:[Int])->Void){
//Get a json Array asynchonous from my RestAPI
let jsonArray = [1,2,3,4,5]
var resultingArray:[Int] = []
let group = dispatch_group_create()
for myThing in jsonArray {
dispatch_group_enter(group)
anotherAsyncFunc(myThing) { somethingElse in
resultingArray.append(somethingElse)
dispatch_group_leave(group)
}
}
dispatch_group_notify(group, dispatch_get_main_queue()) {
completion(something: resultingArray)
}
}
Note, you will want to make sure you synchronize the updates to resultingArray that the anotherAsyncFunc are performing. The easiest way is to make sure that it dispatches its updates back to the main queue (if your REST API doesn't do that already).
func anotherAsyncFunc(data:Int, completion: (somethingElse:Int)->Void){
//Get some more jsonData from RestApi/data asynchronously
let myLoadedData:Int = data*13356
dispatch_async(dispatch_get_main_queue()) {
completion(somethingElse: myLoadedData)
}
}
This is just an example. You can use whatever synchronization mechanism you want, but make sure you synchronize updates on resultingArray accordingly.

Parallel.Invoke - Exception handling

My code runs 4 function to fill in information (Using Invoke) to a class such as:
class Person
{
int Age;
string name;
long ID;
bool isVegeterian
public static Person GetPerson(int LocalID)
{
Person person;
Parallel.Invoke(() => {GetAgeFromWebServiceX(person)},
() => {GetNameFromWebServiceY(person)},
() => {GetIDFromWebServiceZ(person)},
() =>
{
// connect to my database and get information if vegeterian (using LocalID)
....
if (!person.isVegetrian)
return null
....
});
}
}
My question is: I can not return null if he's not a vegeterian, but I want to able to stop all threads, stop processing and just return null. How can it be achieved?
To exit the Parallel.Invoke as early as possible you'd have to do three things:
Schedule the action that detects whether you want to exit early as the first action. It's then scheduled sooner (maybe as first, but that's not guaranteed) so you'll know sooner whether you want to exit.
Throw an exception when you detect the error and catch an AggregateException as Jon's answer indicates.
Use cancellation tokens. However, this only makes sense if you have an opportunity to check their IsCancellationRequested property.
Your code would then look as follows:
var cts = new CancellationTokenSource();
try
{
Parallel.Invoke(
new ParallelOptions { CancellationToken = cts.Token },
() =>
{
if (!person.IsVegetarian)
{
cts.Cancel();
throw new PersonIsNotVegetarianException();
}
},
() => { GetAgeFromWebServiceX(person, cts.Token) },
() => { GetNameFromWebServiceY(person, cts.Token) },
() => { GetIDFromWebServiceZ(person, cts.Token) }
);
}
catch (AggregateException e)
{
var cause = e.InnerExceptions[0];
// Check if cause is a PersonIsNotVegetarianException.
}
However, as I said, cancellation tokens only make sense if you can check them. So there should be an opportunity inside GetAgeFromWebServiceX to check the cancellation token and exit early, otherwise, passing tokens to these methods doesn't make sense.
Well, you can throw an exception from your action, catch AggregateException in GetPerson (i.e. put a try/catch block around Parallel.Invoke), check for it being the right kind of exception, and return null.
That fulfils everything except stopping all the threads. I think it's unlikely that you'll easily be able to stop already running tasks unless you start getting into cancellation tokens. You could stop further tasks from executing by keeping a boolean value to indicate whether any of the tasks so far has failed, and make each task check that before starting... it's somewhat ugly, but it will work.
I suspect that using "full" tasks instead of Parallel.Invoke would make all of this more elegant though.
Surely you need to load your Person from the database first anyway? As it is your code calls the Web services with a null.
If your logic really is sequential, do it sequentially and only do in parallel what makes sense.

Resources