nodejs readers/writers concurrency - node.js

Here's some simple code that demonstrates what I'm trying to do
myVar = 1
reader = () ->
getDataFromServer1().then ->
# uses myVar and does stuff according to its value
# returns promise
writer = () ->
getDataFromServer2().then ->
# assigns to myVar
# returns promise
Q.all([reader(), reader(), reader(), writer(), writer(), writer()]).done ->
console.log 'done'
So I have multiple threads running at the same time. some of them change the value of myVar and some read the value and rely on it. And I don't want a writer to write while another writer is writing or a reader is reading. Readers can read at the same time though. This is similar to the Readers–writers problem.
I tried to solve this by defining a sharedResource function as follows
sharedResource = (initialValue) ->
readLock = Q.fcall ->
writeLock = Q.fcall ->
value: initialValue
read: (action) ->
newPromise = writeLock.then action
readLock = Q.all([newPromise, readLock]).then -> null
newPromise
write: (action) ->
newPromise = Q.all([readLock, writeLock]).then action
writeLock = Q.all([newPromise, writeLock]).then -> null
newPromise
and then changed my code to use it
myVar = sharedResource 1
reader = () ->
myVar.read ->
# noone is writing to myVar while doing this request:
getDataFromServer1().then (data) ->
# read myVar.value instead of myVar, e.g.
expect(data == myVar.value)
writer = () ->
myVar.write ->
# noone reads or writes myVar while doing this request:
getDataFromServer2().then (data) ->
# change myVar.value instead of myVar, e.g.
myVar.value = data
Q.all([reader(), reader(), reader(), writer(), writer(), writer()]).done ->
console.log 'done'
This worked perfectly when I had only one sharedResource. Here's where the problem occurs
myVar1 = sharedResource 1
myVar2 = sharedResource 2
action1 = () ->
myVar1.read ->
myVar2.write ->
getDataFromServer1().then (data) ->
myVar2.value = data + myVar1.value
action2 = () ->
myVar2.read ->
myvar1.write ->
getDataFromServer2().then (data) ->
myVar1.value = data + myVar2.value
Q.all([action1(), action1(), action1(), action2(), action2(), action2()]).done ->
console.log 'done'
Here a case of deadlock happens. Each promise is waiting for the other one to resolve. None of them get resolved and the program stops.
Edit
I'll try my best to explain:
This is actually code to test my server. To see how it performs when multiple clients send multiple requests at the same time. Say for example every time action1 sends a request, the server increments a value stored in it's database. On the client side (the code that you are seeing) I also increment a variable which contains the value I expect to be on the server. And then when action2 sends a request, the server responds with that value and I assert the value in the response to be the same as my local variable.
So I have to get the lock before I send a request to make sure action2 doesn't ask for the variable while it is being changed.
I hope that helps.

Related

Is it possible to provide a setter function in my computational expression?

I'm trying to write an F# computational expression that permits reading and writing thread-safe variables only from within a critical section.
I've got a type, ThreadSafeVar<'t> that wraps a value, a CriticalSection<'t> and a computational expression builder, LockContext, as follows:
// wraps a value and restricts access to it
type ThreadSafeVar<'t> (value: 't) =
member val internal Value = value with get, set
// Encapsulates a critical section
type CriticalSection<'t> =
private
{ LockObj: obj
fn: unit -> 't }
static member Lock(lc: CriticalSection<'t>) = lock lc.LockObj lc.fn
// Expression builder for a locked context
type LockContext () =
member internal this.SyncRoot = obj()
member this.Return(value: 'v) = value
member this.ReturnFrom(value: ThreadSafeVar<'t>) = value.Value
member __.Bind(value: ThreadSafeVar<'t>, fn: 't -> 'u) = fn value.Value
// returns a CriticalSection
member this.Run(fn : unit -> 'u) = { LockObj = this.SyncRoot
fn=fn }
.
.
.
Reading the thread-safe values from within a lock context is simple enough thanks to Bind. e.g.
let lockedInt = ThreadSafeVar(1) // create a thread-safe variable
let context = LockContext()
let wrapperVal = context {
let! i = lockedInt // get the wrapper value inside lockedInt
return i
} |> CriticalSection.Lock
But I'm struggling to understand how to implement a means setting the value from within a LockContext instance. The approach I've thus taken is to implement a custom operation called, for instance, setVal. I've included my attempts thus far but I'm afraid they'd just muddy the waters. It seems that custom operations operate upon the computation built so far within the expression, encoded as a tuple, but I don't see that this is required in my case.
Any hints, pointing to resources, or direct help would be appreciated.
I'm not at all sure of the wisdom of this, but I came up with something based on the State monad that might work for you. First, define a "stateful" function as one that takes a ThreadSafeVar and returns some type of result:
ThreadSafeVar<'state> -> 'result
We then put that signature into a type that represents a stateful computation:
type Stateful<'state, 'result> =
MkStateful of (ThreadSafeVar<'state> -> 'result)
Now we need a way to run such a computation safely using a given TSV:
let run (tsv : ThreadSafeVar<_>) (MkStateful f) =
lock tsv (fun () -> f tsv)
Note that I've gotten rid of your CriticalSection type and instead just lock the TSV itself.
Next, we need a way to lift a pure value into a stateful computation:
let lift value =
MkStateful (fun _ -> value)
And a way to bind two stateful computations together:
let bind binder stateful =
MkStateful (fun tsv ->
run tsv stateful
|> binder
|> run tsv)
Defining the builder is then trivial:
type LockContext () =
member __.Return(value) = lift value
member __.Bind(stateful, binder) = bind binder stateful
let context = LockContext()
We also need helper computations for setting and getting a value safely:
let getValue =
MkStateful (fun tsv ->
tsv.Value)
let setValue value =
MkStateful (fun tsv ->
tsv.Value <- value)
Putting it all together, we can define a computation that increments the value of a TSV:
let comp =
context {
let! oldValue = getValue
let newValue = oldValue + 1
do! setValue newValue
return newValue
}
And we can run it like this:
let lockedInt = ThreadSafeVar(1)
let result = comp |> run lockedInt
printfn "%A" result // output is: 2
You can see the full solution and try it yourself here.

How to update the WPF GUI from non-blocking async methods in F#

System.InvalidOperationException: 'The calling thread cannot access this object because a different thread owns it.'
I have a WPF GUI with a button that when clicked does:
starts a control animation (on the GUI), and
starts a background process to obtain the local printer queues.
I do not want to block the main thread (GUI). However, the code I have gives the above error when I try to update the main thread with the results of the background process.
How do I have a background async process update the main thread without a context violation and not blocking the main thread?
open System.Printing
let GetPrinters =
let LocalPrintServer = new PrintServer()
let printQueues = LocalPrintServer.GetPrintQueues [|EnumeratedPrintQueueTypes.Local; EnumeratedPrintQueueTypes.Connections|]
let printerList =
printQueues
|> Seq.cast<PrintQueue>
|> Seq.toList
printerList
let GetPrintersAsync() =
async {
let! token = Async.StartChild(GetPrinters)
let! p = token
return p }
This is the update procedure I'm using:
let asyncUpper =
async {
let! printerQues = GetPrintersAsync ()
return printerQues
}
// This is where the error is being displayed.
let getprinters (printers:PrintQueue list) =
printers
|> List.map (fun pq -> {fullname = pq.FullName; comment = pq.Comment; defaultPrintTicket= Some pq.DefaultPrintTicket;
description= pq.Description; isInError=pq.IsInError; isOffline=pq.IsOffline; Id= Guid.NewGuid() } )
{ m with Printers = getprinters; IsRefreshing = false }
Edit #1: The above is a short version of the complete listing. Please see
https://github.com/awaynemd/AsyncAndElmish for the complete source code using Elmish.wpf. Thank you.
I've had a chance to look at your source on GitHub now, and even run it.
The problem is that the print queues are retrieved in an async function, which means another thread than the GUI thread. Then the list of queues are returned to the GUI thread, and accessed from there. That's why you get the error message. When the queues are returned to the GUI thread, then they are mapped to the Printer type. This is too late. If you move that mapping into the async instead, then it won't be too late. The data returned to the GUI thread will be Printer list, which is perhaps fine. At least there's no crash. I am not one hundred percent sure if it's ok, because there's a field of type PrintTicket in there, and the question is whether it's safe to pull this across to another thread. If you need data from that object, maybe this too should be mapped to a record in the async before being returned to the GUI thread.
While trying to get it running without the error, this is the code I ended up with. I am not that knowledgeable about async either, and I'm not sure whether there's any point using async for this case. But maybe you're just trying out stuff.
| GetPrintersMsg ->
let getPrinters () = async {
use ps = new PrintServer()
return
ps.GetPrintQueues [| EnumeratedPrintQueueTypes.Local; EnumeratedPrintQueueTypes.Connections |]
|> Seq.cast<PrintQueue>
|> Seq.toList
|> List.map (fun pq ->
{
Id = Guid.NewGuid()
fullname = pq.FullName
comment = pq.Comment
defaultPrintTicket = Some pq.DefaultPrintTicket
description = pq.Description
isInError = pq.IsInError
isOffline = pq.IsOffline
})
}
m, Cmd.OfAsync.either getPrinters () OnPrintersResult OnPrintersError
| OnPrintersResult printers ->
{ m with Printers = printers; IsRefreshing = false }, Cmd.none
I haven't looked at your code, but I think the basic answer to your question for WPF is the Dispatcher class. You can also use F#'s Async.SwitchToContext. See this SO question, for example.
#BentTranberg actually answered the hard part of this question. I post this as the completed answer since editing the question seems redundant. The below code is Bent's answer with a few modifications. The printers are now being read on a separate thread as seen with the printfn statements:
| GetPrintersMsg ->
let getPrinters () = async {
printfn "1: %i" Thread.CurrentThread.ManagedThreadId
let getprinters = async {
printfn "11: %i" Thread.CurrentThread.ManagedThreadId
use ps = new PrintServer()
return
ps.GetPrintQueues [| EnumeratedPrintQueueTypes.Local; EnumeratedPrintQueueTypes.Connections |]
|> Seq.cast<PrintQueue>
|> Seq.toList
|> List.map (fun pq ->
{
Id = Guid.NewGuid()
fullname = pq.FullName
comment = pq.Comment
defaultPrintTicket = Some pq.DefaultPrintTicket
description = pq.Description
isInError = pq.IsInError
isOffline = pq.IsOffline
}) }
let! d = getprinters |> Async.StartChild
return! d
}
m, Cmd.OfAsync.either getPrinters () OnPrintersResult OnPrintersError

What are the differences between Lwt.async and Lwt_main.run on OCaml/Node.JS?

I am experimenting with js_of_ocaml and node.js. As you know, node.js makes extensive use of callbacks to implement asynchronous requests without introducing explicit threads.
In OCaml we have a very nice threading library, Lwt, coming with a very useful syntax extension. I wrote a prototype with a binding to some node library (a AWS S3 client) and added a lwt-ish layer to hide the callback.
open Lwt.Infix
open Printf
open Js
let require_module s =
Js.Unsafe.fun_call
(Js.Unsafe.js_expr "require")
[|Js.Unsafe.inject (Js.string s)|]
let _js_aws = require_module "aws-sdk"
let array_to_list a =
let ax = ref [] in
begin
for i = 0 to a##.length - 1 do
Optdef.iter (array_get a i) (fun x -> ax := x :: !ax)
done;
!ax
end
class type error = object
end
class type bucket = object
method _Name : js_string t readonly_prop
method _CreationDate : date t readonly_prop
end
class type listBucketsData = object
method _Buckets : (bucket t) js_array t readonly_prop
end
class type s3 = object
method listBuckets :
(error -> listBucketsData t -> unit) callback -> unit meth
end
let createClient : unit -> s3 t = fun () ->
let constr_s3 = _js_aws##.S3 in
new%js constr_s3 ()
module S3 : sig
type t
val create : unit -> t
val list_buckets : t -> (string * string) list Lwt.t
end = struct
type t = s3 Js.t
let create () =
createClient ()
let list_buckets client =
let cell_of_bucket_data data =
((to_string data##._Name),
(to_string data##._CreationDate##toString))
in
let mvar = Lwt_mvar.create_empty () in
let callback error buckets =
let p () =
if true then
Lwt_mvar.put mvar
(`Ok(List.map cell_of_bucket_data ## array_to_list buckets##._Buckets))
else
Lwt_mvar.put mvar (`Error("Ups"))
in
Lwt.async p
in
begin
client##listBuckets (wrap_callback callback);
Lwt.bind
(Lwt_mvar.take mvar)
(function
| `Ok(whatever) -> Lwt.return whatever
| `Error(mesg) -> Lwt.fail_with mesg)
end
end
let () =
let s3 = S3.create() in
let dump lst =
Lwt_list.iter_s
(fun (name, creation_date) ->
printf "%32s\t%s\n" name creation_date;
Lwt.return_unit)
lst
in
let t () =
S3.list_buckets s3
>>= dump
in
begin
Lwt.async t
end
Since there is no binding to Lwt_main for node.js, I had to run my code with Lwt.async. What are the differences between running the code with Lwt.async rather than with Lwt_main.run – the latter not existing in node.js? Is it guaranteed that the program will wait until the asynchronous threads are completed before exiting, or is this rather a lucky but random behaviour of my code?
The Lwt_main.run function recursively polls the thread whose execution it supervises. At each iteration, if this thread is still running, the scheduler uses one engine (from Lwt_engine) to execute threads waiting for I/Os, either by selecting or synchronising on events.
The natural way to translate this in Node.JS is to use the process.nextTick method, which relies on Node.JS own scheduler. Implementing the Lwt_main.run function in this case can be as simple as:
let next_tick (callback : unit -> unit) =
Js.Unsafe.(fun_call
(js_expr "process.nextTick")
[| inject (Js.wrap_callback callback) |])
let rec run t =
Lwt.wakeup_paused ();
match Lwt.poll t with
| Some x -> x
| None -> next_tick (fun () -> run t)
This function only run threads of type unit Lwt.t but this is the main case for a program. It is possible to compute arbitrary values using a Lwt_mvar.t to communicate.
It is also possible to extend this example to support all sort of hooks, as in the original Lwt_main.run implementation.

Joining on the first finished thread?

I'm writing up a series of graph-searching algorithms in F# and thought it would be nice to take advantage of parallelization. I wanted to execute several threads in parallel and take the result of the first one to finish. I've got an implementation, but it's not pretty.
Two questions: is there a standard name for this sort of function? Not a Join or a JoinAll, but a JoinFirst? Second, is there a more idiomatic way to do this?
//implementation
let makeAsync (locker:obj) (shared:'a option ref) (f:unit->'a) =
async {
let result = f()
Monitor.Enter locker
shared := Some result
Monitor.Pulse locker
Monitor.Exit locker
}
let firstFinished test work =
let result = ref Option.None
let locker = new obj()
let cancel = new CancellationTokenSource()
work |> List.map (makeAsync locker result) |> List.map (fun a-> Async.StartAsTask(a, TaskCreationOptions.None, cancel.Token)) |> ignore
Monitor.Enter locker
while (result.Value.IsNone || (not <| test result.Value.Value)) do
Monitor.Wait locker |> ignore
Monitor.Exit locker
cancel.Cancel()
match result.Value with
| Some x-> x
| None -> failwith "Don't pass in an empty list"
//end implentation
//testing
let delayReturn (ms:int) value =
fun ()->
Thread.Sleep ms
value
let test () =
let work = [ delayReturn 1000 "First!"; delayReturn 5000 "Second!" ]
let result = firstFinished (fun _->true) work
printfn "%s" result
Would it work to pass the CancellationTokenSource and test to each async and have the first that computes a valid result cancel the others?
let makeAsync (cancel:CancellationTokenSource) test f =
let rec loop() =
async {
if cancel.IsCancellationRequested then
return None
else
let result = f()
if test result then
cancel.Cancel()
return Some result
else return! loop()
}
loop()
let firstFinished test work =
match work with
| [] -> invalidArg "work" "Don't pass in an empty list"
| _ ->
let cancel = new CancellationTokenSource()
work
|> Seq.map (makeAsync cancel test)
|> Seq.toArray
|> Async.Parallel
|> Async.RunSynchronously
|> Array.pick id
This approach makes several improvements: 1) it uses only async (it's not mixed with Task, which is an alternative for doing the same thing--async is more idiomatic in F#); 2) there's no shared state, other than CancellationTokenSource, which was designed for that purpose; 3) the clean function-chaining approach makes it easy to add additional logic/transformations to the pipeline, including trivially enabling/disabling parallelism.
With the Task Parallel Library in .NET 4, this is called WaitAny. For example, the following snippet creates 10 tasks and waits for any of them to complete:
open System.Threading
Array.init 10 (fun _ ->
Tasks.Task.Factory.StartNew(fun () ->
Thread.Sleep 1000))
|> Tasks.Task.WaitAny
In case you are ok to use "Reactive extensions (Rx)" in your project, the joinFirst method can be implemented as:
let joinFirst (f : (unit->'a) list) =
let c = new CancellationTokenSource()
let o = f |> List.map (fun i ->
let j = fun() -> Async.RunSynchronously (async {return i() },-1,c.Token)
Observable.Defer(fun() -> Observable.Start(j))
)
|> Observable.Amb
let r = o.First()
c.Cancel()
r
Example usage:
[20..30] |> List.map (fun i -> fun() -> Thread.Sleep(i*100); printfn "%d" i; i)
|> joinFirst |> printfn "Done %A"
Console.Read() |> ignore
Update:
Using Mailbox processor :
type WorkMessage<'a> =
Done of 'a
| GetFirstDone of AsyncReplyChannel<'a>
let joinFirst (f : (unit->'a) list) =
let c = new CancellationTokenSource()
let m = MailboxProcessor<WorkMessage<'a>>.Start(
fun mbox -> async {
let afterDone a m =
match m with
| GetFirstDone rc ->
rc.Reply(a);
Some(async {return ()})
| _ -> None
let getDone m =
match m with
|Done a ->
c.Cancel()
Some (async {
do! mbox.Scan(afterDone a)
})
|_ -> None
do! mbox.Scan(getDone)
return ()
} )
f
|> List.iter(fun t -> try
Async.RunSynchronously (async {let out = t()
m.Post(Done out)
return ()},-1,c.Token)
with
_ -> ())
m.PostAndReply(fun rc -> GetFirstDone rc)
Unfortunately, there is no built-in operation for this provided by Async, but I'd still use F# asyncs, because they directly support cancellation. When you start a workflow using Async.Start, you can pass it a cancellation token and the workflow will automatically stop if the token is cancelled.
This means that you have to start workflows explicitly (instead of using Async.Parallel), so the synchronizataion must be written by hand. Here is a simple version of Async.Choice method that does that (at the moment, it doesn't handle exceptions):
open System.Threading
type Microsoft.FSharp.Control.Async with
/// Takes several asynchronous workflows and returns
/// the result of the first workflow that successfuly completes
static member Choice(workflows) =
Async.FromContinuations(fun (cont, _, _) ->
let cts = new CancellationTokenSource()
let completed = ref false
let lockObj = new obj()
let synchronized f = lock lockObj f
/// Called when a result is available - the function uses locks
/// to make sure that it calls the continuation only once
let completeOnce res =
let run =
synchronized(fun () ->
if completed.Value then false
else completed := true; true)
if run then cont res
/// Workflow that will be started for each argument - run the
/// operation, cancel pending workflows and then return result
let runWorkflow workflow = async {
let! res = workflow
cts.Cancel()
completeOnce res }
// Start all workflows using cancellation token
for work in workflows do
Async.Start(runWorkflow work, cts.Token) )
Once we write this operation (which is a bit complex, but has to be written only once), solving the problem is quite easy. You can write your operations as async workflows and they'll be cancelled automatically when the first one completes:
let delayReturn n s = async {
do! Async.Sleep(n)
printfn "returning %s" s
return s }
Async.Choice [ delayReturn 1000 "First!"; delayReturn 5000 "Second!" ]
|> Async.RunSynchronously
When you run this, it will print only "returning First!" because the second workflow will be cancelled.

Compiled console command-line program doesn't wait for all the threads finishing

Some of the threads will be terminated before finished if the code is compiled to a console program or run as fsi --use:Program.fs --exec --quiet. Any way to wait for all the threads ending?
This issue can be described as "program exit problem when multiple MailboxProcessers exist".
Output example
(Note the last line is truncated and the last output function (printfn "[Main] after crawl") is never executed.)
[Main] before crawl
[Crawl] before return result
http://news.google.com crawled by agent 1.
[supervisor] reached limit
Agent 5 is done.
http://www.gstatic.com/news/img/favicon.ico crawled by agent 1.
[supervisor] reached limit
Agent 1 is done.
http://www.google.com/imghp?hl=en&tab=ni crawled by agent 4.
[supervisor] reached limit
Agent 4 is done.
http://www.google.com/webhp?hl=en&tab=nw crawled by agent 2.
[supervisor] reached limit
Agent 2 is done.
http://news.google.
Code
Edit: added several System.Threading.Thread.CurrentThread.IsBackground <- false.
open System
open System.Collections.Concurrent
open System.Collections.Generic
open System.IO
open System.Net
open System.Text.RegularExpressions
module Helpers =
type Message =
| Done
| Mailbox of MailboxProcessor<Message>
| Stop
| Url of string option
| Start of AsyncReplyChannel<unit>
// Gates the number of crawling agents.
[<Literal>]
let Gate = 5
// Extracts links from HTML.
let extractLinks html =
let pattern1 = "(?i)href\\s*=\\s*(\"|\')/?((?!#.*|/\B|" +
"mailto:|location\.|javascript:)[^\"\']+)(\"|\')"
let pattern2 = "(?i)^https?"
let links =
[
for x in Regex(pattern1).Matches(html) do
yield x.Groups.[2].Value
]
|> List.filter (fun x -> Regex(pattern2).IsMatch(x))
links
// Fetches a Web page.
let fetch (url : string) =
try
let req = WebRequest.Create(url) :?> HttpWebRequest
req.UserAgent <- "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
req.Timeout <- 5000
use resp = req.GetResponse()
let content = resp.ContentType
let isHtml = Regex("html").IsMatch(content)
match isHtml with
| true -> use stream = resp.GetResponseStream()
use reader = new StreamReader(stream)
let html = reader.ReadToEnd()
Some html
| false -> None
with
| _ -> None
let collectLinks url =
let html = fetch url
match html with
| Some x -> extractLinks x
| None -> []
open Helpers
// Creates a mailbox that synchronizes printing to the console (so
// that two calls to 'printfn' do not interleave when printing)
let printer =
MailboxProcessor.Start(fun x -> async {
while true do
let! str = x.Receive()
System.Threading.Thread.CurrentThread.IsBackground <- false
printfn "%s" str })
// Hides standard 'printfn' function (formats the string using
// 'kprintf' and then posts the result to the printer agent.
let printfn fmt =
Printf.kprintf printer.Post fmt
let crawl url limit =
// Concurrent queue for saving collected urls.
let q = ConcurrentQueue<string>()
// Holds crawled URLs.
let set = HashSet<string>()
let supervisor =
MailboxProcessor.Start(fun x -> async {
System.Threading.Thread.CurrentThread.IsBackground <- false
// The agent expects to receive 'Start' message first - the message
// carries a reply channel that is used to notify the caller
// when the agent completes crawling.
let! start = x.Receive()
let repl =
match start with
| Start repl -> repl
| _ -> failwith "Expected Start message!"
let rec loop run =
async {
let! msg = x.Receive()
match msg with
| Mailbox(mailbox) ->
let count = set.Count
if count < limit - 1 && run then
let url = q.TryDequeue()
match url with
| true, str -> if not (set.Contains str) then
let set'= set.Add str
mailbox.Post <| Url(Some str)
return! loop run
else
mailbox.Post <| Url None
return! loop run
| _ -> mailbox.Post <| Url None
return! loop run
else
printfn "[supervisor] reached limit"
// Wait for finishing
mailbox.Post Stop
return! loop run
| Stop -> printfn "[Supervisor] stop"; return! loop false
| Start _ -> failwith "Unexpected start message!"
| Url _ -> failwith "Unexpected URL message!"
| Done -> printfn "[Supervisor] Supervisor is done."
(x :> IDisposable).Dispose()
// Notify the caller that the agent has completed
repl.Reply(())
}
do! loop true })
let urlCollector =
MailboxProcessor.Start(fun y ->
let rec loop count =
async {
System.Threading.Thread.CurrentThread.IsBackground <- false
let! msg = y.TryReceive(6000)
match msg with
| Some message ->
match message with
| Url u ->
match u with
| Some url -> q.Enqueue url
return! loop count
| None -> return! loop count
| _ ->
match count with
| Gate -> (y :> IDisposable).Dispose()
printfn "[urlCollector] URL collector is done."
supervisor.Post Done
| _ -> return! loop (count + 1)
| None -> supervisor.Post Stop
return! loop count
}
loop 1)
/// Initializes a crawling agent.
let crawler id =
MailboxProcessor.Start(fun inbox ->
let rec loop() =
async {
System.Threading.Thread.CurrentThread.IsBackground <- false
let! msg = inbox.Receive()
match msg with
| Url x ->
match x with
| Some url ->
let links = collectLinks url
printfn "%s crawled by agent %d." url id
for link in links do
urlCollector.Post <| Url (Some link)
supervisor.Post(Mailbox(inbox))
return! loop()
| None -> supervisor.Post(Mailbox(inbox))
return! loop()
| _ -> printfn "Agent %d is done." id
urlCollector.Post Done
(inbox :> IDisposable).Dispose()
}
loop())
// Send 'Start' message to the main agent. The result
// is asynchronous workflow that will complete when the
// agent crawling completes
let result = supervisor.PostAndAsyncReply(Start)
// Spawn the crawlers.
let crawlers =
[
for i in 1 .. Gate do
yield crawler i
]
// Post the first messages.
crawlers.Head.Post <| Url (Some url)
crawlers.Tail |> List.iter (fun ag -> ag.Post <| Url None)
printfn "[Crawl] before return result"
result
// Example:
printfn "[Main] before crawl"
crawl "http://news.google.com" 5
|> Async.RunSynchronously
printfn "[Main] after crawl"
If I recognize the code correctly, it is based on your previous question (and my answer).
The program waits until the supervisor agent completes (by sending the Start message and then waiting for the reply using RunSynchronously). This should guarantee that the main agent as well as all crawlers complete before the application exits.
The problem is that it doesn't wait until the printer agent completes! So, the last call to the (redefined) printfn function sends a message to the agent and then the application completes without waiting until the printing agent finishes.
As far as I know, there is no "standard pattern" for waiting until agent completes processing all messages currently in the queue. Some ideas that you could try are:
You could check the CurrentQueueLength property (wait until it is 0), but that still doesn't mean that the agent completed processing all messages.
You could make the agent more complex by adding a new type of message and waiting until the agent replies to that message (just like you're currently waiting for a reply to the Start message).
Caveat that I know zero F#, but typically you wait for all threads of interest using Thread.Join. It looks to me like in your case, you need to wait for anything of interest that's kicked off via a call to .Start.
You could also consider Task Parallel Library which gives you a higher level (simpler) abstraction onto raw managed threads. Example for waiting for tasks to complete here.
.NET threads have property Thread.IsBackground when this is set to true a thread will not prevent the process from exiting. When set to false it will prevent the process from exiting. See: http://msdn.microsoft.com/en-us/library/system.threading.thread.isbackground.aspx
The threads that run agents come from the thread pool and therefore have Thread.IsBackground set to false by default.
You might try setting the thread's IsBackground to false each time you read a message. You could add a function to do this for you to make the approach cleaner. It's perhaps not the best solution to the problem as each time you use a let! you could change threads so it would need to be carefully implemented to work properly. I just thought mention it to answer the specific question
Any way to wait for all the threads ending?
and help people understand why certain threads stop the program exiting and other’s didn’t.
I think I've sort of solved the problem: adding System.Threading.Thread.CurrentThread.IsBackground <- false after the let! in the printer agent.
However, I tried to modify the original code (the first version before Tomas' AsyncChannel fix) by adding System.Threading.Thread.CurrentThread.IsBackground <- false after all the let! and it still doesn't work. No idea.
Thanks everyone for your help. I finally can start my first F# application for a batch process. I think MailboxProcessor should really set IsBackground to false by default. Anyway to ask Microsoft to change it.
[Update] Just found out that the compiled assembly works well. But fsi --user:Program --exec --quiet is still the same. It seems a bug of fsi?

Resources