How to deal with rollbacks when using the Either monad ("railway-oriented programming") - haskell

I am using F# and Chessie to compose a sequence of tasks (with side effects) that can succeed or fail.
If anything fails, I want to stop executing the remaining tasks and rollback those that have already succeeded.
Unfortunately once I hit the 'failure' path there is no longer a way to retrieve the results of the successful tasks so I can roll them back.
Is there a functional programming "pattern" that deals with this scenario?
Example:
let refuel =
async {
printfn "1 executed"
// Fill missile with fuel
return Result<string,string>.Succeed "1"
} |> AR
let enterLaunchCodes =
async {
printfn "2 executed"
//
return Result<string,string>.FailWith "2"
} |> AR
let fireMissile =
async {
printfn "3 executed"
return Result<string,string>.Succeed "3"
} |> AR
let launchSequence =
asyncTrial {
let! a = refuel
let! b = enterLaunchCodes
let! c = fireMissile
return a,b,c
}
let result = launchSequence
|> Chessie.ErrorHandling.AsyncExtensions.Async.ofAsyncResult
|> Async.RunSynchronously
// Result is a failure... how do I know the results of the successful operations here so I can roll them back?
printfn "Result: %A" result

As people have pointed out in the comments, there are a couple of options that can be used to solve this.
One way is to use compensating transactions.
In this approach, the Success case contains a list of "undo" functions. Every step that can be undone adds a function to this list.
When any step fails, each undo function in the list is executed (in reverse order).
There are more sophisticated ways to do this of course (e.g storing the undo functions persistently in case of crashes,
or this kind of thing).
Here's some code that demonstrates this approach:
/// ROP design with compensating transactions
module RopWithUndo =
type Undo = unit -> unit
type Result<'success> =
| Success of 'success * Undo list
| Failure of string
let bind f x =
match x with
| Failure e -> Failure e
| Success (s1,undoList1) ->
match f s1 with
| Failure e ->
// undo everything in reverse order
undoList1 |> List.rev |> List.iter (fun undo -> undo())
// return the error
Failure e
| Success (s2,undoList2) ->
// concatenate the undo lists
Success (s2, undoList1 # undoList2)
/// Example
module LaunchWithUndo =
open RopWithUndo
let undo_refuel() =
printfn "undoing refuel"
let refuel ok =
if ok then
printfn "doing refuel"
Success ("refuel", [undo_refuel])
else
Failure "refuel failed"
let undo_enterLaunchCodes() =
printfn "undoing enterLaunchCodes"
let enterLaunchCodes ok refuelInfo =
if ok then
printfn "doing enterLaunchCodes"
Success ("enterLaunchCodes", [undo_enterLaunchCodes])
else
Failure "enterLaunchCodes failed"
let fireMissile ok launchCodesInfo =
if ok then
printfn "doing fireMissile "
Success ("fireMissile ", [])
else
Failure "fireMissile failed"
// test with failure at refuel
refuel false
|> bind (enterLaunchCodes true)
|> bind (fireMissile true)
(*
val it : Result<string> = Failure "refuel failed"
*)
// test with failure at enterLaunchCodes
refuel true
|> bind (enterLaunchCodes false)
|> bind (fireMissile true)
(*
doing refuel
undoing refuel
val it : Result<string> = Failure "enterLaunchCodes failed"
*)
// test with failure at fireMissile
refuel true
|> bind (enterLaunchCodes true)
|> bind (fireMissile false)
(*
doing refuel
doing enterLaunchCodes
undoing enterLaunchCodes
undoing refuel
val it : Result<string> = Failure "fireMissile failed"
*)
// test with no failure
refuel true
|> bind (enterLaunchCodes true)
|> bind (fireMissile true)
(*
doing refuel
doing enterLaunchCodes
doing fireMissile
val it : Result<string> =
Success ("fireMissile ",[..functions..])
*)
If the results of each cannot be undone, a second option is not to do irreversible things in each step at all,
but to delay the irreversible bits until all steps are OK.
In this approach, the Success case contains a list of "execute" functions. Every step that succeeds adds a function to this list.
At the very end, the entire list of functions is executed.
The downside is that once committed, all the functions are run (although you could also chain those monadically too!)
This is basically a very crude version of the interpreter pattern.
Here's some code that demonstrates this approach:
/// ROP design with delayed executions
module RopWithExec =
type Execute = unit -> unit
type Result<'success> =
| Success of 'success * Execute list
| Failure of string
let bind f x =
match x with
| Failure e -> Failure e
| Success (s1,execList1) ->
match f s1 with
| Failure e ->
// return the error
Failure e
| Success (s2,execList2) ->
// concatenate the exec lists
Success (s2, execList1 # execList2)
let execute x =
match x with
| Failure e ->
Failure e
| Success (s,execList) ->
execList |> List.iter (fun exec -> exec())
Success (s,[])
/// Example
module LaunchWithExec =
open RopWithExec
let exec_refuel() =
printfn "refuel"
let refuel ok =
if ok then
printfn "checking if refuelling can be done"
Success ("refuel", [exec_refuel])
else
Failure "refuel failed"
let exec_enterLaunchCodes() =
printfn "entering launch codes"
let enterLaunchCodes ok refuelInfo =
if ok then
printfn "checking if launch codes can be entered"
Success ("enterLaunchCodes", [exec_enterLaunchCodes])
else
Failure "enterLaunchCodes failed"
let exec_fireMissile() =
printfn "firing missile"
let fireMissile ok launchCodesInfo =
if ok then
printfn "checking if missile can be fired"
Success ("fireMissile ", [exec_fireMissile])
else
Failure "fireMissile failed"
// test with failure at refuel
refuel false
|> bind (enterLaunchCodes true)
|> bind (fireMissile true)
|> execute
(*
val it : Result<string> = Failure "refuel failed"
*)
// test with failure at enterLaunchCodes
refuel true
|> bind (enterLaunchCodes false)
|> bind (fireMissile true)
|> execute
(*
checking if refuelling can be done
val it : Result<string> = Failure "enterLaunchCodes failed"
*)
// test with failure at fireMissile
refuel true
|> bind (enterLaunchCodes true)
|> bind (fireMissile false)
|> execute
(*
checking if refuelling can be done
checking if launch codes can be entered
val it : Result<string> = Failure "fireMissile failed"
*)
// test with no failure
refuel true
|> bind (enterLaunchCodes true)
|> bind (fireMissile true)
|> execute
(*
checking if refuelling can be done
checking if launch codes can be entered
checking if missile can be fired
refuel
entering launch codes
firing missile
val it : Result<string> = Success ("fireMissile ",[])
*)
You get the idea, I hope. I'm sure there are other approaches as well -- these are two that are obvious and simple. :)

Related

How find integer in text

Help me figure out how to work with text
i have a string like: "word1 number: word2" for example : "result 0: Good" or "result 299: Bad"
i need print Undefined/Low or High
When string is null , print Undefined
When number 0-15, print Low
When number >15, print High
type GetResponse =
{
MyData: string voption
ErrorMessage: string voption }
val result: Result<GetResponse, MyError>
and then i try:
MyData =
match result with
| Ok value ->
if (value.Messages = null) then
ValueSome "result: Undefined"
else
let result =
value.Messages.FirstOrDefault(
(fun x -> x.ToUpperInvariant().Contains("result")),
"Undefined"
)
if (result <> "Undefined") then
ValueSome result
else
errors.Add("We don't have any result")
ValueNone
| Error err ->
errors.Add(err.ToErrorString)
ValueNone
ErrorMessage =
if errors.Any() then
(errors |> String.concat ", " |> ValueSome)
else
ValueNone
but i dont know gow check in string number and maybe there is some way print this without a billion if?
Parsing gets complex very quickly. I recommend using FParsec to simplify the logic and avoid errors. A basic parser that seems to meet your needs:
open System
open FParsec
let parseWord =
manySatisfy Char.IsLetter
let parseValue =
parseWord // parse any word (e.g. "result")
>>. spaces1 // skip whitespace
>>. puint32 // parse an unsigned integer value
.>> skipChar ':' // skip colon character
.>> spaces // skip whitespace
.>> parseWord // parse any word (e.g. "Good")
You can then use it like this:
type ParserResult = Undefined | Low | High
let parse str =
if isNull str then Result.Ok Undefined
else
match run parseValue str with
| Success (num, _ , _) ->
if num <= 15u then Result.Ok Low
else Result.Ok High
| Failure (errorMsg, _, _) ->
Result.Error errorMsg
parse null |> printfn "%A" // Ok Undefined
parse "result 0: Good" |> printfn "%A" // Ok Low
parse "result 299: Bad" |> printfn "%A" // Ok High
parse "invalid input" |> printfn "%A" // Error "Error in Ln: 1 Col: 9 ... Expecting: integer number"
There's definitely a learning curve with FParsec, but I think it's worth adding to your toolbelt.
I agree with Brian that parsing can become quite tricky very quickly. However if you have some well established format of the input and you're not very much into writing complex parsers, good old regular expressions can be of service ;)
Here is my take on the problem - please note that it has plenty of room to improve, this is just a proof of concept:
open System.Text.RegularExpressions
let test1 = "result 0: Good"
let test2 = "result 299: Bad"
let test3 = "some other text"
type ParserResult =
| Undefined
| Low of int
| High of int
let (|ValidNumber|_|) s =
//https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex?view=net-6.0
let rx = new Regex("(\w\s+)(\d+)\:(\s+\w)")
let matches = rx.Matches(s)
if matches.Count > 0 then
let groups = matches.[0].Groups |> Seq.toList
match groups with
| [_; _; a; _] -> Some (int a.Value)
| _ -> None
else
None
let parseMyString str =
match str with
| ValidNumber n when n < 16 -> Low n
| ValidNumber n -> High n
| _ -> Undefined
//let r = parseMyString test1
printfn "%A" (parseMyString test1)
printfn "%A" (parseMyString test2)
printfn "%A" (parseMyString test3)
The active pattern ValidNumber returns the Some number if a match of the input string is found, otherwise it returns None. The parseMyString function uses the pattern and guards to initialise the final ParserOutput value.

F# - Error Loading Table With Empty Last Row Using HtmlProvider

When loading a Wikipedia table using HtmlProvider, I get an error message because the last row in the table is empty!
module SOQN =
open System
open FSharp.Data
let [<Literal>] wikiUrl = #"https://en.wikipedia.org/wiki/COVID-19_testing#Virus_testing_statistics_by_country"
type Covid = HtmlProvider<wikiUrl>
let main() =
printfn ""
printfn "SOQN: Error Loading Table With Empty Last Row Using HtmlProvider?"
printfn ""
let feed = Covid.Load(wikiUrl)
feed.Tables.``Virus testing statistics by country``.Rows
|> Seq.map (fun r -> r.Date)
|> printf "%A "
printfn ""
0
[<EntryPoint>]
main() |> ignore
printfn "Fini!"
printfn ""
// Actual Output:
// "Date is missing"
//
// Expected Output:
// seq [ "Albania"; "19 Apr"; "5542"; "562"; "10.1"; "1,936"; "196"; "[121]" ]
// ...
//
What am I missing?
For instance, can I preset the column types to 'string' similar to using 'schema' with CsvProvider?

F# MailboxProcessor memory leak in try/catch block

Updated after obvious error pointed out by John Palmer in the comments.
The following code results in OutOfMemoryException:
let agent = MailboxProcessor<string>.Start(fun agent ->
let maxLength = 1000
let rec loop (state: string list) i = async {
let! msg = agent.Receive()
try
printfn "received message: %s, iteration: %i, length: %i" msg i state.Length
let newState = state |> Seq.truncate maxLength |> Seq.toList
return! loop (msg::newState) (i+1)
with
| ex ->
printfn "%A" ex
return! loop state (i+1)
}
loop [] 0
)
let greeting = "hello"
while true do
agent.Post greeting
System.Threading.Thread.Sleep(1) // avoid piling up greetings before they are output
The error is gone if I don't use try/catch block.
Increasing the sleep time only postpones the error.
Update 2: I guess the issue here is that the function stops being tail recursive as the recursive call is no longer the last one to execute. Would be nice for somebody with more F# experience to desugar it as I'm sure this is a common memory-leak situation in F# agents as the code is very simple and generic.
Solution:
It turned out to be a part of a bigger problem: the function can't be tail-recursive if the recursive call is made within try/catch block as it has to be able to unroll the stack if the exception is thrown and thus has to save call stack information.
More details here:
Tail recursion and exceptions in F#
Properly rewritten code (separate try/catch and return):
let agent = MailboxProcessor<string>.Start(fun agent ->
let maxLength = 1000
let rec loop (state: string list) i = async {
let! msg = agent.Receive()
let newState =
try
printfn "received message: %s, iteration: %i, length: %i" msg i state.Length
let truncatedState = state |> Seq.truncate maxLength |> Seq.toList
msg::truncatedState
with
| ex ->
printfn "%A" ex
state
return! loop newState (i+1)
}
loop [] 0
)
I suspect the issue is actually here:
while true do
agent.Post "hello"
All the "hello"s that you post have to be stored in memory somewhere and will be pushed much faster than the output can happen with printf
See my old post here http://vaskir.blogspot.ru/2013/02/recursion-and-trywithfinally-blocks.html
random chars in order to satisfy this site rules *
Basically anything that is done after the return (like a try/with/finally/dispose) will prevent tail calls.
See https://blogs.msdn.microsoft.com/fsharpteam/2011/07/08/tail-calls-in-f/
There is also work underway to have the compiler warn about lack of tail recursion: https://github.com/fsharp/fslang-design/issues/82

Joining on the first finished thread?

I'm writing up a series of graph-searching algorithms in F# and thought it would be nice to take advantage of parallelization. I wanted to execute several threads in parallel and take the result of the first one to finish. I've got an implementation, but it's not pretty.
Two questions: is there a standard name for this sort of function? Not a Join or a JoinAll, but a JoinFirst? Second, is there a more idiomatic way to do this?
//implementation
let makeAsync (locker:obj) (shared:'a option ref) (f:unit->'a) =
async {
let result = f()
Monitor.Enter locker
shared := Some result
Monitor.Pulse locker
Monitor.Exit locker
}
let firstFinished test work =
let result = ref Option.None
let locker = new obj()
let cancel = new CancellationTokenSource()
work |> List.map (makeAsync locker result) |> List.map (fun a-> Async.StartAsTask(a, TaskCreationOptions.None, cancel.Token)) |> ignore
Monitor.Enter locker
while (result.Value.IsNone || (not <| test result.Value.Value)) do
Monitor.Wait locker |> ignore
Monitor.Exit locker
cancel.Cancel()
match result.Value with
| Some x-> x
| None -> failwith "Don't pass in an empty list"
//end implentation
//testing
let delayReturn (ms:int) value =
fun ()->
Thread.Sleep ms
value
let test () =
let work = [ delayReturn 1000 "First!"; delayReturn 5000 "Second!" ]
let result = firstFinished (fun _->true) work
printfn "%s" result
Would it work to pass the CancellationTokenSource and test to each async and have the first that computes a valid result cancel the others?
let makeAsync (cancel:CancellationTokenSource) test f =
let rec loop() =
async {
if cancel.IsCancellationRequested then
return None
else
let result = f()
if test result then
cancel.Cancel()
return Some result
else return! loop()
}
loop()
let firstFinished test work =
match work with
| [] -> invalidArg "work" "Don't pass in an empty list"
| _ ->
let cancel = new CancellationTokenSource()
work
|> Seq.map (makeAsync cancel test)
|> Seq.toArray
|> Async.Parallel
|> Async.RunSynchronously
|> Array.pick id
This approach makes several improvements: 1) it uses only async (it's not mixed with Task, which is an alternative for doing the same thing--async is more idiomatic in F#); 2) there's no shared state, other than CancellationTokenSource, which was designed for that purpose; 3) the clean function-chaining approach makes it easy to add additional logic/transformations to the pipeline, including trivially enabling/disabling parallelism.
With the Task Parallel Library in .NET 4, this is called WaitAny. For example, the following snippet creates 10 tasks and waits for any of them to complete:
open System.Threading
Array.init 10 (fun _ ->
Tasks.Task.Factory.StartNew(fun () ->
Thread.Sleep 1000))
|> Tasks.Task.WaitAny
In case you are ok to use "Reactive extensions (Rx)" in your project, the joinFirst method can be implemented as:
let joinFirst (f : (unit->'a) list) =
let c = new CancellationTokenSource()
let o = f |> List.map (fun i ->
let j = fun() -> Async.RunSynchronously (async {return i() },-1,c.Token)
Observable.Defer(fun() -> Observable.Start(j))
)
|> Observable.Amb
let r = o.First()
c.Cancel()
r
Example usage:
[20..30] |> List.map (fun i -> fun() -> Thread.Sleep(i*100); printfn "%d" i; i)
|> joinFirst |> printfn "Done %A"
Console.Read() |> ignore
Update:
Using Mailbox processor :
type WorkMessage<'a> =
Done of 'a
| GetFirstDone of AsyncReplyChannel<'a>
let joinFirst (f : (unit->'a) list) =
let c = new CancellationTokenSource()
let m = MailboxProcessor<WorkMessage<'a>>.Start(
fun mbox -> async {
let afterDone a m =
match m with
| GetFirstDone rc ->
rc.Reply(a);
Some(async {return ()})
| _ -> None
let getDone m =
match m with
|Done a ->
c.Cancel()
Some (async {
do! mbox.Scan(afterDone a)
})
|_ -> None
do! mbox.Scan(getDone)
return ()
} )
f
|> List.iter(fun t -> try
Async.RunSynchronously (async {let out = t()
m.Post(Done out)
return ()},-1,c.Token)
with
_ -> ())
m.PostAndReply(fun rc -> GetFirstDone rc)
Unfortunately, there is no built-in operation for this provided by Async, but I'd still use F# asyncs, because they directly support cancellation. When you start a workflow using Async.Start, you can pass it a cancellation token and the workflow will automatically stop if the token is cancelled.
This means that you have to start workflows explicitly (instead of using Async.Parallel), so the synchronizataion must be written by hand. Here is a simple version of Async.Choice method that does that (at the moment, it doesn't handle exceptions):
open System.Threading
type Microsoft.FSharp.Control.Async with
/// Takes several asynchronous workflows and returns
/// the result of the first workflow that successfuly completes
static member Choice(workflows) =
Async.FromContinuations(fun (cont, _, _) ->
let cts = new CancellationTokenSource()
let completed = ref false
let lockObj = new obj()
let synchronized f = lock lockObj f
/// Called when a result is available - the function uses locks
/// to make sure that it calls the continuation only once
let completeOnce res =
let run =
synchronized(fun () ->
if completed.Value then false
else completed := true; true)
if run then cont res
/// Workflow that will be started for each argument - run the
/// operation, cancel pending workflows and then return result
let runWorkflow workflow = async {
let! res = workflow
cts.Cancel()
completeOnce res }
// Start all workflows using cancellation token
for work in workflows do
Async.Start(runWorkflow work, cts.Token) )
Once we write this operation (which is a bit complex, but has to be written only once), solving the problem is quite easy. You can write your operations as async workflows and they'll be cancelled automatically when the first one completes:
let delayReturn n s = async {
do! Async.Sleep(n)
printfn "returning %s" s
return s }
Async.Choice [ delayReturn 1000 "First!"; delayReturn 5000 "Second!" ]
|> Async.RunSynchronously
When you run this, it will print only "returning First!" because the second workflow will be cancelled.

Compiled console command-line program doesn't wait for all the threads finishing

Some of the threads will be terminated before finished if the code is compiled to a console program or run as fsi --use:Program.fs --exec --quiet. Any way to wait for all the threads ending?
This issue can be described as "program exit problem when multiple MailboxProcessers exist".
Output example
(Note the last line is truncated and the last output function (printfn "[Main] after crawl") is never executed.)
[Main] before crawl
[Crawl] before return result
http://news.google.com crawled by agent 1.
[supervisor] reached limit
Agent 5 is done.
http://www.gstatic.com/news/img/favicon.ico crawled by agent 1.
[supervisor] reached limit
Agent 1 is done.
http://www.google.com/imghp?hl=en&tab=ni crawled by agent 4.
[supervisor] reached limit
Agent 4 is done.
http://www.google.com/webhp?hl=en&tab=nw crawled by agent 2.
[supervisor] reached limit
Agent 2 is done.
http://news.google.
Code
Edit: added several System.Threading.Thread.CurrentThread.IsBackground <- false.
open System
open System.Collections.Concurrent
open System.Collections.Generic
open System.IO
open System.Net
open System.Text.RegularExpressions
module Helpers =
type Message =
| Done
| Mailbox of MailboxProcessor<Message>
| Stop
| Url of string option
| Start of AsyncReplyChannel<unit>
// Gates the number of crawling agents.
[<Literal>]
let Gate = 5
// Extracts links from HTML.
let extractLinks html =
let pattern1 = "(?i)href\\s*=\\s*(\"|\')/?((?!#.*|/\B|" +
"mailto:|location\.|javascript:)[^\"\']+)(\"|\')"
let pattern2 = "(?i)^https?"
let links =
[
for x in Regex(pattern1).Matches(html) do
yield x.Groups.[2].Value
]
|> List.filter (fun x -> Regex(pattern2).IsMatch(x))
links
// Fetches a Web page.
let fetch (url : string) =
try
let req = WebRequest.Create(url) :?> HttpWebRequest
req.UserAgent <- "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
req.Timeout <- 5000
use resp = req.GetResponse()
let content = resp.ContentType
let isHtml = Regex("html").IsMatch(content)
match isHtml with
| true -> use stream = resp.GetResponseStream()
use reader = new StreamReader(stream)
let html = reader.ReadToEnd()
Some html
| false -> None
with
| _ -> None
let collectLinks url =
let html = fetch url
match html with
| Some x -> extractLinks x
| None -> []
open Helpers
// Creates a mailbox that synchronizes printing to the console (so
// that two calls to 'printfn' do not interleave when printing)
let printer =
MailboxProcessor.Start(fun x -> async {
while true do
let! str = x.Receive()
System.Threading.Thread.CurrentThread.IsBackground <- false
printfn "%s" str })
// Hides standard 'printfn' function (formats the string using
// 'kprintf' and then posts the result to the printer agent.
let printfn fmt =
Printf.kprintf printer.Post fmt
let crawl url limit =
// Concurrent queue for saving collected urls.
let q = ConcurrentQueue<string>()
// Holds crawled URLs.
let set = HashSet<string>()
let supervisor =
MailboxProcessor.Start(fun x -> async {
System.Threading.Thread.CurrentThread.IsBackground <- false
// The agent expects to receive 'Start' message first - the message
// carries a reply channel that is used to notify the caller
// when the agent completes crawling.
let! start = x.Receive()
let repl =
match start with
| Start repl -> repl
| _ -> failwith "Expected Start message!"
let rec loop run =
async {
let! msg = x.Receive()
match msg with
| Mailbox(mailbox) ->
let count = set.Count
if count < limit - 1 && run then
let url = q.TryDequeue()
match url with
| true, str -> if not (set.Contains str) then
let set'= set.Add str
mailbox.Post <| Url(Some str)
return! loop run
else
mailbox.Post <| Url None
return! loop run
| _ -> mailbox.Post <| Url None
return! loop run
else
printfn "[supervisor] reached limit"
// Wait for finishing
mailbox.Post Stop
return! loop run
| Stop -> printfn "[Supervisor] stop"; return! loop false
| Start _ -> failwith "Unexpected start message!"
| Url _ -> failwith "Unexpected URL message!"
| Done -> printfn "[Supervisor] Supervisor is done."
(x :> IDisposable).Dispose()
// Notify the caller that the agent has completed
repl.Reply(())
}
do! loop true })
let urlCollector =
MailboxProcessor.Start(fun y ->
let rec loop count =
async {
System.Threading.Thread.CurrentThread.IsBackground <- false
let! msg = y.TryReceive(6000)
match msg with
| Some message ->
match message with
| Url u ->
match u with
| Some url -> q.Enqueue url
return! loop count
| None -> return! loop count
| _ ->
match count with
| Gate -> (y :> IDisposable).Dispose()
printfn "[urlCollector] URL collector is done."
supervisor.Post Done
| _ -> return! loop (count + 1)
| None -> supervisor.Post Stop
return! loop count
}
loop 1)
/// Initializes a crawling agent.
let crawler id =
MailboxProcessor.Start(fun inbox ->
let rec loop() =
async {
System.Threading.Thread.CurrentThread.IsBackground <- false
let! msg = inbox.Receive()
match msg with
| Url x ->
match x with
| Some url ->
let links = collectLinks url
printfn "%s crawled by agent %d." url id
for link in links do
urlCollector.Post <| Url (Some link)
supervisor.Post(Mailbox(inbox))
return! loop()
| None -> supervisor.Post(Mailbox(inbox))
return! loop()
| _ -> printfn "Agent %d is done." id
urlCollector.Post Done
(inbox :> IDisposable).Dispose()
}
loop())
// Send 'Start' message to the main agent. The result
// is asynchronous workflow that will complete when the
// agent crawling completes
let result = supervisor.PostAndAsyncReply(Start)
// Spawn the crawlers.
let crawlers =
[
for i in 1 .. Gate do
yield crawler i
]
// Post the first messages.
crawlers.Head.Post <| Url (Some url)
crawlers.Tail |> List.iter (fun ag -> ag.Post <| Url None)
printfn "[Crawl] before return result"
result
// Example:
printfn "[Main] before crawl"
crawl "http://news.google.com" 5
|> Async.RunSynchronously
printfn "[Main] after crawl"
If I recognize the code correctly, it is based on your previous question (and my answer).
The program waits until the supervisor agent completes (by sending the Start message and then waiting for the reply using RunSynchronously). This should guarantee that the main agent as well as all crawlers complete before the application exits.
The problem is that it doesn't wait until the printer agent completes! So, the last call to the (redefined) printfn function sends a message to the agent and then the application completes without waiting until the printing agent finishes.
As far as I know, there is no "standard pattern" for waiting until agent completes processing all messages currently in the queue. Some ideas that you could try are:
You could check the CurrentQueueLength property (wait until it is 0), but that still doesn't mean that the agent completed processing all messages.
You could make the agent more complex by adding a new type of message and waiting until the agent replies to that message (just like you're currently waiting for a reply to the Start message).
Caveat that I know zero F#, but typically you wait for all threads of interest using Thread.Join. It looks to me like in your case, you need to wait for anything of interest that's kicked off via a call to .Start.
You could also consider Task Parallel Library which gives you a higher level (simpler) abstraction onto raw managed threads. Example for waiting for tasks to complete here.
.NET threads have property Thread.IsBackground when this is set to true a thread will not prevent the process from exiting. When set to false it will prevent the process from exiting. See: http://msdn.microsoft.com/en-us/library/system.threading.thread.isbackground.aspx
The threads that run agents come from the thread pool and therefore have Thread.IsBackground set to false by default.
You might try setting the thread's IsBackground to false each time you read a message. You could add a function to do this for you to make the approach cleaner. It's perhaps not the best solution to the problem as each time you use a let! you could change threads so it would need to be carefully implemented to work properly. I just thought mention it to answer the specific question
Any way to wait for all the threads ending?
and help people understand why certain threads stop the program exiting and other’s didn’t.
I think I've sort of solved the problem: adding System.Threading.Thread.CurrentThread.IsBackground <- false after the let! in the printer agent.
However, I tried to modify the original code (the first version before Tomas' AsyncChannel fix) by adding System.Threading.Thread.CurrentThread.IsBackground <- false after all the let! and it still doesn't work. No idea.
Thanks everyone for your help. I finally can start my first F# application for a batch process. I think MailboxProcessor should really set IsBackground to false by default. Anyway to ask Microsoft to change it.
[Update] Just found out that the compiled assembly works well. But fsi --user:Program --exec --quiet is still the same. It seems a bug of fsi?

Resources