Joining on the first finished thread? - multithreading

I'm writing up a series of graph-searching algorithms in F# and thought it would be nice to take advantage of parallelization. I wanted to execute several threads in parallel and take the result of the first one to finish. I've got an implementation, but it's not pretty.
Two questions: is there a standard name for this sort of function? Not a Join or a JoinAll, but a JoinFirst? Second, is there a more idiomatic way to do this?
//implementation
let makeAsync (locker:obj) (shared:'a option ref) (f:unit->'a) =
async {
let result = f()
Monitor.Enter locker
shared := Some result
Monitor.Pulse locker
Monitor.Exit locker
}
let firstFinished test work =
let result = ref Option.None
let locker = new obj()
let cancel = new CancellationTokenSource()
work |> List.map (makeAsync locker result) |> List.map (fun a-> Async.StartAsTask(a, TaskCreationOptions.None, cancel.Token)) |> ignore
Monitor.Enter locker
while (result.Value.IsNone || (not <| test result.Value.Value)) do
Monitor.Wait locker |> ignore
Monitor.Exit locker
cancel.Cancel()
match result.Value with
| Some x-> x
| None -> failwith "Don't pass in an empty list"
//end implentation
//testing
let delayReturn (ms:int) value =
fun ()->
Thread.Sleep ms
value
let test () =
let work = [ delayReturn 1000 "First!"; delayReturn 5000 "Second!" ]
let result = firstFinished (fun _->true) work
printfn "%s" result

Would it work to pass the CancellationTokenSource and test to each async and have the first that computes a valid result cancel the others?
let makeAsync (cancel:CancellationTokenSource) test f =
let rec loop() =
async {
if cancel.IsCancellationRequested then
return None
else
let result = f()
if test result then
cancel.Cancel()
return Some result
else return! loop()
}
loop()
let firstFinished test work =
match work with
| [] -> invalidArg "work" "Don't pass in an empty list"
| _ ->
let cancel = new CancellationTokenSource()
work
|> Seq.map (makeAsync cancel test)
|> Seq.toArray
|> Async.Parallel
|> Async.RunSynchronously
|> Array.pick id
This approach makes several improvements: 1) it uses only async (it's not mixed with Task, which is an alternative for doing the same thing--async is more idiomatic in F#); 2) there's no shared state, other than CancellationTokenSource, which was designed for that purpose; 3) the clean function-chaining approach makes it easy to add additional logic/transformations to the pipeline, including trivially enabling/disabling parallelism.

With the Task Parallel Library in .NET 4, this is called WaitAny. For example, the following snippet creates 10 tasks and waits for any of them to complete:
open System.Threading
Array.init 10 (fun _ ->
Tasks.Task.Factory.StartNew(fun () ->
Thread.Sleep 1000))
|> Tasks.Task.WaitAny

In case you are ok to use "Reactive extensions (Rx)" in your project, the joinFirst method can be implemented as:
let joinFirst (f : (unit->'a) list) =
let c = new CancellationTokenSource()
let o = f |> List.map (fun i ->
let j = fun() -> Async.RunSynchronously (async {return i() },-1,c.Token)
Observable.Defer(fun() -> Observable.Start(j))
)
|> Observable.Amb
let r = o.First()
c.Cancel()
r
Example usage:
[20..30] |> List.map (fun i -> fun() -> Thread.Sleep(i*100); printfn "%d" i; i)
|> joinFirst |> printfn "Done %A"
Console.Read() |> ignore
Update:
Using Mailbox processor :
type WorkMessage<'a> =
Done of 'a
| GetFirstDone of AsyncReplyChannel<'a>
let joinFirst (f : (unit->'a) list) =
let c = new CancellationTokenSource()
let m = MailboxProcessor<WorkMessage<'a>>.Start(
fun mbox -> async {
let afterDone a m =
match m with
| GetFirstDone rc ->
rc.Reply(a);
Some(async {return ()})
| _ -> None
let getDone m =
match m with
|Done a ->
c.Cancel()
Some (async {
do! mbox.Scan(afterDone a)
})
|_ -> None
do! mbox.Scan(getDone)
return ()
} )
f
|> List.iter(fun t -> try
Async.RunSynchronously (async {let out = t()
m.Post(Done out)
return ()},-1,c.Token)
with
_ -> ())
m.PostAndReply(fun rc -> GetFirstDone rc)

Unfortunately, there is no built-in operation for this provided by Async, but I'd still use F# asyncs, because they directly support cancellation. When you start a workflow using Async.Start, you can pass it a cancellation token and the workflow will automatically stop if the token is cancelled.
This means that you have to start workflows explicitly (instead of using Async.Parallel), so the synchronizataion must be written by hand. Here is a simple version of Async.Choice method that does that (at the moment, it doesn't handle exceptions):
open System.Threading
type Microsoft.FSharp.Control.Async with
/// Takes several asynchronous workflows and returns
/// the result of the first workflow that successfuly completes
static member Choice(workflows) =
Async.FromContinuations(fun (cont, _, _) ->
let cts = new CancellationTokenSource()
let completed = ref false
let lockObj = new obj()
let synchronized f = lock lockObj f
/// Called when a result is available - the function uses locks
/// to make sure that it calls the continuation only once
let completeOnce res =
let run =
synchronized(fun () ->
if completed.Value then false
else completed := true; true)
if run then cont res
/// Workflow that will be started for each argument - run the
/// operation, cancel pending workflows and then return result
let runWorkflow workflow = async {
let! res = workflow
cts.Cancel()
completeOnce res }
// Start all workflows using cancellation token
for work in workflows do
Async.Start(runWorkflow work, cts.Token) )
Once we write this operation (which is a bit complex, but has to be written only once), solving the problem is quite easy. You can write your operations as async workflows and they'll be cancelled automatically when the first one completes:
let delayReturn n s = async {
do! Async.Sleep(n)
printfn "returning %s" s
return s }
Async.Choice [ delayReturn 1000 "First!"; delayReturn 5000 "Second!" ]
|> Async.RunSynchronously
When you run this, it will print only "returning First!" because the second workflow will be cancelled.

Related

How to update the WPF GUI from non-blocking async methods in F#

System.InvalidOperationException: 'The calling thread cannot access this object because a different thread owns it.'
I have a WPF GUI with a button that when clicked does:
starts a control animation (on the GUI), and
starts a background process to obtain the local printer queues.
I do not want to block the main thread (GUI). However, the code I have gives the above error when I try to update the main thread with the results of the background process.
How do I have a background async process update the main thread without a context violation and not blocking the main thread?
open System.Printing
let GetPrinters =
let LocalPrintServer = new PrintServer()
let printQueues = LocalPrintServer.GetPrintQueues [|EnumeratedPrintQueueTypes.Local; EnumeratedPrintQueueTypes.Connections|]
let printerList =
printQueues
|> Seq.cast<PrintQueue>
|> Seq.toList
printerList
let GetPrintersAsync() =
async {
let! token = Async.StartChild(GetPrinters)
let! p = token
return p }
This is the update procedure I'm using:
let asyncUpper =
async {
let! printerQues = GetPrintersAsync ()
return printerQues
}
// This is where the error is being displayed.
let getprinters (printers:PrintQueue list) =
printers
|> List.map (fun pq -> {fullname = pq.FullName; comment = pq.Comment; defaultPrintTicket= Some pq.DefaultPrintTicket;
description= pq.Description; isInError=pq.IsInError; isOffline=pq.IsOffline; Id= Guid.NewGuid() } )
{ m with Printers = getprinters; IsRefreshing = false }
Edit #1: The above is a short version of the complete listing. Please see
https://github.com/awaynemd/AsyncAndElmish for the complete source code using Elmish.wpf. Thank you.
I've had a chance to look at your source on GitHub now, and even run it.
The problem is that the print queues are retrieved in an async function, which means another thread than the GUI thread. Then the list of queues are returned to the GUI thread, and accessed from there. That's why you get the error message. When the queues are returned to the GUI thread, then they are mapped to the Printer type. This is too late. If you move that mapping into the async instead, then it won't be too late. The data returned to the GUI thread will be Printer list, which is perhaps fine. At least there's no crash. I am not one hundred percent sure if it's ok, because there's a field of type PrintTicket in there, and the question is whether it's safe to pull this across to another thread. If you need data from that object, maybe this too should be mapped to a record in the async before being returned to the GUI thread.
While trying to get it running without the error, this is the code I ended up with. I am not that knowledgeable about async either, and I'm not sure whether there's any point using async for this case. But maybe you're just trying out stuff.
| GetPrintersMsg ->
let getPrinters () = async {
use ps = new PrintServer()
return
ps.GetPrintQueues [| EnumeratedPrintQueueTypes.Local; EnumeratedPrintQueueTypes.Connections |]
|> Seq.cast<PrintQueue>
|> Seq.toList
|> List.map (fun pq ->
{
Id = Guid.NewGuid()
fullname = pq.FullName
comment = pq.Comment
defaultPrintTicket = Some pq.DefaultPrintTicket
description = pq.Description
isInError = pq.IsInError
isOffline = pq.IsOffline
})
}
m, Cmd.OfAsync.either getPrinters () OnPrintersResult OnPrintersError
| OnPrintersResult printers ->
{ m with Printers = printers; IsRefreshing = false }, Cmd.none
I haven't looked at your code, but I think the basic answer to your question for WPF is the Dispatcher class. You can also use F#'s Async.SwitchToContext. See this SO question, for example.
#BentTranberg actually answered the hard part of this question. I post this as the completed answer since editing the question seems redundant. The below code is Bent's answer with a few modifications. The printers are now being read on a separate thread as seen with the printfn statements:
| GetPrintersMsg ->
let getPrinters () = async {
printfn "1: %i" Thread.CurrentThread.ManagedThreadId
let getprinters = async {
printfn "11: %i" Thread.CurrentThread.ManagedThreadId
use ps = new PrintServer()
return
ps.GetPrintQueues [| EnumeratedPrintQueueTypes.Local; EnumeratedPrintQueueTypes.Connections |]
|> Seq.cast<PrintQueue>
|> Seq.toList
|> List.map (fun pq ->
{
Id = Guid.NewGuid()
fullname = pq.FullName
comment = pq.Comment
defaultPrintTicket = Some pq.DefaultPrintTicket
description = pq.Description
isInError = pq.IsInError
isOffline = pq.IsOffline
}) }
let! d = getprinters |> Async.StartChild
return! d
}
m, Cmd.OfAsync.either getPrinters () OnPrintersResult OnPrintersError

Use a MailboxProcessor with reply-channel to create limited agents that return values in order

Basically, I want to change the following into a limited threading solution, because in my situation the list of calculations is too large, spawning too many threads, and I'd like to experiment and measure performance with less threads.
// the trivial approach (and largely my current situation)
let doWork() =
[1 .. 10]
|> List.map (fun i -> async {
do! Async.Sleep (100 * i) // longest thread will run 1 sec
return i * i // some complex calculation returning a certain type
})
|> Async.Parallel
|> Async.RunSynchronously // works, total wall time 1s
My new approach, this code is borrowed/inspired by this online snippet from Tomas Petricek (which I tested, it works, but I need it to return a value, not unit).
type LimitAgentMessage =
| Start of Async<int> * AsyncReplyChannel<int>
| Finished
let threadingLimitAgent limit = MailboxProcessor.Start(fun inbox -> async {
let queue = System.Collections.Generic.Queue<_>()
let count = ref 0
while true do
let! msg = inbox.Receive()
match msg with
| Start (work, reply) -> queue.Enqueue((work, reply))
| Finished -> decr count
if count.Value < limit && queue.Count > 0 then
incr count
let work, reply = queue.Dequeue()
// Start it in a thread pool (on background)
Async.Start(async {
let! x = work
do! async {reply.Reply x }
inbox.Post(Finished)
})
})
// given a synchronous list of tasks, run each task asynchronously,
// return calculated values in original order
let worker lst =
// this doesn't work as expected, it waits for each reply
let agent = threadingLimitAgent 10
lst
|> List.map(fun x ->
agent.PostAndReply(
fun replyChannel -> Start(x, replyChannel)))
Now, with this in place, the original code would become:
let doWork() =
[1 .. 10]
|> List.map (fun i -> async {
do! Async.Sleep (100 * i) // longest thread will run 1 sec
return i * i // some complex calculation returning a certain type
})
|> worker // worker is not working (correct output, runs 5.5s)
All in all, the output is correct (it does calculate and propagate back the replies), but it does not do so in the (limited set) of threads.
I've been playing around a bit, but think I'm missing the obvious (and besides, who knows, someone may like the idea of a limited-threads mailbox processor that returns its calculations in order).
The problem is the call to agent.PostAndReply. PostAndReply will block until the work has finished. Calling this inside List.map will cause the work to be executed sequentially. One solution is to use PostAndAsyncReply which does not block and also returns you an async handle for getting the result back.
let worker lst =
let agent = threadingLimitAgent 10
lst
|> List.map(fun x ->
agent.PostAndAsyncReply(
fun replyChannel -> Start(x, replyChannel)))
|> Async.Parallel
let doWork() =
[1 .. 10]
|> List.map (fun i -> async {
do! Async.Sleep (100 * i)
return i * i
})
|> worker
|> Async.RunSynchronously
That's of course only one possible solution (getting all async handles back and awaiting them in parallel).

What are the differences between Lwt.async and Lwt_main.run on OCaml/Node.JS?

I am experimenting with js_of_ocaml and node.js. As you know, node.js makes extensive use of callbacks to implement asynchronous requests without introducing explicit threads.
In OCaml we have a very nice threading library, Lwt, coming with a very useful syntax extension. I wrote a prototype with a binding to some node library (a AWS S3 client) and added a lwt-ish layer to hide the callback.
open Lwt.Infix
open Printf
open Js
let require_module s =
Js.Unsafe.fun_call
(Js.Unsafe.js_expr "require")
[|Js.Unsafe.inject (Js.string s)|]
let _js_aws = require_module "aws-sdk"
let array_to_list a =
let ax = ref [] in
begin
for i = 0 to a##.length - 1 do
Optdef.iter (array_get a i) (fun x -> ax := x :: !ax)
done;
!ax
end
class type error = object
end
class type bucket = object
method _Name : js_string t readonly_prop
method _CreationDate : date t readonly_prop
end
class type listBucketsData = object
method _Buckets : (bucket t) js_array t readonly_prop
end
class type s3 = object
method listBuckets :
(error -> listBucketsData t -> unit) callback -> unit meth
end
let createClient : unit -> s3 t = fun () ->
let constr_s3 = _js_aws##.S3 in
new%js constr_s3 ()
module S3 : sig
type t
val create : unit -> t
val list_buckets : t -> (string * string) list Lwt.t
end = struct
type t = s3 Js.t
let create () =
createClient ()
let list_buckets client =
let cell_of_bucket_data data =
((to_string data##._Name),
(to_string data##._CreationDate##toString))
in
let mvar = Lwt_mvar.create_empty () in
let callback error buckets =
let p () =
if true then
Lwt_mvar.put mvar
(`Ok(List.map cell_of_bucket_data ## array_to_list buckets##._Buckets))
else
Lwt_mvar.put mvar (`Error("Ups"))
in
Lwt.async p
in
begin
client##listBuckets (wrap_callback callback);
Lwt.bind
(Lwt_mvar.take mvar)
(function
| `Ok(whatever) -> Lwt.return whatever
| `Error(mesg) -> Lwt.fail_with mesg)
end
end
let () =
let s3 = S3.create() in
let dump lst =
Lwt_list.iter_s
(fun (name, creation_date) ->
printf "%32s\t%s\n" name creation_date;
Lwt.return_unit)
lst
in
let t () =
S3.list_buckets s3
>>= dump
in
begin
Lwt.async t
end
Since there is no binding to Lwt_main for node.js, I had to run my code with Lwt.async. What are the differences between running the code with Lwt.async rather than with Lwt_main.run – the latter not existing in node.js? Is it guaranteed that the program will wait until the asynchronous threads are completed before exiting, or is this rather a lucky but random behaviour of my code?
The Lwt_main.run function recursively polls the thread whose execution it supervises. At each iteration, if this thread is still running, the scheduler uses one engine (from Lwt_engine) to execute threads waiting for I/Os, either by selecting or synchronising on events.
The natural way to translate this in Node.JS is to use the process.nextTick method, which relies on Node.JS own scheduler. Implementing the Lwt_main.run function in this case can be as simple as:
let next_tick (callback : unit -> unit) =
Js.Unsafe.(fun_call
(js_expr "process.nextTick")
[| inject (Js.wrap_callback callback) |])
let rec run t =
Lwt.wakeup_paused ();
match Lwt.poll t with
| Some x -> x
| None -> next_tick (fun () -> run t)
This function only run threads of type unit Lwt.t but this is the main case for a program. It is possible to compute arbitrary values using a Lwt_mvar.t to communicate.
It is also possible to extend this example to support all sort of hooks, as in the original Lwt_main.run implementation.

Compiled console command-line program doesn't wait for all the threads finishing

Some of the threads will be terminated before finished if the code is compiled to a console program or run as fsi --use:Program.fs --exec --quiet. Any way to wait for all the threads ending?
This issue can be described as "program exit problem when multiple MailboxProcessers exist".
Output example
(Note the last line is truncated and the last output function (printfn "[Main] after crawl") is never executed.)
[Main] before crawl
[Crawl] before return result
http://news.google.com crawled by agent 1.
[supervisor] reached limit
Agent 5 is done.
http://www.gstatic.com/news/img/favicon.ico crawled by agent 1.
[supervisor] reached limit
Agent 1 is done.
http://www.google.com/imghp?hl=en&tab=ni crawled by agent 4.
[supervisor] reached limit
Agent 4 is done.
http://www.google.com/webhp?hl=en&tab=nw crawled by agent 2.
[supervisor] reached limit
Agent 2 is done.
http://news.google.
Code
Edit: added several System.Threading.Thread.CurrentThread.IsBackground <- false.
open System
open System.Collections.Concurrent
open System.Collections.Generic
open System.IO
open System.Net
open System.Text.RegularExpressions
module Helpers =
type Message =
| Done
| Mailbox of MailboxProcessor<Message>
| Stop
| Url of string option
| Start of AsyncReplyChannel<unit>
// Gates the number of crawling agents.
[<Literal>]
let Gate = 5
// Extracts links from HTML.
let extractLinks html =
let pattern1 = "(?i)href\\s*=\\s*(\"|\')/?((?!#.*|/\B|" +
"mailto:|location\.|javascript:)[^\"\']+)(\"|\')"
let pattern2 = "(?i)^https?"
let links =
[
for x in Regex(pattern1).Matches(html) do
yield x.Groups.[2].Value
]
|> List.filter (fun x -> Regex(pattern2).IsMatch(x))
links
// Fetches a Web page.
let fetch (url : string) =
try
let req = WebRequest.Create(url) :?> HttpWebRequest
req.UserAgent <- "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
req.Timeout <- 5000
use resp = req.GetResponse()
let content = resp.ContentType
let isHtml = Regex("html").IsMatch(content)
match isHtml with
| true -> use stream = resp.GetResponseStream()
use reader = new StreamReader(stream)
let html = reader.ReadToEnd()
Some html
| false -> None
with
| _ -> None
let collectLinks url =
let html = fetch url
match html with
| Some x -> extractLinks x
| None -> []
open Helpers
// Creates a mailbox that synchronizes printing to the console (so
// that two calls to 'printfn' do not interleave when printing)
let printer =
MailboxProcessor.Start(fun x -> async {
while true do
let! str = x.Receive()
System.Threading.Thread.CurrentThread.IsBackground <- false
printfn "%s" str })
// Hides standard 'printfn' function (formats the string using
// 'kprintf' and then posts the result to the printer agent.
let printfn fmt =
Printf.kprintf printer.Post fmt
let crawl url limit =
// Concurrent queue for saving collected urls.
let q = ConcurrentQueue<string>()
// Holds crawled URLs.
let set = HashSet<string>()
let supervisor =
MailboxProcessor.Start(fun x -> async {
System.Threading.Thread.CurrentThread.IsBackground <- false
// The agent expects to receive 'Start' message first - the message
// carries a reply channel that is used to notify the caller
// when the agent completes crawling.
let! start = x.Receive()
let repl =
match start with
| Start repl -> repl
| _ -> failwith "Expected Start message!"
let rec loop run =
async {
let! msg = x.Receive()
match msg with
| Mailbox(mailbox) ->
let count = set.Count
if count < limit - 1 && run then
let url = q.TryDequeue()
match url with
| true, str -> if not (set.Contains str) then
let set'= set.Add str
mailbox.Post <| Url(Some str)
return! loop run
else
mailbox.Post <| Url None
return! loop run
| _ -> mailbox.Post <| Url None
return! loop run
else
printfn "[supervisor] reached limit"
// Wait for finishing
mailbox.Post Stop
return! loop run
| Stop -> printfn "[Supervisor] stop"; return! loop false
| Start _ -> failwith "Unexpected start message!"
| Url _ -> failwith "Unexpected URL message!"
| Done -> printfn "[Supervisor] Supervisor is done."
(x :> IDisposable).Dispose()
// Notify the caller that the agent has completed
repl.Reply(())
}
do! loop true })
let urlCollector =
MailboxProcessor.Start(fun y ->
let rec loop count =
async {
System.Threading.Thread.CurrentThread.IsBackground <- false
let! msg = y.TryReceive(6000)
match msg with
| Some message ->
match message with
| Url u ->
match u with
| Some url -> q.Enqueue url
return! loop count
| None -> return! loop count
| _ ->
match count with
| Gate -> (y :> IDisposable).Dispose()
printfn "[urlCollector] URL collector is done."
supervisor.Post Done
| _ -> return! loop (count + 1)
| None -> supervisor.Post Stop
return! loop count
}
loop 1)
/// Initializes a crawling agent.
let crawler id =
MailboxProcessor.Start(fun inbox ->
let rec loop() =
async {
System.Threading.Thread.CurrentThread.IsBackground <- false
let! msg = inbox.Receive()
match msg with
| Url x ->
match x with
| Some url ->
let links = collectLinks url
printfn "%s crawled by agent %d." url id
for link in links do
urlCollector.Post <| Url (Some link)
supervisor.Post(Mailbox(inbox))
return! loop()
| None -> supervisor.Post(Mailbox(inbox))
return! loop()
| _ -> printfn "Agent %d is done." id
urlCollector.Post Done
(inbox :> IDisposable).Dispose()
}
loop())
// Send 'Start' message to the main agent. The result
// is asynchronous workflow that will complete when the
// agent crawling completes
let result = supervisor.PostAndAsyncReply(Start)
// Spawn the crawlers.
let crawlers =
[
for i in 1 .. Gate do
yield crawler i
]
// Post the first messages.
crawlers.Head.Post <| Url (Some url)
crawlers.Tail |> List.iter (fun ag -> ag.Post <| Url None)
printfn "[Crawl] before return result"
result
// Example:
printfn "[Main] before crawl"
crawl "http://news.google.com" 5
|> Async.RunSynchronously
printfn "[Main] after crawl"
If I recognize the code correctly, it is based on your previous question (and my answer).
The program waits until the supervisor agent completes (by sending the Start message and then waiting for the reply using RunSynchronously). This should guarantee that the main agent as well as all crawlers complete before the application exits.
The problem is that it doesn't wait until the printer agent completes! So, the last call to the (redefined) printfn function sends a message to the agent and then the application completes without waiting until the printing agent finishes.
As far as I know, there is no "standard pattern" for waiting until agent completes processing all messages currently in the queue. Some ideas that you could try are:
You could check the CurrentQueueLength property (wait until it is 0), but that still doesn't mean that the agent completed processing all messages.
You could make the agent more complex by adding a new type of message and waiting until the agent replies to that message (just like you're currently waiting for a reply to the Start message).
Caveat that I know zero F#, but typically you wait for all threads of interest using Thread.Join. It looks to me like in your case, you need to wait for anything of interest that's kicked off via a call to .Start.
You could also consider Task Parallel Library which gives you a higher level (simpler) abstraction onto raw managed threads. Example for waiting for tasks to complete here.
.NET threads have property Thread.IsBackground when this is set to true a thread will not prevent the process from exiting. When set to false it will prevent the process from exiting. See: http://msdn.microsoft.com/en-us/library/system.threading.thread.isbackground.aspx
The threads that run agents come from the thread pool and therefore have Thread.IsBackground set to false by default.
You might try setting the thread's IsBackground to false each time you read a message. You could add a function to do this for you to make the approach cleaner. It's perhaps not the best solution to the problem as each time you use a let! you could change threads so it would need to be carefully implemented to work properly. I just thought mention it to answer the specific question
Any way to wait for all the threads ending?
and help people understand why certain threads stop the program exiting and other’s didn’t.
I think I've sort of solved the problem: adding System.Threading.Thread.CurrentThread.IsBackground <- false after the let! in the printer agent.
However, I tried to modify the original code (the first version before Tomas' AsyncChannel fix) by adding System.Threading.Thread.CurrentThread.IsBackground <- false after all the let! and it still doesn't work. No idea.
Thanks everyone for your help. I finally can start my first F# application for a batch process. I think MailboxProcessor should really set IsBackground to false by default. Anyway to ask Microsoft to change it.
[Update] Just found out that the compiled assembly works well. But fsi --user:Program --exec --quiet is still the same. It seems a bug of fsi?

F#: I cannot return unit in a do clause and still have side effects

I'm writing a simple ini file parser and I'm having a little problem with the initialization of the object in the "do" clause. It wants me to return a unit but i can't get the blankity function to do the side effects if I try to pipe into an "ignore" or if i return "()" directly.
This code works as a separate function because I can ignore the results.
#light
module Utilities.Config
open System
open System.IO
open System.Text.RegularExpressions
open System.Collections.Generic
type Config(?fileName : string) =
let fileName = defaultArg fileName #"C:\path\myConfigs.ini"
static let defaultSettings =
dict[ "Setting1", "1";
"Setting2", "2";
"Debug", "0";
"State", "Disarray";]
let settingRegex = new Regex(#"\s*(?<key>([^;#=]*[^;#= ]))\s*=\s*(?<value>([^;#]*[^;# ]))")
let fileSettings = new Dictionary<string, string>()
let addFileSetting (groups : GroupCollection) =
fileSettings.Add(groups.Item("key").Value, groups.Item("value").Value)
do File.ReadAllLines(fileName)
|> Seq.map(fun line -> settingRegex.Match(line))
|> Seq.filter(fun mtch -> mtch.Success)
|> Seq.map(fun mtch -> addFileSetting(mtch.Groups) // Does not have the correct return type
//|> ignore //#1 Does not init the dictionary
//() //#2 Does not init the dictionary
//The extra step will work
member c.ReadFile =
File.ReadAllLines(fileName)
|> Seq.map(fun line -> settingRegex.Match(line))
|> Seq.filter(fun mtch -> mtch.Success)
|> Seq.map(fun mtch -> addFileSetting(mtch.Groups))
Use Seq.iter (executing an action for each element - returning unit) instead of Seq.map (transforming elements).
The code doesn't work with ignore because Seq's are evaluated lazily and when you ignore the result, there is no need to run any code at all. Read this article

Resources