Why is this Haskell program actually concurrent?

Why is this Haskell program actually concurrent? - haskell

I am learning about simple uses of forkIO and Mvar from Marlow's book: Parallel and Concurrent Programming in Haskell. I the section MVar as a Container for Shared State the following code is exposed:
-- Creates a new Map within an MVar
new :: IO PhoneBookState
new = do
m <- newMVar Map.empty
return (PhoneBookState m)
-- Inserts a Key, Value in the Map, locking the MVar
insert :: PhoneBookState -> Name -> PhoneNumber -> IO ()
insert (PhoneBookState m) name number = do
book <- takeMVar m
putMVar m (Map.insert name number book)
-- Retrieves the Map, locking the MVar briefly
lookup :: PhoneBookState -> Name -> IO (Maybe PhoneNumber)
lookup (PhoneBookState m) name = do
book <- takeMVar m
putMVar m book
return (Map.lookup name book)
Where PhoneBookState holds an MVar (Map Name PhoneNumber). My problem comes in the main function
main = do
s <- new
sequence_ [ insert s ("name" ++ show n) (show n) | n <- [1..10000] ]
lookup s "name999" >>= print
lookup s "unknown" >>= print
The idea is both lookups happening concurrently, but in such a case, shouldn't it be something like
main = do
s <- new
sequence_ [ insert s ("name" ++ show n) (show n) | n <- [1..10000] ]
forkIO(lookup s "name999" >>= print)
lookup s "unknown" >>= print
What am I missing?

Related

How can I use REPL with CPS function?

I've just encountered withSession :: (Session -> IO a) -> IO a of wreq package. I want to evaluate the continuation line by line, but I can't find any way for this.
import Network.Wreq.Session as S
withSession $ \sess -> do
res <- S.getWith opts sess "http://stackoverflow.com/questions"
-- print res
-- .. other things
In above snippet how can I evaluate print res in ghci? In other words, can I get Session type in ghci?

Wonderful question.
I am aware of no methods that can re-enter the GHCi REPL, so that we can use that in CPS functions. Perhaps others can suggest some way.
However, I can suggest an hack. Basically, one can exploit concurrency to turn CPS inside out, if it is based on the IO monad as in this case.
Here's the hack: use this in a GHCi session
> sess <- newEmptyMVar :: IO (MVar Session)
> stop <- newEmptyMVar :: IO (MVar ())
> forkIO $ withSession $ \s -> putMVar sess s >> takeMVar stop
> s <- takeMVar sess
> -- use s here as if you were inside withSession
> let s = () -- recommended
> putMVar stop ()
> -- we are now "outside" withSession, don't try to access s here!
A small library to automatize the hack:
data CPSControl b = CPSControl (MVar ()) (MVar b)
startDebugCps :: ((a -> IO ()) -> IO b) -> IO (a, CPSControl b)
startDebugCps cps = do
cpsVal <- newEmptyMVar
retVal <- newEmptyMVar
stop <- newEmptyMVar
_ <- forkIO $ do
x <- cps $ \c -> putMVar cpsVal c >> takeMVar stop
putMVar retVal x
s <- takeMVar cpsVal
return (s, CPSControl stop retVal)
stopDebugCps :: CPSControl b -> IO b
stopDebugCps (CPSControl stop retVal) = do
putMVar stop ()
takeMVar retVal
testCps :: (String -> IO ()) -> IO String
testCps act = do
putStrLn "testCps: begin"
act "here's some string!"
putStrLn "testCps: end"
return "some return value"
A quick test:
> (x, ctrl) <- startDebugCps testCps
testCps: begin
> x
"here's some string!"
> stopDebugCps ctrl
testCps: end
"some return value"

Use two monads without a transformer

In order to understand how to use monad transformers, I wrote the following code without one. It reads standard input line by line and displays each line reversed until an empty line is encountered. It also counts the lines using State and in the end displays the total number.
import Control.Monad.State
main = print =<< fmap (`evalState` 0) go where
go :: IO (State Int Int)
go = do
l <- getLine
if null l
then return get
else do
putStrLn (reverse l)
-- another possibility: fmap (modify (+1) >>) go
rest <- go
return $ do
modify (+1)
rest
I wanted to add the current line number before each line. I was able to do it with StateT:
import Control.Monad.State
main = print =<< evalStateT go 0 where
go :: StateT Int IO Int
go = do
l <- lift getLine
if null l
then get
else do
n <- get
lift (putStrLn (show n ++ ' ' : reverse l))
modify (+1)
go
My question is: how to do the same in the version without monad transformers?

The problem you're having is that the hand-unrolling of StateT s IO a is s -> IO (s, a), not IO (s -> (s, a))! Once you have this insight, it's pretty easy to see how to do it:
go :: Int -> IO (Int, Int)
go s = do
l <- getLine
if null l
then return (s, s)
else do
putStrLn (show s ++ ' ' : reverse l)
go (s+1)

You'd just need to run the accumulated state computation on every line. This is O(n²) time, but since your first program is already using O(n) space, that's not too terrible. Of course, the StateT approach is superior in pretty much every way! If you really want to do it "by hand" and not pay an efficiency price, just manage the state by hand instead of building a state transformer at all. You're really not getting any benefit by using State instead of Int in the first program.

Maybe this is what you are looking for?
main = print =<< fmap (`evalState` 0) (go get) where
go :: State Int Int -> IO (State Int Int)
go st = do
l <- getLine
if null l
then return (st >>= \_ -> get)
else do
let ln = evalState st 0
putStrLn(show ln ++ ' ' : reverse l)
go (st >>= \_ -> modify (+1) >>= \_ -> get)
The idea here is to make go tail recursive, building up your state computation, which you can then evaluate at each step.
EDIT
This version will bound the size of the state computation to a constant size, although under lazy evaluation, when the previous state computation is forced, we should be able to reuse it without re-evaluating it, so I'm guessing that these are essentially the same...
main = print =<< fmap (`evalState` 0) (go get) where
go :: State Int Int -> IO (State Int Int)
go st = do
l <- getLine
if null l
then return st
else do
let ln = evalState st 0
putStrLn(show ln ++ ' ' : reverse l)
go (modify (\s -> s+ln+1) >>= \_ -> get)

Technique for reading in multiple lines for Haskell IO

Basically I would like to find a way so that a user can enter the number of test cases and then input their test cases. The program can then run those test cases and print out the results in the order that the test cases appear.
So basically I have main which reads in the number of test cases and inputs it into a function that will read from IO that many times. It looks like this:
main = getLine >>= \tst -> w (read :: String -> Int) tst [[]]
This is the method signature of w: w :: Int -> [[Int]]-> IO ()
So my plan is to read in the number of test cases and have w run a function which takes in each test case and store the result into the [[]] variable. So each list in the list will be an output. w will just run recursively until it reaches 0 and print out each list on a separate line. I'd like to know if there is a better way of doing this since I have to pass in an empty list into w, which seems extraneous.

As #bheklilr mentioned you can't update a value like [[]]. The standard functional approach is to pass an accumulator through a a set of recursive calls. In the following example the acc parameter to the loop function is this accumulator - it consists of all of the output collected so far. At the end of the loop we return it.
myTest :: Int -> [String]
myTest n = [ "output line " ++ show k ++ " for n = " ++ show n | k <- [1..n] ]
main = do
putStr "Enter number of test cases: "
ntests <- fmap read getLine :: IO Int
let loop k acc | k > ntests = return $ reverse acc
loop k acc = do
-- we're on the kth-iteration
putStr $ "Enter parameter for test case " ++ show k ++ ": "
a <- fmap read getLine :: IO Int
let output = myTest a -- run the test
loop (k+1) (output:acc)
allOutput <- loop 1 []
print allOutput
As you get more comfortable with this kind of pattern you'll recognize it as a fold (indeed a monadic fold since we're doing IO) and you can implement it with foldM.
Update: To help explain how fmap works, here are equivalent expressions written without using fmap:
With fmap: Without fmap:
n <- fmap read getLine :: IO [Int] line <- getLine
let n = read line :: Int
vals <- fmap (map read . words) getLine line <- getLine
:: IO [Int] let vals = (map read . words) line :: [Int]
Using fmap allows us to eliminate the intermediate variable line which we never reference again anyway. We still need to provide a type signature so read knows what to do.

The idiomatic way is to use replicateM:
runAllTests :: [[Int]] -> IO ()
runAllTests = {- ... -}
main = do
numTests <- readLn
tests <- replicateM numTests readLn
runAllTests tests
-- or:
-- main = readLn >>= flip replicateM readLn >>= runAllTests

Why is putStrLn not atomic?

To practice concurrent programming, I wrote the following (suboptimal) program, which repeatedly calculates the first prime bigger than whatever the user inputs:
import Control.Concurrent
import Control.Concurrent.Chan
import Control.Monad (forever)
primeAtLeast n = -- Some pure code that looks up the first prime at least as big as n
outputPrimeAtLeast n = putStrLn $ show $ (n, primeAtLeast n)
main = do
chan <- newChan
worker <- forkIO $ forever $ readChan chan >>= outputPrimeAtLeast
forever $ (readLn :: (IO Int)) >>= (writeChan chan)
killThread worker
I want to have a worker thread in the background that does the actual calculation and outputs (n, primeAtLeast n) as soon as it's finished.
What it's doing now: As soon as I enter a number n, it immediately outputs (n,, returns the control to the main thread, calculates primeAtLeast n in the background and outputs the second half primeAtLeast n) as soon as it's finished.
So is putStrLn not atomic? Or where is the problem?

Try this:
outputPrimeAtLeast n = let p = primeAtLeast n in p `seq` putStrLn $ show (n, p)
The above forces the computation of the prime before the putStrLn is run.
Further, you may use print instead of putStrLn . show:
outputPrimeAtLeast n = let p = primeAtLeast n in p `seq` print (n, p)
Alternatively, you may use a putStrLn function which forces every single character before starting printing anything.
strictPutStrLn :: Show a => a -> IO ()
strictPutStrLn x = let str = show x in str `listSeq` putStrLn str
listSeq :: [a] -> b -> b
listSeq [] w = w
listSeq (x:xs) w = x `seq` listSeq xs w

MVars are blocking indefinitely; but only in certain scenarios.

First, because this is about a specific case, I haven't reduced the code at all, so it will be quite long, and in 2 parts (Helper module, and the main).
SpawnThreads in ConcurHelper takes a list of actions, forks them, and gets an MVar containing the result of the action. It them combines the results, and returns the resulting list. It works fine in certain cases, but blocks indefinitely on others.
If I give it a list of putStrLn actions, it executes them fine, then returns the resulting ()s (yes, I know running print commands on different threads at the same time is bad in most cases).
If I try running multiTest in Scanner though (which takes either scanPorts or scanAddresses, the scan range, and the number of threads to use; then splits the scan range over the threads, and passes the list of actions to SpawnThreads), it will block indefinitely. The odd thing is, according to the debug prompts scattered around ConcurHelper, on each thread, ForkIO is returning before the MVar is filled. This would make sense if it wasn't in a do block, but shouldn't the actions be performed sequentially? (I don't know if this is related to the problem or not; it's just something I noticed while attempting to debug it).
I've thought it out step by step, and if it's executing in the order laid out in spawnThreads, the following should happen:
An empty MVar should be created inside forkIOReturnMVar, and passed to mVarWrapAct.
mVarWrapAct should execute the action, and put the result in the MVar (this is where the problem seems to lie. "MVar filled" is never shown, suggesting the MVar is never put into)
getResults should then take from the resulting list of MVars, and return the results
If point #2 isn't the issue, I can see where the problem would be (and if it is the issue, I can't see why putMVar never executes. Inside the scanner module, the only real function of interest for this question is multiTest. I only included the rest so it could be run).
To do a simple test, you can run the following:
spawnThreads [putStrLn "Hello", putStrLn "World"] (should return [(),()])
multiTest (scanPorts "127.0.0.1") 1 (0,5) (Creates the MVar, hangs for a sec, then crashes with the aforementioned error)
Any help in understanding whats going on here would be appreciated. I can't see what the difference between the 2 use cases are.
Thank you
(And I'm using this atrocious exception handling system because IO errors don't give codes for specific network exceptions, so I've been left with parsing messages to find out what happened)
Main:
module Scanner where
import Network
import Network.Socket
import System.IO
import Control.Exception
import Control.Concurrent
import ConcurHelper
import Data.Maybe
import Data.Char
import NetHelp
data NetException = NetNoException | NetTimeOut | NetRefused | NetHostUnreach
| NetANotAvail | NetAccessDenied | NetAddrInUse
deriving (Show, Eq)
diffExcept :: Either SomeException Handle -> Either NetException Handle
diffExcept (Right h) = Right h
diffExcept (Left (SomeException m))
| err == "WSAETIMEDOUT" = Left NetTimeOut
| err == "WSAECONNREFUSED" = Left NetRefused
| err == "WSAEHOSTUNREACH" = Left NetHostUnreach
| err == "WSAEADDRNOTAVAIL" = Left NetANotAvail
| err == "WSAEACCESS" = Left NetAccessDenied
| err == "WSAEADDRINUSE" = Left NetAddrInUse
| otherwise = error $ show m
where
err = reverse . dropWhile (== ')') . reverse . dropWhile (/='W') $ show m
extJust :: Maybe a -> a
extJust (Just a) = a
selectJusts :: IO [Maybe a] -> IO [a]
selectJusts mayActs = do
mays <- mayActs; return . map extJust $ filter isJust mays
scanAddresses :: Int -> Int -> Int -> IO [String]
scanAddresses port minAddr maxAddr =
selectJusts $ mapM (\addr -> do
let sAddr = "192.168.1." ++ show addr
print $ "Trying " ++ sAddr ++ " " ++ show port
connection <- testConn sAddr port
if isJust connection
then do hClose $ extJust connection; return $ Just sAddr
else return Nothing) [minAddr..maxAddr]
scanPorts :: String -> Int -> Int -> IO [Int]
scanPorts addr minPort maxPort =
selectJusts $ mapM (\port -> do
--print $ "Trying " ++ addr ++ " " ++ show port
connection <- testConn addr port
if isJust connection
then do hClose $ extJust connection; return $ Just port
else return Nothing) [minPort..maxPort]
main :: IO ()
main = do
withSocketsDo $ do
putStrLn "Scan Addresses or Ports? (a/p)"
choice <- getLine
if (toLower $ head choice) == 'a'
then do
putStrLn "On what port?"
sPort <- getLine
addrs <- scanAddresses (read sPort :: Int) 0 255
print addrs
else do
putStrLn "At what address?"
address <- getLine
ports <- scanPorts address 0 9999
print ports
main
testConn :: HostName -> Int -> IO (Maybe Handle)
testConn host port = do
result <- try $ timedConnect 1 host port
let result' = diffExcept result
case result' of
Left e -> do putStrLn $ "\t" ++ show e; return Nothing
Right h -> return $ Just h
setPort :: AddrInfo -> Int -> AddrInfo
setPort addInf nPort = case addrAddress addInf of
(SockAddrInet _ host) -> addInf { addrAddress = (SockAddrInet (fromIntegral nPort) host)}
getHostAddress :: HostName -> Int -> IO SockAddr
getHostAddress host port = do
addrs <- getAddrInfo Nothing (Just host) Nothing
let adInfo = head addrs
newAdInfo = setPort adInfo port
return $ addrAddress newAdInfo
timedConnect :: Int -> HostName -> Int -> IO Handle
timedConnect time host port = do
s <- socket AF_INET Stream defaultProtocol
setSocketOption s RecvTimeOut time; setSocketOption s SendTimeOut time
addr <- getHostAddress host port
connect s addr
socketToHandle s ReadWriteMode
multiTest :: (Int -> Int -> IO a) -> Int -> (Int, Int) -> IO [a]
multiTest partAction threads (mi,ma) =
spawnThreads $ recDiv [mi,perThread..ma]
where
perThread = ((ma - mi) `div` threads) + 1
recDiv [] = []
recDiv (curN:restN) =
partAction (curN + 1) (head restN) : recDiv restN
Helper:
module ConcurHelper where
import Control.Concurrent
import System.IO
spawnThreads :: [IO a] -> IO [a]
spawnThreads actions = do
ms <- mapM (\act -> do m <- forkIOReturnMVar act; return m) actions
results <- getResults ms
return results
forkIOReturnMVar :: IO a -> IO (MVar a)
forkIOReturnMVar act = do
m <- newEmptyMVar
putStrLn "Created MVar"
forkIO $ mVarWrapAct act m
putStrLn "Fork returned"
return m
mVarWrapAct :: IO a -> MVar a -> IO ()
mVarWrapAct act m = do a <- act; putMVar m a; putStrLn "MVar filled"
getResults :: [MVar a] -> IO [a]
getResults mvars = do
unpacked <- mapM (\m -> do r <- takeMVar m; return r) mvars
putStrLn "MVar taken from"
return unpacked

Your forkIOReturnMVar isn't exception safe: whenever act throws, the MVar isn't going to be filled.
Minimal example
import ConcurHelper
main = spawnThreads [badOperation]
where badOperation = do
error "You're never going to put something in the MVar"
return True
As you can see, badOperation throws, and therefore the MVar won't get filled in mVarWrapAct.
Fix
Fill the MVar with an appropriate value if you encounter an exception. Since you cannot provide a default value for all possible types a, it's better to use MVar (Maybe a) or MVar (Either b a) as you already do in your network code.
In order to catch the exceptions, use one of the operations provided in Control.Exception. For example, you could use onException:
mVarWrapAct :: IO a -> MVar (Maybe a) -> IO ()
mVarWrapAct act m = do
onException (act >>= putMVar m . Just) (putMVar m Nothing)
putStrLn "MVar filled"
However, you might want to preserve the actual exception for more information. In this case you could simply use catch together with Either SomeException a :
mVarWrapAct :: IO a -> MVar (Either SomeException a) -> IO ()
mVarWrapAct act m = do
catch (act >>= putMVar m . Right) (putMVar m . Left)
putStrLn "MVar filled"

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Why is this Haskell program actually concurrent? - haskell

Related

How can I use REPL with CPS function?

Use two monads without a transformer

Technique for reading in multiple lines for Haskell IO

Why is putStrLn not atomic?

MVars are blocking indefinitely; but only in certain scenarios.

Categories

Resources