In the example below, I'd like to be able to call the 'ls' function directly (see the last commented out line of the example) but I have not been able to figure out the correct syntax.
Thanks in advance.
module Main (main) where
import System.Directory
ls :: FilePath -> IO [FilePath]
ls dir = do
fileList <- getDirectoryContents dir
return fileList
main = do
fileList <- ls "."
mapM putStrLn fileList
-- How can I just use the ls call directly like in the following (which doesn't compile)?
-- mapM putStrLn (ls".")
You can't just use
mapM putStrLn (ls ".")
because ls "." has type IO [FilePath], and mapM putStrLn expects just [FilePath], so you need to use bind, or >>= in Haskell. So your actual line would be
main = ls "." >>= mapM_ putStrLn
Notice the mapM_ function, not just mapM. mapM will give you IO [()] type, but for main you need IO (), and that's what mapM_ is for.
Related
from the ghci> prompt, I would like to readFile "filename.text" and pass the produced string as an argument to the words function to convert sentences to wordlists.
Thanks
You can execute your pure function (words) "inside" the IO monad returned by readFile.
readFile :: FilePath -> IO String
and
words :: String -> [String]
so you can simply do
fmap words $ readFile "filename.txt"
which has the type IO [String]. If you do this in ghci (which is itself "inside" of an IO monad) you will get the word list displayed.
EDIT:
If you want to apply multiple transformations you may want to cleanly separate the pure part (based on #Davislor's solution from comments):
readFile "filename.txt" >>= (return . sort . words) >>= mapM_ putStrLn
The return here just lift to IO, you could simply replace return with mapM_ putStrLn instead (sorter, but less clean distinction).
Another solutions may be applicative style:
sort <$> words <$> readFile "filename.txt" >>= mapM_ putStrLn
or using do notation (imperative style):
do ; f <- readFile "filename.txt"; let out = sort (words f) in mapM_ putStrLn out
(which is ugly because I used ; instead of newline) or simply (less imperatively :) :
do ; f <- readFile "filename.txt"; mapM_ putStrLn $ sort $ words f
I want to read a file, process it, and write the results to another file; the input file name is to be supplied through a console argument, and the output file name is generated from the input file name.
The catch is I want it to transparently “fail over” to stdin/stdout if no arguments are supplied; essentially, in case a file name is supplied, I redirect stdin/stdout to the respective file names so I can transparently use interact whether the file name was supplied or not.
Here's the code hacked together with dummy output in a superfluous else. What will be the proper, idiomatic form of doing it?
It probably could have something to do with Control.Monad's when or guard, as was pointed out in a similar question, but maybe somebody wrote this already.
import System.IO
import Data.Char(toUpper)
import System.Environment
import GHC.IO.Handle
main :: IO ()
main = do
args <- getArgs
if(not $ null args) then
do
print $ "working with "++ (head args)
finHandle <- openFile (head args) ReadMode --open the supplied input file
hDuplicateTo finHandle stdin --bind stdin to finName's handle
foutHandle <- openFile ((head args) ++ ".out") WriteMode --open the output file for writing
hDuplicateTo foutHandle stdout --bind stdout to the outgoing file
else print "working through stdin/redirect" --get to know
interact ((++) "Here you go---\n" . map toUpper)
There's nothing very special about interact - here is its definition:
interact :: (String -> String) -> IO ()
interact f = do s <- getContents
putStr (f s)
How about something like this:
import System.Environment
import Data.Char
main = do
args <- getArgs
let (reader, writer) =
case args of
[] -> (getContents, putStr)
(path : _) -> let outpath = path ++ ".output"
in (readFile path, writeFile outpath)
contents <- reader
writer (process contents)
process :: String -> String
process = (++) "Here you go---\n" . map toUpper
Based on the command line arguments we set reader and writer to the IO-actions which will read the input and write the output.
This seems fairly idiomatic to me already. The one note I have is to avoid head, as it is an unsafe function (it can throw a runtime error). In this case it is fairly easy to do so by using case to pattern match.
main :: IO ()
main = do
args <- getArgs
case args of
fname:_ -> do
print $ "working with " ++ fname
finHandle <- openFile fname ReadMode
hDuplicateTo finHandle stdin
foutHandle <- openFile (fname ++ ".out") WriteMode
hDuplicateTo foutHandle stdout
[] -> do
print "working through stdin/redirect"
interact ((++) "Here you go---\n" . map toUpper)
Sorry, this is probably really dumb, but can someone explain me why this program doesn't compile? I get Couldn't match expected type 'a1 -> String' with actual type 'IO String'.
import System.Environment
main = do
[first, last] <- getArgs
firstnames <- lines . readFile "firstnames_male"
lastnames <- lines . readFile "lastnames"
print firstnames
You can't do lines . readFile "lastnames".
The readFile function returns an IO String, not a String.
You can, however, use the fmap function (or the <$> operator) to achieve this:
main = do
[first, last] <- argArgs
firstnames <- lines `fmap` readFile "firstnames_males"
...
This works because IO is a functor.
I'm writing a program that creates a shell script containing one command for each image file in a directory. There are 667,944 images in the directory, so I need to handle the strictness/laziness issue properly.
Here's a simple example that gives me Stack space overflow. It does work if I give it more space using +RTS -Ksize -RTS, but it should be able run with little memory, producing output immediately. So I've been reading the stuff about strictness in the Haskell wiki and the wikibook on Haskell, trying to figure out how to fix the problem, and I think it's one of the mapM commands that is giving me grief, but I still don't understand enough about strictness to sort the problem.
I've found some other questions on SO that seem relevant (Is mapM in Haskell strict? Why does this program get a stack overflow? and Is Haskell's mapM not lazy?), but enlightenment still eludes me.
import System.Environment (getArgs)
import System.Directory (getDirectoryContents)
genCommand :: FilePath -> FilePath -> FilePath -> IO String
genCommand indir outdir file = do
let infile = indir ++ '/':file
let angle = 0 -- have to actually read the file to calculate this for real
let outfile = outdir ++ '/':file
return $! "convert " ++ infile ++ " -rotate " ++ show angle ++
" -crop 143x143+140+140 " ++ outfile
main :: IO ()
main = do
putStrLn "#!/bin/sh"
(indir:outdir:_) <- getArgs
files <- getDirectoryContents indir
let imageFiles = filter (`notElem` [".", ".."]) files
commands <- mapM (genCommand indir outdir) imageFiles
mapM_ putStrLn commands
EDIT: TEST #1
Here's the newest version of the example.
import System.Environment (getArgs)
import System.Directory (getDirectoryContents)
import Control.Monad ((>=>))
genCommand :: FilePath -> FilePath -> FilePath -> IO String
genCommand indir outdir file = do
let infile = indir ++ '/':file
let angle = 0 -- have to actually read the file to calculate this for real
let outfile = outdir ++ '/':file
return $! "convert " ++ infile ++ " -rotate " ++ show angle ++
" -crop 143x143+140+140 " ++ outfile
main :: IO ()
main = do
putStrLn "TEST 1"
(indir:outdir:_) <- getArgs
files <- getDirectoryContents indir
putStrLn $ show (length files)
let imageFiles = filter (`notElem` [".", ".."]) files
-- mapM_ (genCommand indir outdir >=> putStrLn) imageFiles
mapM_ (\filename -> genCommand indir outdir filename >>= putStrLn) imageFiles
I compile it with the command ghc --make -O2 amy2.hs -rtsopts. If I run it with the command ./amy2 ~/nosync/GalaxyZoo/table2/images/ wombat, I get
TEST 1
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
If I instead run it with the command ./amy2 ~/nosync/GalaxyZoo/table2/images/ wombat +RTS -K20M, I get the correct output...eventually:
TEST 1
667946
convert /home/amy/nosync/GalaxyZoo/table2/images//587736546846572812.jpeg -rotate 0 -crop 143x143+140+140 wombat/587736546846572812.jpeg
convert /home/amy/nosync/GalaxyZoo/table2/images//587736542558617814.jpeg -rotate 0 -crop 143x143+140+140 wombat/587736542558617814.jpeg
...and so on.
This isn't really a strictness issue(*), but an order of evaluation issue. Unlike lazily evaluated pure values, monadic effects must happen in deterministic order. mapM executes every action in the given list and gathers the results, but it cannot return until the whole list of actions is executed, so you don't get the same streaming behavior as with pure list functions.
The easy fix in this case is to run both genCommand and putStrLn inside the same mapM_. Note that mapM_ doesn't suffer from the same issue since it is not building an intermediate list.
mapM_ (genCommand indir outdir >=> putStrLn) imageFiles
The above uses the "kleisli composition operator" >=> from Control.Monad which is like the function composition operator . except for monadic functions. You can also use the normal bind and a lambda.
mapM_ (\filename -> genCommand indir outdir filename >>= putStrLn) imageFiles
For more complex I/O applications where you want better composability between small, monadic stream processors, you should use a library such as conduit or pipes.
Also, make sure you are compiling with either -O or -O2.
(*) To be exact, it is also a strictness issue, because in addition to building a large, intermediate list in memory, laziness causes mapM to build unnecessary thunks and use up stack.
EDIT: So it seems the main culprit might be getDirectoryContents. Looking at the function's source code, it essentially does the same kind of list accumulation internally as mapM.
In order to do streaming directory listing, we need to use System.Posix.Directory which unfortunately makes the program incompatible with non-POSIX systems (like Windows). You can stream the directory contents by e.g. using continuation passing style
import System.Environment (getArgs)
import Control.Monad ((>=>))
import System.Posix.Directory (openDirStream, readDirStream, closeDirStream)
import Control.Exception (bracket)
genCommand :: FilePath -> FilePath -> FilePath -> IO String
genCommand indir outdir file = do
let infile = indir ++ '/':file
let angle = 0 -- have to actually read the file to calculate this for real
let outfile = outdir ++ '/':file
return $! "convert " ++ infile ++ " -rotate " ++ show angle ++
" -crop 143x143+140+140 " ++ outfile
streamingDirContents :: FilePath -> (FilePath -> IO ()) -> IO ()
streamingDirContents root cont = do
let loop stream = do
fp <- readDirStream stream
case fp of
[] -> return ()
_ | fp `notElem` [".", ".."] -> cont fp >> loop stream
| otherwise -> loop stream
bracket (openDirStream root) loop closeDirStream
main :: IO ()
main = do
putStrLn "TEST 1"
(indir:outdir:_) <- getArgs
streamingDirContents indir (genCommand indir outdir >=> putStrLn)
Here's how you could do the same thing using conduit:
import System.Environment (getArgs)
import System.Posix.Directory (openDirStream, readDirStream, closeDirStream)
import Data.Conduit
import qualified Data.Conduit.List as L
import Control.Monad.IO.Class (liftIO, MonadIO)
genCommand :: FilePath -> FilePath -> FilePath -> IO String
genCommand indir outdir file = do
let infile = indir ++ '/':file
let angle = 0 -- have to actually read the file to calculate this for real
let outfile = outdir ++ '/':file
return $! "convert " ++ infile ++ " -rotate " ++ show angle ++
" -crop 143x143+140+140 " ++ outfile
dirSource :: (MonadResource m, MonadIO m) => FilePath -> Source m FilePath
dirSource root = do
bracketP (openDirStream root) closeDirStream $ \stream -> do
let loop = do
fp <- liftIO $ readDirStream stream
case fp of
[] -> return ()
_ -> yield fp >> loop
loop
main :: IO ()
main = do
putStrLn "TEST 1"
(indir:outdir:_) <- getArgs
let files = dirSource indir $= L.filter (`notElem` [".", ".."])
commands = files $= L.mapM (liftIO . genCommand indir outdir)
runResourceT $ commands $$ L.mapM_ (liftIO . putStrLn)
The nice thing about conduit is that you regain the ability to compose pieces of functionality with things like conduit versions of filter and mapM. The $= operator streams stuff forward in the chain and $$ connects the stream to a consumer.
The not-so-nice thing is that real world is complicated and writing efficient and robust code requires us to jump through some hoops with resource management. That's why all the operations work in the ResourceT monad transformer which keeps track of e.g. open file handles and cleans them up promptly and deterministically when they are no longer needed or e.g. if the computation gets aborted by an exception (this is in contrast to using lazy I/O and relying on the garbage collector to eventually release any scarce resources).
However, this means that we a) need to run the final resulting conduit operation with runResourceT and b) we need to explicitly lift I/O operations to the transformed monad using liftIO instead of being able to directly write e.g. L.mapM_ putStrLn.
I currently have this code which will perform the main' function on each of the filenames in the list files.
Ideally I have been trying to combine main and main' but I haven't made much progress. Is there a better way to simplify this or will I need to keep them separate?
{- Start here -}
main :: IO [()]
main = do
files <- getArgs
mapM main' files
{- Main's helper function -}
main' :: FilePath -> IO ()
main' file = do
contents <- readFile file
case (runParser parser 0 file $ lexer contents) of Left err -> print err
Right xs -> putStr xs
Thanks!
Edit: As most of you are suggesting; I was trying a lambda abstraction for this but wasn't getting it right. - Should've specified this above. With the examples I see this better.
The Control.Monad library defines the function forM which is mapM is reverse arguments. That makes it easier to use in your situation, i.e.
main :: IO ()
main = do
files <- getArgs
forM_ files $ \file -> do
contents <- readFile file
case (runParser f 0 file $ lexer contents) of
Left err -> print err
Right xs -> putStr xs
The version with the underscore at the end of the name is used when you are not interested in the resulting list (like in this case), so main can simply have the type IO (). (mapM has a similar variant called mapM_).
You can use forM, which equals flip mapM, i.e. mapM with its arguments flipped, like this:
forM_ files $ \file -> do
contents <- readFile file
...
Also notice that I used forM_ instead of forM. This is more efficient when you are not interested in the result of the computation.