Haskell: Read a data file as array for computations - haskell

Set-up
I have a data file in the following format: 2D coordinates separated by a space on each line,
1.23 4.0
23.7 23.1
60.4 4.2
To parse this, I made the following function (which I would not mind feedback on): it takes the file's path as string, parses the lines and columns as doubles,
fileData :: String -> IO [[Float]]
fileData str = readFile str >>=
\file -> return $ map ((map $ \x -> read x::Float) . words $) $ lines file
The ouput given the example file above is an IO [[Float]]:
[[1.23, 4.0],
[23.7, 23.1],
[60.4, 4.2]]
Questions
What is the best way to handle this array, the output of fileData? For example, how should one go about doing computations with values in this array? Eventually, I will want to manipulate these values using hMatrix.
To obtain the zeroth element of the array [1.23, 4.0], I tried running
main :: IO ()
main = fileData "file.txt" >>=
\file -> print $ (flip (!!) 0) <$> file
but it returns the zeroth elements of each sub-array [1.23, 23.7, 60.4], which matches with the value of (print $ map (flip (!!) 0)) on the array as if was [[Float]] print $ map (flip (!!) 0) xs, where xs is the array as if it was [[Float]].
Update
I thought using fmap, as (flip (!!) 0) <$> file in main was necessary because of the type of file there, but it turns out that file !! 0 works as intended; this makes main revised as:
main :: IO ()
main = fileData "file.txt" >>=
\file -> print $ file !! 0
However, now I am confused as to what the type of file in fileData "file.txt" >>= \file -> print $ file !! 0. I thought it was of type IO [[Float]] so applying (!!) would not work directly, because of the IO monad. Is there a better way to think about this?

Related

IO woes when seeking sizes of directory contents?

I'm learning Haskell, and my goal today is to write a function sizeOf :: FilePath -> IO Integer (calculate the size of a file or folder), with the logic
If path is a file, System.Directory.getFileSize path
If path is a directory, get a list of its contents, recursively run this function on them, and sum the results
If it's something other than a file or directory, return 0
Here's how I'd implement it in Ruby, to illustrate (Ruby notes: map's argument is the equivalent of \d -> size_of d, reduce :+ is foldl (+) 0, any function ending ? returns a bool, returns are implicit):
def size_of path
if File.file? path
File.size path
elsif File.directory? path
Dir.glob(path + '/*').map { |d| size_of d }.reduce :+
end
end
Here's my crack at it in Haskell:
sizeOf :: FilePath -> IO Integer
sizeOf path =
do
isFile <- doesFileExist path
if isFile then
getFileSize path
else do
isDir <- doesDirectoryExist path
if isDir then
sum $ map sizeOf $ listDirectory path
else
return 0
I know where my problem is. sum $ map sizeOf $ listDirectory path, where listDirectory path returns an IO [FilePath] and not a FilePath. But... I can't really imagine any solution solving this. <$> instead of $ was the first thing that came to mind, since <$> I understood to be something that let a function of a -> b become a Context a -> Context b. But... I guess IO isn't like that?
I spent about two hours puzzling over the logic there. I tried it on other examples. Here's a relevant discovery that threw me: if double = (*) 2, then map double [1,2,3] == [2,4,6], but map double <$> [return 1, return 2, return 3] == [[2],[4],[6]]... it wraps them in a list. I think that's what happening to me but I'm way out of my depth.
You'd need
sum <$> (listDirectory path >>= mapM sizeOf)
Explanation:
The idea to use sum over an IO [Integer] is ok, so we need to get such a thing.
listDirectory path gives us IO [FilePath], so we need to pass each file path to sizeOf. This is what >>= together with mapM does.
Note that map alone would give us [IO Integer] that is why we need mapM
how about (using Control.Monad.Extra):
du :: FilePath -> IO Integer
du path = ifM (doesFileExist path)
(getFileSize path) $
ifM (doesDirectoryExist path)
(sum <$> (listDirectory path >>= mapM (du .
(addTrailingPathSeparator path ++))))
(return 0)
I believe you need to add the path to the output of listDirectory for successful recursive descent as listDirectory only returns filenames without the path, which is required for the subsequent calls to du.
The type of ifM is probably obvious, but is ifM :: Monad m => m Bool -> m a -> m a -> m a

Haskell - Trying to apply a function to lines of multiple numbers

I am new to Haskell and I am trying to apply a function (gcd) to input on standard in, which is line separated and each line contains no less or more than two numbers. Here is an example of my input:
3
10 4
1 100
288 240
I am currently breaking up each line into a tuple of both numbers, but I am having trouble figuring out how to separate these tuples and apply a function to them. Here is what I have so far:
import Data.List
main :: IO ()
main = do
n <- readLn :: IO Int
content <- getContents
let
points = map (\[x, y] -> (x, y)). map (map (read::String->Int)). map words. lines $ content
ans = gcd (fst points :: Int) (snd points :: Int)
print ans
Any information as two a good place to start looking for this answer would be much appreciated. I have read through the Learning Haskell tutorial and have not found any information of this particular problem.
You are pretty close. There is no reason to convert to a tuple or list of tuples before calling gcd.
main = do
contents <- getContents
print $ map ((\[x,y] -> gcd (read x) (read y)) . words) . lines $ contents
All the interesting stuff is between print and contents. lines will split the contents into lines. map (...) applies the function to each line. words splits the line into words. \[x,y] -> gcd (read x) (read y) will match on a list of two strings (and throw an error otherwise - not good practice in general but fine for a simple program like this), read those strings as Integers and compute their GCD.
If you want to make use of lazy IO, in order to print each result after you enter each line, you can change it as follows.
main = do
contents <- getContents
mapM_ (print . (\[x,y] -> gcd (read x) (read y)) . words) . lines $ contents
Or, you can do it in a more imperative style:
import Control.Monad
main = do
n <- readLn
replicateM_ n $ do
[x, y] <- (map read . words) `liftM` getLine
print $ gcd x y

Iteratively printing every integer in a List

Say I have a List of integers l = [1,2]
Which I want to print to stdout.
Doing print l produces [1,2]
Say I want to print the list without the braces
map print l produces
No instance for (Show (IO ())) arising from a use of `print'
Possible fix: add an instance declaration for (Show (IO ()))
In a stmt of an interactive GHCi command: print it
`:t print
print :: Show a => a -> IO ()
So while I thought this would work I went ahead and tried:
map putStr $ map show l
Since I suspected a type mismatch from Integer to String was to blame. This produced the same error message as above.
I realize that I could do something like concatenating the list into a string, but I would like to avoid that if possible.
What's going on? How can I do this without constructing a string from the elements of the List?
The problem is that
map :: (a -> b) -> [a] -> [b]
So we end up with [IO ()]. This is a pure value, a list of IO actions. It won't actually print anything. Instead we want
mapM_ :: (a -> IO ()) -> [a] -> IO ()
The naming convention *M means that it operates over monads and *_ means we throw away the value. This is like map except it sequences each action with >> to return an IO action.
As an example mapM_ print [1..10] will print each element on a new line.
Suppose you're given a list xs :: [a] and function f :: Monad m => a -> m b. You want to apply the function f to each element of xs, yielding a list of actions, then sequence these actions. Here is how I would go about constructing a function, call it mapM, that does this. In the base case, xs = [] is the empty list, and we simply return []. In the recursive case, xs has the form x : xs. First, we want to apply f to x, giving the action f x :: m b. Next, we want recursively call mapM on xs. The result of performing the first step is a value, say y; the result of performing the second step is a list of values, say ys. So we collect y and ys into a list, then return them in the monad:
mapM :: Monad m => (a -> m b) -> [a] -> m [b]
mapM f [] = return []
mapM f (x : xs) = f x >>= \y -> mapM f ys >>= \ys -> return (y : ys)
Now we can map a function like print, which returns an action in the IO monad, over a list of values to print: mapM print [1..10] does precisely this for the list of integers from one through ten. There is a problem, however: we aren't particularly concerned about collecting the results of printing operations; we're primarily concerned about their side effects. Instead of returning y : ys, we simply return ().
mapM_ :: Monad m => (a -> m b) ->[a] -> m ()
mapM_ f [] = return ()
mapM_ f (x : xs) = f x >> mapM_ f xs
Note that mapM and mapM_ can be defined without explicit recursion using the sequence and sequence_ functions from the standard library, which do precisely what their names imply. If you look at the source code for mapM and mapM_ in Control.Monad, you will see them implemented that way.
Everything in Haskell is very strongly typed, including code to perform IO!
When you write print [1, 2], this is just a convenience wrapper for putStrLn (show [1, 2]), where show is a function that turns a (Show'able) object into a string. print itself doesn't do anything (in the side effect sense of do), but it outputs an IO() action, which is sort of like a mini unrun "program" (if you excuse the sloppy language), which isn't "run" at its creation time, but which can be passed around for later execution. You can verify the type in ghci
> :t print [1, 2]
print [1, 2]::IO()
This is just an object of type IO ().... You could throw this away right now and nothing would ever happen. More likely, if you use this object in main, the IO code will run, side effects and all.
When you map multiple putStrLn (or print) functions onto a list, you still get an object whose type you can view in ghci
> :t map print [1, 2]
map print [1, 2]::[IO()]
Like before, this is just an object that you can pass around, and by itself it will not do anything. But unlike before, the type is incorrect for usage in main, which expects an IO() object. In order to use it, you need to convert it to this type.
There are many ways to do this conversion.... One way that I like is the sequence function.
sequence $ map print [1, 2]
which takes a list of IO actions (ie- mini "programs" with side effects, if you will forgive the sloppy language), and sequences them together as on IO action. This code alone will now do what you want.
As jozefg pointed out, although sequence works, sequence_ is a better choice here....
Sequence not only concatinates the stuff in the IO action, but also puts the return values in a list.... Since print's return value is IO(), the new return value becomes a useless list of ()'s (in IO). :)
Using the lens library:
[1,2,3] ^! each . act print
You might write your own function, too:
Prelude> let l = [1,2]
Prelude> let f [] = return (); f (x:xs) = do print x; f xs
Prelude> f l
1
2

Using lookup with an IO list?

I am getting the contents of a file and transforming it into a list of form:
[("abc", 123), ("def", 456)]
with readFile, lines, and words.
Right now, I can manage to transform the resulting list into type IO [(String, Int)].
My problem is, when I try to make a function like this:
check x = lookup x theMap
I get this error, which I'm not too sure how to resolve:
Couldn't match expected type `[(a0, b0)]'
with actual type `IO [(String, Int)]'
In the second argument of `lookup', namely `theMap'
theMap is essentially this:
getLines :: String -> IO [String]
getLines = liftM lines . readFile
tuplify [x,y] = (x, read y :: Int)
theMap = do
list <- getLines "./test.txt"
let l = map tuplify (map words list)
return l
And the file contents are:
abc 123
def 456
Can anyone explain what I'm doing wrong and or show me a better solution? I just started toying around with monads a few hours ago and am running into a few bumps along the way.
Thanks
You will have to "unwrap" theMap from IO. Notice how you're already doing this to getLines by:
do
list <- getlines
[...]
return (some computation on list)
So you could have:
check x = do
m <- theMap
return . lookup x $ m
This is, in fact, an antipattern (albeit an illustrative one,) and you would be better off using the functor instance, ie. check x = fmap (lookup x) theMap

readIO parse error haskell, this close

So I've been enjoying this challenging language, I am currently working on an assignment for school.
This is what it says: I need to prompt the user for a list of numbers, then display the average of the list , I am so close to figuring it out. However I get this weird parse error:
"Exception: user error (Prelude.readIO: no parse)"
Here is my code:
module Main (listM', diginums', getList, main) where
import System.IO
import Data.List
diginums' = []
listM' = [1, 2, 3]
average' = (sum diginums') / (fromIntegral (length diginums'))
getList :: IO [Double]
getList = readLn
main = do
putStrLn "Please enter a few numbers"
diginums' <- getList
putStrLn $ show average'
Terminal Prompts : Enter a few #'s
I Enter : 123
ERROR : Exception: user error (Prelude.readIO: no parse)
I know my functions are working correctly to calculate the average. Now I think my problem is that when I take in the list of numbers from the user, I don't correctly parse them to type Double for my average function.
Your type signature says that
getList :: IO [Double]
getList = readLn
reads a list of Doubles, that means it expects input of the form
[123, 456.789, 1011.12e13]
but you gave it what could be read as a single number, "123". Thus the read fails, the input couldn't be parsed as a [Double].
If you want to parse input in a different form, not as syntactically correct Haskell values, for example as a space-separated list of numbers, you can't use readLn, but have to write a parser for the desired format yourself. For the mentioned space-separated list of numbers, that is very easy, e.g
getList :: IO [Double]
getList = do
input <- getLine
let nums = words input
return $ map read nums
If you want to get the list in the form of numbers each on its own line, ended by an empty line, you'd use a recursion with an accumulator,
getList :: IO [Double]
getList = completeList []
completeList :: [Double] -> IO [Double]
completeList acc = do
line <- getLine
if null line
then return (reverse acc)
else completeList (read line : acc)
Parsers for more complicated or less rigid formats would be harder to write.
When the parsing is fixed, you run into the problem that you haven't yet got used to the fact that values are immutable.
diginums' = []
listM' = [1, 2, 3]
average' = (sum diginums') / (fromIntegral (length diginums'))
defines three values, the Double value average' is defined in terms of the empty list diginums', hence its value is
sum diginums' / fromIntegral (length diginums') = 0 / 0
which is a NaN.
What you need is a function that computes the average of a list of Doubles, and apply that to the entered list in main.
average :: [Double] -> Double
average xs = sum xs / fromIntegral (length xs)
main = do
putStrLn "Please enter a few numbers"
numbers <- getList
print $ average numbers
There are no mutable variables in Haskell, but it looks like you are trying to initialise diginums' as an empty list and then populate it with getList.
Instead, maybe you want to pass the list of numbers to average' as an argument, something like:
module Main (getList, main) where
import System.IO
import Data.List
average' ds = (sum ds) / (fromIntegral (length ds))
getList :: IO [Double]
getList = readLn
main = do
putStrLn "Please enter a few numbers"
diginums' <- getList
putStrLn $ show $ average' diginums'
Also, as Daniel said, you need to input using Haskell literal List syntax, given the way you've coded it.

Resources