I'm using the lines functionality to take an input and split up many variables before sending it off to a function. Please look at the run function and tell me why I get the following error. It seems like it should just assign the first string in ln to seq, but I get an error.
ERROR:dishonest.hs:33:11:
Couldn't match expected type `[t]' against inferred type `Char'
In a 'do' expression: seq <- ln !! 0
In the expression:
do ln <- lines s
seq <- ln !! 0
states <- ln !! 1
l1 <- listDouble (ln !! 2)
....
In the definition of `run':
run s = do ln <- lines s
seq <- ln !! 0
states <- ln !! 1
....
code follows...
import Char
maximumInd :: (Double, Double) -> Int
maximumInd (d1,d2) | maximum [d1,d2] == d1 = 1
| maximum [d1,d2] == d2 = 2
scoreFunction :: String -> Int -> [Double] -> [Double] -> Double -> Double -> (Double,Double)
scoreFunction string (-1) l1 l2 t1 t2 = (0.5, 0.5)
scoreFunction string index l1 l2 t1 t2 = ((fst (scoreFunction string (index-1) l1 l2 t1 t2)) * (l1!!num) * (tr (maximumInd (scoreFunction string (index-1) l1 l2 t1 t2))!!1), (snd (scoreFunction string (index-1) l1 l2 t1 t2)) * (l2!!num) * (tr (maximumInd (scoreFunction string (index-1) l1 l2 t1 t2))!!2))
where
num = digitToInt (string!!index)
tr n | n == 1 = l1
| n == 2 = l2
--split is stolen from teh webs http://julipedia.blogspot.com/2006/08/split-function-in-haskell.html
split :: String -> Char -> [String]
split [] delim = [""]
split (c:cs) delim
| c == delim = "" : rest
| otherwise = (c : head rest) : tail rest
where
rest = split cs delim
readDouble :: String -> Double
readDouble s = read s :: Double
listDouble :: String -> [Double]
listDouble s = map readDouble $ split s ' '
run :: String -> String
run s = do
ln <- lines s
seq <- ln!!0
states <- ln!!1
l1 <- listDouble (ln!!2)
l2 <- listDouble (ln!!3)
tr1 <- readDouble (ln!!4)
tr2 <- readDouble (ln!!5)
show maximumInd (scoreFunction seq (length seq) l1 l2 tr1 tr2)
main = do
putStrLn "Please compose a test job for Viterbi."
putStrLn "First line: A sequence with language [1,9]."
putStrLn "Second line: The number of states."
putStrLn "For the next 2 lines: space delimited emission probabilities."
putStrLn "For the 2 lines after that, transmission probabilities."
putStrLn "Then do ./casino < filename "
interact run
First, let's look at how the compiler is interpreting it:
run :: String -> String
String is in fact [Char].
run s = do
ln <- lines s
...
Simplifying things a lot, a do block must "run" in a Monad. This means that it "returns" a value of type (Monad t) => t a. Since this function is returning [Char], the do block will return [Char], meaning the Monad is [] (if you read [a] as [] a, it will be more clear).
Copying from another answer of mine,
Simplifying things a lot, on a do block on the IO monad, every line is either:
Something which returns a value of the "IO a" type; the value of the "a" type within it is discarded (so the "a" is often "()")
A <- expression, which does the same thing but instead of discarding the value of the "a" type gives it the name to the left of the <-
A let, which does nothing more than give a name to a value
Here we are not on the IO Monad, but on the [] Monad. So the expression to the right of the <- must be a [a].
So, in the first line of the do block:
ln <- lines s
Here the type is [[Char]], and so the type of ln is [Char].
On the next line:
seq <- ln!!0
Here ln!!0 has type Char, but since you are in the [] Monad, it is expecting a list of some sort. This is what causes the compiler's error message.
The solution is to, instead of using the do notation, use a plain let block:
run :: String -> String
run s = let
ln = lines s
seq = ln!!0
states = ln!!1
l1 = listDouble (ln!!2)
l2 = listDouble (ln!!3)
tr1 = readDouble (ln!!4)
tr2 = readDouble (ln!!5)
in show maximumInd (scoreFunction seq (length seq) l1 l2 tr1 tr2)
I did not compile this block, but even if there is something else wrong with it, it should be enough to get you going again.
I'm not sure if this is right, but the issue might lay in the fact that <- isn't an assignment operator, as you seem to be using it; it essentially unpacks a value from a monad. But I'm not really sure if that's the cause of your issue or not.
Yeah i think mipadi is right. the do notation translates into >>= and return calls to the list monad.
run s = do
ln <- lines s
seq <- ln!!0
states <- ln!!1
Will get the list that is returned by lines s and for the seq and states, ln will be a string of that list each time. So actually with ln!!0, you get the first character of that string. But a list is required at the right side of the <- there. That's just about all what i remember. Has been quite a bit of time since i did those stuff with haskell :)
Remember that lists are monads in haskell, with the definition:
instance Monad [] where
m >>= f = concatMap f m
return x = [x]
fail s = []
So if you take your code, which goes something like:
do {ln <- lines "hello, world"; ln!!0}
That is equivalent to the following using bind notation:
lines "hello world" >>= (\ln -> ln!!0)
or more concisely:
lines "hello world" >>= (!!0)
We can now use the definition of the list monad to re-write that as the following:
concatMap (!!0) (lines "hello, world")
Which is equivalent to:
concat $ map (!!0) (lines "hello, world")
lines "hello, world" will return ["hello, world"], so mapping (!!0) over it will produce the string "h". That has type [Char], but concat requires a type [[t]]. Char does not match [t], hence the error.
Try using a let or something rather than do notation.
Edit:
So I think this is what you want, using let rather than do.
run :: String -> String
run s = let ln = lines s
seq = ln!!0
states = ln!!1
l1 = listDouble (ln!!2)
l2 = listDouble (ln!!3)
tr1 = readDouble (ln!!4)
tr2 = readDouble (ln!!5)
in show $ maximumInd (scoreFunction seq (length seq) l1 l2 tr1 tr2)
run is of type String -> String, so you probably don't want the do notation[1]. I'd advise you to do this:
comment out everything below the
listDouble function, load that, and
be sure that'll compile.
add a test value that's formatted like the file you expect. Something like:
t = "[1,9]\n3\n1.0 1.0 1.0\n1.0 1.0 1.0\n1.0\n1.0"
add a test values that the top level for the values you're defining in run
ln = lines t
seq = ln!!0
states = ln!!1
l1 = listDouble (ln!!2)
l2 = listDouble (ln!!3)
tr1 = readDouble (ln!!4)
tr2 = readDouble (ln!!5)
use the type signature of
scoreFunction to guide you in
building the arguments to that
function, then the rest of run, and finally main.
Learn to use an interpreter, such as Hugs, ghci. Learn the :r and :t commands. For example (i'm using currying to give some but not all of a functions arguments):
:t scoreFunction
:t scoreFunction ""
:t scoreFunction 3445
You can use this to have the system help you determine if you're on the right track.
Doing this at the top level with introduce a conflict with a Prelude.seq function - either rename your seq, or refernence yours as Main.seq.
Haskell is notorious for error messages that are inscrutable to beginners, so I'd recommend periodically rolling back to a version that compiles, either by commenting out your current experiments (which is what I had you do in step 1 above), or using your editors undo function.
[1]I say probably because Strings, being lists of Characters, are instances of the Monad class, but that's fairly advanced
Related
I'm trying to cut chunks from a list, with a given predicate. I would have preferred to use a double character, e.g. ~/, but have resolved to just using $. What I essentially want to do is this...
A: "Hello, my $name is$ Danny and I $like$ Haskell"
What I want to turn this into is this:
B: "Hello, my Danny and I Haskell"
So I want to strip everything in between the given symbol, $, or my first preference was ~/, if I can figure it out. What I tried was this:
s1 :: String -> String
s1 xs = takeWhile (/= '$') xs
s2 :: String -> String
s2 xs = dropWhile (/= '$') xs
s3 :: String -> String
s3 xs = s3 $ s2 $ s1 xs
This solution seems to just bug my IDE out (possibly infinite looping).
Solution:
s3 :: String -> String
s3 xs
|'$' `notElem` xs = xs
|otherwise = takeWhile (/= '$') xs ++ (s3 $ s1 xs)
s1 :: String -> String
s1 xs = drop 1 $ dropWhile (/= '$') $ tail $ snd $ break ('$'==) xs
This seems like a nice application for parsers. A solution using trifecta:
import Control.Applicative
import Data.Foldable
import Data.Functor
import Text.Trifecta
input :: String
input = "Hello, my $name is$ Danny and I $like$ Haskell"
cutChunk :: CharParsing f => f String
cutChunk = "" <$ (char '$' *> many (notChar '$') <* char '$')
cutChunk matches $, followed by 0 or more (many) non-$ characters, then another $. Then we use ("" <$) to make this parser's value always be the empty string, thus discarding all the characters that this parser matches.
includeChunk :: CharParsing f => f String
includeChunk = some (notChar '$')
includeChunk matches the text that we want to include in the result, which is anything that's not the $ character. It's important that we use some (matching one or more characters) and not many (matching zero or more characters) because we're going to include this parser within another many expression next; if this parser matched on the empty string, then that could loop infinitely.
chunks :: CharParsing f => f String
chunks = fold <$> many (cutChunk <|> includeChunk)
chunks is the parser for everything. Read <|> as "or", as in "parse either a cutChunk or an includeChunk". many (cutChunk <|> includeChunk) is a parser that produces a list of chunks e.g. Success ["Hello, my ",""," Danny and I ",""," Haskell"], so we fold the output to concatenate those chunks together into a single string.
result :: Result String
result = parseString chunks mempty input
The result:
Success "Hello, my Danny and I Haskell"
Your infinite loop comes from calling s3 recursively with no base case:
s3 :: String -> String
s3 xs = s3 $ s2 $ s1 xs
Adding a base case corrects the infinite loop:
s3 xs
| '$' `notElem` xs = xs
| otherwise = ...
This is not the whole answer. Think about what s1 actually does and where you use its return value:
s1 "hello $my name is$ ThreeFx" == "hello "
For further reference, see the break function:
break :: (a -> Bool) -> [a] -> ([a], [a])
I think your logic is wrong, perhaps easier to write it in an elementary way
Prelude> let pr xs = go xs True
Prelude| where go [] _ = []
Prelude| go (x:xs) f | x=='$' = go xs (not f)
Prelude| | f = x : go xs f
Prelude| | otherwise = go xs f
Prelude|
Prelude> pr "Hello, my $name is$ Danny and I $like$ Haskell"
"Hello, my Danny and I Haskell"
Explanation The flag f keeps track of the state (either pass mode or not). If the current char is a token skip and switch state.
This question already has answers here:
How to get normal value from IO action in Haskell
(2 answers)
Closed 7 years ago.
I just started learning Haskell and got my first project working today. Its a small program that uses Network.HTTP.Conduit and Graphics.Rendering.Chart (haskell-chart) to plot the amount of google search results for a specific question with a changing number in it.
My problem is that simple-http from the conduit package returns a monad (I hope I understood the concept of monads right...), but I only want to use the ByteString inside of it, that contains the html-code of the website. So until now i use download = unsafePerformIO $ simpleHttp url to use it later without caring about the monad - I guess that's not the best way to do that.
So: Is there any better solution so that I don't have to carry the monad with me the whole evaluation? Or would it be better to leave it the way the result is returned (with the monad)?
Here's the full program - the mentioned line is in getResultCounter. If things are coded not-so-well and could be done way better, please remark that too:
import System.IO.Unsafe
import Network.HTTP.Conduit (simpleHttp)
import qualified Data.ByteString.Lazy.Char8 as L
import Graphics.Rendering.Chart.Easy
import Graphics.Rendering.Chart.Backend.Cairo
numchars :: [Char]
numchars = "1234567890"
isNum :: Char -> Bool
isNum = (\x -> x `elem` numchars)
main = do
putStrLn "Please input your Search (The first 'X' is going to be replaced): "
search <- getLine
putStrLn "X ranges from: "
from <- getLine
putStrLn "To: "
to <- getLine
putStrLn "In steps of (Only whole numbers are accepted):"
step <- getLine
putStrLn "Please have some patience..."
let range = [read from,(read from + read step)..read to] :: [Int]
let searches = map (replaceX search) range
let res = map getResultCounter searches
plotList search ([(zip range res)] :: [[(Int,Integer)]])
putStrLn "Done."
-- Creates a plot from the given data
plotList name dat = toFile def (name++".png") $ do
layout_title .= name
plot (line "Results" dat)
-- Calls the Google-site and returns the number of results
getResultCounter :: String -> Integer
getResultCounter search = read $ filter isNum $ L.unpack parse :: Integer
where url = "http://www.google.de/search?q=" ++ search
download = unsafePerformIO $ simpleHttp url -- Not good
parse = takeByteStringUntil "<"
$ dropByteStringUntil "id=\"resultStats\">" download
-- Drops a ByteString until the desired String is found
dropByteStringUntil :: String -> L.ByteString -> L.ByteString
dropByteStringUntil str cont = helper str cont 0
where helper s bs n | (bs == L.empty) = L.empty
| (n >= length s) = bs
| ((s !! n) == L.head bs) = helper s (L.tail bs) (n+1)
| ((s !! n) /= L.head bs) = helper s (L.tail bs) 0
-- Takes a ByteString until the desired String is found
takeByteStringUntil :: String -> L.ByteString -> L.ByteString
takeByteStringUntil str cont = helper str cont 0
where helper s bs n | bs == L.empty = bs
| n >= length s = L.empty
| s !! n == L.head bs = L.head bs `L.cons`
helper s (L.tail bs) (n + 1)
| s !! n /= L.head bs = L.head bs `L.cons`
helper s (L.tail bs) 0
-- Replaces the first 'X' in a string with the show value of the given value
replaceX :: (Show a) => String -> a -> String
replaceX str x | str == "" = ""
| head str == 'X' = show x ++ tail str
| otherwise = head str : replaceX (tail str) x
This is a lie:
getResultCounter :: String -> Integer
The type signature above is promising that the resulting integer only depends on the input string, when this is not the case: Google can add/remove results from one call to the other, affecting the output.
Making the type more honest, we get
getResultCounter :: String -> IO Integer
This honestly admits it's going to interact with the external world. The code then is easily adapted to:
getResultCounter search = do
let url = "http://www.google.de/search?q=" ++ search
download <- simpleHttp url -- perform IO here
let parse = takeByteStringUntil "<"
$ dropByteStringUntil "id=\"resultStats\">" download
return (read $ filter isNum $ L.unpack parse :: Integer)
Above, I tried to preserve the original structure of the code.
Now, in main we can no longer do
let res = map getResultCounter searches
but we can do
res <- mapM getResultCounter searches
after importing Control.Monad.
Basically I would like to find a way so that a user can enter the number of test cases and then input their test cases. The program can then run those test cases and print out the results in the order that the test cases appear.
So basically I have main which reads in the number of test cases and inputs it into a function that will read from IO that many times. It looks like this:
main = getLine >>= \tst -> w (read :: String -> Int) tst [[]]
This is the method signature of w: w :: Int -> [[Int]]-> IO ()
So my plan is to read in the number of test cases and have w run a function which takes in each test case and store the result into the [[]] variable. So each list in the list will be an output. w will just run recursively until it reaches 0 and print out each list on a separate line. I'd like to know if there is a better way of doing this since I have to pass in an empty list into w, which seems extraneous.
As #bheklilr mentioned you can't update a value like [[]]. The standard functional approach is to pass an accumulator through a a set of recursive calls. In the following example the acc parameter to the loop function is this accumulator - it consists of all of the output collected so far. At the end of the loop we return it.
myTest :: Int -> [String]
myTest n = [ "output line " ++ show k ++ " for n = " ++ show n | k <- [1..n] ]
main = do
putStr "Enter number of test cases: "
ntests <- fmap read getLine :: IO Int
let loop k acc | k > ntests = return $ reverse acc
loop k acc = do
-- we're on the kth-iteration
putStr $ "Enter parameter for test case " ++ show k ++ ": "
a <- fmap read getLine :: IO Int
let output = myTest a -- run the test
loop (k+1) (output:acc)
allOutput <- loop 1 []
print allOutput
As you get more comfortable with this kind of pattern you'll recognize it as a fold (indeed a monadic fold since we're doing IO) and you can implement it with foldM.
Update: To help explain how fmap works, here are equivalent expressions written without using fmap:
With fmap: Without fmap:
n <- fmap read getLine :: IO [Int] line <- getLine
let n = read line :: Int
vals <- fmap (map read . words) getLine line <- getLine
:: IO [Int] let vals = (map read . words) line :: [Int]
Using fmap allows us to eliminate the intermediate variable line which we never reference again anyway. We still need to provide a type signature so read knows what to do.
The idiomatic way is to use replicateM:
runAllTests :: [[Int]] -> IO ()
runAllTests = {- ... -}
main = do
numTests <- readLn
tests <- replicateM numTests readLn
runAllTests tests
-- or:
-- main = readLn >>= flip replicateM readLn >>= runAllTests
I use System.Random and System.Random.Shuffle to shuffle the order of characters in a string, I shuffle it using:
shuffle' string (length string) g
g being a getStdGen.
Now the problem is that the shuffle can result in an order that's identical to the original order, resulting in a string that isn't really shuffled, so when this happens I want to just shuffle it recursively until it hits a a shuffled string that's not the original string (which should usually happen on the first or second try), but this means I need to create a new random number generator on each recursion so it wont just shuffle it exactly the same way every time.
But how do I do that? Defining a
newg = newStdGen
in "where", and using it results in:
Jumble.hs:20:14:
Could not deduce (RandomGen (IO StdGen))
arising from a use of shuffle'
from the context (Eq a)
bound by the inferred type of
shuffleString :: Eq a => IO StdGen -> [a] -> [a]
at Jumble.hs:(15,1)-(22,18)
Possible fix:
add an instance declaration for (RandomGen (IO StdGen))
In the expression: shuffle' string (length string) g
In an equation for `shuffled':
shuffled = shuffle' string (length string) g
In an equation for `shuffleString':
shuffleString g string
= if shuffled == original then
shuffleString newg shuffled
else
shuffled
where
shuffled = shuffle' string (length string) g
original = string
newg = newStdGen
Jumble.hs:38:30:
Couldn't match expected type `IO StdGen' with actual type `StdGen'
In the first argument of `jumble', namely `g'
In the first argument of `map', namely `(jumble g)'
In the expression: (map (jumble g) word_list)
I'm very new to Haskell and functional programming in general and have only learned the basics, one thing that might be relevant which I don't know yet is the difference between "x = value", "x <- value", and "let x = value".
Complete code:
import System.Random
import System.Random.Shuffle
middle :: [Char] -> [Char]
middle word
| length word >= 4 = (init (tail word))
| otherwise = word
shuffleString g string =
if shuffled == original
then shuffleString g shuffled
else shuffled
where
shuffled = shuffle' string (length string) g
original = string
jumble g word
| length word >= 4 = h ++ m ++ l
| otherwise = word
where
h = [(head word)]
m = (shuffleString g (middle word))
l = [(last word)]
main = do
g <- getStdGen
putStrLn "Hello, what would you like to jumble?"
text <- getLine
-- let text = "Example text"
let word_list = words text
let jumbled = (map (jumble g) word_list)
let output = unwords jumbled
putStrLn output
This is pretty simple, you know that g has type StdGen, which is an instance of the RandomGen typeclass. The RandomGen typeclass has the functions next :: g -> (Int, g), genRange :: g -> (Int, Int), and split :: g -> (g, g). Two of these functions return a new random generator, namely next and split. For your purposes, you can use either quite easily to get a new generator, but I would just recommend using next for simplicity. You could rewrite your shuffleString function to something like
shuffleString :: RandomGen g => g -> String -> String
shuffleString g string =
if shuffled == original
then shuffleString (snd $ next g) shuffled
else shuffled
where
shuffled = shuffle' string (length string) g
original = string
End of answer to this question
One thing that might be relevant which I don't know yet is the difference between "x = value", "x <- value", and "let x = value".
These three different forms of assignment are used in different contexts. At the top level of your code, you can define functions and values using the simple x = value syntax. These statements are not being "executed" inside any context other than the current module, and most people would find it pedantic to have to write
module Main where
let main :: IO ()
main = do
putStrLn "Hello, World"
putStrLn "Exiting now"
since there isn't any ambiguity at this level. It also helps to delimit this context since it is only at the top level that you can declare data types, type aliases, and type classes, these can not be declared inside functions.
The second form, let x = value, actually comes in two variants, the let x = value in <expr> inside pure functions, and simply let x = value inside monadic functions (do notation). For example:
myFunc :: Int -> Int
myFunc x =
let y = x + 2
z = y * y
in z * z
Lets you store intermediate results, so you get a faster execution than
myFuncBad :: Int -> Int
myFuncBad x = (x + 2) * (x + 2) * (x + 2) * (x + 2)
But the former is also equivalent to
myFunc :: Int -> Int
myFunc x = z * z
where
y = x + 2
z = y * y
There are subtle difference between let ... in ... and where ..., but you don't need to worry about it at this point, other than the following is only possible using let ... in ..., not where ...:
myFunc x = (\y -> let z = y * y in z * z) (x + 2)
The let ... syntax (without the in ...) is used only in monadic do notation to perform much the same purpose, but usually using values bound inside it:
something :: IO Int
something = do
putStr "Enter an int: "
x <- getLine
let y = myFunc (read x)
return (y * y)
This simply allows y to be available to all proceeding statements in the function, and the in ... part is not needed because it's not ambiguous at this point.
The final form of x <- value is used especially in monadic do notation, and is specifically for extracting a value out of its monadic context. That may sound complicated, so here's a simple example. Take the function getLine. It has the type IO String, meaning it performs an IO action that returns a String. The types IO String and String are not the same, you can't call length getLine, because length doesn't work for IO String, but it does for String. However, we frequently want that String value inside the IO context, without having to worry about it being wrapped in the IO monad. This is what the <- is for. In this function
main = do
line <- getLine
print (length line)
getLine still has the type IO String, but line now has the type String, and can be fed into functions that expect a String. Whenever you see x <- something, the something is a monadic context, and x is the value being extracted from that context.
So why does Haskell have so many different ways of defining values? It all comes down to its type system, which tries really hard to ensure that you can't accidentally launch the missiles, or corrupt a file system, or do something you didn't really intend to do. It also helps to visually separate what is an action, and what is a computation in source code, so that at a glance you can tell if an action is being performed or not. It does take a while to get used to, and there are probably valid arguments that it could be simplified, but changing anything would also break backwards compatibility.
And that concludes today's episode of Way Too Much Information(tm)
(Note: To other readers, if I've said something incorrect or potentially misleading, please feel free to edit or leave a comment pointing out the mistake. I don't pretend to be perfect in my descriptions of Haskell syntax.)
So I've read the theory, now trying to parse a file in Haskell - but am not getting anywhere. This is just so weird...
Here is how my input file looks:
m n
k1, k2...
a11, ...., an
a21,.... a22
...
am1... amn
Where m,n are just intergers, K = [k1, k2...] is a list of integers, and a11..amn is a "matrix" (a list of lists): A=[[a11,...a1n], ... [am1... amn]]
Here is my quick python version:
def parse(filename):
"""
Input of the form:
m n
k1, k2...
a11, ...., an
a21,.... a22
...
am1... amn
"""
f = open(filename)
(m,n) = f.readline().split()
m = int(m)
n = int(n)
K = [int(k) for k in f.readline().split()]
# Matrix - list of lists
A = []
for i in range(m):
row = [float(el) for el in f.readline().split()]
A.append(row)
return (m, n, K, A)
And here is how (not very) far I got in Haskell:
import System.Environment
import Data.List
main = do
(fname:_) <- getArgs
putStrLn fname --since putStrLn goes to IO ()monad we can't just apply it
parsed <- parse fname
putStrLn parsed
parse fname = do
contents <- readFile fname
-- ,,,missing stuff... ??? how can I get first "element" and match on it?
return contents
I am getting confused by monads (and the context that the trap me into!), and the do statement. I really want to write something like this, but I know it's wrong:
firstLine <- contents.head
(m,n) <- map read (words firstLine)
because contents is not a list - but a monad.
Any help on the next step would be great.
So I've just discovered that you can do:
liftM lines . readFile
to get a list of lines from a file. However, still the example only only transforms the ENTIRE file, and doesn't use just the first, or the second lines...
The very simple version could be:
import Control.Monad (liftM)
-- this operates purely on list of strings
-- and also will fail horribly when passed something that doesn't
-- match the pattern
parse_lines :: [String] -> (Int, Int, [Int], [[Int]])
parse_lines (mn_line : ks_line : matrix_lines) = (m, n, ks, matrix)
where [m, n] = read_ints mn_line
ks = read_ints ks_line
matrix = parse_matrix matrix_lines
-- this here is to loop through remaining lines to form a matrix
parse_matrix :: [String] -> [[Int]]
parse_matrix lines = parse_matrix' lines []
where parse_matrix' [] acc = reverse acc
parse_matrix' (l : ls) acc = parse_matrix' ls $ (read_ints l) : acc
-- this here is to give proper signature for read
read_ints :: String -> [Int]
read_ints = map read . words
-- this reads the file contents and lifts the result into IO
parse_file :: FilePath -> IO (Int, Int, [Int], [[Int]])
parse_file filename = do
file_lines <- (liftM lines . readFile) filename
return $ parse_lines file_lines
You might want to look into Parsec for fancier parsing, with better error handling.
*Main Control.Monad> parse_file "test.txt"
(3,3,[1,2,3],[[1,2,3],[4,5,6],[7,8,9]])
An easy to write solution
import Control.Monad (replicateM)
-- Read space seperated words on a line from stdin
readMany :: Read a => IO [a]
readMany = fmap (map read . words) getLine
parse :: IO (Int, Int, [Int], [[Int]])
parse = do
[m, n] <- readMany
ks <- readMany
xss <- replicateM m readMany
return (m, n, ks, xss)
Let's try it:
*Main> parse
2 2
123 321
1 2
3 4
(2,2,[123,321],[[1,2],[3,4]])
While the code I presented is quite expressive. That is, you get work done quickly with little code, it has some bad properties. Though I think if you are still learning haskell and haven't started with parser libraries. This is the way to go.
Two bad properties of my solution:
All code is in IO, nothing is testable in isolation
The error handling is very bad, as you see the pattern matching is very aggressive in [m, n]. What happens if we have 3 elements on the first line of the input file?
liftM is not magic! You would think it does some arcane thing to lift a function f into a monad but it is actually just defined as:
liftM f x = do
y <- x
return (f y)
We could actually use liftM to do what you wanted to, that is:
[m,n] <- liftM (map read . words . head . lines) (readFile fname)
but what you are looking for are let statements:
parseLine = map read . words
parse fname = do
(x:y:xs) <- liftM lines (readFile fname)
let [m,n] = parseLine x
let ks = parseLine y
let matrix = map parseLine xs
return (m,n,ks,matrix)
As you can see we can use let to mean variable assignment rather then monadic computation. In fact let statements are you just let expressions when we desugar the do notation:
parse fname =
liftM lines (readFile fname) >>= (\(x:y:xs) ->
let [m,n] = parseLine x
ks = parseLine y
matrix = map parseLine xs
in return matrix )
A Solution Using a Parsing Library
Since you'll probably have a number of people responding with code that parses strings of Ints into [[Int]] (map (map read . words) . lines $ contents), I'll skip that and introduce one of the parsing libraries. If you were to do this task for real work you'd probably use such a library that parses ByteString (instead of String, which means your IO reads everything into a linked list of individual characters).
import System.Environment
import Control.Monad
import Data.Attoparsec.ByteString.Char8
import qualified Data.ByteString as B
First, I imported the Attoparsec and bytestring libraries. You can see these libraries and their documentation on hackage and install them using the cabal tool.
main = do
(fname:_) <- getArgs
putStrLn fname
parsed <- parseX fname
print parsed
main is basically unchanged.
parseX :: FilePath -> IO (Int, Int, [Int], [[Int]])
parseX fname = do
bs <- B.readFile fname
let res = parseOnly parseDrozzy bs
-- We spew the error messages right here
either (error . show) return res
parseX (renamed from parse to avoid name collision) uses the bytestring library's readfile, which reads in the file packed, in contiguous bytes, instead of into cells of a linked list. After parsing I use a little shorthand to return the result if the parser returned Right result or print an error if the parser returned a value of Left someErrorMessage.
-- Helper functions, more basic than you might think, but lets ignore it
sint = skipSpace >> int
int = liftM floor number
parseDrozzy :: Parser (Int, Int, [Int], [[Int]])
parseDrozzy = do
m <- sint
n <- sint
skipSpace
ks <- manyTill sint endOfLine
arr <- count m (count n sint)
return (m,n,ks,arr)
The real work then happens in parseDrozzy. We get our m and n Int values using the above helper. In most Haskell parsing libraries we must explicitly handle whitespace - so I skip the newline after n to get to our ks. ks is just all the int values before the next newline. Now we can actually use the previously specified number of rows and columns to get our array.
Technically speaking, that final bit arr <- count m (count n sint) doesn't follow your format. It will grab n ints even if it means going to the next line. We could copy Python's behavior (not verifying the number of values in a row) using count m (manyTill sint endOfLine) or we could check for each end of line more explicitly and return an error if we are short on elements.
From Lists to a Matrix
Lists of lists are not 2 dimensional arrays - the space and performance characteristics are completely different. Let's pack our list into a real matrix using Data.Array.Repa (import Data.Array.Repa). This will allow us to access the elements of the array efficiently as well as perform operations on the entire matrix, optionally spreading the work among all the available CPUs.
Repa defines the dimensions of your array using a slightly odd syntax. If your row and column lengths are in variables m and n then Z :. n :. m is much like the C declaration int arr[m][n]. For the one dimensional example, ks, we have:
fromList (Z :. (length ks)) ks
Which changes our type from [Int] to Array DIM1 Int.
For the two dimensional array we have:
let matrix = fromList (Z :. m :. n) (concat arr)
And change our type from [[Int]] to Array DIM2 Int.
So there you have it. A parsing of your file format into an efficient Haskell data structure using production-oriented libraries.
What about something simple like this?
parse :: String -> (Int, Int, [Int], [[Int]])
parse stuff = (m, n, ks, xss)
where (line1:line2:rest) = lines stuff
readMany = map read . words
(m:n:_) = readMany line1
ks = readMany line2
xss = take m $ map (take n . readMany) rest
main :: IO ()
main = do
stuff <- getContents
let (m, n, ks, xss) = parse stuff
print m
print n
print ks
print xss