How can I randomly generate a string 5 symbols long in haskell? - haskell

I'm a newbie so I'm having problems doing this:P
Currently I have tried this:
xs1 :: RandomGen g => Int -> g -> [[Char]]
xs1 n = sequence $ replicate n $ randomRs ('!', '~' ::Char)
but I can't give this string to my function:
fun1 :: (Eq a, Num a) => [a] -> [Char] -> [Char]
fun1 a xs1=[if (a!!n==1) then (xs1!!n) else b | n <- [0, 1,2,3,4]]
any help would be greatly appreciated

IMO the easiest way for a beginner to use random stuff is randomIO or randomRIO - here is an example printing 5 random characters (in the range '!' to '~'):
import Control.Monad (replicateM)
import System.Random (Random(randomRIO))
randomString :: Int -> IO String
randomString len = replicateM len $ randomRIO ('!', '~')
select :: (Num f, Eq f) => [f] -> String -> String
select =
zipWith (\f c -> if f == 1 then c else ' ')
main :: IO ()
main = do
-- there will be a warning without the `:: Int`
s <- select [1 :: Int,0,1,0,1] <$> randomString 5
putStrLn s
I'm using randomRIO here because using just randomIO will give you random characters from all over the place most likely all unprintable ;)
I don't know what you are trying to do with fun1 but I'll gone edit if you make it clear

How can I randomly generate a string 5 symbols long ?
Due to the use of randomRs, your xs1 function returns a list of infinitely long strings.
Also, as xs1 is a function not a list, you cannot really index it thru the !! operator.
Please please: paste the full text of the error message in your question, rather than just writing:
“I can't give this string to my function:”.
In order to generate random strings, you need two things:
a generator, typically returned by mkStdGen or mkTFGen
a monadic action object with the appropriate return type
Let's try to do that under the ghci Haskell interpreter:
$ ghci
GHCi, version 8.8.4: https://www.haskell.org/ghc/ :? for help
λ>
λ> import System.Random
λ> import Control.Monad.Random
λ>
So let's get a generator first:
λ>
λ> seed = 42
λ> gen1 = mkStdGen seed
λ>
Then let's write the action object:
λ>
λ> act5 = sequence $ replicate 5 (getRandomR ('!', '~' ::Char))
λ>
λ> :type act5
act5 :: MonadRandom m => m [Char]
λ>
The action object is essentially a wrapper around a transition function that takes a generator and returns some (output value, new generator) pair.
Finally, we combine these two things thru the runRand :: Rand g a -> g -> (a, g) library function, which returns both the value you want and an updated generator:
λ>
λ> (str1, gen2) = runRand act5 gen1
λ>
λ> str1
"D6bK/"
λ>
And we can even get a second pseudo-random string, courtesy of the updated generator:
λ>
λ> (str2, gen3) = runRand act5 gen2
λ>
λ> str2
"|}#5h"
λ>
And so on ... As you can see, it is not really necessary to use the IO monad for this.
A slight generalization:
Now, say you need 6 such random strings:
λ>
λ> act5rep = sequence (replicate 6 act5)
λ>
λ> :type act5rep
act5rep :: MonadRandom m => m [[Char]]
λ>
λ> (strs, newGen) = runRand act5rep gen1
λ>
λ> strs
["D6bK/","|}#5h","0|qSQ","h4:.1","3+e-}","}eu,I"]
λ>

Related

How can I create a random 2d array in Haskell?

I am fairly new to Haskell and I'm always confused when it comes to dealing with random values. This time I'm trying to create a 5x5 2d-array in which every cell contains a random value between 4 options.
data Content = Bomb | One | Two | Three deriving (Eq, Show)
data Board = Array (Int, Int) Cell
data Cell = Cell {
content :: Content,
state :: CellState,
notes :: Notes
}
type Notes = [String]
type CellState = Bool
then in the main function I typed
main :: IO()
main = do
let cellMatrix = createMatrix
print cellMatrix
How can I create a function that creates a Board, and where do I need to do all the g<-newStdGen stuff?
Any advice would be greatly appreciated!
Focusing solely on the randomness aspect of the problem here, we see that the System.Random module in its version 1.2 provides an uniformR function that can produce one pseudo-random integer value from a specified range. Typically, we would need 5*5=25 values from the (0,3) range.
Side Note: Beware the changes between Random v1.1 and v1.2 are massive ones.
Alternatively, the same module provides a randomRs function, but that one provides an infinite list of integer values, hence cannot return an updated value of the generator. That might not be what you require.
We can use as our workhorse a function that returns a finite count of random integer values, together with an updated generator:
import System.Random
getManyInts :: RandomGen g => g -> Int -> (Int,Int) -> ([Int],g)
getManyInts g0 count range =
if (count <= 0) then ([], g0)
else let
(v,g1) = uniformR range g0
(vs,gf) = getManyInts g1 (count-1) range
in
(v:vs, gf)
In order to navigate from integer to Enum values, we will use a slightly improved Content type:
data Content = Bomb | One | Two | Three
deriving (Eq, Show, Enum, Bounded)
At that point, we can write a small test program that provides 5*5=25 random values of Content type:
main :: IO ()
main = do
g0 <- newStdGen
let count = 5*5
(loC, hiC) = (minBound :: Content, maxBound :: Content)
intRange = (fromEnum loC, fromEnum hiC)
(xs, g1) = getManyInts g0 count intRange
cs = (map toEnum xs) :: [Content]
putStrLn $ "contents: " ++ (show cs)
Test program output:
contents: [Two,Three,One,One,One,Two,One,Two,One,Bomb,Two,Two,One,Three,Bomb,Two,Two,Bomb,Three,One,One,Bomb,One,One,Bomb]
Addendum:
A better, polymorphic function:
Next, we can provide a function that returns a list of random values for any similar type:
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE ExplicitForAll #-}
getManyEnums :: forall g e. (RandomGen g, Bounded e, Enum e) => g -> Int -> ([e],g)
getManyEnums g0 count = (map toEnum xs, g1)
where
intRange = (fromEnum (minBound :: e), fromEnum (maxBound :: e))
(xs, g1) = getManyInts g0 count intRange
This function can be specialized as required. For example:
Testing under ghci:
$ ghci
GHCi, version 8.10.5: https://www.haskell.org/ghc/ :? for help
...
λ>
λ> :load q73268595.hs
[1 of 1] Compiling Main ( q73268595.hs, interpreted )
Ok, one module loaded.
λ>
λ> getManyContents = getManyEnums :: StdGen -> Int -> ([Content],StdGen)
λ>
λ> g0 <- newStdGen
λ>
λ> (cs,gf) = getManyContents g0 10
λ>
λ> cs
[One,One,Two,One,One,Three,Three,Three,Bomb,One]
λ>
λ> :q
Leaving GHCi.
$

Tuple initialization from IO data in Haskell

I would like to know what is the best way to get a tuple from data read from the input in Haskell. I often encounter this problem in competitive programming when the input is made up of several lines that contain space-separated integers. Here is an example:
1 3 10
2 5 8
10 11 0
0 0 0
To read lines of integers, I use the following function:
readInts :: IO [Int]
readInts = fmap (map read . words) getLine
Then, I transform these lists into tuples with of the appropriate size:
readInts :: IO (Int, Int, Int, Int)
readInts = fmap ((\l -> (l !! 0, l !! 1, l !! 2, l !! 3)) . map read . words) getLine
This approach does not seem very idiomatic to me.
The following syntax is more readable but it only works for 2-tuples:
readInts :: IO (Int, Int)
readInts = fmap ((\[x, y] -> (x, y)) . map read . words) getLine
(EDIT: as noted in the comments, the solution above works for n-tuples in general).
Is there an idiomatic way to initialize tuples from lists of integers without having to use !! in Haskell? Alternatively, is there a different approach to processing this type of input?
How about this:
readInts :: IO (<any tuple you like>)
readInts = read . ("(" ++) . (++ ")") . intercalate "," . words <$> getLine
Given that the context is 'competitive programming' (something I'm only dimly aware of as a concept), I'm not sure that the following offers a particularly competitive alternative, but IMHO I'd consider it idiomatic to use one of several available parser combinators.
The base package comes with a module called Text.ParserCombinators.ReadP. Here's how you could use it to parse the input file from the linked article:
module Q57693986 where
import Text.ParserCombinators.ReadP
parseNumber :: ReadP Integer
parseNumber = read <$> munch1 (`elem` ['0'..'9'])
parseTriple :: ReadP (Integer, Integer, Integer)
parseTriple =
(,,) <$> parseNumber <*> (char ' ' *> parseNumber) <*> (char ' ' *> parseNumber)
parseLine :: ReadS (Integer, Integer, Integer)
parseLine = readP_to_S (parseTriple <* eof)
parseInput :: String -> [(Integer, Integer, Integer)]
parseInput = concatMap (fmap fst . filter (null . snd)) . fmap parseLine . lines
You can use the parseInput against this input file:
1 3 10
2 5 8
10 11 0
0 0 0
Here's a GHCi session that parses that file:
*Q57693986> parseInput <$> readFile "57693986.txt"
[(1,3,10),(2,5,8),(10,11,0),(0,0,0)]
Each parseLine function produces a list of tuples that match the parser; e.g.:
*Q57693986> parseLine "11 32 923"
[((11,32,923),"")]
The second element of the tuple is any remaining String still waiting to be parsed. In the above example, parseLine has completely consumed the line, which is what I'd expect for well-formed input, so the remaining String is empty.
The parser returns a list of alternatives if there's more than one way the input could be consumed by the parser, but again, in the above example, there's only one suggested alternative, as the line has been fully consumed.
The parseInput function throws away any tuple that hasn't been fully consumed, and then picks only the first element of any remaining tuples.
This approach has often served me with puzzles such as Advent of Code, where the input files tend to be well-formed.
This is a way to generate a parser that works generically for any tuple (of reasonable size). It requires the library generics-sop.
{-# LANGUAGE DeriveGeneric, DeriveAnyClass,
FlexibleContexts, TypeFamilies, TypeApplications #-}
import GHC.Generics
import Generics.SOP
import Generics.SOP (hsequence, hcpure,Proxy,to,SOP(SOP),NS(Z),IsProductType,All)
import Data.Char
import Text.ParserCombinators.ReadP
import Text.ParserCombinators.ReadPrec
import Text.Read
componentP :: Read a => ReadP a
componentP = munch isSpace *> readPrec_to_P readPrec 1
productP :: (IsProductType a xs, All Read xs) => ReadP a
productP =
let parserOutside = hsequence (hcpure (Proxy #Read) componentP)
in Generics.SOP.to . SOP . Z <$> parserOutside
For example:
*Main> productP #(Int,Int,Int) `readP_to_S` " 1 2 3 "
[((1,2,3)," ")]
It allows components of different types, as long as they all have a Read instance.
It also parses records that have a Generics.SOP.Generic instance:
data Stuff = Stuff { x :: Int, y :: Bool }
deriving (Show,GHC.Generics.Generic,Generics.SOP.Generic)
For example:
*Main> productP #Stuff `readP_to_S` " 1 True"
[(Stuff {x = 1, y = True},"")]

Arbitrary String generator in Haskell (Test.QuickCheck.Gen)

I am struggling on Real World Haskell Chapter 11 quickCheck generator implementation for a an algebraic data type.
Following the book implementation (which was published in 2008), I came up with the following:
-- file: ch11/Prettify2.hs
module Prettify2(
Doc(..)
) where
data Doc = Empty
| Char Char
| Text String
| Line
| Concat Doc Doc
| Union Doc Doc
deriving (Show, Eq)
And my Arbitrary implementation:
-- file: ch11/Arbitrary.hs
import System.Random
import Test.QuickCheck.Gen
import qualified Test.QuickCheck.Arbitrary
class Arbitrary a where
arbitrary :: Gen a
-- elements' :: [a] => Gen a {- Expected a constraint, but ‘[a]’ has kind ‘*’ -}
-- choose' :: Random a => (a, a) -> Gen a
-- oneof' :: [Gen a] -> a
data Ternary = Yes
| No
| Unknown
deriving(Eq, Show)
instance Arbitrary Ternary where
arbitrary = do
n <- choose (0, 2) :: Gen Int
return $ case n of
0 -> Yes
1 -> No
_ -> Unknown
instance (Arbitrary a, Arbitrary b) => Arbitrary (a, b) where
arbitrary = do
x <- arbitrary
y <- arbitrary
return (x, y)
instance Arbitrary Char where
arbitrary = elements (['A'..'Z'] ++ ['a' .. 'z'] ++ " ~!##$%^&*()")
I tried the two following implementation with no success:
import Prettify2
import Control.Monad( liftM, liftM2 )
instance Arbitrary Doc where
arbitrary = do
n <- choose (1,6) :: Gen Int
case n of
1 -> return Empty
2 -> do x <- arbitrary
return (Char x)
3 -> do x <- arbitrary
return (Text x)
4 -> return Line
5 -> do x <- arbitrary
y <- arbitrary
return (Concat x y)
6 -> do x <- arbitrary
y <- arbitrary
return (Union x y)
instance Arbitrary Doc where
arbitrary =
oneof [ return Empty
, liftM Char arbitrary
, liftM Text arbitrary
, return Line
, liftM2 Concat arbitrary arbitrary
, liftM2 Union arbitrary arbitrary ]
But it doesn't compile since No instance for (Arbitrary String)
I tried then to implement the instance for Arbitrary String in the following ways:
import qualified Test.QuickCheck.Arbitrary but it does not implement Arbitrary String neither
installing Test.RandomStrings hackage link
instance Arbitrary String where
arbitrary = do
n <- choose (8, 16) :: Gen Int
return $ randomWord randomASCII n :: Gen String
With the following backtrace:
$ ghci
GHCi, version 7.10.3: http://www.haskell.org/ghc/ :? for help
Prelude> :l Arbitrary.hs
[1 of 2] Compiling Prettify2 ( Prettify2.hs, interpreted )
[2 of 2] Compiling Main ( Arbitrary.hs, interpreted )
Arbitrary.hs:76:9:
The last statement in a 'do' block must be an expression
return <- randomWord randomASCII n :: Gen String
Failed, modules loaded: Prettify2
Would you have any good suggestion about how to implement this particular generator and - more in general - how to proceed in these cases?
Thank you in advance
Don't define a new Arbitrary type class, import Test.QuickCheck instead. It defines most of these instances for you. Also be careful about the version of quickcheck, RWH assumes version 1.
The resulting full implementation will be:
-- file: ch11/Arbitrary.hs
import Test.QuickCheck
import Prettify2
import Control.Monad( liftM, liftM2 )
data Ternary = Yes
| No
| Unknown
deriving(Eq, Show)
instance Arbitrary Ternary where
arbitrary = do
n <- choose (0, 2) :: Gen Int
return $ case n of
0 -> Yes
1 -> No
_ -> Unknown
instance Arbitrary Doc where
arbitrary =
oneof [ return Empty
, liftM Char arbitrary
, liftM Text arbitrary
, return Line
, liftM2 Concat arbitrary arbitrary
, liftM2 Union arbitrary arbitrary ]

Why does this not run in constant memory?

I am trying to write a very large amount of data to a file in constant memory.
import qualified Data.ByteString.Lazy as B
{- Creates and writes num grids of dimensions aa x aa -}
writeGrids :: Int -> Int -> IO ()
writeGrids num aa = do
rng <- newPureMT
let (grids,shuffleds) = createGrids rng aa
createDirectoryIfMissing True "data/grids/"
B.writeFile (gridFileName num aa)
(encode (take num grids))
B.writeFile (shuffledFileName num aa)
(encode (take num shuffleds))
However this consumes memory proportional to the size of num. I know createGrids is a sufficiently lazy function because I have tested it by appending error "not lazy enough" (as suggested by the Haskell wiki here) to the end of the lists it returns and no errors are raised. take is a lazy function that is defined in Data.List. encode is also a lazy function defined in Data.Binary. B.writeFile is defined in Data.ByteString.Lazy.
Here is the complete code so you can execute it:
import Control.Arrow (first)
import Data.Binary
import GHC.Float (double2Float)
import System.Random (next)
import System.Random.Mersenne.Pure64 (PureMT, newPureMT, randomDouble)
import System.Random.Shuffle (shuffle')
import qualified Data.ByteString.Lazy as B
main :: IO ()
main = writeGrids 1000 64
{- Creates and writes num grids of dimensions aa x aa -}
writeGrids :: Int -> Int -> IO ()
writeGrids num aa = do
rng <- newPureMT
let (grids,shuffleds) = createGrids rng aa
B.writeFile "grids.bin" (encode (take num grids))
B.writeFile "shuffleds.bin" (encode (take num shuffleds))
{- a random number generator, dimension of grids to make
returns a pair of lists, the first is a list of grids of dimensions
aa x aa, the second is a list of the shuffled grids corresponding to the first list -}
createGrids :: PureMT -> Int -> ([[(Float,Float)]],[[(Float,Float)]])
createGrids rng aa = (grids,shuffleds) where
rs = randomFloats rng
grids = map (getGridR aa) (chunksOf (2 * aa * aa) rs)
shuffleds = shuffler (aa * aa) rng grids
{- length of each grid, a random number generator, a list of grids
returns a the list with each grid shuffled -}
shuffler :: Int -> PureMT -> [[(Float,Float)]] -> [[(Float,Float)]]
shuffler n rng (xs:xss) = shuffle' xs n rng : shuffler n (snd (next rng)) xss
shuffler _ _ [] = []
{- divides list into chunks of size n -}
chunksOf :: Int -> [a] -> [[a]]
chunksOf n = go
where go xs = case splitAt n xs of
(ys,zs) | null ys -> []
| otherwise -> ys : go zs
{- dimension of grid, list of random floats [0,1]
returns a list of (x,y) points of length n^2 such that all
points are in the range [0,1] and the points are a randomly
perturbed regular grid -}
getGridR :: Int -> [Float] -> [(Float,Float)]
getGridR n rs = pts where
nn = n * n
(irs,jrs) = splitAt nn rs
n' = fromIntegral n
grid = [ (p,q) | p <- [0..n'-1], q <- [0..n'-1] ]
pts = zipWith (\(p,q) (ir,jr) -> ((p+ir)/n',(q+jr)/n')) grid (zip irs jrs)
{- an infinite list of random floats in range [0,1] -}
randomFloats :: PureMT -> [Float]
randomFloats rng = let (d,rng') = first double2Float (randomDouble rng)
in d : randomFloats rng'
The required packages are:
, bytestring
, binary
, random
, mersenne-random-pure64
, random-shuffle
Two reasons for the memory usage:
First, Data.Binary.encode doesn't seem to run in constant space. The following program uses 910 MB memory:
import Data.Binary
import qualified Data.ByteString.Lazy as B
len = 10000000 :: Int
main = B.writeFile "grids.bin" $ encode [0..len]
If we leave a 0 out from len we get 97 MB memory usage.
In contrast, the following program uses 1 MB:
import qualified Data.ByteString.Lazy.Char8 as B
main = B.writeFile "grids.bin" $ B.pack $ show [0..(1000000::Int)]
Second, in your program shuffleds contains references to contents of grids, which prevents garbage collection of grids. So when we print grids, we also evaluate it and then it has to sit in memory until we finish printing shuffleds. The following version of your program still consumes lots of memory, but it uses constant space if we comment out one of the two lines with B.writeFile.
import qualified Data.ByteString.Lazy.Char8 as B
writeGrids :: Int -> Int -> IO ()
writeGrids num aa = do
rng <- newPureMT
let (grids,shuffleds) = createGrids rng aa
B.writeFile "grids.bin" (B.pack $ show (take num grids))
B.writeFile "shuffleds.bin" (B.pack $ show (take num shuffleds))
For what it's worth, here is a full solution combining the ideas of everyone here. Memory consumption is constant at ~6MB (compiled with -O2).
import Control.Arrow (first)
import Control.Monad.State (state, evalState)
import Data.Binary
import GHC.Float (double2Float)
import System.Random (next)
import System.Random.Mersenne.Pure64 (PureMT, newPureMT, randomDouble)
import System.Random.Shuffle (shuffle')
import qualified Data.ByteString as B (hPut)
import qualified Pipes.Binary as P (encode)
import qualified Pipes.Prelude as P (zip, mapM, drain)
import Pipes (runEffect, (>->))
import System.IO (withFile, IOMode(AppendMode))
main :: IO ()
main = writeGrids 1000 64
{- Creates and writes num grids of dimensions aa x aa -}
writeGrids :: Int -> Int -> IO ()
writeGrids num aa = do
rng <- newPureMT
let (grids, shuffleds) = createGrids rng aa
gridFile = "grids.bin"
shuffledFile = "shuffleds.bin"
encoder = P.encode . SerList . take num
writeFile gridFile ""
writeFile shuffledFile ""
withFile gridFile AppendMode $ \hGr ->
withFile shuffledFile AppendMode $ \hSh ->
runEffect
$ P.zip (encoder grids) (encoder shuffleds)
>-> P.mapM (\(ch1, ch2) -> B.hPut hGr ch1 >> B.hPut hSh ch2)
>-> P.drain -- discards the stream of () results.
{- a random number generator, dimension of grids to make
returns a pair of lists, the first is a list of grids of dimensions
aa x aa, the second is a list of the shuffled grids corresponding to the first list -}
createGrids :: PureMT -> Int -> ( [[(Float,Float)]], [[(Float,Float)]] )
createGrids rng aa = unzip gridsAndShuffleds where
rs = randomFloats rng
grids = map (getGridR aa) (chunksOf (2 * aa * aa) rs)
gridsAndShuffleds = shuffler (aa * aa) rng grids
{- length of each grid, a random number generator, a list of grids
returns a the list with each grid shuffled -}
shuffler :: Int -> PureMT -> [[(Float,Float)]] -> [( [(Float,Float)], [(Float,Float)] )]
shuffler n rng xss = evalState (traverse oneShuffle xss) rng
where
oneShuffle xs = state $ \r -> ((xs, shuffle' xs n r), snd (next r))
newtype SerList a = SerList { runSerList :: [a] }
deriving (Show)
instance Binary a => Binary (SerList a) where
put (SerList (x:xs)) = put False >> put x >> put (SerList xs)
put _ = put True
get = do
stop <- get :: Get Bool
if stop
then return (SerList [])
else do
x <- get
SerList xs <- get
return (SerList (x : xs))
{- divides list into chunks of size n -}
chunksOf :: Int -> [a] -> [[a]]
chunksOf n = go
where go xs = case splitAt n xs of
(ys,zs) | null ys -> []
| otherwise -> ys : go zs
{- dimension of grid, list of random floats [0,1]
returns a list of (x,y) points of length n^2 such that all
points are in the range [0,1] and the points are a randomly
perturbed regular grid -}
getGridR :: Int -> [Float] -> [(Float,Float)]
getGridR n rs = pts where
nn = n * n
(irs,jrs) = splitAt nn rs
n' = fromIntegral n
grid = [ (p,q) | p <- [0..n'-1], q <- [0..n'-1] ]
pts = zipWith (\(p,q) (ir,jr) -> ((p+ir)/n',(q+jr)/n')) grid (zip irs jrs)
{- an infinite list of random floats in range [0,1] -}
randomFloats :: PureMT -> [Float]
randomFloats rng = let (d,rng') = first double2Float (randomDouble rng)
in d : randomFloats rng'
Comments on the changes:
shuffler is now a traversal with the State functor. It produces, in a single pass through the input list, a list of pairs, in which each grid is paired with its shuffled version. createGrids then (lazily) unzips this list.
The files are written to using pipes machinery, in a way loosely inspired by this answer (I originally wrote this using P.foldM). Note that the hPut I used is the strict bytestring one, for it acts on strict chunks supplied by the producer made with P.zip (which, in spirit, is a pair of lazy bytestrings that supplies chunks in pairs).
SerList is there to hold the custom Binary instance Thomas M. DuBuisson alludes to. Note that I haven't thought too much about laziness and strictness in the get method of the instance. If that causes you trouble, this question looks useful.

Haskell GHCi - Using EOF character on stdin with getContents

I like to parse strings ad hoc in Python by just pasting into the interpreter.
>>> s = """Adams, John
... Washington,George
... Lincoln,Abraham
... Jefferson, Thomas
... """
>>> print "\n".join(x.split(",")[1].replace(" ", "")
for x in s.strip().split("\n"))
John
George
Abraham
Thomas
This works great using the Python interpreter, but I'd like to do this with Haskell/GHCi. Problem is, I can't paste multi-line strings. I can use getContents with an EOF character, but I can only do it once since the EOF character closes stdin.
Prelude> s <- getContents
Prelude> s
"Adams, John
Adams, John\nWashington,George
Washington,George\nLincoln,Abraham
Lincoln,Abraham\nJefferson, Thomas
Jefferson, Thomas\n^Z
"
Prelude> :{
Prelude| putStr $ unlines $ map ((filter (`notElem` ", "))
Prelude| . snd . (break (==','))) $ lines s
Prelude| :}
John
George
Abraham
Thomas
Prelude> x <- getContents
*** Exception: <stdin>: hGetContents: illegal operation (handle is closed)
Is there a better way to go about doing this with GHCi? Note - my understanding of getContents (and Haskell IO in general) is probably severely broken.
UPDATED
I will be playing with the answers I have received. Here are some helper functions I made (plagiarized) that simulate Python's """ quoting (by ending with """, not starting) from ephemient's answer.
getLinesWhile :: (String -> Bool) -> IO String
getLinesWhile p = liftM unlines $ takeWhileM p (repeat getLine)
getLines :: IO String
getLines = getLinesWhile (/="\"\"\"")
To use AndrewC's answer in GHCi -
C:\...\code\haskell> ghci HereDoc.hs -XQuasiQuotes
ghci> :{
*HereDoc| let s = [heredoc|
*HereDoc| Adams, John
*HereDoc| Washington,George
*HereDoc| Lincoln,Abraham
*HereDoc| Jefferson, Thomas
*HereDoc| |]
*HereDoc| :}
ghci> putStrLn s
Adams, John
Washington,George
Lincoln,Abraham
Jefferson, Thomas
ghci> :{
*HereDoc| putStr $ unlines $ map ((filter (`notElem` ", "))
*HereDoc| . snd . (break (==','))) $ lines s
*HereDoc| :}
John
George
Abraham
Thomas
getContents == hGetContents stdin. Unfortunately, hGetContents marks its handle as (semi-)closed, which means anything attempting to read from stdin ever again will fail.
Does it suffice to simply read up to an empty line or some other marker, never closing stdin?
takeWhileM :: Monad m => (a -> Bool) -> [m a] -> m [a]
takeWhileM p (ma : mas) = do
a <- ma
if p a
then liftM (a :) $ takeWhileM p mas
else return []
takeWhileM _ _ = return []
ghci> liftM unlines $ takeWhileM (not . null) (repeat getLine)
Adams, John
Washington, George
Lincoln, Abraham
Jefferson, Thomas
"Adams, John\nWashington, George\nLincoln, Abraham\nJefferson, Thomas\n"
ghci>
If you do this a lot, and you're writing helper functions in some module anyway, why not go the whole hog and use your editor for the raw data too:
{-# LANGUAGE TemplateHaskell, QuasiQuotes #-}
module ParseAdHoc where
import HereDoc
import Data.Char (isSpace)
import Data.List (intercalate,intersperse) -- other handy helpers
-- ------------------------------------------------------
-- edit this bit every time you do your ad-hoc parsing
adhoc :: String -> String
adhoc = head . splitOn ',' . rmspace
input = [heredoc|
Adams, John
Washington,George
Lincoln,Abraham
Jefferson, Thomas
|]
-- ------------------------------------------------------
-- add other helpers you'll reuse here
main = mapM_ putStrLn.map adhoc.lines $ input
rmspace = filter (not.isSpace)
splitWith :: (a -> Bool) -> [a] -> [[a]] -- splits using a function that tells you when
splitWith isSplitter list = case dropWhile isSplitter list of
[] -> []
thisbit -> firstchunk : splitWith isSplitter therest
where (firstchunk, therest) = break isSplitter thisbit
splitOn :: Eq a => a -> [a] -> [[a]] -- splits on the given item
splitOn c = splitWith (== c)
splitsOn :: Eq a => [a] -> [a] -> [[a]] -- splits on any of the given items
splitsOn chars = splitWith (`elem` chars)
It would be easier to use takeWhile (/=',') instead of head . splitOn ',', but I thought that splitOn will be more useful to you in the future.
This uses a helper module, HereDoc, that lets you paste multiline string literals into your code (like perl's <<"EOF" or python's """). I can't remember how I found how to do this, but I've tweaked it to remove whitespace first and last lines, so I can start and end my data with a newline.
module HereDoc where
import Language.Haskell.TH
import Language.Haskell.TH.Quote
import Data.Char (isSpace)
{-
example1 = [heredoc|Hi.
This is a multi-line string.
It should appear as an ordinary string literal.
Remember you can only use a QuasiQuoter
in a different module, so import this HereDoc module
into something else and don't forget the
{-# LANGUAGE TemplateHaskell, QuasiQuotes #-}|]
example2 = [heredoc|
This heredoc has no newline characters in it because empty or whitespace-only first and last lines are ignored
|]
-}
heredoc = QuasiQuoter {quoteExp = stringE.topAndTail,
quotePat = litP . stringL,
quoteType = undefined,
quoteDec = undefined}
topAndTail = myunlines.tidyend.tidyfront.lines
tidyfront :: [String] -> [String]
tidyfront [] = []
tidyfront (xs:xss) | all isSpace xs = xss
| otherwise = xs:xss
tidyend :: [String] -> [String]
tidyend [] = []
tidyend [xs] | all isSpace xs = []
| otherwise = [xs]
tidyend (xs:xss) = xs:tidyend xss
myunlines :: [String] -> String
myunlines [] = ""
myunlines (l:ls) = l ++ concatMap ('\n':) ls
You might find Data.Text a good source of (inspiration for) helper functions:
http://hackage.haskell.org/packages/archive/text/latest/doc/html/Data-Text.html

Resources