Haskell GHCi - Using EOF character on stdin with getContents - haskell

I like to parse strings ad hoc in Python by just pasting into the interpreter.
>>> s = """Adams, John
... Washington,George
... Lincoln,Abraham
... Jefferson, Thomas
... """
>>> print "\n".join(x.split(",")[1].replace(" ", "")
for x in s.strip().split("\n"))
John
George
Abraham
Thomas
This works great using the Python interpreter, but I'd like to do this with Haskell/GHCi. Problem is, I can't paste multi-line strings. I can use getContents with an EOF character, but I can only do it once since the EOF character closes stdin.
Prelude> s <- getContents
Prelude> s
"Adams, John
Adams, John\nWashington,George
Washington,George\nLincoln,Abraham
Lincoln,Abraham\nJefferson, Thomas
Jefferson, Thomas\n^Z
"
Prelude> :{
Prelude| putStr $ unlines $ map ((filter (`notElem` ", "))
Prelude| . snd . (break (==','))) $ lines s
Prelude| :}
John
George
Abraham
Thomas
Prelude> x <- getContents
*** Exception: <stdin>: hGetContents: illegal operation (handle is closed)
Is there a better way to go about doing this with GHCi? Note - my understanding of getContents (and Haskell IO in general) is probably severely broken.
UPDATED
I will be playing with the answers I have received. Here are some helper functions I made (plagiarized) that simulate Python's """ quoting (by ending with """, not starting) from ephemient's answer.
getLinesWhile :: (String -> Bool) -> IO String
getLinesWhile p = liftM unlines $ takeWhileM p (repeat getLine)
getLines :: IO String
getLines = getLinesWhile (/="\"\"\"")
To use AndrewC's answer in GHCi -
C:\...\code\haskell> ghci HereDoc.hs -XQuasiQuotes
ghci> :{
*HereDoc| let s = [heredoc|
*HereDoc| Adams, John
*HereDoc| Washington,George
*HereDoc| Lincoln,Abraham
*HereDoc| Jefferson, Thomas
*HereDoc| |]
*HereDoc| :}
ghci> putStrLn s
Adams, John
Washington,George
Lincoln,Abraham
Jefferson, Thomas
ghci> :{
*HereDoc| putStr $ unlines $ map ((filter (`notElem` ", "))
*HereDoc| . snd . (break (==','))) $ lines s
*HereDoc| :}
John
George
Abraham
Thomas

getContents == hGetContents stdin. Unfortunately, hGetContents marks its handle as (semi-)closed, which means anything attempting to read from stdin ever again will fail.
Does it suffice to simply read up to an empty line or some other marker, never closing stdin?
takeWhileM :: Monad m => (a -> Bool) -> [m a] -> m [a]
takeWhileM p (ma : mas) = do
a <- ma
if p a
then liftM (a :) $ takeWhileM p mas
else return []
takeWhileM _ _ = return []
ghci> liftM unlines $ takeWhileM (not . null) (repeat getLine)
Adams, John
Washington, George
Lincoln, Abraham
Jefferson, Thomas
"Adams, John\nWashington, George\nLincoln, Abraham\nJefferson, Thomas\n"
ghci>

If you do this a lot, and you're writing helper functions in some module anyway, why not go the whole hog and use your editor for the raw data too:
{-# LANGUAGE TemplateHaskell, QuasiQuotes #-}
module ParseAdHoc where
import HereDoc
import Data.Char (isSpace)
import Data.List (intercalate,intersperse) -- other handy helpers
-- ------------------------------------------------------
-- edit this bit every time you do your ad-hoc parsing
adhoc :: String -> String
adhoc = head . splitOn ',' . rmspace
input = [heredoc|
Adams, John
Washington,George
Lincoln,Abraham
Jefferson, Thomas
|]
-- ------------------------------------------------------
-- add other helpers you'll reuse here
main = mapM_ putStrLn.map adhoc.lines $ input
rmspace = filter (not.isSpace)
splitWith :: (a -> Bool) -> [a] -> [[a]] -- splits using a function that tells you when
splitWith isSplitter list = case dropWhile isSplitter list of
[] -> []
thisbit -> firstchunk : splitWith isSplitter therest
where (firstchunk, therest) = break isSplitter thisbit
splitOn :: Eq a => a -> [a] -> [[a]] -- splits on the given item
splitOn c = splitWith (== c)
splitsOn :: Eq a => [a] -> [a] -> [[a]] -- splits on any of the given items
splitsOn chars = splitWith (`elem` chars)
It would be easier to use takeWhile (/=',') instead of head . splitOn ',', but I thought that splitOn will be more useful to you in the future.
This uses a helper module, HereDoc, that lets you paste multiline string literals into your code (like perl's <<"EOF" or python's """). I can't remember how I found how to do this, but I've tweaked it to remove whitespace first and last lines, so I can start and end my data with a newline.
module HereDoc where
import Language.Haskell.TH
import Language.Haskell.TH.Quote
import Data.Char (isSpace)
{-
example1 = [heredoc|Hi.
This is a multi-line string.
It should appear as an ordinary string literal.
Remember you can only use a QuasiQuoter
in a different module, so import this HereDoc module
into something else and don't forget the
{-# LANGUAGE TemplateHaskell, QuasiQuotes #-}|]
example2 = [heredoc|
This heredoc has no newline characters in it because empty or whitespace-only first and last lines are ignored
|]
-}
heredoc = QuasiQuoter {quoteExp = stringE.topAndTail,
quotePat = litP . stringL,
quoteType = undefined,
quoteDec = undefined}
topAndTail = myunlines.tidyend.tidyfront.lines
tidyfront :: [String] -> [String]
tidyfront [] = []
tidyfront (xs:xss) | all isSpace xs = xss
| otherwise = xs:xss
tidyend :: [String] -> [String]
tidyend [] = []
tidyend [xs] | all isSpace xs = []
| otherwise = [xs]
tidyend (xs:xss) = xs:tidyend xss
myunlines :: [String] -> String
myunlines [] = ""
myunlines (l:ls) = l ++ concatMap ('\n':) ls
You might find Data.Text a good source of (inspiration for) helper functions:
http://hackage.haskell.org/packages/archive/text/latest/doc/html/Data-Text.html

Related

Tuple initialization from IO data in Haskell

I would like to know what is the best way to get a tuple from data read from the input in Haskell. I often encounter this problem in competitive programming when the input is made up of several lines that contain space-separated integers. Here is an example:
1 3 10
2 5 8
10 11 0
0 0 0
To read lines of integers, I use the following function:
readInts :: IO [Int]
readInts = fmap (map read . words) getLine
Then, I transform these lists into tuples with of the appropriate size:
readInts :: IO (Int, Int, Int, Int)
readInts = fmap ((\l -> (l !! 0, l !! 1, l !! 2, l !! 3)) . map read . words) getLine
This approach does not seem very idiomatic to me.
The following syntax is more readable but it only works for 2-tuples:
readInts :: IO (Int, Int)
readInts = fmap ((\[x, y] -> (x, y)) . map read . words) getLine
(EDIT: as noted in the comments, the solution above works for n-tuples in general).
Is there an idiomatic way to initialize tuples from lists of integers without having to use !! in Haskell? Alternatively, is there a different approach to processing this type of input?
How about this:
readInts :: IO (<any tuple you like>)
readInts = read . ("(" ++) . (++ ")") . intercalate "," . words <$> getLine
Given that the context is 'competitive programming' (something I'm only dimly aware of as a concept), I'm not sure that the following offers a particularly competitive alternative, but IMHO I'd consider it idiomatic to use one of several available parser combinators.
The base package comes with a module called Text.ParserCombinators.ReadP. Here's how you could use it to parse the input file from the linked article:
module Q57693986 where
import Text.ParserCombinators.ReadP
parseNumber :: ReadP Integer
parseNumber = read <$> munch1 (`elem` ['0'..'9'])
parseTriple :: ReadP (Integer, Integer, Integer)
parseTriple =
(,,) <$> parseNumber <*> (char ' ' *> parseNumber) <*> (char ' ' *> parseNumber)
parseLine :: ReadS (Integer, Integer, Integer)
parseLine = readP_to_S (parseTriple <* eof)
parseInput :: String -> [(Integer, Integer, Integer)]
parseInput = concatMap (fmap fst . filter (null . snd)) . fmap parseLine . lines
You can use the parseInput against this input file:
1 3 10
2 5 8
10 11 0
0 0 0
Here's a GHCi session that parses that file:
*Q57693986> parseInput <$> readFile "57693986.txt"
[(1,3,10),(2,5,8),(10,11,0),(0,0,0)]
Each parseLine function produces a list of tuples that match the parser; e.g.:
*Q57693986> parseLine "11 32 923"
[((11,32,923),"")]
The second element of the tuple is any remaining String still waiting to be parsed. In the above example, parseLine has completely consumed the line, which is what I'd expect for well-formed input, so the remaining String is empty.
The parser returns a list of alternatives if there's more than one way the input could be consumed by the parser, but again, in the above example, there's only one suggested alternative, as the line has been fully consumed.
The parseInput function throws away any tuple that hasn't been fully consumed, and then picks only the first element of any remaining tuples.
This approach has often served me with puzzles such as Advent of Code, where the input files tend to be well-formed.
This is a way to generate a parser that works generically for any tuple (of reasonable size). It requires the library generics-sop.
{-# LANGUAGE DeriveGeneric, DeriveAnyClass,
FlexibleContexts, TypeFamilies, TypeApplications #-}
import GHC.Generics
import Generics.SOP
import Generics.SOP (hsequence, hcpure,Proxy,to,SOP(SOP),NS(Z),IsProductType,All)
import Data.Char
import Text.ParserCombinators.ReadP
import Text.ParserCombinators.ReadPrec
import Text.Read
componentP :: Read a => ReadP a
componentP = munch isSpace *> readPrec_to_P readPrec 1
productP :: (IsProductType a xs, All Read xs) => ReadP a
productP =
let parserOutside = hsequence (hcpure (Proxy #Read) componentP)
in Generics.SOP.to . SOP . Z <$> parserOutside
For example:
*Main> productP #(Int,Int,Int) `readP_to_S` " 1 2 3 "
[((1,2,3)," ")]
It allows components of different types, as long as they all have a Read instance.
It also parses records that have a Generics.SOP.Generic instance:
data Stuff = Stuff { x :: Int, y :: Bool }
deriving (Show,GHC.Generics.Generic,Generics.SOP.Generic)
For example:
*Main> productP #Stuff `readP_to_S` " 1 True"
[(Stuff {x = 1, y = True},"")]

How to convert list to string?

I make a function which read file and removes in every line all the words that were encountered earlier in the same line.
{-# OPTIONS_GHC -Wall #-}
module Main where
import System.Environment
import System.IO()
main :: IO ()
main = do args <- getArgs
if (length args > 0) then do
f <- get args
putStrLn (seqWord $ head f)
else do
f <- getContents
putStrLn (seqWord f)
get :: [String] -> IO[String]
get [] = return []
get (file:xs) = do
contents <- readFile file
fs <- get xs
return (contents:fs)
seqWord :: String -> String
seqWord s = show (map (filterWord . words) (lines s))
filterWord :: [String] -> [String]
filterWord [] = []
filterWord (x:xs) = x : filterWord (filter(/=x) xs)
In answer I have list of lists, like this
[["1","12","5","8","13","145","85"],["546","822","1","12","58","8","9"]]
Please, help me fix this problem. Thank you
Use the unwords function to undo the effect of words. You may also want to replace show with unlines.
seqWord s = unlines (map (unwords . filterWord . words) (lines s))

Cutting specific chunks from a Haskell String

I'm trying to cut chunks from a list, with a given predicate. I would have preferred to use a double character, e.g. ~/, but have resolved to just using $. What I essentially want to do is this...
A: "Hello, my $name is$ Danny and I $like$ Haskell"
What I want to turn this into is this:
B: "Hello, my Danny and I Haskell"
So I want to strip everything in between the given symbol, $, or my first preference was ~/, if I can figure it out. What I tried was this:
s1 :: String -> String
s1 xs = takeWhile (/= '$') xs
s2 :: String -> String
s2 xs = dropWhile (/= '$') xs
s3 :: String -> String
s3 xs = s3 $ s2 $ s1 xs
This solution seems to just bug my IDE out (possibly infinite looping).
Solution:
s3 :: String -> String
s3 xs
|'$' `notElem` xs = xs
|otherwise = takeWhile (/= '$') xs ++ (s3 $ s1 xs)
s1 :: String -> String
s1 xs = drop 1 $ dropWhile (/= '$') $ tail $ snd $ break ('$'==) xs
This seems like a nice application for parsers. A solution using trifecta:
import Control.Applicative
import Data.Foldable
import Data.Functor
import Text.Trifecta
input :: String
input = "Hello, my $name is$ Danny and I $like$ Haskell"
cutChunk :: CharParsing f => f String
cutChunk = "" <$ (char '$' *> many (notChar '$') <* char '$')
cutChunk matches $, followed by 0 or more (many) non-$ characters, then another $. Then we use ("" <$) to make this parser's value always be the empty string, thus discarding all the characters that this parser matches.
includeChunk :: CharParsing f => f String
includeChunk = some (notChar '$')
includeChunk matches the text that we want to include in the result, which is anything that's not the $ character. It's important that we use some (matching one or more characters) and not many (matching zero or more characters) because we're going to include this parser within another many expression next; if this parser matched on the empty string, then that could loop infinitely.
chunks :: CharParsing f => f String
chunks = fold <$> many (cutChunk <|> includeChunk)
chunks is the parser for everything. Read <|> as "or", as in "parse either a cutChunk or an includeChunk". many (cutChunk <|> includeChunk) is a parser that produces a list of chunks e.g. Success ["Hello, my ",""," Danny and I ",""," Haskell"], so we fold the output to concatenate those chunks together into a single string.
result :: Result String
result = parseString chunks mempty input
The result:
Success "Hello, my Danny and I Haskell"
Your infinite loop comes from calling s3 recursively with no base case:
s3 :: String -> String
s3 xs = s3 $ s2 $ s1 xs
Adding a base case corrects the infinite loop:
s3 xs
| '$' `notElem` xs = xs
| otherwise = ...
This is not the whole answer. Think about what s1 actually does and where you use its return value:
s1 "hello $my name is$ ThreeFx" == "hello "
For further reference, see the break function:
break :: (a -> Bool) -> [a] -> ([a], [a])
I think your logic is wrong, perhaps easier to write it in an elementary way
Prelude> let pr xs = go xs True
Prelude| where go [] _ = []
Prelude| go (x:xs) f | x=='$' = go xs (not f)
Prelude| | f = x : go xs f
Prelude| | otherwise = go xs f
Prelude|
Prelude> pr "Hello, my $name is$ Danny and I $like$ Haskell"
"Hello, my Danny and I Haskell"
Explanation The flag f keeps track of the state (either pass mode or not). If the current char is a token skip and switch state.

Reading multiline user's input

I want to lazily read user input and do something with it line by line. But if user ends a line with , (comma) followed by any number of spaces (including zero), I want give him opportunity to finish his input on the next line.
And here is what I've got:
import System.IO
import Data.Char
chop :: String -> [String]
chop = f . map (++ "\n") . lines
where f [] = []
f [x] = [x]
f (x : y : xs) = if (p . tr) x
then f ((x ++ y) : xs)
else x : f (y : xs)
p x = (not . null) x && ((== ',') . last) x
tr xs | all isSpace xs = ""
tr (x : xs) = x :tr xs
main :: IO ()
main =
do putStrLn "Welcome to hell, version 0.1.3!"
putPrompt
mapM_ process . takeWhile (/= "quit\n") . chop =<< getContents
where process str = putStr str >> putPrompt
putPrompt = putStr ">>> " >> hFlush stdout
Sorry, it doesn't work at all. Bloody mess.
P.S. I want to preserve \n characters on end of every chunk. Currently I add them manually with map (++ "\n") after lines.
How about changing the type of chop a little:
readMultiLine :: IO [String]
readMultiLine = do
ln <- getLine
if (endswith (rstrip ln) ",") then
liftM (ln:) readMultiLine
else
return [ln]
Now you know that if the last list is not empty, then the user didn't finish typing (the last input ended with ',').
Of course, either import Data.String.Utils, or write your own. Could be as simple as:
endswith xs ys = (length xs >= length ys)
&& (and $ zipWith (==) (reverse xs) (reverse ys))
rstrip = reverse . dropWhile isSpace . reverse
But I missed the point at first. Here's the actual thing.
unfoldM :: (Monad m) => (a -> Maybe (m b, m a)) -> a -> m [b]
unfoldM f z = case f z of
Nothing -> return []
Just (x, y) -> liftM2 (:) x $ y >>= unfoldM f
main = unfoldM (\x -> if (x == ["quit"]) then Nothing
else Just (print x, readMultiLine)) =<< readMultiLine
The reason is, you need to be able to insert the "action" to be done on input between reading one multi-line input and the next. Here print x is the action inserted between two readMultiLine
Since you have questions about getContents, let me add. Even though getContents provides a lazy String, its effectful changes to the world are ordered with the subsequent effects of processing the list. But the processing of the list attempts to insert effects between effects of reading particular list items. To do that, you need a function that exposes the chain of effects, so you can insert your own effects between them.
You can do this using pipes, preserving the laziness of the user's input
import Data.Char (isSpace)
import Pipes
import qualified Pipes.Prelude as Pipes
endsWithComma :: String -> Bool
endsWithComma str =
case (dropWhile isSpace $ reverse str) of
',':_ -> True
_ -> False
finish :: Monad m => Pipe String String m ()
finish = do
str <- await
yield str
if endsWithComma str
then do
str' <- await
yield str'
else finish
user :: Producer String IO ()
user = Pipes.stdinLn >-> finish
You can then hook up the user Producer to any downstream Consumer. For example, to echo the stream back out you can write:
main = runEffect (user >-> Pipes.stdoutLn)
To learn more about pipes you can read the tutorial.
Sorry, I wrote something wrong in a comment and I thought that now that I understood what you were trying to do, I'd give an answer with a little more substance. The core idea is that you're going to need a state buffer while you loop through the string, as far as I can tell. You have f :: [String] -> [String] but you'll need an extra string of buffer before you can solve this puzzle.
So let me assume an answer which looks like:
chop = joinCommas "" . map (++ "\n") . lines
Then the structure of joinCommas is going to look like:
import Data.List (isSuffixOf)
-- override with however you want to handle the ",\n" between lines.
joinLines = (++)
incomplete = isSuffixOf ",\n"
joinCommas :: String -> [String] -> [String]
joinCommas prefix (line : rest)
| incomplete prefix = joinCommas (joinLines prefix line) rest
| otherwise = prefix : joinCommas line rest
joinCommas prefix []
| incomplete prefix = error "Incomplete input"
| otherwise = [prefix]
The prefix stores up lines until it doesn't end with ",\n" at which point it emits the prefix and continues with the rest of the lines. On EOF we process the last line unless that line is incomplete.

Getting an integer from the console

Is there a way to read an integer from the console in Haskell? I'm asking for something pretty much like C++'s cin or Java's Scanner.nextInt().
And by that I mean that given this input:
1 2 3
2 3
4 25 12 7
1
I should be able to read them all, not at the same time (maybe reading 4 of them, doing some calculations and then read the rest) ignoring the fact that they are in separate lines.
The easiest solution is probably
getAll :: Read a => IO [a]
getAll = fmap (fmap read . words) getContents
getInts :: IO [Int]
getInts = getAll
which will read all input into a single list.
When in doubt, use Parsec! (not always, and not really, but who cares)
import Text.ParserCombinators.Parsec
import Text.Parsec.Numbers
value = do
spaces
num <- parseFloat
return num
line = many value
then "rinse and repeat", with getLine until you EOF.
Note: you can do it without Parsec using read and friends, but this way is more extendable and preferred for more complicated grammars.
Using Parsec:
import Text.ParserCombinators.Parsec
import Text.Parsec.Numbers
import Control.Applicative ((*>), (<*))
line = spaces *> many1 (parseFloat <* spaces)
main = putStrLn "Enter numbers:" >> fmap (parse line "") getLine >>= print
Running it:
$ ghc parsenums.hs
$ ./parsenums
Enter numbers:
345 23 654 234
[345.0,23.0,654.0,234.0]
A more "manual" way to do it would be something like:
import Data.Char (isDigit, isSpace)
getInts :: String -> [Int]
getInts s = case span isDigit (dropWhile isSpace s) of
("", "") -> []
("", s) -> error $ "Invalid input: " ++ s
(digits, rest) -> (read digits :: Int) : getInts rest
Which might be much clearer to see how it works. In fact, here's one that's completely from the ground up:
getInts :: String -> [Int]
getInts s = case span isDigit (dropWhile isSpace s) of
("", "") -> []
("", s) -> error $ "Invalid input: " ++ s
(digits, rest) -> strToInt digits : getInts rest
isDigit :: Char -> Bool
isDigit c = '0' <= c && c <= '9'
isSpace :: Char -> Bool
isSpace c = c `elem` " \t\n\r"
charToInt :: Char -> Int
charToInt c = fromEnum c - 48
strToInt :: String -> Int
strToInt s = go 0 s where
go n [] = n
go n (c:rest) = go (n * 10 + charToInt c) rest

Resources