Getting IO based on the contents of an IO stream - haskell

I have a situation where I am trying to concatenate the contents of two text files, A and B. The complication is that the location of B is specified in the contents of A. I've created a function (minimal example below), which reads A, opens B, and then just tries to stick them both together, but frankly this method seems too easy to be correct and I have a feeling that it may not be the best approach. It compiles but I'm unable to test it since it can't find the second file (presumably something to do with paths, but I've not yet figured out what). Any advice appreciated.
getIOFromIO :: IO String -> IO String
getIOFromIO orig = do
origContents <- orig
moreIO <- readFile origContents
return (origContents ++ " " ++ moreIO)

The function getIOFromIO should work fine, provided you pass it an IO action that reads the first file, like:
getIOFromIO (readFile "foo.tmp")
and provided the entire contents of foo.tmp, including any preceding or trailing whitespace (like a trailing newline) are part of the desired filename.
The following self-contained example demonstrates its use:
setup :: String -> String -> IO ()
setup file1 file2 = do
writeFile file1 file2 -- put name of file2 in file1
writeFile file2 $ "body\n"
-- unmodified from your question
getIOFromIO :: IO String -> IO String
getIOFromIO orig = do
origContents <- orig
moreIO <- readFile origContents
return (origContents ++ " " ++ moreIO)
main = do
setup "foo.tmp" "bar.tmp"
txt <- getIOFromIO (readFile "foo.tmp")
print txt
It should generate the output:
"bar.tmp body\n"
^^^^^^^ ^^^^
| ` contents of second file (bar.tmp)
|
`- contents of first file (foo.tmp)

Related

Haskells stack or interact are adding characters

I'm doing my first steps using Haskell. I created a project using stack and changed the Main.hs into
module Main where
my_fkt :: String -> String
my_fkt input = show (length input)
main :: IO ()
main = interact my_fkt
I build the project via stack build, run it via stack exec firststeps-exe, enter "abcd" and finish input via <CTRL>-D. In the console I now see
abcd4%
The %is inverted. If I use a text file containing the "abcd" (without line break) and execute more sample.txt | stack exec firststeps-exe I see
abcd5%
Why do I get one additional character in the second case and what is the inverted percentage sign?
That is because the definition of interact uses putStr instead of putStrLn.
You can take a look at the source code here.
interact :: (String -> String) -> IO ()
interact f = do s <- getContents
putStr (f s)
To remedy your issue I would go on and create a similar function
interact' :: (String -> String) -> IO ()
interact' f = do s <- getContents
putStrLn (f s)
or if you like to mix it up and write a bit terser code
interact' f = putStrLn =<< (f <$> getContents)
I don't know what the % is or why it is showing up, my guess would be that it is the escaped CTRL-D.
With regards to your second question about the additional "non-existing" character, I am also not sure, but here my guess would be that this is the \EOF.
Btw. you can always check using more testinput | wc -c it should yield the same result as your haskell program.

Turtle: dealing with non-utf8 input

In my path to learning Pipes, I've run into problems when dealing with non-utf8 files. That is why I've took a detour into the Turtle library to try to understand how to solve the problem there, at higher level of abstraction.
The exercise I want to do is quite simple: find the sum of all the lines of all regular files reachable from a given directory. This is readily implemented by the following shell command:
find $FPATH -type f -print | xargs cat | wc -l
I've come up with the following solution:
import qualified Control.Foldl as F
import qualified Turtle as T
-- | Returns true iff the file path is not a symlink.
noSymLink :: T.FilePath -> IO Bool
noSymLink fPath = (not . T.isSymbolicLink) <$> T.stat fPath
-- | Shell that outputs the regular files in the given directory.
regularFilesIn :: T.FilePath -> T.Shell T.FilePath
regularFilesIn fPath = do
fInFPath <- T.lsif noSymLink fPath
st <- T.stat fInFPath
if T.isRegularFile st
then return fInFPath
else T.empty
-- | Read lines of `Text` from all the regular files under the given directory
-- path.
inputDir :: T.FilePath -> T.Shell T.Line
inputDir fPath = do
file <- regularFilesIn fPath
T.input file
-- | Print the number of lines in all the files in a directory.
printLinesCountIn :: T.FilePath -> IO ()
printLinesCountIn fPath = do
count <- T.fold (inputDir fPath) F.length
print count
This solution gives the correct result, as long as there are no non-utf8 files in the directory. If this is not the case, the program will raise an exception like the following one:
*** Exception: test/resources/php_ext_syslog.h: hGetLine: invalid argument (invalid byte sequence)
Which is to be expected since:
$ file -I test/resources/php_ext_syslog.h
test/resources/php_ext_syslog.h: text/x-c; charset=iso-8859-1
I was wondering how to solve the problem of reading different encodings into Text, so that the program can deal with this. For the problem at hand I guess I could avoid the conversion to Text, but I'd rather know how to do this, since you could imagine a situation in which, for instance, I would like to make a set with all the words under a certain directory.
EDIT
For what is worth so far the only solution I could come up with is the following:
mDecodeByteString :: T.Shell ByteString -> T.Shell T.Text
mDecodeByteString = gMDecodeByteString (streamDecodeUtf8With lenientDecode)
where gMDecodeByteString :: (ByteString -> Decoding)
-> T.Shell ByteString
-> T.Shell T.Text
gMDecodeByteString f bss = do
bs <- bss
let Some res bs' g = f bs
if BS.null bs'
then return res
else gMDecodeByteString g bss
inputDir' :: T.FilePath -> T.Shell T.Line
inputDir' fPath = do
file <- regularFilesIn fPath
text <- mDecodeByteString (TB.input file)
T.select (NE.toList $ T.textToLines text)
-- | Print the number of lines in all the files in a directory. Using a more
-- robust version of `inputDir`.
printLinesCountIn' :: T.FilePath -> IO ()
printLinesCountIn' fPath = do
count <- T.fold (inputDir' fPath) T.countLines
print count
The problem is that this will count one more line per file, but at least allows to decode non-utf8 ByteStrings.

How do I read from a file and add the numbers in the text file in Haskell

I'm new to Haskell and IO is still a bit confusing. I have a txt file that I want to read, add the numbers in the text file, and then write it to a text file. the file looks like the following:
2
3
the numbers are separated by a new line character I know how to read a file contents then write it to another file but I don't know how I can manipulate it or if I have to cast the information to an Int?
module Main where
import System.Environment
-- | this fuction read first line in a file and write out to src file
-- src "src.txt", des "des.txt"
copyFirstLine :: FilePath -- ^ path to input file
-> FilePath -- ^ path to output file
-> IO ()
copyFirstLine src dst = do
contect <- readFile src
let (fst :rest) = (lines contect)
writeFile dst fst
main = do
[src,dst] <- getArgs
copyFirstLine src dst
Thanks in advance.
I can't sure your 'manipulate' means what, but I will assume you need integer calculation. It won't be difficult to manipulate as string.
If you hoogle the signature String -> Int you can find the read.
-- | this fuction read first line in a file and write out +1 result
-- to src file src "src.txt", des "des.txt"
eachPlusOne :: FilePath -- ^ path to input file
-> FilePath -- ^ path to output file
-> IO ()
eachPlusOne src dst = do
contect <- readFile src
let lns = lines contect :: [String]
ints = map ((1+) . read) lns :: [Int]
outs = unlines . map show $ ints :: String
writeFile dst outs
If you are using sufficiently recent version of ghc, you can use readMaybe which is desirable.

Read in multiple lines from standard input with arguments in Haskell

I'm trying to read in multiple lines from standard input in Haskell, plus one argument, then do something with the current line and write something to the standard output.
In my case I am trying to normalize lambda expressions. The program may receive 1 or more lambda expressions to normalize and then it has to write the result (normalized form or error) to the standard output. And the program may receive an argument (the max number of reductions). Here is the main function:
main :: IO ()
main = do
params <- getArgs
fullLambda <- getLine
let lambda = convertInput fullLambda
let redNum | (length params) == 1 = read (head params)
| otherwise = 100
case (parsing lambda) of
Left errorExp -> putStrLn ("ERROR: " ++ lambda)
Right lambdaExp -> do
let normalizedLambdaExp = reduction lambdaExp redNum
if (isNormalForm normalizedLambdaExp) && (isClosed lambdaExp)
then putStrLn ("OK: " ++ show normalizedLambdaExp)
else putStrLn ("ERROR: " ++ lambda)
where
convertInput :: String -> String
convertInput ('\"':xs) = take ((length xs) - 2) xs
convertInput input = input
So this code handles one line and completes the reductions and then writes something to the standard output. How can I change this to handle multiple lines? I've read about replicateM but I can't seem to grasp it. My mind is very OO so I was thinking maybe some looping somehow, but that is surely not the preferred way.
Also, this program has to be able to run like this:
echo "(\x.x) (\x.x)" | Main 25
And will produce:
OK: (\x.x)
And if there are multiple lines, it has to produce the same kind of output for each line, in new lines.
But also has to work without the argument, and has to handle multiple lines. I spent time on google and here, but I'm not sure how the argument reading will happen. I need to read in the argument once and the line(s) once or many times. Does someone know a not too lengthy solution to this problem?
I've tried it like this, too (imperatively):
main :: IO ()
main = do
params <- getArgs
mainHelper params
main
mainHelper :: [String] -> IO ()
mainHelper params = do
fullLambda <- getLine
And so on, but then it puts this to the standard output as well:
Main: <stdin>: hGetLine: end of file
Thank you in advance!
It appears you want to:
Parse a command line option which may or may not exist.
For each line of input process it with some function.
Here is an approach using lazy IO:
import System.Environment
import Control.Monad
main = do args <- getArgs
let option = case args of
[] -> ... the default value...
(a:_) -> read a
contents <- getContents
forM_ (lines contents) $ \aline -> do
process option aline
I am assuming your processing function has type process :: Int -> String -> IO (). For instance, it could look like:
process :: Int -> String -> IO ()
process option str = do
if length str < option
then putStrLn $ "OK: " ++ str
else putStrLn $ "NOT OK: line too long"
Here's how it works:
contents <- getContents reads all of standard input into the variable contents
lines contents breaks up the input into lines
forM_ ... iterates over each line, passing the line to the process function
The trick is that getContents reads standard input lazily so that you'll get some output after each line is read.
You should be aware that there are issues with lazy IO which you may run into when your program becomes more complex. However, for this simple use case lazy IO is perfectly fine and works well.

Read array-string into variable

I have text file containing data like that:
13.
13.
[(1,2),(2,3),(4,5)].
And I want to read this into 3 variables in Haskell. But standard functions read this as strings, but considering I get rid of dot at the end myself is there any built-in parser function that will make Integer of "13" and [(Integer,Integer)] list out of [(1,2),(2,3),(4,5)] ?
Yes, it's called read:
let i = read "13" :: Integer
let ts = read "[(1,2),(2,3),(4,5)]" :: [(Integer, Integer)]
The example text file you gave has trailing spaces as well as the full stop, so merely cutting the last character doesn't work. Let's take just the digits, using:
import Data.Char (isDigit)
Why not have a data type to store the stuff from the file:
data MyStuff = MyStuff {firstNum :: Int,
secondNum:: Int,
intPairList :: [(Integer, Integer)]}
deriving (Show,Read)
Now we need to read the file, and then turn it into individual lines:
getMyStuff :: FilePath -> IO MyStuff
getMyStuff filename = do
rawdata <- readFile filename
let [i1,i2,list] = lines rawdata
return $ MyStuff (read $ takeWhile isDigit i1) (read $ takeWhile isDigit i2) (read $ init list)
The read function works with any data type that has a Read instance, and automatically produces data of the right type.
> getMyStuff "data.txt" >>= print
MyStuff {firstNum = 13, secondNum = 13, intPairList = [(1,2),(2,3),(4,5)]}
A better way
I'd be inclined to save myself a fair bit of work, and just write that data directly, so
writeMyStuff :: FilePath -> MyStuff -> IO ()
writeMyStuff filename somedata = writeFile filename (show somedata)
readMyStuff :: FilePath -> IO MyStuff
readMyStuff filename = fmap read (readFile filename)
(The fmap just applies the pure function read to the output of the readFile.)
> writeMyStuff "test.txt" MyStuff {firstNum=12,secondNum=42, intPairList=[(1,2),(3,4)]}
> readMyStuff "test.txt" >>= print
MyStuff {firstNum = 12, secondNum = 42, intPairList = [(1,2),(3,4)]}
You're far less likely to make little parsing or printing errors if you let the compiler sort it all out for you, it's less code, and simpler.
Haskell's strong types require you to know what you're getting. So let's forgo all error checking and optimization and assume that the file is always in the right format, you can do something like this:
data Entry = Number Integer
| List [(Integer, Integer)]
parseYourFile :: FilePath -> IO [Entry]
parseYourFile p = do
content <- readFile p
return $ parseYourFormat content
parseYourFormat :: String -> [Entry]
parseYourFormat data = map parseEntry $ lines data
parseEntry :: String -> Entry
parseEntry line = if head line == '['
then List $ read core
else Number $ read core
where core = init line
Or you could write a proper parser for it using one of the many combinator frameworks.

Resources