Haskell Read from file and parse string into words

Haskell Read from file and parse string into words - haskell

When in ghci mode I can type this line :
map read $ words "1 2 3 4 5" :: [Int]
and get
[1,2,3,4,5]
When I make a file named splitscan.hs containing this line:
map read $ words scan :: [Float]
I get this error:
[1 of 1] Compiling Main ( splitscan.hs, splitscan.o )
splitscan.hs:1:1: error:
Invalid type signature: map read $ words str :: ...
Should be of form <variable> :: <type>
|
1 | map read $ words str :: [Float]
| ^^^^^^^^^^^^^^^^^^^^
When I do this:
import System.IO
main = do
scan <- readFile "g924310_b1_copy.txt"
map read $ words scan :: [Float]
putStr scan
I get :
readscan.hs:5:5: error:
• Couldn't match type ‘[]’ with ‘IO’
Expected type: IO Float
Actual type: [Float]
• In a stmt of a 'do' block: map read $ words scan :: [Float]
In the expression:
do scan <- readFile "g924310_b1_copy.txt"
map read $ words scan :: [Float]
putStr scan
In an equation for ‘main’:
main
= do scan <- readFile "g924310_b1_copy.txt"
map read $ words scan :: [Float]
putStr scan
The question is, how do implement the ghci line such that I can get all the words from the scan and make a list of them that I can later fit regressions, add constants to etc.

In Haskell, variables are immutable. So map read $ words scan doesn't change the variable scan; it returns a new value. You need to use this new value if you want to do something with it.
import System.IO
main = do
scan <- readFile "g924310_b1_copy.txt"
let scan' = map read $ words scan :: [Float]
print scan'

I would do the following:
floats :: String -> [Float]
floats = fmap read . words
main :: IO ()
main = print =<< fmap floats readFile "g924310_b1_copy.txt"

Related

Reading numbers inline

Imagine I read an input block via stdin that looks like this:
3
12
16
19
The first number is the number of following rows. I have to process these numbers via a function and report the results separated by a space.
So I wrote this main function:
main = do
num <- readLn
putStrLn $ intercalate " " [ show $ myFunc $ read getLine | c <- [1..num]]
Of course that function doesn't compile because of the read getLine.
But what is the correct (read: the Haskell way) way to do this properly? Is it even possible to write this function as a one-liner?

Is it even possible to write this function as a one-liner?
Well, it is, and it's kind of concise, but see for yourself:
main = interact $ unwords . map (show . myFunc . read) . drop 1 . lines
So, how does this work?
interact :: (String -> String) -> IO () takes all contents from STDIN, passes it through the given function, and prints the output.
We use unwords . map (show . myFunc . read) . drop 1 . lines :: String -> String:
lines :: String -> [String] breaks a string at line ends.
drop 1 removes the first line, as we don't actually need the number of lines.
map (show . myFunc . read) converts each String to the correct type, uses myFunc, and then converts it back to a `String.
unwords is basically the same as intercalate " ".
However, keep in mind that interact isn't very GHCi friendly.

You can build a list of monadic actions with <$> (or fmap) and execute them all with sequence.
λ intercalate " " <$> sequence [show . (2*) . read <$> getLine | _ <- [1..4]]
1
2
3
4
"2 4 6 8"

Is it even possible to write this function as a one-liner?
Sure, but there is a problem with the last line of your main function. Because you're trying to apply intercalate " " to
[ show $ myFunc $ read getLine | c <- [1..num]]
I'm guessing you expect the latter to have type [String], but it is in fact not a well-typed expression. How can that be fixed? Let's first define
getOneInt :: IO Int
getOneInt = read <$> getLine
for convenience (we'll be using it multiple times in our code). Now, what you meant is probably something like
[ show . myFunc <$> getOneInt | c <- [1..num]]
which, if the type of myFunc aligns with the rest, has type [IO String]. You can then pass that to sequence in order to get a value of type IO [String] instead. Finally, you can "pass" that (using =<<) to
putStrLn . intercalate " "
in order to get the desired one-liner:
import Control.Monad ( replicateM )
import Data.List ( intercalate )
main :: IO ()
main = do
num <- getOneInt
putStrLn . intercalate " " =<< sequence [ show . myFunc <$> getOneInt | c <- [1..num]]
where
myFunc = (* 3) -- for example
getOneInt :: IO Int
getOneInt = read <$> getLine
In GHCi:
λ> main
3
45
23
1
135 69 3
Is the code idiomatic and readable, though? Not so much, in my opinion...
[...] what is the correct (read: the Haskell way) way to do this properly?
There is no "correct" way of doing it, but the following just feels more natural and readable to me:
import Control.Monad ( replicateM )
import Data.List ( intercalate )
main :: IO ()
main = do
n <- getOneInt
ns <- replicateM n getOneInt
putStrLn $ intercalate " " $ map (show . myFunc) ns
where
myFunc = (* 3) -- replace by your own function
getOneInt :: IO Int
getOneInt = read <$> getLine
Alternatively, if you want to eschew the do notation:
main =
getOneInt >>=
flip replicateM getOneInt >>=
putStrLn . intercalate " " . map (show . myFunc)
where
myFunc = (* 3) -- replace by your own function

Couldn't match type `[]' with `IO' -- Haskell

I'm beginner in Haskell. In this task i'm performing the split operation but i'm facing problem because of type mis match. I'm reading data from text file and the data is in table format. Ex. 1|2|Rahul|13.25. In this format. Here | is delimiter so i want to split the data from the delimiter | and want to print 2nd column and 4th column data but i'm getting the error like this
"Couldn't match type `[]' with `IO'
Expected type: IO [Char]
Actual type: [[Char]]
In the return type of a call of `splitOn'"
Here is my code..
module Main where
import Data.List.Split
main = do
list <- readFile("src/table.txt")
putStrLn list
splitOn "|" list
Any help regarding this will appreciate.. Thanks

The problem is that you're trying to return a list from the main function, which has a type of IO ().
What you probably want to do is print the result.
main = do
list <- readFile("src/table.txt")
putStrLn list
print $ splitOn "|" list

Not Haskell, but it looks like a typical awk task.
cat src/table.txt | awk -F'|' '{print $2, $4}'
Back to Haskell the best I could find is :
module Main where
import Data.List.Split(splitOn)
import Data.List (intercalate)
project :: [Int] -> [String] -> [String]
project indices l = foldl (\acc i -> acc ++ [l !! i]) [] indices
fromString :: String -> [[String]]
fromString = map (splitOn "|") . lines
toString :: [[String]] -> String
toString = unlines . map (intercalate "|")
main :: IO ()
main = do
putStrLn =<<
return . toString . map (project [1, 3]) . fromString =<<
readFile("table.txt")
If not reading from a file, but from stdin, the interact function could be useful.

How to do something with data from stdin, line by line, a maximum number of times and printing the number of line in Haskell

This code reads the number of lines to process from the first line of stdin, then it loops number_of_lines_to_process times doing some calculations and prints the result.
I want it to print the line number in "Line #" after "#" but I don't know how to obtain it
import IO
import Control.Monad (replicateM)
main :: IO ()
main = do
hSetBuffering stdin LineBuffering
s <- getLine
let number_of_lines_to_process = read s :: Integer
lines <- replicateM (fromIntegral(number_of_lines_to_process)) $ do
line <- getLine
let number = read line :: Integer
result = number*2 --example
putStrLn ("Line #"++": "++(show result)) --I want to print the number of the iteration and the result
return ()
I guess that the solution to this problem is really easy, but I'm not familiar with Haskell (coding in it for the first time) and I didn't find any way of doing this. Can anyone help?

You could use forM_ instead of replicateM:
import IO
import Control.Monad
main :: IO ()
main = do
hSetBuffering stdin LineBuffering
s <- getLine
let number_of_lines_to_process = read s :: Integer
forM_ [1..number_of_lines_to_process] (\i -> do
line <- getLine
let number = read line :: Integer
result = number * 2
putStrLn $ "Line #" ++ show i ++ ": " ++ show result)
Note that because you use forM_ (which discards the results of each iteration) you don't need the additional return () at the end - the do block returns the value of the last statement, which in this case is the () which is returned by forM_.

The trick is to first create a list of all the line numbers you want to print, and to then loop through that list, printing each number in turn. So, like this:
import Control.Monad
import System.IO
main :: IO ()
main = do
hSetBuffering stdin LineBuffering
s <- getLine
let lineCount = read s :: Int
-- Create a list of the line numbers
lineNumbers = [1..lineCount]
-- `forM_` is like a "for-loop"; it takes each element in a list and performs
-- an action function that takes the element as a parameter
forM_ lineNumbers $ \ lineNumber -> do
line <- getLine
let number = read line :: Integer
result = number*2 --example
putStrLn $ "Line #" ++ show lineNumber ++ ": " ++ show result
return ()
Read the definition of forM_.
By the way, I wouldn't recommend using the old Haskell98 IO library. Use System.IO instead.

You could calculate the results, enumerate them, and then print them:
import IO
import Control.Monad (replicateM)
-- I'm assuming you start counting from zero
enumerate xs = zip [0..] xs
main :: IO ()
main = do
hSetBuffering stdin LineBuffering
s <- getLine
let number_of_lines_to_process = read s :: Integer
lines <- replicateM (fromIntegral(number_of_lines_to_process)) $ do
line <- getLine
let number = read line :: Integer
result = number*2 --example
return result
mapM_ putStrLn [ "Line "++show i++": "++show l | (i,l) <- enumerate lines ]

I'm still new at Haskell, so there could be problems with the program below (it does work). This program is a tail recursive implementation. The doLine helper function carries around the line number. The processing step is factored into process, which you can change according to the problem you are presented.
import System.IO
import Text.Printf
main = do
hSetBuffering stdin LineBuffering
s <- getLine
let number_of_lines_to_process = read s :: Integer
processLines number_of_lines_to_process
return ()
-- This reads "max" lines from stdin, processing each line and
-- printing the result.
processLines :: Integer -> IO ()
processLines max = doLine 0
where doLine i
| i == max = return ()
| otherwise =
do
line <- getLine
let result = process line
Text.Printf.printf "Line #%d: %d\n" (i + 1) result
doLine (i + 1)
-- Just an example. (This doubles the input.)
process :: [Char] -> Integer
process line = let number = read line :: Integer
in
number * 2
I'm a haskell rookie, so any critiques of the above are welcome.

Just as an alternative, I thought that you might enjoy an answer with minimal monad mucking and no do notation. We zip a lazy list of the user's data with an infinite list of the line number using the enumerate function to give us our desired output.
import System.IO
import Control.Monad (liftM)
--Here's the function that does what you really want with the data
example = (* 2)
--Enumerate takes a function, a line number, and a line of input and returns
--an ennumerated line number of the function performed on the data
enumerate :: (Show a, Show b, Read a) => (a->b) -> Integer -> String -> String
enumerate f i x = "Line #" ++
show i ++
": " ++
(show . f . read $ x) -- show . f . read handles our string conversion
-- Runover takes a list of lines and runs
-- an enumerated version of the sample over those lines.
-- The first line is the number of lines to process.
runOver :: [String] -> [String]
runOver (line:lines) = take (read line) $ --We only want to process the number of lines given in the first line
zipWith (enumerate example) [1..] lines -- run the enumerated example
-- over the list of numbers and the list of lines
-- In our main, we'll use liftM to lift our functions into the IO Monad
main = liftM (runOver . lines) getContents

Read n lines into a [String]

I'm trying to read n lines of content into a List of Strings. I've tried several variations of the code below, but nothing worked.
main = do
input <- getLine
inputs <- mapM getLine [1..read input]
print $ length input
This throws the following error:
Couldn't match expected type `a0 -> IO b0'
with actual type `IO String'
In the first argument of `mapM', namely `getLine'
In a stmt of a 'do' block: inputs <- mapM getLine [1 .. read input]
In the expression:
do { input <- getLine;
inputs <- mapM getLine [1 .. read input];
print $ length input }
And
main = do
input <- getLine
let inputs = map getLine [1..read input]
print $ length input
throws
Couldn't match expected type `a0 -> b0'
with actual type `IO String'
In the first argument of `map', namely `getLine'
In the expression: map getLine [1 .. read input]
In an equation for `inputs': inputs = map getLine [1 .. read input]
How can I do this?

Use replicateM from Control.Monad:
main = do
input <- getLine
inputs <- replicateM (read input) getLine
print $ length inputs
In the spirit of give a man a fish / teach a man to fish: You could have found this yourself by searching Hoogle.
You have:
an action to perform of type IO String
a number of times to perform that action (type Int)
You want:
an action of type IO [String]
So you could search Hoogle for (IO String) -> Int -> (IO [String]). replicateM is the first hit.

One other way you can do this action is by using the pure replicate and the sequence :: (Traversable t, Monad m) => t (m a) -> m (t a)
tool. Lets say first we will ask for a count and then ask for that many integers to print their sum on the terminal.
sumCountManyData :: IO ()
sumCountManyData = putStr "How many integers to sum..? "
>> getLine
>>= sequence . flip replicate getLine . read
>>= print . sum . map read

How do I parse a matrix of integers in Haskell?

So I've read the theory, now trying to parse a file in Haskell - but am not getting anywhere. This is just so weird...
Here is how my input file looks:
m n
k1, k2...
a11, ...., an
a21,.... a22
...
am1... amn
Where m,n are just intergers, K = [k1, k2...] is a list of integers, and a11..amn is a "matrix" (a list of lists): A=[[a11,...a1n], ... [am1... amn]]
Here is my quick python version:
def parse(filename):
"""
Input of the form:
m n
k1, k2...
a11, ...., an
a21,.... a22
...
am1... amn
"""
f = open(filename)
(m,n) = f.readline().split()
m = int(m)
n = int(n)
K = [int(k) for k in f.readline().split()]
# Matrix - list of lists
A = []
for i in range(m):
row = [float(el) for el in f.readline().split()]
A.append(row)
return (m, n, K, A)
And here is how (not very) far I got in Haskell:
import System.Environment
import Data.List
main = do
(fname:_) <- getArgs
putStrLn fname --since putStrLn goes to IO ()monad we can't just apply it
parsed <- parse fname
putStrLn parsed
parse fname = do
contents <- readFile fname
-- ,,,missing stuff... ??? how can I get first "element" and match on it?
return contents
I am getting confused by monads (and the context that the trap me into!), and the do statement. I really want to write something like this, but I know it's wrong:
firstLine <- contents.head
(m,n) <- map read (words firstLine)
because contents is not a list - but a monad.
Any help on the next step would be great.
So I've just discovered that you can do:
liftM lines . readFile
to get a list of lines from a file. However, still the example only only transforms the ENTIRE file, and doesn't use just the first, or the second lines...

The very simple version could be:
import Control.Monad (liftM)
-- this operates purely on list of strings
-- and also will fail horribly when passed something that doesn't
-- match the pattern
parse_lines :: [String] -> (Int, Int, [Int], [[Int]])
parse_lines (mn_line : ks_line : matrix_lines) = (m, n, ks, matrix)
where [m, n] = read_ints mn_line
ks = read_ints ks_line
matrix = parse_matrix matrix_lines
-- this here is to loop through remaining lines to form a matrix
parse_matrix :: [String] -> [[Int]]
parse_matrix lines = parse_matrix' lines []
where parse_matrix' [] acc = reverse acc
parse_matrix' (l : ls) acc = parse_matrix' ls $ (read_ints l) : acc
-- this here is to give proper signature for read
read_ints :: String -> [Int]
read_ints = map read . words
-- this reads the file contents and lifts the result into IO
parse_file :: FilePath -> IO (Int, Int, [Int], [[Int]])
parse_file filename = do
file_lines <- (liftM lines . readFile) filename
return $ parse_lines file_lines
You might want to look into Parsec for fancier parsing, with better error handling.
*Main Control.Monad> parse_file "test.txt"
(3,3,[1,2,3],[[1,2,3],[4,5,6],[7,8,9]])

An easy to write solution
import Control.Monad (replicateM)
-- Read space seperated words on a line from stdin
readMany :: Read a => IO [a]
readMany = fmap (map read . words) getLine
parse :: IO (Int, Int, [Int], [[Int]])
parse = do
[m, n] <- readMany
ks <- readMany
xss <- replicateM m readMany
return (m, n, ks, xss)
Let's try it:
*Main> parse
2 2
123 321
1 2
3 4
(2,2,[123,321],[[1,2],[3,4]])
While the code I presented is quite expressive. That is, you get work done quickly with little code, it has some bad properties. Though I think if you are still learning haskell and haven't started with parser libraries. This is the way to go.
Two bad properties of my solution:
All code is in IO, nothing is testable in isolation
The error handling is very bad, as you see the pattern matching is very aggressive in [m, n]. What happens if we have 3 elements on the first line of the input file?

liftM is not magic! You would think it does some arcane thing to lift a function f into a monad but it is actually just defined as:
liftM f x = do
y <- x
return (f y)
We could actually use liftM to do what you wanted to, that is:
[m,n] <- liftM (map read . words . head . lines) (readFile fname)
but what you are looking for are let statements:
parseLine = map read . words
parse fname = do
(x:y:xs) <- liftM lines (readFile fname)
let [m,n] = parseLine x
let ks = parseLine y
let matrix = map parseLine xs
return (m,n,ks,matrix)
As you can see we can use let to mean variable assignment rather then monadic computation. In fact let statements are you just let expressions when we desugar the do notation:
parse fname =
liftM lines (readFile fname) >>= (\(x:y:xs) ->
let [m,n] = parseLine x
ks = parseLine y
matrix = map parseLine xs
in return matrix )

A Solution Using a Parsing Library
Since you'll probably have a number of people responding with code that parses strings of Ints into [[Int]] (map (map read . words) . lines $ contents), I'll skip that and introduce one of the parsing libraries. If you were to do this task for real work you'd probably use such a library that parses ByteString (instead of String, which means your IO reads everything into a linked list of individual characters).
import System.Environment
import Control.Monad
import Data.Attoparsec.ByteString.Char8
import qualified Data.ByteString as B
First, I imported the Attoparsec and bytestring libraries. You can see these libraries and their documentation on hackage and install them using the cabal tool.
main = do
(fname:_) <- getArgs
putStrLn fname
parsed <- parseX fname
print parsed
main is basically unchanged.
parseX :: FilePath -> IO (Int, Int, [Int], [[Int]])
parseX fname = do
bs <- B.readFile fname
let res = parseOnly parseDrozzy bs
-- We spew the error messages right here
either (error . show) return res
parseX (renamed from parse to avoid name collision) uses the bytestring library's readfile, which reads in the file packed, in contiguous bytes, instead of into cells of a linked list. After parsing I use a little shorthand to return the result if the parser returned Right result or print an error if the parser returned a value of Left someErrorMessage.
-- Helper functions, more basic than you might think, but lets ignore it
sint = skipSpace >> int
int = liftM floor number
parseDrozzy :: Parser (Int, Int, [Int], [[Int]])
parseDrozzy = do
m <- sint
n <- sint
skipSpace
ks <- manyTill sint endOfLine
arr <- count m (count n sint)
return (m,n,ks,arr)
The real work then happens in parseDrozzy. We get our m and n Int values using the above helper. In most Haskell parsing libraries we must explicitly handle whitespace - so I skip the newline after n to get to our ks. ks is just all the int values before the next newline. Now we can actually use the previously specified number of rows and columns to get our array.
Technically speaking, that final bit arr <- count m (count n sint) doesn't follow your format. It will grab n ints even if it means going to the next line. We could copy Python's behavior (not verifying the number of values in a row) using count m (manyTill sint endOfLine) or we could check for each end of line more explicitly and return an error if we are short on elements.
From Lists to a Matrix
Lists of lists are not 2 dimensional arrays - the space and performance characteristics are completely different. Let's pack our list into a real matrix using Data.Array.Repa (import Data.Array.Repa). This will allow us to access the elements of the array efficiently as well as perform operations on the entire matrix, optionally spreading the work among all the available CPUs.
Repa defines the dimensions of your array using a slightly odd syntax. If your row and column lengths are in variables m and n then Z :. n :. m is much like the C declaration int arr[m][n]. For the one dimensional example, ks, we have:
fromList (Z :. (length ks)) ks
Which changes our type from [Int] to Array DIM1 Int.
For the two dimensional array we have:
let matrix = fromList (Z :. m :. n) (concat arr)
And change our type from [[Int]] to Array DIM2 Int.
So there you have it. A parsing of your file format into an efficient Haskell data structure using production-oriented libraries.

What about something simple like this?
parse :: String -> (Int, Int, [Int], [[Int]])
parse stuff = (m, n, ks, xss)
where (line1:line2:rest) = lines stuff
readMany = map read . words
(m:n:_) = readMany line1
ks = readMany line2
xss = take m $ map (take n . readMany) rest
main :: IO ()
main = do
stuff <- getContents
let (m, n, ks, xss) = parse stuff
print m
print n
print ks
print xss

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Haskell Read from file and parse string into words - haskell

I would do the following: floats :: String -> [Float] floats = fmap read . words main :: IO () main = print =<< fmap floats readFile "g924310_b1_copy.txt"

Related

Reading numbers inline

Couldn't match type `[]' with `IO' -- Haskell

How to do something with data from stdin, line by line, a maximum number of times and printing the number of line in Haskell

Read n lines into a [String]

How do I parse a matrix of integers in Haskell?

Categories

Resources