As title states, I am trying to find a given string within a given path. Here is what I come up so far:
getRecursiveContents :: FilePath -> IO [FilePath]
getRecursiveContents topdir = do
names <- getDirectoryContents topdir
let properNames = filter (`notElem` [".", ".."]) names
paths <- forM properNames $ \name -> do
let path = topdir </> name
isDirectory <- doesDirectoryExist path
if isDirectory
then getRecursiveContents path
else return [path]
return (concat paths)
findInFile:: String -> FilePath -> IO Bool
findInFile needle filePath= do
content <- readFile filePath
return (needle `L.isInfixOf` content)
findInFolder:: (String -> Bool) -> FilePath -> String -> IO [IO Bool]
findInFolder p path needle = do
files <- getRecursiveContents path
return (map (findInFile needle) (filter p files))
find = findInFolder (\p -> takeExtension p `elem` [".py", ".xml", ".html"])
I can :
*Main> findInFile "search_string" "./path/to/a/file"
True
Which is perfect but I cannot do the same search for a folder:
*Main> find "./path/to/a/folder" "search_string"
*Main>
In my file system ./path/to/a/file is located under ./path/to/a/folder. Thus I was expecting the same result.
What am I doing wrong?
Note: getRecursiveContents is from real world haskell.
It does indeed work. The only issue is with how things are printed. When you type some expressions into ghci, it will call print on that expression. If the value has type IO x, it will execute the IO action and print x only if it has a Show instance; otherwise it prints no additional information.
find "./path/to/a/folder" "search_string" produces a list of IO actions, which have no Show instance. You can get the result of find, which is again a list of IO actions, and then execute them:
> x <- find "./path/to/a/folder" "search_string"
> sequence x
> [True, False ...
Likely you wanted to do this originally in your function. Simply make the following changes:
findInFolder:: (String -> Bool) -> FilePath -> String -> IO [Bool]
findInFolder p path needle = do
files <- getRecursiveContents path
mapM (findInFile needle) (filter p files)
Now findInFolder will work as you expect.
Related
I am trying to fix and run every example on the Real World Haskell book and learn something in the process and I got stuck at chapter 9. By reading the comments I got the following code to compile:
FoldDir.hs:
import ControlledVisit
import Data.Char (toLower)
import Data.Time.Clock (UTCTime(..))
import System.Directory (Permissions(..))
import System.FilePath ((</>), takeExtension, takeFileName)
data Iterate seed
= Done { unwrap :: seed }
| Skip { unwrap :: seed }
| Continue { unwrap :: seed }
deriving (Show)
type Iterator seed = seed -> Info -> Iterate seed
foldTree :: Iterator a -> a -> FilePath -> IO a
foldTree iter initSeed path = do
endSeed <- fold initSeed path
return (unwrap endSeed)
where
fold seed subpath = getUsefulContents subpath >>= walk seed
walk seed (name : names) = do
let path' = path </> name
info <- getInfo path'
case iter seed info of
done # (Done _) -> return done
Skip seed' -> walk seed' names
Continue seed'
| isDirectory info -> do
next <- fold seed' path'
case next of
done # (Done _) -> return done
seed'' -> walk (unwrap seed'') names
| otherwise -> walk seed' names
walk seed _ = return (Continue seed)
atMostThreePictures :: Iterator [FilePath]
atMostThreePictures paths info
| length paths == 3
= Done paths
| isDirectory info && takeFileName path == ".svn"
= Skip paths
| extension `elem` [".jpg", ".png"]
= Continue (path : paths)
| otherwise
= Continue paths
where
extension = map toLower (takeExtension path)
path = infoPath info
countDirectories count info =
Continue (if isDirectory info then count + 1 else count)
ControlledVisit.hs:
module ControlledVisit where
import Control.Monad (forM, liftM)
import Data.Time.Clock (UTCTime(..))
import System.FilePath ((</>))
import System.Directory
( Permissions(..)
, getModificationTime
, getPermissions
, getDirectoryContents
)
import Control.Exception
( bracket
, handle
, SomeException(..)
)
import System.IO
( IOMode(..)
, hClose
, hFileSize
, openFile
)
data Info = Info
{ infoPath :: FilePath
, infoPerms :: Maybe Permissions
, infoSize :: Maybe Integer
, infoModTime :: Maybe UTCTime
} deriving (Eq, Ord, Show)
getInfo :: FilePath -> IO Info
getInfo path = do
perms <- maybeIO (getPermissions path)
size <- maybeIO (bracket (openFile path ReadMode) hClose hFileSize)
modified <- maybeIO (getModificationTime path)
return (Info path perms size modified)
traverseDirs :: ([Info] -> [Info]) -> FilePath -> IO [Info]
traverseDirs order path = do
names <- getUsefulContents path
contents <- mapM getInfo (path : map (path </>) names)
liftM concat $ forM (order contents) $ \ info -> do
if isDirectory info && infoPath info /= path
then traverseDirs order (infoPath info)
else return [info]
getUsefulContents :: FilePath -> IO [String]
getUsefulContents path = do
names <- getDirectoryContents path
return (filter (`notElem` [".", ".."]) names)
isDirectory :: Info -> Bool
isDirectory = maybe False searchable . infoPerms
maybeIO :: IO a -> IO (Maybe a)
maybeIO act = handle (\ (SomeException _) -> return Nothing) (Just `liftM` act)
traverseVerbose order path = do
names <- getDirectoryContents path
let usefulNames = filter (`notElem` [".", ".."]) names
contents <- mapM getEntryName ("" : usefulNames)
recursiveContents <- mapM recurse (order contents)
return (concat recursiveContents)
where
getEntryName name = getInfo (path </> name)
isDirectory info = case infoPerms info of
Nothing -> False
Just perms -> searchable perms
recurse info = do
if isDirectory info && infoPath info /= path
then traverseVerbose order (infoPath info)
else return [info]
But when I try to run it in GHCi as explained in the book it fails with a weird error that as far as I understand is about GHCi itself:
Prelude> :l FoldDir.hs
[1 of 2] Compiling ControlledVisit ( ControlledVisit.hs, interpreted )
[2 of 2] Compiling Main ( FoldDir.hs, interpreted )
Ok, two modules loaded.
*Main> foldTree atMostThreePictures []
<interactive>:2:1: error:
• No instance for (Show (FilePath -> IO [FilePath]))
arising from a use of ‘print’
(maybe you haven't applied a function to enough arguments?)
• In a stmt of an interactive GHCi command: print it
I think I understand the No instance for (Show (FilePath -> IO [FilePath])) part but I am clueless about the print it. I know it is a special variable in GHCi that stores the evaluation result of the last expression and I guess the code is trying to print a function or a monad, but I don't get where it is happening.
As simple as possible - signature of Your function foldTree is:
foldTree :: Iterator a -> a -> FilePath -> IO a
You are supplying it with two arguments, one of type Iterator [FilePath] and second of type FilePath. Due to default partial application such call returns function with signature:
FilePath -> IO [FilePath]
GHCI wants to display the result of Your call but it cannot, as this type has no defined instance of typeclass Show. And so, it gives You an error telling exactly this.
So I'm writing a program that checks for every line of a .txt file whether it is a palindrome or not,
import System.IO
main :: IO()
main = do {
content <- readFile "palindrom.txt";
print content;
print (lines content);
singleWord (head (lines content));
return ();
}
palindrom :: [Char] -> Bool
palindrom a = a == reverse a
singleWord :: [Char] -> IO()
singleWord a = do {
print (length a);
print (show (palindrom a));
}
But instead of singleWord (head (lines content)) I need to run the singleWord through the entire list.
The problem is that with map or normal list comprehension I always get a ton of varying errors all to do with lines content (which should be an array of Strings or IO Strings) apparently always being the type I don't want (I've tried messing around with type declarations on that forever, but it keeps being the wrong type, or the right one but in an extra array-layer or whatever).
My last attempt is to walk through the array with recursion, with this little extra code:
walkthrough [] = []
walkthrough x = do { singleWord head x; walkthrough (tail x) }
which I can't typecast correctly no matter what.
It's supposed to replace the singleWord (head (lines content)) in main, and if I try anything with typeclassing, like
walkthrough :: [[Char]] -> [[Char]]
walkthrough [] = ["Hi"]
walkthrough x = do { singleWord head x; walkthrough (tail x) }
I get
Couldn't match type `IO' with `[]'
Expected type: [()]
Actual type: IO ()
or some other stuff that won't fit together.
You're looking for a function called mapM_.
main :: IO ()
main = do {
content <- readFile "palindrom.txt";
mapM_ singleWord (lines content);
};
palindrome :: [Char] -> Bool
palindrome a = (a == reverse a)
singleWord :: [Char] -> IO()
singleWord a = do {
let {
adverb = (if palindrome a then " " else " not ");
};
putStrLn (a ++ " is" ++ adverb ++ "a palindrome.");
};
That should've been
walkthrough [] = return () -- this is the final action
walkthrough x = do { singleWord (head x) -- here you missed the parens
; walkthrough (tail x) }
or better yet,
walkthrough [] = return ()
walkthrough (x:xs) = do { singleWord x -- can't make that mistake now!
; walkthrough xs}
and call it as walkthrough (lines content) in your main do block.
As others have pointed out, walkthrough is the same as mapM_ singleWord.
You could also write it with a list comprehension,
walkthrough xs = sequence_ [ singleWord x | x <- xs]
sequence_ :: Monad m => [m a] -> m () turns a list of actions into a sequence of actions discarding their results and producing the () in the end: sequence_ = foldr (>>) (return ()). And sequence_ (map f xs) === mapM_ f xs, so it all ties up in the end.
Use mapM_ singleWord (lines content). For the sake of simplicity, think of mapM_ as.
mapM_ :: (a -> IO ()) -> [a] -> IO ()
Right, I have two functions. Both take exactly the same file input. run2D works perfectly, but oneR gives me the error Prelude.read: no parse. This confuses me as it's my understanding that the no parse error usually means there's a problem with the input file, which there obviously isn't.
run2D :: [String] -> IO()
run2D [file,r] = do
thefile <- readFile file
let [v,e,f] = lines thefile
print(pullChi(eg2D (verticesMake (read v)) (read e) (read f) (read r)) (read r))
oneR :: [String] -> IO()
oneR [file] = do
thefile <- readFile file
let [v,e,f] = lines thefile
print(oneRobot (read v) (read e) (read f))
Here's the contents of my input file
7
[[0,1],[1,2],[0,2],[1,3],[2,3],[1,4],[2,4],[0,6],[1,6],[1,5],[5,6],[4,5]]
[[0,1,2],[1,2,3],[1,2,4],[0,1,6],[1,5,6],[1,4,5]]
and my oneRobot function
oneRobot :: Integer -> [Integer] -> [Integer] -> Integer -- Takes #vertices, list of edges and robots and returns the euler characteristic where number of robots = 1
oneRobot v e f = v - genericLength(e) + genericLength(f)
The problem is: in your file, you have a representation of [[Integer]] at the second and the third line.
Change oneRobot function signature and implementation to reflect this:
oneRobot :: Integer -> [[Integer]] -> [[Integer]] -> Integer
or flatten your list of integer lists with concat if it fits your task:
print(oneRobot (read v) (concat $ read e) (concat $ read f))
I have the following code:
import System.Environment
import System.Directory
import System.IO
import Data.List
dispatch :: [(String, [String] -> IO ())]
dispatch = [ ("add", add)
, ("view", view)
, ("remove", remove)
, ("bump", bump)
]
main = do
(command:args) <- getArgs
let result = lookup command dispatch
if result == Nothing then
errorExit
else do
let (Just action) = result
action args
errorExit :: IO ()
errorExit = do
putStrLn "Incorrect command"
add :: [String] -> IO ()
add [fileName, todoItem] = appendFile fileName (todoItem ++ "\n")
view :: [String] -> IO ()
view [fileName] = do
contents <- readFile fileName
let todoTasks = lines contents
numberedTasks = zipWith (\n line -> show n ++ " - " ++ line) [0..] todoTasks
putStr $ unlines numberedTasks
remove :: [String] -> IO ()
remove [fileName, numberString] = do
handle <- openFile fileName ReadMode
(tempName, tempHandle) <- openTempFile "." "temp"
contents <- hGetContents handle
let number = read numberString
todoTasks = lines contents
newTodoItems = delete (todoTasks !! number) todoTasks
hPutStr tempHandle $ unlines newTodoItems
hClose handle
hClose tempHandle
removeFile fileName
renameFile tempName fileName
bump :: [String] -> IO ()
bump [fileName, numberString] = do
handle <- openFile fileName ReadMode
(tempName, tempHandle) <- openTempFile "." "temp"
contents <- hGetContents handle
let number = read numberString
todoTasks = lines contents
bumpedItem = todoTasks !! number
newTodoItems = [bumpedItem] ++ delete bumpedItem todoTasks
hPutStr tempHandle $ unlines newTodoItems
hClose handle
hClose tempHandle
removeFile fileName
renameFile tempName fileName
Trying to compile it gives me the following error:
$ ghc --make todo
[1 of 1] Compiling Main ( todo.hs, todo.o )
todo.hs:16:15:
No instance for (Eq ([[Char]] -> IO ()))
arising from a use of `=='
Possible fix:
add an instance declaration for (Eq ([[Char]] -> IO ()))
In the expression: result == Nothing
In a stmt of a 'do' block:
if result == Nothing then
errorExit
else
do { let (Just action) = ...;
action args }
In the expression:
do { (command : args) <- getArgs;
let result = lookup command dispatch;
if result == Nothing then
errorExit
else
do { let ...;
.... } }
I don't get why is that since lookup returns Maybe a, which I'm surely can compare to Nothing.
The type of the (==) operator is Eq a => a -> a -> Bool. What this means is that you can only compare objects for equality if they're of a type which is an instance of Eq. And functions aren't comparable for equality: how would you write (==) :: (a -> b) -> (a -> b) -> Bool? There's no way to do it.1 And while clearly Nothing == Nothing and Just x /= Nothing, it's the case that Just x == Just y if and only if x == y; thus, there's no way to write (==) for Maybe a unless you can write (==) for a.
There best solution here is to use pattern matching. In general, I don't find myself using that many if statements in my Haskell code. You can instead write:
main = do (command:args) <- getArgs
case lookup command dispatch of
Just action -> action args
Nothing -> errorExit
This is better code for a couple of reasons. First, it's shorter, which is always nice. Second, while you simply can't use (==) here, suppose that dispatch instead held lists. The case statement remains just as efficient (constant time), but comparing Just x and Just y becomes very expensive. Second, you don't have to rebind result with let (Just action) = result; this makes the code shorter and doesn't introduce a potential pattern-match failure (which is bad, although you do know it can't fail here).
1:: In fact, it's impossible to write (==) while preserving referential transparency. In Haskell, f = (\x -> x + x) :: Integer -> Integer and g = (* 2) :: Integer -> Integer ought to be considered equal because f x = g x for all x :: Integer; however, proving that two functions are equal in this way is in general undecidable (since it requires enumerating an infinite number of inputs). And you can't just say that \x -> x + x only equals syntactically identical functions, because then you could distinguish f and g even though they do the same thing.
The Maybe a type has an Eq instance only if a has one - that's why you get No instance for (Eq ([[Char]] -> IO ())) (a function can't be compared to another function).
Maybe the maybe function is what you're looking for. I can't test this at the moment, but it should be something like this:
maybe errorExit (\action -> action args) result
That is, if result is Nothing, return errorExit, but if result is Just action, apply the lambda function on action.
I've written a simple XML parser in Haskell.
The function convertXML recieves contents of a XML file and returns a list of extracted values that are further processed.
One attribute of XML tag contains also an URL of a product image and I would like to extend the function to also download it if the tag is found.
convertXML :: (Text.XML.Light.Lexer.XmlSource s) => s -> [String]
convertXML xml = productToCSV products
where
productToCSV [] = []
productToCSV (x:xs) = (getFields x) ++ (productToCSV
(elChildren x)) ++ (productToCSV xs)
getFields elm = case (qName . elName) elm of
"product" -> [attrField "uid", attrField "code"]
"name" -> [trim $ strContent elm]
"annotation" -> [trim $ strContent elm]
"text" -> [trim $ strContent elm]
"category" -> [attrField "uid", attrField "name"]
"manufacturer" -> [attrField "uid",
attrField "name"]
"file" -> [getImgName]
_ -> []
where
attrField fldName = trim . fromJust $
findAttr (unqual fldName) elm
getImgName = if (map toUpper $ attrField "type") == "FULL"
then
-- here I need some IO code
-- to download an image
-- fetchFile :: String -> IO String
attrField "file"
else []
products = findElements (unqual "product") productsTree
productsTree = fromJust $ findElement (unqual "products") xmlTree
xmlTree = fromJust $ parseXMLDoc xml
Any idea how to insert an IO code in the getImgName function or do I have to completely rewrite convertXML function to an impure version ?
UPDATE II
Final version of convertXML function. Hybrid pure/impure but clean way suggested by Carl. Second parameter of returned pair is an IO action that runs images downloading and saving to disk and wraps list of local paths where are images stored.
convertXML :: (Text.XML.Light.Lexer.XmlSource s) => s -> ([String], IO [String])
convertXML xml = productToCSV products (return [])
where
productToCSV :: [Element] -> IO String -> ([String], IO [String])
productToCSV [] _ = ([], return [])
productToCSV (x:xs) (ys) = storeFields (getFields x)
( storeFields (productToCSV (elChildren x) (return []))
(productToCSV xs ys) )
getFields elm = case (qName . elName) elm of
"product" -> ([attrField "uid", attrField "code"], return [])
"name" -> ([trim $ strContent elm], return [])
"annotation" -> ([trim $ strContent elm], return [])
"text" -> ([trim $ strContent elm], return [])
"category" -> ([attrField "uid", attrField "name"], return [])
"manufacturer" -> ([attrField "uid",
attrField "name"], return [])
"file" -> getImg
_ -> ([], return [])
where
attrField fldName = trim . fromJust $
findAttr (unqual fldName) elm
getImg = if (map toUpper $ attrField "type") == "FULL"
then
( [attrField "file"], fetchFile url >>=
saveFile localPath >>
return [localPath] )
else ([], return [])
where
fName = attrField "file"
localPath = imagesDir ++ "/" ++ fName
url = attrField "folderUrl" ++ "/" ++ fName
storeFields (x1s, y1s) (x2s, y2s) = (x1s ++ x2s, liftM2 (++) y1s y2s)
products = findElements (unqual "product") productsTree
productsTree = fromJust $ findElement (unqual "products") xmlTree
xmlTree = fromJust $ parseXMLDoc xml
The better approach would be to have the function return the list of files to download as part of the result:
convertXML :: (Text.XML.Light.Lexer.XmlSource s) => s -> ([String], [URL])
and download them in a separate function.
The entire point of the type system in Haskell is that you can't do IO except with IO actions - values of type IO a. There are ways to violate this, but they run the risk of behaving entirely unlike what you'd expect, due to interactions with optimizations and lazy evaluation. So until you understand why IO works the way it does, don't try to make it work differently.
But a very important consequence of this design is that IO actions are first class. With a bit of cleverness, you could write your function as this:
convertXML :: (Text.XML.Light.Lexer.XmlSource s) => s -> ([String], IO [Image])
The second item in the pair would be an IO action that, when executed, would give a list of the images present. That would avoid the need to have image loading code outside of convertXML, and it would allow you to do IO only if you actually needed the images.
I basically see to approaches:
let the function give out a list of found images too and process them with an impure function afterwards. Laziness will do the rest.
Make the whole beast impure
I generally like the first approach more. d