Haskell Shake with Twitch? - haskell

I'm switching (or trying to) from the brilliant tup to haskell shake as my build system..
Only I can't figure out how to get shake to rebuild files on changes.
I could of course use inotify or a wrapper like filewatcher or even watchman.
Since I'm using shake though, I was wondering how to integrate with twitch which shares the do syntax, but otherwise doesn't provide much in way of documentation..
The ultimate goal is to use pandoc for multi format documents.
The only reason tup was inadequate was because it doesn't support targets.

First of all, you should to write your own shake build rules. Then, when some source file will be changed, you should to run your build rules to produce your targets.
Like this:
main = defaultMain $ do
"src/*.md" |> const build
build = shakeArgs shake{shakeFiles="out"} $ do
want ["out/foo.html", "out/foo.pdf"]
"out/*.html" %> \out -> do
let src = "src" </> dropDirectory1 out -<.> "md"
cmd_ "pandoc -o" [out] src
"out/*.pdf" %> \out -> do
let src = "src" </> dropDirectory1 out -<.> "md"
cmd_ "pandoc -o" [out] src
When a markdown file in src directory will be changed, then out/foo.html and out/foo.pdf will be updated.
If you want to optimize work of shake then you can do like this:
main = defaultMain $ do
"src/*.md" |> build . dependentTargets
build targets = shakeArgs shake{shakeFiles="out"} $ do
want targets
...
dependentTargets src
| "*.md" ?== src = ["out/foo.html", "out/foo.pdf"]
| otherwise = []
The package twitch recommends to use extension OverloadedStrings for compile code like this:
"src/*.md" |> ...
But this leads to ambiguous code in other parts of the program. For fix that, you can explicitly converting String to Dep like this:
import Data.String
fromString "src/*.md" |> ...
You can improve this code by redefining the (|>) operator:
import Data.String
import Twitch hiding ((|>))
pattern |> callback = addModify callback $ fromString pattern
"src/*.md" |> ...

I use shake for building a web site and have wrapped it into twitch to rerun the shake build when some files change. The main call for the watching functions (it uses forkIO to watches in two directories, and each can run shake) is bracketed; it also starts the web server.
mainWatch :: SiteLayout -> Port -> Path Abs Dir -> IO ()
mainWatch layout bakedPort bakedPath = bracketIO
(do -- first
shake layout
watchDough <- forkIO (mainWatchDough layout) -- calls shake
watchTemplates <- forkIO (mainWatchThemes layout) -- calls shake
scotty bakedPort (site bakedPath)
return (watchDough,watchTemplates) )
(\(watchDough,watchTemplates) -> do -- last
putIOwords ["main2 end"]
killThread watchDough
killThread watchTemplates
return ()
)
(\watch -> do -- during
return ()
)
Hope this can be adapted to your case!

Related

Conduit Exception

I couldn't figure out how to make sourceDirectory and catchC work.
src = (sourceDirectory "/does/not/exist/input.txt" $$ C.print) `catchC` \e ->
yield (pack $ "Could not read input file: " ++ show (e :: IOException))
The idea is that I use sourceDirectory to walk a directory tree and in case of failure I want the program to continue and not stop.
The catchC function works on individual components of a pipeline, like sourceDirectory "somedir" (in other words, things of type ConduitM). You've applied it to a completely run pipeline, which is just a normal action, and therefore catchC won't work. Your choices are:
Apply catchC to the individual component, e.g. (sourceDirectory "foo" `catchC` handler) $$ printC
Use a non-conduit-specific catch function (such as from safe-exceptions), e.g. (sourceDirectory "foo" $$ printC) `catch` handler.
Also, a recommendation for the future: it's a good idea to include the compiler error when some code won't build.

How to reload updated file in Threepenny-gui 0.6?

The Threepenny-gui changelog (https://hackage.haskell.org/package/threepenny-gui-0.6.0.1/changelog) reads: "The functions loadFile and loadDirectory have been removed, as I felt that the jsStatic option is sufficient for most use cases."
My question is: how can we reload an image that is updated during execution without loadFile?
With Threepenny-gui 0.5 I used the following code:
redraw :: UI.Element -> IORef CompTree -> (Maybe Vertex) -> UI ()
redraw img treeRef mcv
= do tree <- UI.liftIO $ readIORef treeRef
UI.liftIO $ writeFile ".Hoed/debugTree.dot" (shw $ summarize tree mcv)
UI.liftIO $ system $ "dot -Tpng -Gsize=9,5 -Gdpi=100 .Hoed/debugTree.dot "
++ "> .Hoed/wwwroot/debugTree.png"
url <- UI.loadFile "image/png" ".Hoed/wwwroot/debugTree.png"
UI.element img # UI.set UI.src url
When, with Threepenny-gui 0.6, I set jsStatic to Just "./.Hoed/wwwroot", the following code (obviously) results in my GUI only showing the initial image that already was there when my application started:
redraw :: UI.Element -> IORef CompTree -> (Maybe Vertex) -> UI ()
redraw img treeRef mcv
= do tree <- UI.liftIO $ readIORef treeRef
UI.liftIO $ writeFile ".Hoed/debugTree.dot" (shw $ summarize tree mcv)
UI.liftIO $ system $ "dot -Tpng -Gsize=9,5 -Gdpi=100 .Hoed/debugTree.dot "
++ "> .Hoed/wwwroot/debugTree.png"
UI.element img # UI.set UI.src "static/debugTree.png"
return ()
My full code for Threepenny-gui 0.5 is here: https://github.com/MaartenFaddegon/Hoed/blob/master/Debug/Hoed/DemoGUI.hs
(Author here.) Apparently, I didn't consider your use case when removing these functions. :-) I can add them back in if you like, could make an issue on github?
There are various methods on the JavaScript side to reload a file at a certain URL. See for instance the question "Refresh image with a new one at the same url".

How to delete unbuildable goal?

I would like to remove files that no longer have source but without cleaning.
Is there support for partially cleaning an incremental build? In this case, I guess I could compare against set of source files that were consumed in previous builds and define how to clean those that are gone.
main = shakeArgs shakeOptions { shakeVerbosity = Diagnostic } $ do
want [".build"]
phony ".build" $ do
files <- getDirectoryFiles "." ["//*.txt"]
let goals = map (-<.> "") files
need goals
"*" %> \out -> do
Stdout o <- cmd $ "sort " ++ (out ++ ".txt")
writeFile' out o
Using shakeArgsPrune you can define a function that gets passed the live files afterwards. You can then write something like:
import Development.Shake
import Development.Shake.FilePath
import Development.Shake.Util
import System.Directory.Extra
import Data.List
import System.IO
pruner :: [FilePath] -> IO ()
pruner live = do
present <- listFilesRecursive "output"
mapM_ removeFile $ map toStandard present \\ map toStandard live
main :: IO ()
main = shakeArgsPrune shakeOptions pruner $ do
... rules go here ...
This deletes all files in output that are not generated and up-to-date according to the build system as it stands. For a complete example see
http://neilmitchell.blogspot.co.uk/2015/04/cleaning-stale-files-with-shake.html.
The shakeArgsPrune function is only available in shake-0.15.1 and above, but is based on the shakeLiveFiles feature which has been available for longer and can be used directly if you so desire.

Recover the source file name in a Shake rule

I am writing a build system for a static website which works like this:
for every file src/123-some-title.txt produce a file out/123.html
My problem is that when writing the rule for out/*.html I have no direct way to recover the source file name (src/123-some-title.txt) from the target file name (out/123.html).
Of course I could read the src/ directory again and search for a file that starts with 123, but is there a nicer way to do this with Shake?
The first thing to mention is that if you call getDirectoryFiles multiple times with the same arguments it will only calculate once, in the same way that if you call need multiple times on the same file it will only build once. One approach would be:
"out/*.fwd" *> \out -> do
res <- getDirectoryFiles "src" ["*.txt"]
let match = [(takeBaseName out ++ "-") `isPrefixOf` takeBaseName x | x <- res]
when (length match /= 1) $ error "fail, because wrong number of matches"
writeFileChanged out $ head match
"out/*.html" *> \out -> do
src <- readFile' (out -<.> "fwd")
txt <- readFile' ("src" </> src)
...
Here the idea is that the file out/123.txt contain the contents 123-some-title.txt. By using writeFileChanged we only change the .fwd file when the relevant part of the directory changes.
If you want to avoid the .fwd files, you can use the Oracle mechanism. If you want to avoid a linear scan of the getDirectoryFiles result you can use the newCache function. In practice, neither is likely to be problematic, and going with the files is probably simplest.

How to override function in Codec.Archive.Tar

Haskell noob here. I have a question specifically regarding how to use an existing library that may lead to some more fundamental aspects of the proper use of Haskell.
I'm learning Haskell and have a small project in mind to work on while I learn. The script will need to find all the tarballs in a given directory and unpack them in parallel. At this point, I'm working on the basic functionality of unpacking. So, using the Codec.Archive.Tar package, how can I override its behavior regarding tarballs with fully qualified paths?
Here's some example code:
module Main where
import qualified Codec.Archive.Tar as Tar
import qualified Codec.Compression.GZip as GZip
import Control.Monad (liftM, unless)
import qualified Data.ByteString.Lazy as BS
import System.Directory (doesDirectoryExist, getDirectoryContents)
import System.Exit (exitWith, ExitCode(..))
import System.FilePath.Posix (takeExtension)
searchPath = "/home/someuser/tarball/dir"
exit = exitWith ExitSuccess
die = exitWith (ExitFailure 1)
processFile :: String -> IO ()
processFile file = do
putStrLn $ "Unpacking " ++ file ++ " to " ++ searchPath
Tar.unpack searchPath . Tar.read . GZip.decompress =<< BS.readFile filePath
where filePath = searchPath ++ "/" ++ file
main = do
dirExists <- doesDirectoryExist searchPath
unless dirExists $ (putStrLn $ "Error: Search path not found: " ++ searchPath) >> die
files <- targetFiles `liftM` getDirectoryContents searchPath
mapM_ processFile files
exit
where targetFiles = filter (\f -> f /= "." && f /= ".." && takeExtension f == ".tgz")
When I run this in a directory with tarballs that were packed with:
tar czvPf myfile.tgz /tarball_testing/myfile
I get the following output:
Unpacking myfile.tgz to /tarball_testing
unpacker.hs: Absolute file name in tar archive: "/tarball_testing/myfile"
The second line is the issue. Reading the docs for Codec.Archive.Tar I don't see a way to disable this functionality (not interested in discussions of why I want to use full paths in tarballs, or the relative security implications of doing so).
The first thing that comes to mind is that I somehow need to override the function but that doesn't "feel" like the way a pro Haskeller would do it. Can I get a pointer in the right direction?
You cannot monkey patch or otherwise override a function from a Haskell module, and therefore no workaround will let you avoid the safety measures of the library. What you can do, however, is use the functionality in Codec.Archive.Tar to modify the tar entry paths before unpacking so that they won't be absolute any more. Specifically, there is a mapEntriesNoFail function with type
mapEntriesNoFail :: (Entry -> Entry) -> Entries e -> Entries e
Entries is the type of the argument to Tar.unpack, while Entry is the type of an individual entry. Thanks to mapEntriesNoFail, our problem becomes writing an Entry -> Entry function to adjust the paths. For that, first we will need some extra imports:
import qualified Codec.Archive.Tar.Entry as Tar
import System.FilePath.Posix (takeExtension, dropDrive, hasTrailingPathSeparator)
import Data.Either (either)
The function can look like this:
dropDriveFromEntry :: Tar.Entry -> Tar.Entry
dropDriveFromEntry entry =
either (error "Resulting tar path is somehow too long")
(\tp -> entry { Tar.entryTarPath = tp })
drivelessTarPath
where
tarPath = Tar.entryTarPath entry
path = Tar.fromTarPath tarPath
toTarPath' p = Tar.toTarPath (hasTrailingPathSeparator p) p
drivelessTarPath = toTarPath' $ dropDrive path
This may seem a little long-winded; however, the hoops we jump through are there to ensure the resulting tar paths are sane. You can read about the gory details of tar handling on the Codec.Archive.Tar.Entry documentation. The key function in this definition is dropDrive, which makes an absolute path relative (in Linux, it strips the leading slash of an absolute path).
It is worth spending a few words on the use of either. toTarPath produces a value of type Either String TarPath to account for the possibility of failure. Specifically, the conversion to a tar path fails if the provided path is too long. In our case, however, the path cannot be too long, as it is a path which already was in a tar file, perhaps with a removed leading slash. That being so, it is good enough to eliminate the Either wrapping with either, passing an error instead of the function to handle the (impossible) Left case.
With dropDriveFromEntry in hand, we just have to map it over the entries before unpacking. The relevant line of your program would become:
Tar.unpack searchPath . Tar.mapEntriesNoFail dropDriveFromEntry
. Tar.read . GZip.decompress =<< BS.readFile filePath
Note that if there were relevant errors to be accounted for in dropDriveFromEntry, we would make it return Either String TarPath, and then use mapEntries instead of mapEntriesNoFail.
With these changes, the entry in your tar file will be extracted to /home/someuser/tarball/dir/tarball_testing/myfile. If that is not what you intended, you can modify dropDriveFromEntry so that it performs whatever extra path processing you need.
P.S.: Regarding the alternate title of your question, and considering the sensible little program you have shown us, I do not think you should be worried :)

Resources