I want to compile a Haskell module to GHC Core, with optimisations applied, and use the resulting core output. However, when I use compileToCoreSimplified it doesn't seem to run all the normal optimisations. Taking the program:
{-# OPTIONS_GHC -O2 #-}
module LensOpt(pick) where
import Control.Lens
data Record = Record {_field :: String}
field = lens _field $ \r x -> r{_field=x}
pick = Record "test" ^. field
When run through ghc -ddump-simple I get the optimal output:
LensOpt.pick1 :: GHC.Prim.Addr#
LensOpt.pick1 = "test"#
pick :: String
pick = GHC.CString.unpackCString# LensOpt.pick1
However, when compiled using compileToCoreSimplified I get the output:
s1 :: Addr#
s1 = "test"#
s2 :: [Char]
s2 = unpackCString# s1
s3 :: Record
s3 = Record s2
pick :: String
pick = case s3 of { Record ds -> ds }
I am invoking compileToCoreSimplified with:
import GHC
import GhcPlugins
import GHC.Paths
main = runGhc (Just libdir) $ do
setTargets []
dflags <- getSessionDynFlags
setSessionDynFlags dflags{hscTarget = HscNothing}
res <- compileToCoreSimplified "LensOpt.hs"
liftIO $ writeFile "lens_api.txt" $ showSDoc dflags $ ppr res
How do I drive the GHC API to apply the full optimisations?
You need to turn on the optimizations you want to run in the DynFlags.
You can do that either by specifying an optimization level (0..2) and setting optimizations based on that (updOptLevel), or by turning on individual optimizations such as Opt_Specialise:
main = runGhc (Just libdir) $ do
setTargets []
dflags <- getSessionDynFlags
setSessionDynFlags $ updOptLevel 2 $ dflags{hscTarget = HscNothing}
Related
I've re-posted this question to focus more tightly on the specific error, and to better enumerate what I've already tried.
I'm trying to parse some Haskell code during the runtime of a Haskell program using the hint package.
The outer program compiles, but when I run it the inner compilation step fails. I'm getting a description of what I assume is a syntax problem, and a location in the "interactive" code, but I have no idea how to view the code in question.
Here's Main.hs
module Main where
import Data.List (intercalate)
import Polysemy (runM)
import qualified Language.Haskell.Interpreter as H
import qualified Effects as E
handleFailures :: Either H.InterpreterError a -> IO a
handleFailures (Left l) = ioError $ userError $ message l
where
message (H.WontCompile es) = intercalate "\n" (header : map unbox es)
message e = show e
header = "ERROR: Won't compile:"
unbox (H.GhcError e) = e
handleFailures (Right a) = return a
interpretation :: String -> H.Interpreter E.MyEffect
interpretation s = do
H.loadModules ["Effects"]
H.setImportsQ [("Prelude", Nothing), ("Effects", Nothing)]
effect <- H.interpret s (H.as :: E.MyEffect)
return effect
extractProgram :: String -> IO E.MyEffect
extractProgram s = do
p <- H.runInterpreter $ interpretation s
success <- handleFailures p
return success
main :: IO ()
main = do
userProvided <- readFile "UserProvided.hs"
userProgram <- extractProgram userProvided
runM . E.teletypeToIO . E.teletypePlusToIO $ userProgram
Effects.hs defines and provides helpers for a Polysemey Sem monad called MyEffect.
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE LambdaCase, BlockArguments #-}
{-# LANGUAGE GADTs, FlexibleContexts, TypeOperators, DataKinds, PolyKinds, ScopedTypeVariables #-}
module Effects where
import Polysemy
data Teletype m a where
ReadTTY :: Teletype m String
WriteTTY :: String -> Teletype m ()
makeSem ''Teletype
teletypeToIO :: Member (Embed IO) r => Sem (Teletype ': r) a -> Sem r a
teletypeToIO = interpret $ \case
ReadTTY -> embed getLine
WriteTTY msg -> embed $ putStrLn msg
data TeletypePlus m a where
ReadPlus :: TeletypePlus m String
WritePlus :: String -> TeletypePlus m ()
makeSem ''TeletypePlus
teletypePlusToIO :: Member (Embed IO) r => Sem (TeletypePlus ': r) a -> Sem r a
teletypePlusToIO = interpret $ \case
ReadPlus -> embed $ ("+" <>) <$> getLine
WritePlus msg -> embed $ putStrLn $ msg <> "+"
type MyEffect = Sem [TeletypePlus, Teletype, Embed IO] ()
UserProvided.hs contains a simple do expression in MyEffect.
do
i <- readTTY
j <- readPlus
let k = i <> j
writeTTY k
writePlus k
In order to get the polysemy package available at runtime, I have to enter run it from inside a cabal sandbox.
$ cabal build
Build profile: -w ghc-8.8.1 -O1
In order, the following will be built (use -v for more details):
- Hello-Polysemy-0.1.0.0 (exe:Hello-Polysemy) (file Main.hs changed)
Preprocessing executable 'Hello-Polysemy' for Hello-Polysemy-0.1.0.0..
Building executable 'Hello-Polysemy' for Hello-Polysemy-0.1.0.0..
[2 of 2] Compiling Main ( Main.hs, /home/mako/Git/Hello-Polysemy/dist-newstyle/buil/x86_64-linux/ghc-8.8.1/Hello-Polysemy-0.1.0.0/x/Hello-Polysemy/build/Hello-Polysemy/Hello-Polysemy-tmp/Main.o )
Linking /home/mako/Git/Hello-Polysemy/dist-newstyle/build/x86_64-linux/ghc-8.8.1/Hello-Polysemy-0.1.0.0/x/Hello-Polysemy/build/Hello-Polysemy/Hello-Polysemy ...
$ cabal exec bash
... but then ...
$ cabal run
Up to date
Hello-Polysemy: user error (ERROR: Won't compile:
<interactive>:10:135: error:
Operator applied to too few arguments: :)
So far as I can tell the only place I'm using the : operator is in Effects.hs, where (a) I'm actually using the type-operator ':, and (b) compilation succeeds just fine when Effects is imported into Main.hs.
Any suggestions for what the problem might be, or how I could learn more?
I already tried using Language.Haskell.Interpreter.Unsafe.unsafeRunInterpreterWithArgs ["-v4"]. That clarifies that it's talking about ghc-prim:GHC.Types.:{(w) d 66}), but I don't know what to do with that information.
Update:
I've tried various permutations of in-lining the "userProvided" code.
declaring the exact same Effect value inline in Main works fine. Replacing the string read from the file with an inline string of an even simpler value "writePlus \"asdf\"" doesn't change the error message.
I need to have something like
-- Main.hs
module Main where
main :: IO ()
main = do
<import Plugin>
print Plugin.computation
With a Plugin like
-- Plugin.hs
module Plugin where
computation :: Int
computation = 4
However, I need the plugin to be compiled alongside the main application. They need to be deployed together. Only the import (not the compilation) of the module should happen dynamically.
I found Dynamically loading compiled Haskell module - GHC 7.6 along the way and it works just fine with GHC 8.0.2 except for the fact that it requires the source file of the plugin to be in the current working directory when executing the application.
Edit (07.12.2017)
Is it possible to load a module from a String instead of a file using the GHC API? http://hackage.haskell.org/package/ghc-8.2.1/docs/GHC.html#t:Target suggests that it's possible, but the documentation has many holes and I can't find a way to actually do this. If this can be accomplished, I can use file-embed to include the plugin source file into the compiled binary.
Example:
module Main where
-- Dynamic loading of modules
import GHC
import GHC.Paths ( libdir )
import DynFlags
import Unsafe.Coerce
import Data.Time.Clock (getCurrentTime)
import StringBuffer
pluginModuleNameStr :: String
pluginModuleNameStr = "MyPlugin"
pluginSourceStr :: String
pluginSourceStr = unlines
[ "module MyPlugin where"
, "computation :: Int"
, "computation = 4"
]
pluginModuleName :: ModuleName
pluginModuleName = mkModuleName pluginModuleNameStr
pluginSource :: StringBuffer
pluginSource = stringToStringBuffer pluginSourceStr
main :: IO ()
main = do
currentTime <- getCurrentTime
defaultErrorHandler defaultFatalMessager defaultFlushOut $ do
result <- runGhc (Just libdir) $ do
dflags <- getSessionDynFlags
setSessionDynFlags dflags
let target = Target { targetId = TargetModule $ pluginModuleName
, targetAllowObjCode = True
, targetContents = Just ( pluginSource
, currentTime
)
}
setTargets [target]
r <- load LoadAllTargets
case r of
Failed -> error "Compilation failed"
Succeeded -> do
setContext [IIDecl $ simpleImportDecl pluginModuleName]
result <- compileExpr ("MyPlugin.computation")
let result' = unsafeCoerce result :: Int
return result'
print result
This, however, results in
<command-line>: panic! (the 'impossible' happened)
(GHC version 8.0.2 for x86_64-apple-darwin):
module ‘MyPlugin’ is a package module
Edit (08.12.2017)
I can compile the "plugin" directly into the final binary by writing the source to a temp file and then loading it like in the linked post (Dynamically loading compiled Haskell module - GHC 7.6). However, this does not play well if the plugin imports packages from Hackage:
module Main where
import Control.Monad.IO.Class (liftIO)
import DynFlags
import GHC
import GHC.Paths (libdir)
import System.Directory (getTemporaryDirectory, removePathForcibly)
import Unsafe.Coerce (unsafeCoerce)
pluginModuleNameStr :: String
pluginModuleNameStr = "MyPlugin"
pluginSourceStr :: String
pluginSourceStr = unlines
[ "module MyPlugin where"
, "import Data.Aeson"
, "computation :: Int"
, "computation = 4"
]
writeTempFile :: IO FilePath
writeTempFile = do
dir <- getTemporaryDirectory
let file = dir ++ "/" ++ pluginModuleNameStr ++ ".hs"
writeFile file pluginSourceStr
return file
main :: IO ()
main = do
moduleFile <- writeTempFile
defaultErrorHandler defaultFatalMessager defaultFlushOut $ do
result <- runGhc (Just libdir) $ do
dflags <- getSessionDynFlags
setSessionDynFlags dflags
target <- guessTarget moduleFile Nothing
setTargets [target]
r <- load LoadAllTargets
liftIO $ removePathForcibly moduleFile
case r of
Failed -> error "Compilation failed"
Succeeded -> do
setContext [IIDecl $ simpleImportDecl $ mkModuleName pluginModuleNameStr]
result <- compileExpr "MyPlugin.computation"
let result' = unsafeCoerce result :: Int
return result'
print result
Is there a way to load packages when, for instance, MyPlugin contains the statement import Data.Aeson? If I add it to the plugin string, it fails with
/var/folders/t2/hp9y8x6s6rs7zg21hdzvhbf40000gn/T/MyPlugin.hs:2:1: error:
Failed to load interface for ‘Data.Aeson’
Perhaps you meant Data.Version (from base-4.9.1.0)
Use -v to see a list of the files searched for.
haskell-loader-exe: panic! (the 'impossible' happened)
(GHC version 8.0.2 for x86_64-apple-darwin):
Compilation failed
CallStack (from HasCallStack):
error, called at app/Main.hs:40:19 in main:Main
The reason for my request is database support: We use Persistent to access a database and the dynamic import is needed to support multiple databases (MySQL, PostgreSQL and SQLite) while still allowing the end user to only install one of the three database servers (with other words: not requiring the user to install all of them if they only use, for instance, PostgreSQL). The module that is database-specific should only be loaded when the user actually configures the main application to use that module.
If I don't import Database.Persist.MySQL, then the application does not require MySQL to be installed. Otherwise, the application fails with, for instance,
dyld: Library not loaded:
/usr/local/opt/mysql/lib/libmysqlclient.20.dylib
on macOS.
A file with a matching module name must exist by the looks of it - it doesn't seem to matter what the file's content is.
On Linux I can even make it be a symlink to /dev/null and things work but a symlink to itself doesn't.
I am trying to extract the STG representation of a Haskell source as a String via Outputable, but it looks like coreToStgArgs is panicing with the following dump:
user#machine ~/Desktop/hue $ runhaskell test.hs
[foo :: forall a. Num a => a -> a
[GblId, Arity=2, Caf=NoCafRefs, Str=DmdType] =
\r srt:SRT:[] [$dNum a1] + $dNum a1 a1;,
bar :: Int -> Int
[GblId,test.hs: test.hs: panic! (the 'impossible' happened)
(GHC version 7.10.3 for x86_64-unknown-linux):
coreToStgArgs I# 3
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
Here is the file FooBar.hs that I want to extract:
module FooBar where
foo a = a + a
bar :: Int -> Int
bar b = b + 3
Here is the source of test.hs that I used:
import CoreToStg
import GHC
import GHC.Paths
import Outputable
import StgSyn
mkDynFlags :: IO DynFlags
mkDynFlags = runGhc (Just libdir) getSessionDynFlags
mkSTG :: FilePath -> FilePath -> IO [StgBinding]
mkSTG proj src = do
dflags <- mkDynFlags
ghc_core <- runGhc (Just libdir) $ do
setSessionDynFlags (dflags {importPaths = [proj]})
compileToCoreSimplified src
-- compileToCoreModule src
coreToStg dflags (cm_module ghc_core) (cm_binds ghc_core)
mkIOStr :: (Outputable a) => a -> IO String
mkIOStr obj = do
dflags <- mkDynFlags
let ppr_str = showPpr dflags obj
return ppr_str
main :: IO ()
main = do
let proj = "/home/user/Desktop/hue"
let src = proj ++ "/FooBar.hs"
res <- mkIOStr =<< mkSTG proj src
putStrLn res
It looks like someone several years before me has run into a similar problem:
https://ghc.haskell.org/trac/ghc/ticket/7159
However, I have no idea what has happened since. I am also not sure if this is the correct way to go about extracting the STG of an arbitrary Haskell source, so if there are better alternatives that work, I would like to hear about them.
EDIT:
STG translation appears successful for the following program where bar b = b + 3 is changed to bar b = 3:
module FooBar where
foo a = a + a
bar :: Int -> Int
bar b = 3
In fact, at first glance, things appear to work if the induced Core Haskell does not force primitive operations to be performed. For instance bar b = 3 + 9 fails.
Many thanks to melpomene for pointing out something I missed in the documentation.
Here is the modified source of the test.hs that works:
import CorePrep
import CoreToStg
import GHC
import GHC.Paths
import GhcMonad
import HscTypes
import Outputable
import StgSyn
import System.IO
mkSTG :: FilePath -> FilePath -> IO [StgBinding]
mkSTG proj src = runGhc (Just libdir) $ do
env <- getSession
dflags <- getSessionDynFlags
setSessionDynFlags (dflags {importPaths = [proj]})
target <- guessTarget src Nothing
setTargets [target]
load LoadAllTargets
mod_graph <- getModuleGraph
let mod_sum = head mod_graph -- This is bad practice
pmod <- parseModule mod_sum
tmod <- typecheckModule pmod
dmod <- desugarModule tmod
let guts = coreModule dmod
let loc = ms_location mod_sum
let binds = mg_binds guts
let tcs = mg_tcs guts
prep <- liftIO $ corePrepPgm env loc binds tcs
liftIO $ coreToStg dflags (mg_module guts) prep
mkIOStr :: (Outputable a) => a -> IO String
mkIOStr obj = do
dflags <- runGhc (Just libdir) getSessionDynFlags
let ppr_str = showPpr dflags obj
return ppr_str
main :: IO ()
main = do
let proj = "/home/celery/Desktop/hue"
let src = proj ++ "/FooBar.hs"
res <- mkIOStr =<< mkSTG proj src
putStrLn res
I am not sure what the best way to recover a ModSummary (and hence the ModuleName) from a Target is, but I do vaguely remember it being the first element of the ModuleGraph, which is defined as type ModuleGraph = [ModSummary].
The type signature for corePrepPgm is also different between GHC 7 and 8:
https://downloads.haskell.org/~ghc/7.10.1/docs/html/libraries/ghc-7.10.1/CorePrep.html
https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-8.0.1/CorePrep.html
Suggestions for improvement are welcome :)
EDIT: I have found instances of counter examples to this -- the head of a ModuleGraph is not always the target. My current workaround is to see if any ModSummary within the ModuleGraph contains a location which matches that of the initial source file location.
I've written a function getSamplesFromFile that takes a file and returns its contents as a Vector of Floats. The functions reads the contents of the file into a Data.ByteString using Data.ByteString.hGet, it then converts this Data.ByteString to a Vector of Floats using:
import qualified Data.Vector.Unboxed as V
import qualified Data.ByteString as BS
import Data.Word
import System.Environment
import GHC.Int
toVector :: BS.ByteString -> V.Vector Float
toVector bs = vgenerate (fromIntegral (BS.length bs `div` 3)) $ \i ->
myToFloat [BS.index bs (3*i), BS.index bs (3*i+1), BS.index bs (3*i+2)]
where
myToFloat :: [Word8] -> Float
myToFloat = sum . map fromIntegral
vgenerate n f = V.generate n (f . fromIntegral)
I was testing how lazy this program was via a small test program:
main = do
[file] <- getArgs
samples <- getSamplesFromFile file
let slice = V.slice 0 50000 samples
print slice
If I run this on a 13MB file, it seems as if every sample is loaded into memory, even though I only need 50000 samples to be printed.
If I make a small modification to this problem and first map or filter over it, the result is different:
main = do
[file] <- getArgs
samples <- getSamplesFromFile file
let slice = V.slice 0 50000 samples
let mapped = V.map id slice
print mapped
This way, it seems that not every sample was loaded into memory, only the slice:
To make sure this was the case, I ran the program again with a slice of half the size (25000 samples):
Now, the memory usage seems to be proportional to the size of the slice. Just because I map over the slice with id.
The result is the same when filtering over the samples. How can applying a higher-order function suddenly make the behavior lazy?
EDIT
The problem seems to have to do something with cabal. As you can see from the pictures, I was testing my code inside a cabal project called laziness. I can't reproduce this weird behavior if use a separate Main.hs file outside of a cabal project. This is the Main.hs I'm using:
module Main where
import qualified Data.ByteString as BS
import qualified Data.Vector.Unboxed as V
import Data.Word
import GHC.Int
import System.Environment
main = do
[file] <- getArgs
samples <- getSamplesFromFile file
let slice = V.slice 0 50000 samples
--let filtered = V.filter (>0) slice
let mapped = V.map id slice
print slice
getSamplesFromFile = fmap toVector . BS.readFile
toVector :: BS.ByteString -> V.Vector Float
toVector bs = vgenerate (fromIntegral (BS.length bs `div` 3)) $ \i ->
myToFloat [BS.index bs (3*i), BS.index bs (3*i+1), BS.index bs (3*i+2)]
where
myToFloat :: [Word8] -> Float
myToFloat = sum . map fromIntegral
vgenerate n f = V.generate n (f . fromIntegral)
I don't experience the weird behavior if I do the following:
Create a new directory somewhere via mkdir
Add the above Main.hs to the directory.
Compile using ghc Main.hs -O2 -rtsopts -prof.
Run via ./Main myfile.wav +RTS -hy.
Create the pdf using hp2ps and ps2pdf.
I do experience the weird behavior if I do the following:
Create a new directory, laziness, via mkdir laziness.
Initiate a cabal project via cabal init.
Add the above Main.hs to /src.
Add ghc-options: -O2 -rtsopts -prof to laziness.cabal.
Compile using cabal install
Run via laziness myfile.wav +RTS -hy.
Create the pdf using hp2ps and ps2pdf.
I even experience the weird behavior if I:
cd laziness/src
Compile using ghc Main.hs -O2 -rtsopts -prof.
Run via ./Main myfile.wav +RTS -hy.
Create the pdf using hp2ps and ps2pdf.
So it seems that this behavior only occurs when the code is inside a cabal project. This seems weird to me. Could this have something to do with the setup of my cabal project?.
Is there a way in the Glasgow Haskell Compiler to introspect the names of all functions in a module?
I am trying to create an automatic database migration system that, given the names of migration modules, introspects the names of the functions inside and calls them one at a time.
Something like
doMigrations("Migrations.M_2015")
doMigrations("Migrations.M_2016")
-- ...
where Migration.M_2015 contains
module Migration.M_2015
where
migration_2015_01_02 :: DbConnection -> Status
migration_2015_01_02 connection =
-- ...
Each doMigration will reflect the names of the migration functions in its module and only call those that have not been run before (names saved in a DB table). This will only be called at application start-up, so performance is not a big issue. The reflection can occur at either compile-time or run-time.
In order to do this, you need to use the GHC API -- which is included in the ghc package (which is hidden) -- and is poorly documented.
I attach here a simple program which will print out the list of top level items exported in a module. This should serve as a starting point. This is a little command line utility which takes two arguments -- a module name and the word "class", "data", "function". So, for example:
test Prelude function
will print a list of functions exported by the module (those that are not constructors or defined in a class).
In order to compile this (assuming it is in test.hs) you will need to do:
ghc -package ghc test
in order to make the GHC API packages available.
Here's the code:
import Data.List ( (\\) )
import Data.Maybe (fromJust, catMaybes)
import System.Environment (getArgs)
-- the GHC API stuff
import GHC
import GHC.Paths (libdir)
import ConLike ( ConLike(..) )
import Outputable (showPpr, showSDocUnqual)
import Var (tyVarName)
showU dfs = showSDocUnqual dfs . pprParenSymName
main = do
(mn : ty : _) <- getArgs
a <- runGhc (Just libdir) $ do
dflags <- getSessionDynFlags
setSessionDynFlags dflags
mm <- lookupModule (mkModuleName mn) Nothing
mi <- fmap fromJust $ getModuleInfo mm
res <- fmap catMaybes $ mapM lookupName (modInfoExports mi)
return $ case ty of
"class" -> [showU dflags c' | c#(ATyCon c') <- res, isClassTyCon c']
"data" -> [showU dflags c' | c#(ATyCon c') <- res, (not . isClassTyCon) c']
"function" -> let cf = map getName $ concat [(classMethods . fromJust . tyConClass_maybe) c' | c#(ATyCon c') <- res, isClassTyCon c']
df = map getName $ concat [ tyConDataCons c' | c#(ATyCon c') <- res, (not . isClassTyCon) c']
ff = [ getName c | c#(AnId{}) <- res] \\ cf
fd = [ getName c | c#(AConLike (RealDataCon{})) <- res] \\ df
in [showU dflags x | x <- ff ++ fd]
_ -> ["need to specify: class, data, or function"]
print a
The list of classes and defined data are pretty straightforward. The list of defined functions includes functions defined in classes and constructors. The above code, for functions, excludes these ( with \\ cf and \\ df ).
a is the generated list of function (or class or data) names.
The code which would invoke these functions would be the subject of a different question (and answer).
lookupModule is the function which loads the module for analysis.
The combination of getModuleInfo and modInfoExports get the list of "stuff" which includes the list of functions exported from the module.
The rest of the code is about getting those names in a usable form.