Introspecting function names in a GHC module - haskell

Is there a way in the Glasgow Haskell Compiler to introspect the names of all functions in a module?
I am trying to create an automatic database migration system that, given the names of migration modules, introspects the names of the functions inside and calls them one at a time.
Something like
doMigrations("Migrations.M_2015")
doMigrations("Migrations.M_2016")
-- ...
where Migration.M_2015 contains
module Migration.M_2015
where
migration_2015_01_02 :: DbConnection -> Status
migration_2015_01_02 connection =
-- ...
Each doMigration will reflect the names of the migration functions in its module and only call those that have not been run before (names saved in a DB table). This will only be called at application start-up, so performance is not a big issue. The reflection can occur at either compile-time or run-time.

In order to do this, you need to use the GHC API -- which is included in the ghc package (which is hidden) -- and is poorly documented.
I attach here a simple program which will print out the list of top level items exported in a module. This should serve as a starting point. This is a little command line utility which takes two arguments -- a module name and the word "class", "data", "function". So, for example:
test Prelude function
will print a list of functions exported by the module (those that are not constructors or defined in a class).
In order to compile this (assuming it is in test.hs) you will need to do:
ghc -package ghc test
in order to make the GHC API packages available.
Here's the code:
import Data.List ( (\\) )
import Data.Maybe (fromJust, catMaybes)
import System.Environment (getArgs)
-- the GHC API stuff
import GHC
import GHC.Paths (libdir)
import ConLike ( ConLike(..) )
import Outputable (showPpr, showSDocUnqual)
import Var (tyVarName)
showU dfs = showSDocUnqual dfs . pprParenSymName
main = do
(mn : ty : _) <- getArgs
a <- runGhc (Just libdir) $ do
dflags <- getSessionDynFlags
setSessionDynFlags dflags
mm <- lookupModule (mkModuleName mn) Nothing
mi <- fmap fromJust $ getModuleInfo mm
res <- fmap catMaybes $ mapM lookupName (modInfoExports mi)
return $ case ty of
"class" -> [showU dflags c' | c#(ATyCon c') <- res, isClassTyCon c']
"data" -> [showU dflags c' | c#(ATyCon c') <- res, (not . isClassTyCon) c']
"function" -> let cf = map getName $ concat [(classMethods . fromJust . tyConClass_maybe) c' | c#(ATyCon c') <- res, isClassTyCon c']
df = map getName $ concat [ tyConDataCons c' | c#(ATyCon c') <- res, (not . isClassTyCon) c']
ff = [ getName c | c#(AnId{}) <- res] \\ cf
fd = [ getName c | c#(AConLike (RealDataCon{})) <- res] \\ df
in [showU dflags x | x <- ff ++ fd]
_ -> ["need to specify: class, data, or function"]
print a
The list of classes and defined data are pretty straightforward. The list of defined functions includes functions defined in classes and constructors. The above code, for functions, excludes these ( with \\ cf and \\ df ).
a is the generated list of function (or class or data) names.
The code which would invoke these functions would be the subject of a different question (and answer).
lookupModule is the function which loads the module for analysis.
The combination of getModuleInfo and modInfoExports get the list of "stuff" which includes the list of functions exported from the module.
The rest of the code is about getting those names in a usable form.

Related

Using GHC API to compile a program with optimisation

I want to compile a Haskell module to GHC Core, with optimisations applied, and use the resulting core output. However, when I use compileToCoreSimplified it doesn't seem to run all the normal optimisations. Taking the program:
{-# OPTIONS_GHC -O2 #-}
module LensOpt(pick) where
import Control.Lens
data Record = Record {_field :: String}
field = lens _field $ \r x -> r{_field=x}
pick = Record "test" ^. field
When run through ghc -ddump-simple I get the optimal output:
LensOpt.pick1 :: GHC.Prim.Addr#
LensOpt.pick1 = "test"#
pick :: String
pick = GHC.CString.unpackCString# LensOpt.pick1
However, when compiled using compileToCoreSimplified I get the output:
s1 :: Addr#
s1 = "test"#
s2 :: [Char]
s2 = unpackCString# s1
s3 :: Record
s3 = Record s2
pick :: String
pick = case s3 of { Record ds -> ds }
I am invoking compileToCoreSimplified with:
import GHC
import GhcPlugins
import GHC.Paths
main = runGhc (Just libdir) $ do
setTargets []
dflags <- getSessionDynFlags
setSessionDynFlags dflags{hscTarget = HscNothing}
res <- compileToCoreSimplified "LensOpt.hs"
liftIO $ writeFile "lens_api.txt" $ showSDoc dflags $ ppr res
How do I drive the GHC API to apply the full optimisations?
You need to turn on the optimizations you want to run in the DynFlags.
You can do that either by specifying an optimization level (0..2) and setting optimizations based on that (updOptLevel), or by turning on individual optimizations such as Opt_Specialise:
main = runGhc (Just libdir) $ do
setTargets []
dflags <- getSessionDynFlags
setSessionDynFlags $ updOptLevel 2 $ dflags{hscTarget = HscNothing}

Using Control.Lens.Plated with python AST

I'm trying to learn to use plated to transform and search through the python AST generated by language-python (https://hackage.haskell.org/package/language-python-0.5.4/docs/Language-Python-Common-AST.html)
To briefly summarize the types:
Modules is a list of statements
A statement can contain further statements or expressions
An expression can contain further expressions, identities or operators
I created a hello world to get started with these two libraries. Here's my code that defines plate for a subset of the python AST types:
{-# LANGUAGE FlexibleInstances#-}
module Lib
( someFunc
) where
import Language.Python.Version3.Parser
import Language.Python.Common.Token
import Language.Python.Common.AST
import Language.Python.Common.SrcLocation
import Control.Lens.Plated
import Data.Data.Lens
import Language.Python.Common.Pretty
import Language.Python.Common.PrettyAST
instance Plated (Statement SrcSpan) where
plate = uniplate
instance Plated (Expr SrcSpan) where
plate = uniplate
instance Plated (Ident SrcSpan) where
plate = uniplate
instance Plated (Op SrcSpan) where
plate = uniplate
extract (Right (x, _)) = x
someFunc :: IO ()
someFunc = do
putStrLn $ show $ concatMap (map prettyText) $ universe $ extract $ parseStmt "2*(1+x)" "file.py"
putStrLn $ show $ map prettyText $ universe $ extract $ parseExpr "2*(1+x)" "file.py"
The output of this program is
["2 * (1 + x)"]
["2 * (1 + x)","2","(1 + x)","1 + x","1","x"]
The problem I'm seeing is that plated only sees the type consistent with the root node. If I start with a statement as my root node it doesn't look at expressions which are children of the statement. In the second case where I look at expressions it seems to descend and find child expressions. But, it's not showing me the operators or identity types that are children of expressions.
How do get plate to descend into a data structure with mixed types? Or am I using the wrong tool for this?

Running Q Exp in a GhcMonad [duplicate]

Is it possible to generate and run TemplateHaskell generated code at runtime?
Using C, at runtime, I can:
create the source code of a function,
call out to gcc to compile it to a .so (linux) (or use llvm, etc.),
load the .so and
call the function.
Is a similar thing possible with Template Haskell?
Yes, it's possible. The GHC API will compile Template Haskell. A proof-of-concept is available at https://github.com/JohnLato/meta-th, which, although not very sophisticated, shows one general technique that even provides a modicum of type safety. Template Haskell expressions are build using the Meta type, which can then be compiled and loaded into a usable function.
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TemplateHaskell #-}
{-# OPTIONS_GHC -Wall #-}
module Data.Meta.Meta (
-- * Meta type
Meta (..)
-- * Functions
, metaCompile
) where
import Language.Haskell.TH
import Data.Typeable as Typ
import Control.Exception (bracket)
import System.Plugins -- from plugins
import System.IO
import System.Directory
newtype Meta a = Meta { unMeta :: ExpQ }
-- | Super-dodgy for the moment, the Meta type should register the
-- imports it needs.
metaCompile :: forall a. Typeable a => Meta a -> IO (Either String a)
metaCompile (Meta expr) = do
expr' <- runQ expr
-- pretty-print the TH expression as source code to be compiled at
-- run-time
let interpStr = pprint expr'
typeTypeRep = Typ.typeOf (undefined :: a)
let opener = do
(tfile, h) <- openTempFile "." "fooTmpFile.hs"
hPutStr h (unlines
[ "module TempMod where"
, "import Prelude"
, "import Language.Haskell.TH"
, "import GHC.Num"
, "import GHC.Base"
, ""
, "myFunc :: " ++ show typeTypeRep
, "myFunc = " ++ interpStr] )
hFlush h
hClose h
return tfile
bracket opener removeFile $ \tfile -> do
res <- make tfile ["-O2", "-ddump-simpl"]
let ofile = case res of
MakeSuccess _ fp -> fp
MakeFailure errs -> error $ show errs
print $ "loading from: " ++ show ofile
r2 <- load (ofile) [] [] "myFunc"
print "loaded"
case r2 of
LoadFailure er -> return (Left (show er))
LoadSuccess _ (fn :: a) -> return $ Right fn
This function takes an ExpQ, and first runs it in IO to create a plain Exp. The Exp is then pretty-printed into source code, which is compiled and loaded at run-time. In practice, I've found that one of the more difficult obstacles is specifying the correct imports in the generated TH code.
From what I understand you want to create and run a code at runtime which I think you can do using GHC API but I am not very sure of the scope of what you can achieve. If you want something like hot code swapping you can look at the package hotswap.

Dynamic module name

I want to do something like this in Haskell, but the compiler is not letting me.
Is there any way to accomplish this task?
-- both modules export function named "hello"
-- and I want to run it in every module
import qualified MyMod as M1
import qualified MyAnotherMod as M2
runmodules = map (\m -> m.hello) [M1, M2]
Modules in Haskell are not even remotely first-class entities in the ways this would require, I'm afraid.
However, as bzn commented, Template Haskell can be used for problems like this. The result can be a bit clumsy, but if you really need some quick metaprogramming hacks it's not a bad choice. I'm not really an expert with TH, but what you want is pretty simple, with one catch: Neither "ambiguous identifiers" nor "module names" can be captured or quoted in any way, as far as I know, so you'll have to put them in strings given as arguments to the TH function.
Here's a quick and dirty, minimal example:
{-# LANGUAGE TemplateHaskell #-}
module MapModuleTH where
import Language.Haskell.TH
mapQual :: [String] -> String -> ExpQ
mapQual ms n = listE $ map (\m -> varE . mkName $ m ++ "." ++ n) ms
mapMQual :: [String] -> String -> ExpQ
mapMQual ms n = appE (varE 'sequence) $ listE $ map (\m -> varE . mkName $ m ++ "." ++ n) ms
You phrased things as "running the function" which sounds more like doing a bunch of IO actions, not just collecting a list of stuff, so I added a variant that also sequences the result.
Note that, despite the use of strings here, this is still statically typed--if the qualified names don't exist, or the types don't match up, you'll get the expected compile-time error just as if you'd written everything out by hand.
Here's a quick example of using it. Given the following:
{-# LANGUAGE TemplateHaskell #-}
module MapModule where
import MapModuleTH
import qualified Test1 as T1
import qualified Test2 as T2
tests = do xs <- $(mapMQual ["T1", "T2"] "test")
putStrLn $ "Count: " ++ show (length xs)
Assuming the other modules are there and define test, then in GHCi we can see:
> tests
Test 1
Test 2
Count: 2
I don't think you can quote a qualified name prefix like that in template haskell, and the hello identifier isn't in scope, so you might have to fall back to programming with strings.
module ModuleParamsTH where
import Language.Haskell.TH
getAll :: String -> [String] -> ExpQ
getAll valueName moduleNames =
listE $ map (varE . mkName . (++ suffix)) moduleNames
where suffix = "." ++ valueName
which can then be used like so,
{-# LANGUAGE TemplateHaskell #-}
import ModuleParamsTH
import qualified ModuleParamsM1 as M1
import qualified ModuleParamsM2 as M2
runmodules = $(getAll "hello" ["M1", "M2"])
However, I would not do all this. You could just write [M1.hello, M2.hello] or use a type class to abstract over implementations.
Modules aren't values in Haskell. Therefore that isn't possible. What do you want to achieve?

Grandfather Paradox in Haskell

I'm trying to write a renamer for a compiler that I'm writing in Haskell.
The renamer scans an AST looking for symbol DEFs, which it enters into a symbol table, and symbol USEs, which it resolves by looking in the symbol table.
In this language, uses can come before or after defs, so it would seem that a 2 pass strategy is required; one pass to find all the defs and build the symbol table, and a second to resolve all the uses.
However, since Haskell is lazy (like me), I figure I can tie-the-knot and pass the renamer the final symbol table before it is actually built. This is fine as long as I promise to actually build it. In an imperative programming language, this would be like sending a message back in time. This does work in Haskell, but care must be taken to not introduce a temporal paradox.
Here's a terse example:
module Main where
import Control.Monad.Error
import Control.Monad.RWS
import Data.Maybe ( catMaybes )
import qualified Data.Map as Map
import Data.Map ( Map )
type Symtab = Map String Int
type RenameM = ErrorT String (RWS Symtab String Symtab)
data Cmd = Def String Int
| Use String
renameM :: [Cmd] -> RenameM [(String, Int)]
renameM = liftM catMaybes . mapM rename1M
rename1M :: Cmd -> RenameM (Maybe (String, Int))
rename1M (Def name value) = do
modify $ \symtab -> Map.insert name value symtab
return Nothing
rename1M (Use name) = return . liftM ((,) name) . Map.lookup name =<< ask
--rename1M (Use name) =
-- maybe (return Nothing) (return . Just . (,) name) . Map.lookup name =<< ask
--rename1M (Use name) =
-- maybe (throwError $ "Cannot locate " ++ name) (return . Just . (,) name) . Map.lookup name =<< ask
rename :: [Cmd] -> IO ()
rename cmds = do
let (result, symtab, log) = runRWS (runErrorT $ renameM cmds) symtab Map.empty
print result
main :: IO ()
main = do
rename [ Use "foo"
, Def "bar" 2
, Use "bar"
, Def "foo" 1
]
This is the line where the knot is tied:
let (result, symtab, log) = runRWS (runErrorT $ renameM cmds) symtab Map.empty
The running symbol table is stored in the MonadState of the RWS, and the final symbol table is stored in the MonadReader.
In the above example, I have 3 versions of rename1M for Uses (2 are commented out). In this first form, it works fine.
If you comment out the first rename1M Use, and uncomment the second, the program does not terminate. However, it is, in spirit, no different than the first form. The difference is that it has two returns instead of one, so the Maybe returned from Map.lookup must be evaluated to see which path to take.
The third form is the one that I really want. I want to throw an error if I can't find a symbol. But this version also does not terminate. Here, the temporal paradox is obvious; the decision about whether the the symbol will be in the table can affect whether it will be in the table...
So, my question is, is there an elegant way to do what the third version does (throw an error) without running into the paradox? Send the errors on the MonadWriter without allowing the lookup to change the path? Two passes?
Do you really have to interrupt execution when an error occurs? An alternative approach would be to log errors. After tying the knot, you can check whether the list of errors is empty. I've taken this approach in the past.
-- I've wrapped a writer in a writer transformer. You'll probably want to implement it differently to avoid ambiguity
-- related to writer methods.
type RenameM = WriterT [RenameError] (RWS Symtab String Symtab)
rename1M (Use name) = do
symtab_entry <- asks (Map.lookup name)
-- Write a list of zero or more errors. Evaluation of the list is not forced until all processing is done.
tell $ if isJust symtab_entry then [] else missingSymbol name
return $ Just (name, fromMaybe (error "lookup failed") symtab_entry)
rename cmds = do
let ((result, errors), symtab, log) = runRWS (runWriterT $ renameM cmds) symtab Map.empty
-- After tying the knot, check for errors
if null errors then print result else print errors
This does not produce laziness-related nontermination problems because the contents of the symbol table are not affected by whether or not a lookup succeeded.
I don't have a well thought out answer, but one quick thought. Your single pass over the AST takes all the Def and produces a (Map Symbol _), and I wonder if the same AST pass can take all the Use and produce a (Set Symbol) as well as the lazy lookup.
Afterwards you can quite safely compare the Symbols in the keys of the Map with the Symbols in the Set. If the Set has anything not in the Map then you can report all of those Symbols are errors. If any Def'd Symbols are not in in the Set then you can warn about unused Symbols.

Resources