How do I proc out with tilde expansion AND $PATH searching in Haskell? - haskell

I'm trying to run the elm-reactor project, which is written in Haskell. It fails because it's trying to proc out to the elm command like this:
createProcess (proc "elm" $ args fileName)
My elm executable is sitting in ~/.cabal/bin, which is in my PATH.
The System.Process.proc command searches the $PATH for its command argument, but it doesn't do tilde (~) expansion, so it doesn't find elm.
System.Process.shell has the opposite problem. It does tilde expansion, but it doesn't search the $PATH, apparently.
From the source of the System.Process command, it looks like most everything rests on a foreign ccall to "runInteractiveProcess", which I assume is doing whatever $PATH searching is being done. I don't know where the source for runInteractiveProcess would be, and my C is about 15 years worth of rusty.
I can work around this issue by
a) adding the fully-expanded cabal/bin path to my PATH or
b) symlinking an elm from the working directory to its location in cabal/bin.
However, I'd like to offer a suggested fix to the elm project, to save future adopters the trouble I've gone through. Is there a System.Process call that they should be making here that I haven't tried? Or is there a different method they should be using? I suppose at worst they could getEnv for the PATH and HOME, and implement their own file search using that before calling proc - but that breaks cross-platform compatibility. Any other suggestions?

Try using shell instead of proc, i.e.:
createProcess (shell "elm")
This should invoke elm via a shell, which hopefully will interpret tildes in $PATH as desired.
Update: Here is the experiment I performed to test what shell does...
Compile the following program (I called it run-foofoo):
import System.Process
main = do
(,,_,h) <- createProcess $ shell "foofoo"
ec <- waitForProcess h
print ec
Create a new directory ~/new-bin and place the following perl script there as the file foofoo:
#!/usr/bin/perl
print "Got here and PATH is $ENV{PATH}\n";
Run: chmod a+rx ~/new-bin/foofoo
Test with:
PATH="/bin:/usr/bin:/sbin" ./run-foofoo # should fail
PATH="$HOME/new-bin:/bin:/usr/bin:/sbin" ./run-foofoo # should succeed
PATH="~/new-bin:/bin:/usr/bin:/sbin" ./run-foofoo # ???
On my OSX system, the third test reports:
Got here and PATH is ~/new-bin:/bin:/usr/bin:/sbin
ExitSuccess

Related

Running shell commands from Haskell in NixOS

I'm fairly new to NixOS, and am trying to invoke emacs from a Haskell program using the following function:
ediff :: String -> String -> String -> IO ()
ediff testName a b = do
a' <- writeSystemTempFile (testName ++ ".expected") a
b' <- writeSystemTempFile (testName ++ ".received") b
let quote s = "\"" ++ s ++ "\""
callCommand $ "emacs --eval \'(ediff-files " ++ quote a' ++ quote b' ++ ")\'"
When I run the program that invokes this command using stack test, I get the following result (interspersed with unit test results):
/bin/sh: emacs: command not found
Exception: callCommand: emacs --eval '(ediff-files "/run/user/1000/ast1780695788709393584.expected" "/run/user/1000/ast4917054031918502651.received")'
When I run the command that failed to run above from my shell, it works flawlessly. How can I run processes from Haskell in NixOS, as though I had invoked them directly, so that they can access the same commands and configurations as my user?
Both your shell and callCommand use the PATH environment variable, so it seems like stack is changing that. It turns out that stack uses a pure nix shell by default, but you also want to access your user environment, which is 'impure'.
To quote the stack documenation
By default, stack will run the build in a pure Nix build environment (or shell), which means the build should fail if you haven't specified all the dependencies in the packages: section of the stack.yaml file, even if these dependencies are installed elsewhere on your system. This behaviour enforces a complete description of the build environment to facilitate reproducibility. To override this behaviour, add pure: false to your stack.yaml or pass the --no-nix-pure option to the command line.
Another solution is to add Emacs to nix.dependencies in stack.yaml (thanks #chepner). It has the benefit that some version of Emacs will always be available when a developer runs the tests, but that Emacs may not be the Emacs they want to use. You may be able to work around that using something like ~/.config/nixpkgs/config.nix, unless they have configured their Emacs elsewhere, like the system configuration or perhaps a home manager. I'd prefer the simple but impure $PATH solution.

How to replace paths to executables in source code with Nix that are not in PATH

I wish to write some Haskell that calls an executable as part of its work; and install this on a nixOS host. I don't want the executable to be in my PATH (and to rely on that would disrupt the beautiful dependency model of nix).
If this were, say, a Perl script, I would have a simple builder that looked for strings of a certain format, and replaced them with the executable names, based upon dependencies declared in the .nix file. But that seems somewhat harder with the cabal-based building common to haskell.
Is there a standard idiom for encoding the paths to executables at build time (including during development, as well as at install time) within Haskell code on nix?
For the sake of a concrete example, here is a trivial "script":
import System.Process ( readProcess )
main = do
stdout <- readProcess "hostname" [] ""
putStrLn $ "Hostname: " ++ stdout
I would like to be able to compile run this (in principle) without relying on hostname being in the PATH, but rather replacing hostname with the full /nix/store/-inetutils-/bin/hostname path, and thus also gaining the benefits of dependency management under nix.
This could possibly be managed by using a shell (or similar) script, built using a replacement scheme as defined above, that sets up an environment that the haskell executable expects; but still that would need some bootstrapping via the cabal.mkDerivation, and since I'm a lover of OptParse-Applicative's bash completion, I'm loathe to slow that down with another script to fire up every time I hit the tab key. But if that's what's needed, fair enough.
I did look through cabal.mkDerivation for some sort of pre-build step, but if it's there I'm not seeing it.
Thanks,
Assuming you're building the Haskell app in Nix, you can patch a configuration file via your Nix expression. For an example of how to do this, have a look at this small project.
The crux is that you can define a postConfigure hook like this:
pkgs.haskell.lib.overrideCabal yourProject (old: {
postConfigure = ''
substituteInPlace src/Configuration.hs --replace 'helloPrefix = Nothing' 'helloPrefix = Just "${pkgs.hello}"'
'';
})
What I do with my xmonad build in nix1 is refer to executable paths as things like ##compton##/bin/compton. Then I use a script like this to generate my default.nix file:
#!/usr/bin/env bash
set -eu
packages=($(grep '##[^#]*##' src/Main.hs | sed -e 's/.*##\(.*\)##.*/\1/' | sort -u))
extra_args=()
for p in "${packages[#]}"; do
extra_args+=(--extra-arguments "$p")
done
cabal2nix . "${extra_args[#]}" \
| head -n-1
echo " patchPhase = ''";
echo " substituteInPlace src/Main.hs \\"
for p in "${packages[#]}"; do
echo " --replace '##$p##' '\${$p}' \\"
done
echo " '';"
echo "}"
What it does is grep through src/Main.hs (could easily be changed to find all haskell files, or to some specific configuration module) and pick out all the tags surrounded by## like ##some-package-name##. It then does 2 things with them:
passes them to cabal2nix as extra arguments for the nix expression it generates
post-processes nix expression output from cabal2nix to add a patch phase, which replaces the ##some-package-name## tag in the Haskell source file with the actual path to the derivation.2
This generates a nix-expression like this:
{ mkDerivation, base, compton, networkmanagerapplet, notify-osd
, powerline, setxkbmap, stdenv, synapse, system-config-printer
, taffybar, udiskie, unix, X11, xmonad, xmonad-contrib
}:
mkDerivation {
pname = "xmonad-custom";
version = "0.0.0.0";
src = ./.;
isLibrary = false;
isExecutable = true;
executableHaskellDepends = [
base taffybar unix X11 xmonad xmonad-contrib
];
description = "My XMonad build";
license = stdenv.lib.licenses.bsd3;
patchPhase = ''
substituteInPlace src/Main.hs \
--replace '##compton##' '${compton}' \
--replace '##networkmanagerapplet##' '${networkmanagerapplet}' \
--replace '##notify-osd##' '${notify-osd}' \
--replace '##powerline##' '${powerline}' \
--replace '##setxkbmap##' '${setxkbmap}' \
--replace '##synapse##' '${synapse}' \
--replace '##system-config-printer##' '${system-config-printer}' \
--replace '##udiskie##' '${udiskie}' \
'';
}
The net result is I can just write Haskell code and a cabal package file; I don't have to worry much about maintaining the nix package file as well, only re-running my generate-nix script if my dependencies change.
In my Haskell code I just write paths to executables as if ##the-nix-package-name## was an absolute path to a folder where that package is installed, and everything magically works.
The installed xmonad binary ends up containing hardcoded references to the absolute paths to the executables I call, which is how nix likes to work (this means it automatically knows about the dependency during garbage collection, for example). And I don't have to worry about keeping the things I called in my interactive environment's PATH, or maintaining a wrapper that sets up PATH just for this executable.
1 I have it set up as a cabal project that gets built and installed into the nix store, rather than having it dynamically recompile itself from ~/.xmonad/xmonad.hs
2 Step 2 is a little meta, since I'm using a bash script to generate nix code with an embedded bash script in it
This is not indented to be the answer but if I post this in comment section it would turn out to be ugly formatted.
Also I am not sure if this hack is the right way to do the job.
I notice that if I use nix-shell I can get full path to nix store
Assume hash is always the same, AFAIK I believe it is, you can use it to hard-coded in build recipe.
$ which bash
/run/current-system/sw/bin/bash
[wizzup# ~]
$ nix-shell -p bash
[nix-shell:~]$ which bash
/nix/store/wb34dgkpmnssjkq7yj4qbjqxpnapq0lw-bash-4.4-p12/bin/bash
Lastly, I doubt if you have to to any of this if you use buildInput, it should be the same path.

Loading Shell Script Files to Access Their Functions In Haskell

I have a shell script file named /path/to/shell_script.sh which contains a function defintion shell_function() { ...}.
I'd like to be able to do something in Haskell like readProcessWithExitCode shell_function [] "" to eventually get a hold of an IO (String).
How to do this?
If you have a shell script like
#!/bin/bash
foo() {
echo "foo"
}
you can use the readCreateProcess function from the process package to source the script and execute the function in one go, like this:
module Main where
import System.Process
main :: IO ()
main = do
-- I had to put the full path of the script for it to work
result <- readCreateProcess ((shell ". /tmp/foo.sh && foo")) ""
print result
This solution assumes that the script only does things like defining functions and setting environment variables, without running undesired "effectful" code each time it is sourced.

multiline contents of a IO handle in haskell display nothing

I have been experimenting with Haskell. I am trying to write a web crawler and I need to use external curl binary (due to some proxy settings, curl needs to have some special arguments which seem to be impossible/hard to set inside the haskell code, so i rather just pass it as a command line option. but that is another story...)
In the code at the bottom, if I change the marked line with curl instead of curl --help the output renders properly and gives:
"curl: try 'curl --help' or 'curl --manual' for more information
"
otherwise the string is empty - as the `curl --help' response is multiline.
I suspect that in haskell the buffer is cleared with every new line. (same goes for other simple shell commands like ls versus ls -l etc.)
How do I fix it?
The code:
import System.Process
import System.IO
main = do
let sp = (proc "curl --help"[]){std_out=CreatePipe} -- *** THIS LINE ***
(_,Just out_h,_,_)<- createProcess sp
out <-hGetContents out_h
print out
proc takes as a first argument the name of the executable, not a shell command. That, is when you use proc "foo bar" you are not referring to a foo executable, but to an executable named exactly foo bar, with the space in its file name.
This is a useful feature in practice, because sometimes you do have spaces in there (e.g. on Windows you might have c:\Program Files\Foo\Foo.exe). Using a shell command you would have to escape spaces in your command string. Worse, a few other characters need to be escaped as well, and it's cumbersome to check what exactly those are. proc sidesteps the issue by not using the shell at all but passing the string as it is to the OS.
For the executable arguments, proc takes a separate argument list. E.g.
proc "c:\\Program Files\\Foo\\Foo.exe" ["hello world!%$"]
Note that the arguments need no escaping as well.
If you want to pass arguments to curl you have to pass that it in the list:
sp = (proc "/usr/bin/curl" ["--help"]) {std_out=CreatePipe}
Then you will get the complete output in the entire string.

Bash script execution with and without shebang in Linux and BSD

How and who determines what executes when a Bash-like script is executed as a binary without a shebang?
I guess that running a normal script with shebang is handled with binfmt_script Linux module, which checks a shebang, parses command line and runs designated script interpreter.
But what happens when someone runs a script without a shebang? I've tested the direct execv approach and found out that there's no kernel magic in there - i.e. a file like that:
$ cat target-script
echo Hello
echo "bash: $BASH_VERSION"
echo "zsh: $ZSH_VERSION"
Running compiled C program that does just an execv call yields:
$ cat test-runner.c
void main() {
if (execv("./target-script", 0) == -1)
perror();
}
$ ./test-runner
./target-script: Exec format error
However, if I do the same thing from another shell script, it runs the target script using the same shell interpreter as the original one:
$ cat test-runner.bash
#!/bin/bash
./target-script
$ ./test-runner.bash
Hello
bash: 4.1.0(1)-release
zsh:
If I do the same trick with other shells (for example, Debian's default sh - /bin/dash), it also works:
$ cat test-runner.dash
#!/bin/dash
./target-script
$ ./test-runner.dash
Hello
bash:
zsh:
Mysteriously, it doesn't quite work as expected with zsh and doesn't follow the general scheme. Looks like zsh executed /bin/sh on such files after all:
greycat#burrow-debian ~/z/test-runner $ cat test-runner.zsh
#!/bin/zsh
echo ZSH_VERSION=$ZSH_VERSION
./target-script
greycat#burrow-debian ~/z/test-runner $ ./test-runner.zsh
ZSH_VERSION=4.3.10
Hello
bash:
zsh:
Note that ZSH_VERSION in parent script worked, while ZSH_VERSION in child didn't!
How does a shell (Bash, dash) determines what gets executed when there's no shebang? I've tried to dig up that place in Bash/dash sources, but, alas, looks like I'm kind of lost in there. Can anyone shed some light on the magic that determines whether the target file without shebang should be executed as script or as a binary in Bash/dash? Or may be there is some sort of interaction with kernel / libc and then I'd welcome explanations on how does it work in Linux and FreeBSD kernels / libcs?
Since this happens in dash and dash is simpler, I looked there first.
Seems like exec.c is the place to look, and the relevant functionis are tryexec, which is called from shellexec which is called whenever the shell things a command needs to be executed. And (a simplified version of) the tryexec function is as follows:
STATIC void
tryexec(char *cmd, char **argv, char **envp)
{
char *const path_bshell = _PATH_BSHELL;
repeat:
execve(cmd, argv, envp);
if (cmd != path_bshell && errno == ENOEXEC) {
*argv-- = cmd;
*argv = cmd = path_bshell;
goto repeat;
}
}
So, it simply always replaces the command to execute with the path to itself (_PATH_BSHELL defaults to "/bin/sh") if ENOEXEC occurs. There's really no magic here.
I find that FreeBSD exhibits identical behavior in bash and in its own sh.
The way bash handles this is similar but much more complicated. If you want to look in to it further I recommend reading bash's execute_command.c and looking specifically at execute_shell_script and then shell_execve. The comments are quite descriptive.
(Looks like Sorpigal has covered it but I've already typed this up and it may be of interest.)
According to Section 3.16 of the Unix FAQ, the shell first looks at the magic number (first two bytes of the file). Some numbers indicate a binary executable; #! indicates that the rest of the line should be interpreted as a shebang. Otherwise, the shell tries to run it as a shell script.
Additionally, it seems that csh looks at the first byte, and if it's #, it'll try to run it as a csh script.

Resources