Simple library to do UTF-8 in Haskell (since Streams no longer compile) - haskell

I just want to read (and maybe write) UTF-8 data. haskell.org still advertises System.Streams which does not compile with recent ghc:
% runhaskell Setup.lhs configure
Configuring Streams-0.2.1...
runhaskell Setup.lhs build
Preprocessing library Streams-0.2.1...
Building Streams-0.2.1...
[10 of 45] Compiling System.FD ( System/FD.hs, dist/build/System/FD.o )
System/FD.hs:138:22:
Couldn't match expected type `GHC.IOBase.FD'
against inferred type `FD'
In the first argument of `fdType', namely `fd'
In a 'do' expression: fd_type <- fdType fd
In the expression:
let
oflags1 = case mode of
ReadMode -> ...
WriteMode -> ...
ReadWriteMode -> ...
AppendMode -> ...
binary_flags | binary = o_BINARY
| otherwise = 0
oflags = oflags1 .|. binary_flags
in
do fd <- fdOpen filepath oflags 438
fd_type <- fdType fd
when (mode == WriteMode && fd_type == RegularFile)
$ do fdSetFileSize fd 0
....
Similar problem with Streams 0.1. I cannot get more recent versions since the official site is down:
% wget http://files.pupeno.com/software/streams/Streams-0.1.7.tar.bz2
--2009-07-30 15:36:14-- http://files.pupeno.com/software/streams/Streams-0.1.7.tar.bz2
Resolving files.pupeno.com... failed: Name or service not known.
wget: unable to resolve host address `files.pupeno.com'
A better solution? darcs source code?

Use the utf8-string or the more recent text package.
View the list of packages on hackage.

Edit:
L. Kolmodin is right: utf8-string or text is the right answer. I'll leave my original answer below for reference. Google seems to have steered me wrong in choosing IConv. (The equivalent of my IConv wrapper function is already in utf8-string as Codec.Binary.UTF8.String.encodeString.)
Here is what I've been using--I may not remember the complete solution, so let me know if you still run into problems:
From Hackage, install IConv. Unfortunately, Codec.Text.IConv.convert operates on bytestrings, not strings. I guess you could read files directly as bytestrings, but I wrote a converter since HaXml uses normal strings:
import qualified Data.ByteString.Lazy.Char8 as B
utf8FromLatin1 = B.unpack . convert "LATIN1" "UTF-8" . B.pack
Now, on Mac OS, you have to compile with
$ ghc -O2 --make -L/usr/lib -L/opt/local/lib Whatever.hs
Because there was some library conflict, I think with MacPorts, I have to point explicitly to the built-in iconv libraries. There is probably a way to always pass those -L flags to ghc, but I haven't looked it up yet.

utf-8 strings are just byte character sequences, so it should be possible to just read and write the strings as is. All of the first 127 characters, including whitespace, should be ascii. Of course, you would need your own functions for manipulating strings since they are now multi byte sequences.

Related

How do I use a module and a script in the terminal command ocaml?

I'm trying to run a .ml script, test.ml, using the command ocaml and use a module, template.ml, that I setup.
Currently, I know that I can run ocaml using the module by doing ocaml -init template.ml and that I can run a script using ocaml test.ml.
I'm trying to run the script, test.ml, and use the module, template.ml.
I have tried using ocaml test.ml with the first line being open Template ;;after compiling template with ocamlopt -c template.ml. Template is undefined in that case.
I have also tried using ocaml -init template.ml test.ml with and without open Template ;; as the first line of code. They don't work or error respectively.
First, the open command is only for controlling the namespace. I.e., it controls the set of visible names. It doesn't have the effect (as is often assumed) of locating and making a module accessible. (In general you should avoid over-using open. It's never necessary; you can always use the full Module.name syntax.)
The ocaml command line takes any number of compiled ocaml modules followed by one ocaml (.ml) file.
So you can do what you want by compiling the template.ml file before you start:
$ ocamlc -c template.ml
$ ocaml template.cmo test.ml
Here is a fully worked example with minimal contents of the files:
$ cat template.ml
let f x = x + 5
$ cat test.ml
let main () = Printf.printf "%d\n" (Template.f 14)
let () = main ()
$ ocamlc -c template.ml
$ ocaml template.cmo test.ml
19
For what it's worth I think of OCaml as a compiled language rather than a scripting language. So I usually compile all the files and then run them. Using the same files as above, it looks like this:
$ ocamlc -o test template.ml test.ml
$ ./test
19
I only use the ocaml command when I want a to interact with an interpreter (which OCaml folks have traditionally called the "toplevel").
$ ocaml
OCaml version 4.10.0
# let f x = x + 5;;
val f : int -> int = <fun>
# f 14;;
- : int = 19
#

How can I run a GHCi statement in cabal v2-repl directly from command line?

How can I replicate ghci -e "print 123" in cabal v2-repl?
I've searched for "expression" or "statement" in cabal v2-repl --help with no luck.
The simplest way is to use shell piping capabilities. See:
% cabal v2-repl <<< ':type zip'
...
λ zip :: [a] -> [b] -> [(a, b)]
λ Leaving GHCi.
This <<< notation, in sh script interpreter, means that the quoted string is sent to the standard input of the command, followed by newline, then end of file.
There are other ways. For example, if you wish to supply more lines, you can use the so-called "here-doc":
% cabal repl <<EOF
:type zip
:type fst
EOF
The <<< notation is a shorthand for a one-line "here-doc".
In general, a program may know whether its standard input is a terminal (assuming, live user) or a file (which heredoc pretends to be), and behave differently. But ordinarily it would work either way, and, if you can send things to its standard input, you can automate it.

GHCI 7.8.3 does not support utf8 characters

I've read in the utf8-string package that ghc should support utf8 by default. I've even seen somewhere being written that now my default codepage is used.
Despite all that, a simple code does not execute.
writeFile "asd.txt" "ćlččć"
returns
*** Exception: filenames.txt: commitBuffer: invalid argument (invalid character)
How do I get this code to execute?
Perhaps you should set the encoding of the handle you're actually writing to. I don't know for sure, since I can't reproduce your problem, but something like this may do:
withFile "asd.txt" WriteMode $ \h -> do
hSetEncoding h utf8
hPutStr h "ćlččć"

Get absolute path of current source file in Haskell

Is it possible to get the absolute path of the current source file in Haskell?
I could only find one relevant function: getCurrentDirectory from System.Directory, but it "returns an absolute path to the current directory of the calling process.", not the path of the current file.
(I need it to read sample inputs which are located in the same folder as the source file; If there's any better way to do it, that will be helpful too!)
You can use CPP. If you compile this file
{-# LANGUAGE CPP #-}
main = print __FILE__
it will print the path to the source as you passed it to ghc – which may or may not be the full path, though:
/tmp $ ghc --make mypath.hs
[1 of 1] Compiling Main ( mypath.hs, mypath.o )
Linking mypath ...
/tmp $ ./mypath
"mypath.hs"
/tmp $ ghc --make /tmp/mypath.hs
Linking /tmp/mypath ...
/tmp $ ./mypath
"/tmp/mypath.hs"
As an alternative, the file-embed package can be used here. It uses template haskell to embed files/directories.
This can be very useful to embed resources or configs in the executable. It may not be advisable to read the sample input this way though. data-files in cabal might be better alternative as already pointed out earlier in this thread.
The PseudoMacros package might be useful. According to the description, it provides C-like strings for the current file name etc.
UPDATE
The file name returned by PseudoMacros equals the path passed to ghc (same behaviour as #JoachimBreitner mentioned in his answer), so
import PseudoMacros
main :: IO ()
main = putStrLn ("Hello from " ++ $__FILE__ ++ ", line " ++ show $__LINE__ ++ "!")
will print
Hello from tmp.hs, line 5!
or
Hello from /tmp/tmp.hs, line 5!
depending on whether you provided a relative or absolute filename to ghc.

Calling Haskell script on mac?

I've installed the Haskell platform on my mac (OSX lion), and ghci is running great.
Now I've created a haskell-file, stored on my "desk." How can I call it from this directory?
Example:
Prelude> :load datei.hs
[1 of 1] Compiling Main ( datei.hs, interpreted )
datei.hs:1:7: parse error on input `\'
Failed, modules loaded: none.
datei.hs:
let fac n = if n == 0 then 1 else n * fac (n-1)
Why do I get this?
Use the OSX terminal to reach your desktop and invoke yourfile.hs using ghci:
cd ~/Desktop
ghci yourfile.hs
edit:
As stated in the comments, the error message you're seeing above is warning you that the character \ exists at an unexpected location in the source code.
Since that character does not exist in the line of code you posted, there must be more to datei.hs. We need to see the rest of your source code before we can help.
If you saved your program with TextEdit, it's very possible that you're seeing a '\' character because you're saving it as an RTF file (TextEdit's default). Hit Ctrl-shift-t to convert it into a plain text file.
If your already in ghci you can use ':cd /path/to/file' as well.
Here is a good thread discussing let.

Resources