Converting literal Chars to Word8 - haskell

The documentation for ByteString gives the following code example:
breakByte :: Word8 -> ByteString -> (ByteString, ByteString)
breakByte 'c' "abcd"
However when I write the same I get the following error (ideone):
Couldn't match expected type `GHC.Word.Word8'
with actual type `Char'
Of course 'c' is a Char, not Word8. Presumably they're using some extension which allows a fromInteger style function to work automatically on Char literals, but I'm not sure what. {-# LANGUAGE OverloadedStrings #-} doesn't seem to make any difference.

Just import the Char8 versions of the modules. These do the byte conversions. Note that this is for 8 bit characters. So don't try putting unicode data into it.

Related

How can I save a variable as a bytestring?

Ik this is a dumb question, but if I have this:
a :: B.ByteString
a = "a"
I get an error that says "Couldn't match type B.ByteString with type [Char]". I know what's the problem but I don't know how to fix it, could you help? thx.
Character string literals in Haskell, by default, are always treated as String, which is equivalent to [Char]. Most string-like data structures define a function called pack to convert from, and the bytestring package is no exception (Note that this is pack from Data.ByteString.Char8; the one in Data.ByteString converts from [Word8]).
import Data.ByteString.Char8(pack)
a :: B.ByteString
a = pack "a"
However, GHC also supports an extension called OverloadedStrings. If you're willing to enable this, ByteString implements a typeclass called IsString. With this extension enabled, the type of a string literal like "a" is no longer [Char] and is instead forall a. IsString a => a (similar to how the type of numerical literals like 3 is forall a. Num a => a). This will happily specialize to ByteString if the type is in scope.
{-# LANGUAGE OverloadedStrings #-}
a :: B.ByteString
a = "a"
If you go this route, make sure you understand the proviso listed in the docs for this instance. For ASCII characters, it won't pose a problem, but if your string has Unicode characters outside the ASCII range, you need to be aware of it.

Haskell implicit conversions

Hello i was looking to using Data.Text.intercalate and from Hackage i do not understand why if the method has the following signature:
intercalate :: Text -> [Text] -> Text Why then, does this work
T.intercalate "NI!" ["We", "seek", "the", "Holy", "Grail"]
"WeNI!seekNI!theNI!HolyNI!Grail"
Shouldn't you apply Data.Text.pack it before each element of the list?
Source : http://hackage.haskell.org/package/text-1.2.3.1/docs/Data-Text.html
In my case i want to pack the following :
Input :"{" ,mytext ,"}" #mytext::Text I am doing it with :
Prelude.intercalate (Data.Text.pack ",") [pack "{",mytext, pack "}"] or (pack "{") ++ mytext++ pack "}")
Can someone please explain me why does Data.Text expose the same methods as Data.List (in our case intercalate) and how does it make implicit conversions between Char and Text ?
You likely enabled -XOverloadedStrings (or enabled it with the {-# LANGUAGE OverloadedStrings #-} at the top of the file).
As a result this means that string literals (not string variables, only the literals), can be interpreted by any IsString type.
Text is an IsString type. So that means that implicitly you use pack around the string literals (again literals, not ordinary variables).
A similar thing happens with number literals: a number literal can be any Num type. Based on what functions you call on the number literal, Haskell can derive the exact type, and thus "interprets" the literal accordingly. For example if you write atan2 1 2, then 1 and 2 should be interpreted as RealFloat types, whereas for quot 1 2, the 1 and 2 are interpreted as Ìntegral` types.

Why isn't ByteString converted automatically to FilePath?

I'm passing a (strict) ByteString to something expecting a System.IO.FilePath, which is declared as type FilePath = String. I'm also using {-# LANGUAGE OverloadedStrings #-}. I've had conversions in some places happen automatically, but here it does not. What have I got wrong?
Main.hs:33:40: error:
• Couldn't match type ‘ByteString’ with ‘[Char]’
Expected type: FilePath
Actual type: ByteString
The {-# LANGUAGE OverloadedStrings #-} pragma only works for string literals, like "a string". In that case, Haskell implicitly places a fromString before every string literal, so it rewrites a string literal as "a string" to fromString "a string". This only happens for literals.
In Haskell, as far as I know, there are no implicit conversions. Conversions between for instance Int and Float are all explicit.
Furthermore note that the IsString typeclass only has a function fromString :: String -> a. So that means it works only from a string to that instance (here ByteString), not the other way around.
You can use the unpack :: ByteString -> String to convert the ByteString to a String.
IIRC, the OverloadedStrings extension doesn't enable magical conversion between different types of data. What it does is that when you write a string literal like "foo", the compiler can treat that literal as not only a String, but also as a ByteString.
You probably need something like unpack to convert ByteString to String.

Parsec returns [Char] instead of Text

I am trying to create a parser for a custom file format. In the format I am working with, some fields have a closing tag like so:
<SOL>
<DATE>0517
<YEAR>86
</SOL>
I am trying to grab the value between the </ and > and use it as part of the bigger parser.
I have come up with the code below. The trouble is, the parser returns [Char] instead of Text. I can pack each Char by doing fmap pack $ return r to get a text value out, but I was hoping type inference would save me from having to do this. Could someone give hints as to why I am getting back [Char] instead of Text, and how I can get back Text without having to manually pack the value?
{-# LANGUAGE NoMonomorphismRestriction #-}
{-# LANGUAGE OverloadedStrings #-}
import Data.Text
import Text.Parsec
import Text.Parsec.Text
-- |A closing tag is on its own line and is a "</" followed by some uppercase characters
-- followed by some '>'
closingTag = do
_ <- char '\n'
r <- between (string "</") (char '>') (many upper)
return r
string has the type
string :: Stream s m Char => String -> ParsecT s u m String
(See here for documentation)
So getting a String back is exactly what's supposed to happen.
Type inference doesn't change types, it only infers them. String is a concrete type, so there's no way to infer Text for it.
What you could do, if you need this in a couple of places, is to write a function
text :: Stream s m Char => String -> ParsecT s u m Text
text = fmap pack . string
or even
string' :: (IsString a, Stream s m Char) => String -> ParsecT s u m a
string' = fmap fromString . string
Also, it doesn't matter in this example but you'd probably want to import Text qualified, names like pack are used in a number of different modules.
As Ørjan Johansen correctly pointed out, string isn't actually the problem here, many upper is. The same principle applies though.
The reason you get [Char] here is that upper parses a Char and many turns that into a [Char]. I would write my own combinator along the lines of:
manyPacked = fmap pack . many
You could probably use type-level programming with type classes etc. to automatically choose between many and manyPack depending on the expect return type, but I don't think that's worth it. (It would probably look a bit like Scala's CanBuiltFrom).

Haskell How to Create a Word8?

I want to write a simple function which splits a ByteString into [ByteString] using '\n' as the delimiter. My attempt:
import Data.ByteString
listize :: ByteString -> [ByteString]
listize xs = Data.ByteString.splitWith (=='\n') xs
This throws an error because '\n' is a Char rather than a Word8, which is what Data.ByteString.splitWith is expecting.
How do I turn this simple character into a Word8 that ByteString will play with?
You could just use the numeric literal 10, but if you want to convert the character literal you can use fromIntegral (ord '\n') (the fromIntegral is required to convert the Int that ord returns into a Word8). You'll have to import Data.Char for ord.
You could also import Data.ByteString.Char8, which offers functions for using Char instead of Word8 on the same ByteString data type. (Indeed, it has a lines function that does exactly what you want.) However, this is generally not recommended, as ByteStrings don't store Unicode codepoints (which is what Char represents) but instead raw octets (i.e. Word8s).
If you're processing textual data, you should consider using Text instead of ByteString.

Resources