How can I save a variable as a bytestring? - haskell

Ik this is a dumb question, but if I have this:
a :: B.ByteString
a = "a"
I get an error that says "Couldn't match type B.ByteString with type [Char]". I know what's the problem but I don't know how to fix it, could you help? thx.

Character string literals in Haskell, by default, are always treated as String, which is equivalent to [Char]. Most string-like data structures define a function called pack to convert from, and the bytestring package is no exception (Note that this is pack from Data.ByteString.Char8; the one in Data.ByteString converts from [Word8]).
import Data.ByteString.Char8(pack)
a :: B.ByteString
a = pack "a"
However, GHC also supports an extension called OverloadedStrings. If you're willing to enable this, ByteString implements a typeclass called IsString. With this extension enabled, the type of a string literal like "a" is no longer [Char] and is instead forall a. IsString a => a (similar to how the type of numerical literals like 3 is forall a. Num a => a). This will happily specialize to ByteString if the type is in scope.
{-# LANGUAGE OverloadedStrings #-}
a :: B.ByteString
a = "a"
If you go this route, make sure you understand the proviso listed in the docs for this instance. For ASCII characters, it won't pose a problem, but if your string has Unicode characters outside the ASCII range, you need to be aware of it.

Related

How does OverloadedStrings language extension work?

I am trying to understand the language extension OverloadedStrings from the page https://ocharles.org.uk/posts/2014-12-17-overloaded-strings.html.
When the OverloadedStrings is enabled, then String becomes a type Data.String.IsString a => a:
Prelude Data.String> :t fromString "Foo"
fromString "Foo" :: IsString a => a
In the description, the author has mentioned the following:
By enabling this extension, string literals are now a call to the
fromString function, which belongs to the IsString type class.
What does string literals are now a call to the fromString function ?
and also the author has mentioned:
This polymorphism is extremely powerful, and it allows us to write
embedded domain specific languages in Haskell source code, without
having to introduce new constructs for otherwise normal values.
what does without having to introduce new constructs for otherwise normal values mean?
When the OverloadedStrings is enabled, then String becomes a type Data.String.IsString a => a
No that is incorrect. A String remains a String. It has only effect on string literals, not variables that have as type a String, and these still can be Strings.
What does string literals are now a call to the fromString function?
It means that if you write a string literal, like "foo", Haskell implicitly writes fromString "foo", and thus you can use this like any IsString object.
what does without having to introduce new constructs for otherwise normal values mean?
It means that we can make our own types for which we can write some sort of "mini-parser", and thus write these objects as string literals in our code. For example if we make a datatype like:
newtype BoolList = BoolList [Bool] deriving Show
then we can write our own parser
instance IsString BoolList where
fromString = BoolList . map toBool
where toBool '1' = True
toBool _ = False
Now we can for example define a list of Bools as:
myboollist :: BoolList
myboollist = "10110010001"
So then we get:
Prelude Data.String> myboollist
BoolList [True,False,True,True,False,False,True,False,False,False,True]
We here thus wrote a string literal "10110010001", and that means that implictly, we wrote fromString "10110010001". Since the type of myboollist is BoolList, it is here clear to what the string literal is parsed.
This thus can be useful if some data types are complex, our would take a lot of code to construct an object.
Since the fromString call is however postponed, and frequently not all possible strings map to a value of the type (here it is the case, although it is debatable if it is good to just fill in False for everything else than '1'), it thus can raise errors at runtime when the string turns out to be "unparsable".
what does without having to introduce new constructs for otherwise normal values mean?
The next sentence says
So why should string literals be any different?
so this one refers primarily to number literals. Consider e.g. a type defining polynomials. Because + and * can only be applied to arguments of the same type, if we want
2*x^3 + 3*x :: Poly Int
to be legal, 2 and 3 have to be of type Poly Int; otherwise you'd need either
a separate operator to multiply a polynomial by a number: 2.*x^3 + 3.^x.
a constructor for a constant polynomial: (C 2)*x^3 + (C 3)*x
An example for string literals is given at the end:
However, SQL queries are notorious for injection attacks when we concatenate strings. Interestingly, postgresql-simple provides a Query type that only has a IsString instance. This means that it’s very lightweight to write a literal query, but the moment we want to start concatenating strings for our query, we have to be very explicit.

Why isn't ByteString converted automatically to FilePath?

I'm passing a (strict) ByteString to something expecting a System.IO.FilePath, which is declared as type FilePath = String. I'm also using {-# LANGUAGE OverloadedStrings #-}. I've had conversions in some places happen automatically, but here it does not. What have I got wrong?
Main.hs:33:40: error:
• Couldn't match type ‘ByteString’ with ‘[Char]’
Expected type: FilePath
Actual type: ByteString
The {-# LANGUAGE OverloadedStrings #-} pragma only works for string literals, like "a string". In that case, Haskell implicitly places a fromString before every string literal, so it rewrites a string literal as "a string" to fromString "a string". This only happens for literals.
In Haskell, as far as I know, there are no implicit conversions. Conversions between for instance Int and Float are all explicit.
Furthermore note that the IsString typeclass only has a function fromString :: String -> a. So that means it works only from a string to that instance (here ByteString), not the other way around.
You can use the unpack :: ByteString -> String to convert the ByteString to a String.
IIRC, the OverloadedStrings extension doesn't enable magical conversion between different types of data. What it does is that when you write a string literal like "foo", the compiler can treat that literal as not only a String, but also as a ByteString.
You probably need something like unpack to convert ByteString to String.

Converting literal Chars to Word8

The documentation for ByteString gives the following code example:
breakByte :: Word8 -> ByteString -> (ByteString, ByteString)
breakByte 'c' "abcd"
However when I write the same I get the following error (ideone):
Couldn't match expected type `GHC.Word.Word8'
with actual type `Char'
Of course 'c' is a Char, not Word8. Presumably they're using some extension which allows a fromInteger style function to work automatically on Char literals, but I'm not sure what. {-# LANGUAGE OverloadedStrings #-} doesn't seem to make any difference.
Just import the Char8 versions of the modules. These do the byte conversions. Note that this is for 8 bit characters. So don't try putting unicode data into it.

Parsec returns [Char] instead of Text

I am trying to create a parser for a custom file format. In the format I am working with, some fields have a closing tag like so:
<SOL>
<DATE>0517
<YEAR>86
</SOL>
I am trying to grab the value between the </ and > and use it as part of the bigger parser.
I have come up with the code below. The trouble is, the parser returns [Char] instead of Text. I can pack each Char by doing fmap pack $ return r to get a text value out, but I was hoping type inference would save me from having to do this. Could someone give hints as to why I am getting back [Char] instead of Text, and how I can get back Text without having to manually pack the value?
{-# LANGUAGE NoMonomorphismRestriction #-}
{-# LANGUAGE OverloadedStrings #-}
import Data.Text
import Text.Parsec
import Text.Parsec.Text
-- |A closing tag is on its own line and is a "</" followed by some uppercase characters
-- followed by some '>'
closingTag = do
_ <- char '\n'
r <- between (string "</") (char '>') (many upper)
return r
string has the type
string :: Stream s m Char => String -> ParsecT s u m String
(See here for documentation)
So getting a String back is exactly what's supposed to happen.
Type inference doesn't change types, it only infers them. String is a concrete type, so there's no way to infer Text for it.
What you could do, if you need this in a couple of places, is to write a function
text :: Stream s m Char => String -> ParsecT s u m Text
text = fmap pack . string
or even
string' :: (IsString a, Stream s m Char) => String -> ParsecT s u m a
string' = fmap fromString . string
Also, it doesn't matter in this example but you'd probably want to import Text qualified, names like pack are used in a number of different modules.
As Ørjan Johansen correctly pointed out, string isn't actually the problem here, many upper is. The same principle applies though.
The reason you get [Char] here is that upper parses a Char and many turns that into a [Char]. I would write my own combinator along the lines of:
manyPacked = fmap pack . many
You could probably use type-level programming with type classes etc. to automatically choose between many and manyPack depending on the expect return type, but I don't think that's worth it. (It would probably look a bit like Scala's CanBuiltFrom).

Why does the Data.String.IsString typeclass only define one conversion?

Why does the Haskell base package only define the IsString class to have a conversion from String to 'like-string' value, and not define the inverse transformation, from 'like-string' value to String?
The class should be defined as:
class IsString a where
fromString :: String -> a
toString :: a -> String
ref: http://hackage.haskell.org/packages/archive/base/4.4.0.0/doc/html/Data-String.html
The reason is IMHO that IsString's primary purpose is to be used for string literals in Haskell source code (or (E)DSLs -- see also Paradise: A two-stage DSL embedded in Haskell) via the OverloadedStrings language extension in an analogous way to how other polymorphic literals work (e.g. via fromRational for floating point literals or fromInteger for integer literals)
The term IsString might be a bit misleading, as it suggests that the type-class represents string-like structures, whereas it's really just to denote types which have a quoted-string-representation in Haskell source code.
If you desire to use toString :: a -> String, I think you're simply forgetting about show :: a -> String, or more properly Show a => show :: a -> String.
If you want to operate on a type both having a :: a -> String and :: String -> a, you can simply put those type-class constraints on the functions.
doubleConstraintedFunction :: Show a, IsString a => a -> .. -> .. -> a
We carefully note that we avoid defining type classes having a set of functions that can as well be split into two subclasses. Therefor we don't put toString in IsString.
Finally, I must also mention about Read, which provides Read a => String -> a. You use read and show for very simple serialization. fromString from IsString has a different purpose, it's useful with the language pragma OverloadedStrings, then you can very conveniently insert code like "This is not a string" :: Text. (Text is a (efficient) data-structure for Strings)

Resources