Haskell ByteString / Data.Binary.Get question

Haskell ByteString / Data.Binary.Get question - haskell

Attempting to use Data.Binary.Get and ByteString and not understanding what's happening. My code is below:
getSegmentParams :: Get (Int, L.ByteString)
getSegmentParams = do
seglen <- liftM fromIntegral getWord16be
params <- getByteString (seglen - 2)
return (seglen, params)
I get the following error against the third item of the return tuple, ie payload:
Couldn't match expected type `L.ByteString'
against inferred type `bytestring-0.9.1.4:Data.ByteString.Internal.ByteString'
Someone please explain to me the interaction between Data.Binary.Get and ByteStrings and how I can do what I'm intending. Thanks.

It says you expect the second element of the tuple to be a L.ByteString (I assume that L is from Data.ByteString.Lazy) but the getByteString routine returns a strict ByteString from Data.ByteString. You probably want to use getLazyByteString.

There are two ByteString data types: one is in Data.ByteString.Lazy and one is in Data.ByteString.
Given the L qualifying your ByteString, I presume you want the lazy variety, but getByteString is giving you a strict ByteString.
Lazy ByteStrings are internally represented by a list of strict ByteStrings.
Fortunately Data.ByteString.Lazy gives you a mechanism for turning a list of strict ByteStrings into a lazy ByteString.
If you define
import qualified Data.ByteString as S
strictToLazy :: S.ByteString -> L.ByteString
strictToLazy = L.fromChunks . return
you can change your code fragment to
getSegmentParams :: Get (Int, L.ByteString)
getSegmentParams = do
seglen <- liftM fromIntegral getWord16be
params <- getByteString (seglen - 2)
return (seglen, strictToLazy params)
and all should be right with the world.

Related

Haskell type error occurs when compiling two code snippets together that separately raise no error

I am having a type issue with Haskell, the program below throws the compile time error:
Couldn't match expected type ‘bytestring-0.10.8.2:Data.ByteString.Lazy.Internal.ByteString’ with actual type ‘Text’
Program is:
{-# LANGUAGE OverloadedStrings #-}
module Main where
...
import Control.Concurrent (MVar, newMVar, modifyMVar_, modifyMVar, readMVar)
import qualified Data.Text as T
import qualified Data.Text.IO as T
import qualified Network.WebSockets as WS
import Data.Map (Map)
import Data.Aeson (decode)
...
application :: MVar ServerState -> WS.ServerApp
application state pending = do
conn <- WS.acceptRequest pending
msg <- WS.receiveData conn
-- EITHER this line can be included
T.putStrLn msg
-- OR these two lines, but not both
decodedObject <- return (decode msg :: Maybe (Map String Int))
print decodedObject
...
It seems to me that the basic issue is that putStrLn expects Text whereas decode expects Bytetring.
What I don't get is why I can run this section of the code:
T.putStrLn msg
Or I can run this section of the code:
decodedObject <- return (decode msg :: Maybe (Map String Int))
print decodedObject
But not both together.
What is the proper way to resolve this issue in the program?
I guess this is something like Type Coercion, or Type Inference, or what would be Casting in other languages. The problem is I don't know how to phrase the problem clearly enough to look it up.
It's as if msg can be one of a number of Types, but as soon as it is forced to be one Type, it can't then be another...
I'm also not sure if this overlaps with Overloaded strings. I have the pragma and am compiling with -XOverloadedStrings
I'm quite a newbie, so hope this is a reasonable question.
Any advice gratefully received! Thanks

This is because WS.receiveData is polymorphic on its return type:
receiveData :: WebSocketsData a => Connection -> IO a
it only needs to be WebSocketsData a instance, which both Text and ByteString are. So the compiler just infers the type.
I suggest you just assume it's a ByteString, and convert in Text upon the putStrLn usage.

Thanks to everyone for their advice. My final understanding is that any value in Haskell can be polymorphic until you force it to settle on a type, at which point it can't be any other type (stupid, but I hadn't seen a clear example of that before).
In my example, WS.receiveData returns polymorphic IO a where a is an instance of class WebsocketData which itself is parameterised by a type which can be either Text or Bytestring.
Aeson decode expects a (lazy) Bytestring. Assuming that we settle on (lazy) Bytestring for our a, this means the first line that I mentioned before needs to become:
T.putStrLn $ toStrict $ decodeUtf8 msg
to convert the lazy ByteString to a strict Text. I can do this so long as I know the incoming websocket message is UTF8 encoded.
I may have got some wording wrong there, but think that's basically it.

How can I convert IO ByteString to IO String

I know how to convert a ByteString to a String with unpack but I'm struggling to figure out how to convert an IO ByteString (which is what I get back from the fetchHeader function in HaskellNet) to an IO String. I'm basically trying to do something like this
getAllHeadersForMessageUID :: IMapConnection -> UID -> IO String
getAllHeadersForMessageUID connection uid = do
headers <- fetchHeader connection uid
return (headers >>= BS.unpack)
The error message doesn't make sense to me
Couldn't match expected type ‘[BS.ByteString]’
with actual type ‘BS.ByteString’
In the first argument of ‘(>>=)’, namely ‘headers’
In the first argument of ‘return’, namely ‘(headers >>= BS.unpack)
I don't know why a list of ByteString is expected.

Try using return $ BS.unpack headers instead of return (headers >>= BS.unpack).
Or try return $ map BS.unpack headers if headers is a list of ByteStrings.
Besides the fact that it happens to type check (and I'm assuming BS.unpack headers works), here's a way to think about things:
headers is a pure value
BS.unpack is a pure function
headers >>= ... doesn't make sense because the LHS of >>= needs to be a monadic computation
... >>= BS.unpack doesn't make sense because the RHS of >>= needs to be a function which produces a monadic computation
BS.unpack headers is the string we want to return, but it's a pure value
we therefore use return to promote the pure value to a monadic computation
Update:
The following code shows that if fetchHeader has type IO [BS.ByteString], then your code will type check:
import Data.ByteString.Char8 as BS
fetchHeader :: IO [BS.ByteString] -- this works
-- fetchHeader :: IO BS.ByteString -- this doesn't
fetchHeader = undefined
foo :: IO String
foo = do
headers <- fetchHeader
return $ headers >>= BS.unpack
On the other hand, if you change its type to IO BS.ByteString you get the error you encountered.
Update 2:
Interestingly enough, when headers is a list of ByteStrings, the expression headers >>= BS.unpack does make sense and is equivalent to:
concat $ map BS.unpack headers

User5402's answer assumes the ByteString is pure ASCII (which is OK if you are the only person using your code, but there are several reasons why it's a bad idea if you intend to share it)
If the ByteString is encoded with UTF-8: you can convert it to String like this:
import qualified Codec.Binary.UTF8.String as UTF8
foo b = do
bs <- b
return $ UTF8.decode $ unpack bs
I'm not sure how to deal with other encodings such as windows codepages (other than by setting the encoding of a handle, which isn't applicable here).

differences between Lazy.ByteString and Lazy.Char8.ByteString

I am bit confused over the codes in real world haskell
import qualified Data.ByteString.Lazy.Char8 as L8
import qualified Data.ByteString.Lazy as L
matchHeader :: L.ByteString -> L.ByteString -> Maybe L.ByteString
matchHeader prefix str
| prefix `L8.isPrefixOf` str
= Just (L8.dropWhile isSpace (L.drop (L.length prefix) str))
| otherwise
= Nothing
It seems L and L8 can be used interchangeably somewhere in this function, compiles fine if I replace L with L8 especially for the type L.ByteString and L8.ByteString, I saw in hackage, they're linked to the same source, does that mean Data.ByteString.Lazy.Char8.ByteString is the same as Data.ByteString.Lazy.ByteString ? Why L8.isPrefixOf is used here but not L.isPrefixOf?

That's funny, I've used all the ByteStrings but never noticed (until you mentioned) it that the Char8 and Word8 versions are internally the same data type.
Once mentioned though, I had to go and look at the code.... The following line in Data/ByteString/Lazy/Char8.hs shows that not only are the data types the same, but many of the functions are reexported identically....
-- Functions transparently exported
import Data.ByteString.Lazy
(fromChunks, toChunks, fromStrict, toStrict
,empty,null,length,tail,init,append,reverse,transpose,cycle
,concat,take,drop,splitAt,intercalate,isPrefixOf,group,inits,tails,copy
,hGetContents, hGet, hPut, getContents
,hGetNonBlocking, hPutNonBlocking
,putStr, hPutStr, interact)
So it would seem that most of Data.ByteString.(Lazy.)?Char8 are just a convenience wrapper around Data.ByteString(.Lazy)?. This also explains to me why show has always created stringy output for Word8 ByteStrings.
Of course some stuff does differ, as you can see when you try to create a ByteString-
B.pack "abcd" -- This fails
B.pack [65, 66, 67, 68] -- output is "ABCD"
B8.pack "abcd" -- This works

According to the documentation, both Lazy.ByteString and Lazy.Char.ByteString are a space-efficient representation of a Word8 vector, supporting many efficient operations. So, internally they seem to be same and you can use them interchangeably.
But Lazy.Char.ByteString has additionally these characteristics:
All Chars will be truncated to 8 bits (So be careful!)
The Char8 interface to bytestrings provides an instance of IsString for the ByteString type, enabling you to use string literals, and have them implicitly packed to ByteStrings. (you should enable OverloadedStrings extension for this)

Reading a file into an array of strings

I'm pretty new to Haskell, and am trying to simply read a file into a list of strings. I'd like one line of the file per element of the list. But I'm running into a type issue that I don't understand. Here's what I've written for my function:
readAllTheLines hdl = (hGetLine hdl):(readAllTheLines hdl)
That compiles fine. I had thought that the file handle needed to be the same one returned from openFile. I attempted to simply show the list from the above function by doing the following:
displayFile path = show (readAllTheLines (openFile path ReadMode))
But when I try to compile it, I get the following error:
filefun.hs:5:43:
Couldn't match expected type 'Handle' with actual type 'IO Handle'
In the return type of a call of 'openFile'
In the first argument of 'readAllTheLines', namely
'(openFile path ReadMode)'
In the first argument of 'show', namely
'(readAllTheLines (openFile path ReadMode))'
So it seems like openFile returns an IO Handle, but hGetLine needs a plain old Handle. Am I misunderstanding the use of these 2 functions? Are they not intended to be used together? Or is there just a piece I'm missing?

Use readFile and lines for a better alternative.
readLines :: FilePath -> IO [String]
readLines = fmap lines . readFile
Coming back to your solution openFile returns IO Handle so you have to run the action to get the Handle. You also have to check if the Handle is at eof before reading something from that. It is much simpler to just use the above solution.
import System.IO
readAllTheLines :: Handle -> IO [String]
readAllTheLines hndl = do
eof <- hIsEOF hndl
notEnded eof
where notEnded False = do
line <- hGetLine hndl
rest <- readAllTheLines hndl
return (line:rest)
notEnded True = return []
displayFile :: FilePath -> IO [String]
displayFile path = do
hndl <- openFile path ReadMode
readAllTheLines hndl

To add on to Satvik's answer, the example below shows how you can utilize a function to populate an instance of Haskell's STArray typeclass in case you need to perform computations on a truly random access data type.
Code Example
Let's say we have the following problem. We have lines in a text file "test.txt", and we need to load it into an array and then display the line found in the center of that file. This kind of computation is exactly the sort situation where one would want to use a random access array over a sequentially structured list. Granted, in this example, there may not be a huge difference between using a list and an array, but, generally speaking, list accesses will cost O(n) in time whereas array accesses will give you constant time performance.
First, let's create our sample text file:
test.txt
This
is
definitely
a
test.
Given the file above, we can use the following Haskell program (located in the same directory as test.txt) to print out the middle line of text, i.e. the word "definitely."
Main.hs
{-# LANGUAGE BlockArguments #-} -- See footnote 1
import Control.Monad.ST (runST, ST)
import Data.Array.MArray (newArray, readArray, writeArray)
import Data.Array.ST (STArray)
import Data.Foldable (for_)
import Data.Ix (Ix) -- See footnote 2
populateArray :: (Integral i, Ix i) => STArray s i e -> [e] -> ST s () -- See footnote 3
populateArray stArray es = for_ (zip [0..] es) (uncurry (writeArray stArray))
middleWord' :: (Integral i, Ix i) => i -> STArray s i String -> ST s String
middleWord' arrayLength = flip readArray (arrayLength `div` 2)
middleWord :: [String] -> String
middleWord ws = runST do
let len = length ws
array <- newArray (0, len - 1) "" :: ST s (STArray s Int String)
populateArray array ws
middleWord' len array
main :: IO ()
main = do
ws <- words <$> readFile "test.txt"
putStrLn $ middleWord ws
Explanation
Starting with the top of Main.hs, the ST s monad and its associated function runST allow us to extract pure values from imperative-style computations with in-place updates in a referentially transparent manner. The module Data.Array.MArray exports the MArray typeclass as an interface for instantiating mutable array data types and provides helper functions for creating, reading, and writing MArrays. These functions can be used in conjunction with STArrays since there is an instance of MArray defined for STArray.
The populateArray function is the crux of our example. It uses for_ to "applicatively" loop over a list of tuples of indices and list elements to fill the given STArray with those list elements, producing a value of type () in the ST s monad.
The middleWord' helper function uses readArray to produce a String (wrapped in the ST s monad) that corresponds to the middle element of a given STArray of Strings.
The middleWord function instantiates a new STArray, uses populateArray to fill the array with values from a provided list of strings, and calls middleWord' to obtain the middle string in the array. runST is applied to this whole ST s monadic computation to extract the pure String result.
We finally use our middleWord function in main to find the middle word in the text file "test.txt".
Further Reading
Haskell's STArray is not the only way to work with arrays in Haskell. There are in fact Arrays, IOArrays, DiffArrays and even "unboxed" versions of all of these array types that avoid using the indirection of pointers to simply store "raw" values. There is a page on the Haskell Wikibook on this topic that may be worth some study. Before that, however, looking at the Wikibook page on mutable objects may give you some insight as to why the ST s monad allows us to safely compute pure values from functions that use imperative/destructive operations.
Footnotes
1 The BlockArguments language extension is what allows us to pass a do block directly to a function without any parentheses or use of the function application operator $.
2 As suggested by the Hackage documentation, Ix is a typeclass mainly meant to be used to specify types for indexing arrays.
3 The use of the Integral and Ix type constraints may be a bit of overkill, but it's used to make our type signatures as general as possible.

Many types of String (ByteString)

I wish to compress my application's network traffic.
According to the (latest?) "Haskell Popularity Rankings", zlib seems to be a pretty popular solution. zlib's interface uses ByteStrings:
compress :: ByteString -> ByteString
decompress :: ByteString -> ByteString
I am using regular Strings, which are also the data types used by read, show, and Network.Socket:
sendTo :: Socket -> String -> SockAddr -> IO Int
recvFrom :: Socket -> Int -> IO (String, Int, SockAddr)
So to compress my strings, I need some way to convert a String to a ByteString and vice-versa.
With hoogle's help, I found:
Data.ByteString.Char8 pack :: String -> ByteString
Trying to use it:
Prelude Codec.Compression.Zlib Data.ByteString.Char8> compress (pack "boo")
<interactive>:1:10:
Couldn't match expected type `Data.ByteString.Lazy.Internal.ByteString'
against inferred type `ByteString'
In the first argument of `compress', namely `(pack "boo")'
In the expression: compress (pack "boo")
In the definition of `it': it = compress (pack "boo")
Fails, because (?) there are different types of ByteString ?
So basically:
Are there several types of ByteString? What types, and why?
What's "the" way to convert Strings to ByteStrings?
Btw, I found that it does work with Data.ByteString.Lazy.Char8's ByteString, but I'm still intrigued.

There are two kinds of bytestrings: strict (defined in Data.Bytestring.Internal) and lazy (defined in Data.Bytestring.Lazy.Internal). zlib uses lazy bytestrings, as you've discovered.

The function you're looking for is:
import Data.ByteString as BS
import Data.ByteString.Lazy as LBS
lazyToStrictBS :: LBS.ByteString -> BS.ByteString
lazyToStrictBS x = BS.concat $ LBS.toChunks x
I expect it can be written more concisely without the x. (i.e. point-free, but I'm new to Haskell.)

A more efficient mechanism might be to switch to a full bytestring-based layer:
network.bytestring for bytestring sockets
lazy bytestrings for compressoin
binary of bytestring-show to replace Show/Read

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Haskell ByteString / Data.Binary.Get question - haskell

It says you expect the second element of the tuple to be a L.ByteString (I assume that L is from Data.ByteString.Lazy) but the getByteString routine returns a strict ByteString from Data.ByteString. You probably want to use getLazyByteString.

Related

Haskell type error occurs when compiling two code snippets together that separately raise no error

How can I convert IO ByteString to IO String

differences between Lazy.ByteString and Lazy.Char8.ByteString

Reading a file into an array of strings

Many types of String (ByteString)

Categories

Resources