When I run this code, I get a decode error from Data.Text. What am I doing wrong?
import Data.Text (Text, pack, unpack)
import Data.Text.Encoding (decodeUtf8)
import Data.ByteString (ByteString)
import System.Entropy
randBS :: IO ByteString
randBS = do
randBytes <- getEntropy 2048
return randBytes
main :: IO ()
main = do
r <- randBS
putStrLn $ unpack $ decodeUtf8 r
Runtime Error:
Cannot decode byte '\xc4': Data.Text.Internal.Encoding.Fusion.streamUtf8:
Invalid UTF-8 stream
I would like to generate some random bytes that will be used as an auth token.
I am on Mac OS X (Yosemite) and GHC Version 7.10.1
randBS returns random bytes not utf-8 encoded data!
What you have is not a representation of Text so it doesn't matter which function you use you will encounter some decoding error, and so you'll have to use something like decodeUtf8With and use an error handler to replace invalid bytes with their literal counterpart.
Something like:
import Data.Text (Text, pack, unpack)
import Data.Text.Encoding (decodeUtf8With)
import Data.ByteString (ByteString)
import Data.Char (chr)
import Control.Applicative ((<$>))
import System.Entropy
handler _ x = chr <$> fromIntegral <$> x
randBS :: IO ByteString
randBS = do
randBytes <- getEntropy 2048
return randBytes
main :: IO ()
main = do
r <- randBS
putStrLn $ unpack $ decodeUtf8With handler r
Not tested, in this moment I don't have GHC installed :s
Probably even better is to simply use hexadecimal encoding instead of utf-8 + error handler. You can do so with the base16-bytestring library. So you'd first use the encode :: ByteString -> ByteString to obtain a representation with only ASCII values:
import Data.Text (Text, pack, unpack)
import Data.ByteString (ByteString)
import Data.ByteString.Encoding (decodeUtf8)
import Data.ByteString.Base16 (encode)
import System.Entropy
--- ... randBS as before
main = do
r <- randBS
putStrLn $ unpack $ decodeUtf8 $ encode r
Related
How do I read one byte from STDIN?
I tried searching for IO Word8 on Hoogle but there is nothing useful.
The closest I can find is System.IO.getChar, but it reads a Char not a Word8.
I think the easiest option is to use hGet from Data.ByteString like this:
import qualified Data.ByteString as B
import System.IO (stdin)
import Data.Word (Word8)
getByte :: IO Word8
getByte = B.head <$> B.hGet stdin 1
You can alternatively use hSetBinaryMode like this:
import System.IO
import Data.Word (Word8)
getByte :: IO Word8
getByte = do
hSetBinaryMode stdin True
c <- getChar
hSetBinaryMode stdin False
pure $! fromIntegral (fromEnum c)
I'm trying to use RSA from cryptonite package to handle encryption and try to create an X509 cert (Data.X509) from a pem encoded cert base64Encoded string so i can access the Public and Private Key. However, i kept getting a "StreamConstructionWrongSize" error. I think i'm missing something here when reading the base64Encoded string in. I included the snippet of the codes. Perhaps, someone can spot the mistake. I'd really appreciate any help.
import Data.String.Conversions (cs)
import Crypto.Hash (hash, SHA256(..), Digest)
import Data.ByteString (ByteString)
import Data.Text.Encoding (encodeUtf8)
import qualified Data.Text.IO as TIO
import System.IO (hFlush, stdout)
import Data.ByteArray.Encoding (convertToBase, Base (Base64))
import Data.X509
main :: IO ()
main = do
let certBS = encodeUtf8 "MIIFXjCCBEagAwIBAgIIKPOmOJqfnTIwDQYJKoZIhvcNAQELBQAwgbQxCzAJBgNVBAYTAlVTMRAwDgYDVQQIEwdBcml6b25hMRMwEQYDVQQHEwpTY290dHNkYWxlMRowGAYDVQQKExFHb0RhZGR5LmNvbSwgSW5jLjEtMCsGA1UECxMkaHR0cDovL2NlcnRzLmdvZGFkZHkuY29tL3JlcG9zaXRvcnkvMTMwMQYDVQQDEypHbyBEYWRkeSBTZWN1cmUgQ2VydGlmaWNhdGUgQXV0aG9yaXR5IC0gRzIwHhcNMTYwMjA5MjE1NzM5WhcNMTkwMzA3MjIzNTUxWjBrMQswCQYDVQQGEwJVUzENMAsGA1UECBMEVXRhaDEVMBMGA1UEBxMMU291dGggSm9yZGFuMRcwFQYDVQQKEw5BbGxlZ2lhbmNlIEluYzEdMBsGA1UEAwwUKi5hbGxlZ2lhbmNldGVjaC5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDHWaH6ziZnWy3Uy/wmM7zbHqrg/AmdsWk54x6wpe7+B2USbTbZg56rGjEL/I1Pe/PtcbM8uiPq1TshMxwePqWFCqKfJrj6/RS9/gKpWOwfh+Z2Qk78L3DQH97njDLJAYBf68MX0eKhM0PGBo+1BqKfAb9C3hEltROuABPEBO7GMFecZaDtZ6lDkYBidRs7nOqQnU91X8ly376U0dAEnTghVmDSh5v8yKvMm0AhOd8S7EulfXEg+cj6Y996pM56bDbv2H7NZNv7QD38FY2IQWTWT46z0riF/D93ilVF62cvNXtUwj2kx581Z/e4eB7+Q3cLFFG1KGLgnvDKIguuxK5LAgMBAAGjggG6MIIBtjAMBgNVHRMBAf8EAjAAMB0GA1UdJQQWMBQGCCsGAQUFBwMBBggrBgEFBQcDAjAOBgNVHQ8BAf8EBAMCBaAwNQYDVR0fBC4wLDAqoCigJoYkaHR0cDovL2NybC5nb2RhZGR5LmNvbS9nZGlnMnMyLTIuY3JsMFMGA1UdIARMMEowSAYLYIZIAYb9bQEHFwIwOTA3BggrBgEFBQcCARYraHR0cDovL2NlcnRpZmljYXRlcy5nb2RhZGR5LmNvbS9yZXBvc2l0b3J5LzB2BggrBgEFBQcBAQRqMGgwJAYIKwYBBQUHMAGGGGh0dHA6Ly9vY3NwLmdvZGFkZHkuY29tLzBABggrBgEFBQcwAoY0aHR0cDovL2NlcnRpZmljYXRlcy5nb2RhZGR5LmNvbS9yZXBvc2l0b3J5L2dkaWcyLmNydDAfBgNVHSMEGDAWgBRAwr0njsw0gzCiM9f7bLPwtCyAzjAzBgNVHREELDAqghQqLmFsbGVnaWFuY2V0ZWNoLmNvbYISYWxsZWdpYW5jZXRlY2guY29tMB0GA1UdDgQWBBSFAVaoKHe4qj41Axk07GZrLVx38zANBgkqhkiG9w0BAQsFAAOCAQEApsEOT1kL51QIQbpQjkamjQ1y+HoEMXd7npM1YMfaLxL+v4b6qShMamtNUWac+aJ5ul8849UBhPj3nNfLz6PJtdFmztKOHs58Th8lz6ksSDt1z/GRi3/EcWR7beBuDD2AKFGnckcSkCls+lOBSh/BxAIKPdlD3wzUEpqLuD1scucRfcjwt6X0K3otthMJ9PZ9tJQMSOLnUFKLvG+ZFlnkY+K3pmKWsk2ZMrua2m9wvqVXq4ZgEKQ6xr91HYkPCjfLt14ExnL3vNdXS0DR7LtjJDyXJeFkT0Ev81W9c5dk7gzbwg+EM4UYxW6qz+7WyHvV+uLNnHNmwm1b17tVfYvu3w=="
cert = decodeSignedCertificate certBS
case cert of
Left st -> putStrLn $ "Error reading cert: " ++ st
Right scrt -> do
let crt = getCertificate scrt
putStrLn $ show crt
Updated: I found the solution to my question. Hope this helps other who has the same problem
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Data.String.Conversions (cs)
import Crypto.Hash (hash, SHA256(..), Digest)
import qualified Data.ByteString as BS (ByteString, readFile)
import Data.Text.Encoding (encodeUtf8)
import qualified Data.Text.IO as TIO
import System.IO (hFlush, stdout)
import Data.ByteArray.Encoding (convertToBase, Base (Base64))
import Data.List (head)
import Data.Text (Text(..))
----- Read file import
import Data.Monoid ((<>))
import Control.Applicative
import Data.ASN1.Types
import Data.ASN1.BinaryEncoding
import Data.ASN1.Encoding
import Data.Maybe
import qualified Data.X509 as X509
import Data.X509.Memory (pemToKey)
import Data.PEM (pemParseLBS, pemParseBS, pemWriteBS, pemContent, pemName, PEM)
import qualified Data.ByteString.Lazy as L
-------------------------
main :: IO ()
main = do
--pems <- readPEMs "./certs/test.pem" -- read from file
let cert1 = "-----BEGIN CERTIFICATE-----\n" ++ "MIIFXjCCBEagAwIBAgIIKPOmOJqfnTIwDQYJKoZIhvcNAQELBQAwgbQxCzAJBgNVBAYTAlVTMRAwDgYDVQQIEwdBcml6b25hMRMwEQYDVQQHEwpTY290dHNkYWxlMRowGAYDVQQKExFHb0RhZGR5LmNvbSwgSW5jLjEtMCsGA1UECxMkaHR0cDovL2NlcnRzLmdvZGFkZHkuY29tL3JlcG9zaXRvcnkvMTMwMQYDVQQDEypHbyBEYWRkeSBTZWN1cmUgQ2VydGlmaWNhdGUgQXV0aG9yaXR5IC0gRzIwHhcNMTYwMjA5MjE1NzM5WhcNMTkwMzA3MjIzNTUxWjBrMQswCQYDVQQGEwJVUzENMAsGA1UECBMEVXRhaDEVMBMGA1UEBxMMU291dGggSm9yZGFuMRcwFQYDVQQKEw5BbGxlZ2lhbmNlIEluYzEdMBsGA1UEAwwUKi5hbGxlZ2lhbmNldGVjaC5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDHWaH6ziZnWy3Uy/wmM7zbHqrg/AmdsWk54x6wpe7+B2USbTbZg56rGjEL/I1Pe/PtcbM8uiPq1TshMxwePqWFCqKfJrj6/RS9/gKpWOwfh+Z2Qk78L3DQH97njDLJAYBf68MX0eKhM0PGBo+1BqKfAb9C3hEltROuABPEBO7GMFecZaDtZ6lDkYBidRs7nOqQnU91X8ly376U0dAEnTghVmDSh5v8yKvMm0AhOd8S7EulfXEg+cj6Y996pM56bDbv2H7NZNv7QD38FY2IQWTWT46z0riF/D93ilVF62cvNXtUwj2kx581Z/e4eB7+Q3cLFFG1KGLgnvDKIguuxK5LAgMBAAGjggG6MIIBtjAMBgNVHRMBAf8EAjAAMB0GA1UdJQQWMBQGCCsGAQUFBwMBBggrBgEFBQcDAjAOBgNVHQ8BAf8EBAMCBaAwNQYDVR0fBC4wLDAqoCigJoYkaHR0cDovL2NybC5nb2RhZGR5LmNvbS9nZGlnMnMyLTIuY3JsMFMGA1UdIARMMEowSAYLYIZIAYb9bQEHFwIwOTA3BggrBgEFBQcCARYraHR0cDovL2NlcnRpZmljYXRlcy5nb2RhZGR5LmNvbS9yZXBvc2l0b3J5LzB2BggrBgEFBQcBAQRqMGgwJAYIKwYBBQUHMAGGGGh0dHA6Ly9vY3NwLmdvZGFkZHkuY29tLzBABggrBgEFBQcwAoY0aHR0cDovL2NlcnRpZmljYXRlcy5nb2RhZGR5LmNvbS9yZXBvc2l0b3J5L2dkaWcyLmNydDAfBgNVHSMEGDAWgBRAwr0njsw0gzCiM9f7bLPwtCyAzjAzBgNVHREELDAqghQqLmFsbGVnaWFuY2V0ZWNoLmNvbYISYWxsZWdpYW5jZXRlY2guY29tMB0GA1UdDgQWBBSFAVaoKHe4qj41Axk07GZrLVx38zANBgkqhkiG9w0BAQsFAAOCAQEApsEOT1kL51QIQbpQjkamjQ1y+HoEMXd7npM1YMfaLxL+v4b6qShMamtNUWac+aJ5ul8849UBhPj3nNfLz6PJtdFmztKOHs58Th8lz6ksSDt1z/GRi3/EcWR7beBuDD2AKFGnckcSkCls+lOBSh/BxAIKPdlD3wzUEpqLuD1scucRfcjwt6X0K3otthMJ9PZ9tJQMSOLnUFKLvG+ZFlnkY+K3pmKWsk2ZMrua2m9wvqVXq4ZgEKQ6xr91HYkPCjfLt14ExnL3vNdXS0DR7LtjJDyXJeFkT0Ev81W9c5dk7gzbwg+EM4UYxW6qz+7WyHvV+uLNnHNmwm1b17tVfYvu3w==" ++ "\n-----END CERTIFICATE-----"
pems <- readPEM cert1
let certx = X509.decodeSignedObject $ pemContent $ head pems
case certx of
Left st1 -> putStrLn $ "Error reading cert: " ++ st1
Right scrt1 -> do
let crt1 = X509.getCertificate scrt1
print crt1
readPEM :: String -> IO [PEM]
readPEM content = --do
return $ either error id $ pemParseBS $ cs content
readPEMs :: FilePath -> IO [PEM]
readPEMs filepath = do
content <- BS.readFile filepath
return $ either error id $ pemParseBS content
This is a followup to this earlier question. I have a conduit source (from Network.HTTP.Conduit) which is strict ByteString. I will like to recombine them into larger chunks (to send over network to another client, after another encoding and conversion to lazy bytestring). I wrote chunksOfAtLeast conduit, derived from the answer in above question which seems to work pretty well. I am wondering if there is any further scope for improving it performance-wise.
import Data.Conduit as C
import Control.Monad.IO.Class
import Control.Monad
import Data.Conduit.Combinators as CC
import Data.Conduit.List as CL
import Data.ByteString.Lazy as LBS hiding (putStrLn)
import Data.ByteString as BS hiding (putStrLn)
chunksOfAtLeast :: Monad m => Int -> Conduit BS.ByteString m BS.ByteString
chunksOfAtLeast chunkSize =
loop
where
loop = do
bs <- takeE chunkSize =$= ((BS.concat . ($ [])) <$> CL.fold (\front next -> front . (next:)) id)
unless (BS.null bs) $ do
yield bs
loop
main = do
yieldMany ["hello", "there", "world!"] $$ chunksOfAtLeast 8 =$ CL.mapM_ Prelude.print
Getting optimal performance is always a case of trying something and benchmarking it, so I can't tell you with certainty that I'm offering you something more efficient. That said, combining smaller chunks of data into larger chunks is a primary goal of builders, so leveraging them may be more efficient. Here's an example:
{-# LANGUAGE OverloadedStrings #-}
import Conduit
import Data.ByteString (ByteString)
import Data.ByteString.Builder (byteString)
import Data.Conduit.ByteString.Builder
bufferChunks :: Conduit ByteString IO ByteString
bufferChunks = mapC byteString =$= builderToByteString
main :: IO ()
main = yieldMany ["hello", "there", "world!"] $$ bufferChunks =$ mapM_C print
In my Haskell Program I need to work with Strings and ByteStrings:
import Data.ByteString.Lazy as BS (ByteString)
import Data.ByteString.Char8 as C8 (pack)
import Data.Char (chr)
stringToBS :: String -> ByteString
stringToBS str = C8.pack str
bsToString :: BS.ByteString -> String
bsToString bs = map (chr . fromEnum) . BS.unpack $ bs
bsToString works fine, but stringToBS results with following error at compiling:
Couldn't match expected type ‘ByteString’
with actual type ‘Data.ByteString.Internal.ByteString’
NB: ‘ByteString’ is defined in ‘Data.ByteString.Lazy.Internal’
‘Data.ByteString.Internal.ByteString’
is defined in ‘Data.ByteString.Internal’
In the expression: pack str
In an equation for ‘stringToBS’: stringToBS str = pack str
But I need to let it be ByteString from Data.ByteString.Lazy as BS (ByteString) for further working functions in my code.
Any idea how to solve my problem?
You are working with both strict ByteStrings and lazy ByteStrings which are two different types.
This import:
import Data.ByteString.Lazy as BS (ByteString)
makes ByteString refer the lazy ByteStrings, so the type signature of your stringToBS doesn't match it's definition:
stringToBS :: String -> ByteString -- refers to lazy ByteStrings
stringToBS str = C8.pack str -- refers to strict ByteStrings
I think it would be a better idea to use import qualified like this:
import qualified Data.ByteString.Lazy as LBS
import qualified Data.ByteString.Char8 as BS
and use BS.ByteString and LBS.ByteString to refer to strict / lazy ByteStrings.
You can convert between lazy and non-lazy versions using fromStrict, and toStrict (both functions are in the lazy bytestring module).
Good day, i have mongodb database filled with some data, i ensured that data stored in correct charset, to fetch data i use following snippet:
{-# LANGUAGE OverloadedStrings #-}
import Network.Wai
import Network.Wai.Handler.Warp (run)
import Data.Enumerator (Iteratee (..))
import Data.Either (either)
import Control.Monad (join)
import Data.Maybe (fromMaybe)
import Network.HTTP.Types (statusOK, status404)
import qualified Data.ByteString as B
import qualified Data.ByteString.Lazy as L
import Data.ByteString.Char8 (unpack)
import Data.ByteString.Lazy.Char8 (pack)
import qualified Data.Text.Lazy as T
import Data.Text (Text(..))
import Control.Monad.IO.Class (liftIO, MonadIO)
import Data.Aeson (encode)
import qualified Data.Map as Map
import qualified Database.MongoDB as DB
application dbpipe req = do
case unpack $ rawPathInfo req of
"/items" -> itemsJSON dbpipe req
_ -> return $ responseLBS status404 [("Content-Type", "text/plain")] "404"
indexPage :: Iteratee B.ByteString IO Response
indexPage = do
page <- liftIO $ processTemplate "templates/index.html" []
return $ responseLBS statusOK [("Content-Type", "text/html; charset=utf-8")] page
processTemplate f attrs = do
page <- L.readFile f
return page
itemsJSON :: DB.Pipe -> Request -> Iteratee B.ByteString IO Response
itemsJSON dbpipe req = do
dbresult <- liftIO $ rundb dbpipe $ DB.find (DB.select [] $ tu "table") >>= DB.rest
let docs = either (const []) id dbresult
-- liftIO $ L.putStrLn $ encode $ show $ map docToMap docs
return $ responseLBS statusOK [("Content-Type", "text/plain; charset=utf-8")]
(encode $ map docToMap docs)
docToMap doc = Map.fromList $ map (\f -> (T.dropAround (== '"') $ T.pack $ show $ DB.label f, T.dropAround (== '"') $ T.pack $ show $ DB.value f)) doc
main = do
pipe <- DB.runIOE $ DB.connect $ DB.host "127.0.0.1"
run 3000 $ application pipe
rundb pipe act = DB.access pipe DB.master database act
tu :: B.ByteString -> UString
tu = DB.u . C8.unpack
Then the result is suprprising, DB.label works well, but DB.value giving me native characters as some escape codes, so the result is look like:
curl http://localhost:3000/items gives:
[{"Марка": "\1058\1080\1087 \1087\1086\1076",
"Model": "BD-W LG BP06LU10 Slim \1058\1080\1087 \1087\1086\1076\1082\1083\1102\1095\1077\1085\1080\1103"},
...
]
This happens in case i trying to print data and also in case i return data encoded as JSON
Any idea how correctly extract values from MongoDB driver ?
The following line confirms that aeson's encoding works properly (using the utf8-string library to read utf8 data off the lazy bytestring back to a haskell string:
> putStrLn $ Data.ByteString.Lazy.UTF8.toString $ encode $ ("\1058\1080\1087 \1087\1086\1076",12)
["Тип под",12]
Looking at your code more closely I see the real problem. You're calling T.pack $ show $ DB.value -- this will render out as literal codepoints, and then pack those into a text object. The fix is to switch from show to something smarter. Look at this (untested)
smartShow :: DB.Value -> Text
smartShow (String s) = Data.Text.Encoding.decodeUtf8 $ Data.CompactString.UTF8.toByteString s
smartShow x = T.pack $ show x
Obviously to handle the recursive cases, etc. you need to be smarter than that, but that's the general notion...
In fact, the "best" thing to do is to write a function of BSON -> JSON directly, rather than go through any intermediate structures at all.
Everything is working as expected -- only your expectations are wrong. =)
What you're seeing there are not raw Strings; they are String's which have been escaped to exist purely in the printable ASCII range by the show function, called by print:
print = putStrLn . show
Never fear: in memory, the string that prints as "\1058" is in fact a single Unicode codepoint long. You can observe this by printing the length of one of the Strings you're interested in and comparing that to the number of Unicode codepoints you expect.