connecting http-conduit to xml-conduit - haskell

I'm struggling converting a Response from http-conduit to an XML document via xml-conduit.
The doPost function takes an XML Document and posts it to the server. The server responds with an XML Document.
doPost queryDoc = do
runResourceT $ do
manager <- liftIO $ newManager def
req <- liftIO $ parseUrl hostname
let req2 = req
{ method = H.methodPost
, requestHeaders = [(CI.mk $ fromString "Content-Type", fromString "text/xml" :: Ascii) :: Header]
, redirectCount = 0
, checkStatus = \_ _ -> Nothing
, requestBody = RequestBodyLBS $ (renderLBS def queryDoc)
}
res <- http req2 manager
return $ res
The following works and returns '200':
let pingdoc = Document (Prologue [] Nothing []) (Element "SYSTEM" [] []) []
Response status headers body <- doPost pingdoc
return (H.statusCode status)
However, when I try and parse the Response body using xml-conduit, I run into problems:
Response status headers body <- doPost xmldoc
let xmlRes' = parseLBS def body
The resulting compilation error is:
Couldn't match expected type `L.ByteString'
with actual type `Source m0 ByteString'
In the second argument of `parseLBS', namely `body'
In the expression: parseLBS def body
In an equation for `xmlRes'': xmlRes' = parseLBS def body
I've tried connecting the Source from http-conduit to the xml-conduit using $= and $$, but I'm not having any success.
Does anyone have any hints to point me in the right direction? Thanks in advance.
Neil

You could use httpLbs rather than http, so that it returns a lazy ByteString rather than a Source — the parseLBS function is named because that's what it takes: a Lazy ByteString. However, it's probably best to use the conduit interface that the two are based on directly, as you mentioned. To do this, you should remove the runResourceT line from doPost, and use the following to get an XML document:
xmlRes' <- runResourceT $ do
Response status headers body <- doPost xmldoc
body $$ sinkDoc def
This uses xml-conduit's sinkDoc function, connecting the Source from http-conduit to the Sink from xml-conduit.
Once they're connected, the complete pipeline has to be run using runResourceT, which ensures all allocated resources are released in a timely fashion. The problem with your original code is that it runs the ResourceT too early, from inside doPost; you should generally use runResourceT right at the point that you want an actual result out, because a pipeline has to run entirely within the scope of a single ResourceT.
By the way, res <- http req2 manager; return $ res can be simplified to just http req2 manager.

Related

Trouble working with `requestBody` (or `getRequestBodyChunk`)

Dependency versions
Wai: 3.2.2.1
Synopsis
The working of getRequestBodyChunk (https://hackage.haskell.org/package/wai-3.2.2.1/docs/Network-Wai.html#v:getRequestBodyChunk) is quite unclear to me. I understand that it reads the next chunk of bytes available in a request; and finally yields a Data.ByteString.empty indicating no more bytes are available.
Thus, given a req :: Request; I'm trying to create a Conduit from the request body as:
streamBody :: Wai.Request -> ConduitM () ByteString IO ()
streamBody req = repeatWhileMC (Wai.getRequestBodyChunk) (Data.ByteString.empty /=)
However, this seems to terminate immediately, as I've seen tracing the outputs:
repeatWhileMC (traceShowId <$> Wai.getRequestBodyChunk req) (traceShowId . (Data.ByteString.empty /=))
The output is
""
False
And hence the stream terminates.
But I can verify that the request body is not empty by other means.
So I'm a little stumped about this and I have a few questions:
Does this mean that the request body has been already consumed?
Or does this mean that the request body wasn't chunked?
Or does it mean that for smaller requests; the chunked bytes are always empty?
Is Scotty overriding this especially for smaller sized requests (I cannot seem to find this anywhere in its code/docs.)
If I do something like this:
streamBody req =
let req' = req { requestBody = pure "foobar" }
in repeatWhileMC (Wai.getRequestBodyChunk req') (Data.ByteString.empty /=)
I do get a non terminating stream of bytes; which leads me to suspect that something before this part of the code consumes the request body: or that function returns an empty body to begin with.
A point to note is that this seems to happen with smaller requests only.
Another point to note that I get the Wai.Request via Web.Scotty.Trans.request; which also has the body related streaming helpers.
Can this also be documented? I believe the documentation for getRequestBodyChunk etc. can be improved with this information.

Haskell Proxy Post request

I have following code working with proxy for a GET Request:
import Control.Applicative ((<$>))
import Data.Maybe (fromJust)
import Network.Browser
import Network.HTTP
import Network.HTTP.Proxy (parseProxy)
main = do
rsp <- browse $ do
setProxy . fromJust $ parseProxy "127.0.0.1:8118"
request $ getRequest "http://www.google.com"
print $ rspBody <$> rsp
And this one for Post, but without proxy:
main = do
r <- post "http://www.geocodeip.com" ["IP" := Data.ByteString.Lazy.Char8.pack "79.212.82.103"]
html <- r ^. responseBody
print html
But how to make a post request with proxy? I dont get it. please help me!
It's pretty simple if you keep track of what you're doing.
We need to use request but feed it a POST request rather than a GET request. To make these we use postRequestWithBody which Hackage tells us has the parameters
postRequestWithBody :: String | URL to POST to
-> String | Content-Type of body
-> String | The body of the request
-> Request_String | The constructed request
So replace request $ getRequest "http://www.google.com" with:
request $ postRequestWithBody "http://www.geocodeip.com/" "application/x-www-form-urlencoded" "IP=79.212.82.103"
...and you'll be good.

http-conduit, snap and lazy IO

I have two http-servers working with a json api using the snap framework
my first prototype contains a handler similar to this example handler
import Data.ByteString (ByteString)
import Data.ByteString.Char8 as B (unwords, putStrLn)
import Data.ByteString.Lazy.Char8 as L (putStrLn)
import Control.Monad.IO.Class (liftIO)
import Data.Monoid ((<>))
import Snap.Core (getParam, modifyResponse, setHeader, writeLBS)
import Network.HTTP.Conduit
import Network.HTTP.Client (defaultManagerSettings)
exampleHandler :: AppHandler ()
exampleHandler = do resp <- liftIO
$ do L.putStrLn "Begin request ..."
initReq <- parseUrl "http://localhost:8001/api"
manager <- newManager defaultManagerSettings
let req = initReq { method = "GET"
, proxy = Nothing}
r <- httpLbs req manager
L.putStrLn "... finished request."
return $ responseBody r
liftIO . L.putStrLn $ "resp: " <> resp
modifyResponse $ setHeader "Content-Type" "application/json"
writeLBS $ "{ \"data\": \""<> resp <>"\" }"
If I issue an ajax-request, the response is sent and received - i see this when the server writes resp: testdata on the console, but the response sent to the browser with writeLBS is not. Now if I change the last line to
writeLBS $ "{ \"data\": \""<> "something fixed" <>"\" }"
everything works like a charm. I think I am meeting one of the pitfalls of lazy IO, but I don't know how to remedy this.
I also tried a few variations with no singe liftIO-block but putting liftIO where necessary.
EDIT
based on the comment by #MichaelSnoyman I did some research regarding writeLBS and tried to
modifyResponse $ setBufferingMode False
. setHeader "Content-Type" "application/json"
writeLBS resp
as I thought maybe buffering could be the problem - no it is not
Furthermore I tried to write explicitly a setResponseBody
let bb = enumBuilder . fromLazyByteString $ "{ \"data\": \""<> resp <>"\" }"
modifyResponse $ setBufferingMode False
. setHeader "Content-Type" "application/json"
. setResponseBody bb
Which showed also no success.
I have solved this issue - it actually was a problem with the javascript getting the handwritten json (note to self: never do that again). There was a non-breaking space at the end of the input data that was not encoded correctly, and I as I am a newbie at JS I didn't get that from the error message.
The intermediate solution is to add urlEncode and make a strict ByteString
let respB = urlEncode . L.toStrict $ C.responseBody resp
modifyResponse $ setBufferingMode False
. setHeader "Content-Type" "application/json"
writeBS $ "{ \"data\": \"" <> respB <> "\" }"
of course you have to change imports accordingly.
The long term solution is: write a proper from/toJSON instance and let the library deal with this.

Accessing Mt Gox API via http-conduit-0.1.9.3: query string causes timeout

I'm trying to access the Mt Gox REST API using http-conduit. Queries that just have a path (e.g. https://data.mtgox.com/api/2/BTCUSD/money/ticker) work fine, but when I add a queryString to the request it times out.
So this works:
mtGoxRequest :: String -> QueryText -> Request m
mtGoxRequest p qt = def {
secure = True,
host = "data.mtgox.com",
port = 443,
method = "GET",
path = fromString $ "api/2/" ++ p,
queryString = renderQuery False $ queryTextToQuery qt,
responseTimeout = Just 10000000
}
currencyTicker :: Request m
currencyTicker = mtGoxRequest "BTCUSD/money/ticker" []
But this times out:
tradeStream :: Currency -> UTCTime -> Request m
tradeStream t = mtGoxRequest
"BTCUSD/money/trades/fetch"
[("since", Just $ T.pack $ utcToGoxTime t)]
The difference seems to be in the use of a queryString: when I added the bogus query "foo=bar" to the currencyTicker that timed out too.
However all this works fine in the web browser: going to https://data.mtgox.com/api/2/BTCUSD/money/ticker?foo=bar instantly returns the correct error message instead of timing out. The trade fetch URL works as well, although I won't include a link because the "since" argument says how far back to go. Conversely, if I remove the queryString from the trades list request it correctly returns the entire available trade history.
So something about the http-conduit query string is obviously different. Anyone know what it might be?
Here is the Haskell Request object being sent (as printed by "Show"):
Request {
host = "data.mtgox.com"
port = 443
secure = True
clientCertificates = []
requestHeaders = []
path = "api/2/BTCUSD/money/trades/fetch"
queryString = "since=1367142721624293"
requestBody = RequestBodyLBS Empty
method = "GET"
proxy = Nothing
rawBody = False
redirectCount = 10
responseTimeout = Just 10000000
}
According to its returned headers Mt Gox is using cloudflare-nginx and PHP 5.
Edit: Forgot to mention that when I use http-conduit to send a request with a queryString to http://scooterlabs.com/echo I get the correct response as well, so it seems to be some interaction between the Mt Gox webserver and http-conduit.
Got it figured out. You need to add a User-Agent string. So
requestHeaders = [(CI.mk "User-Agent", "Test/0.0.1")],
in the middle of the request function makes it work.
$ time curl https://data.mtgox.com/api/2/BTCUSD/money/trades/fetch?since=1367142721624293
...
real 0m20.993s
Looks to me like everything is working correctly: the API call takes a while to return, so http-conduit throws a timeout exception since 20s is longer than 10s.

Where is the breaking change?

I wrote a CRUD application to interface with JIRA. I ended up upgrading my haskell enviornment, because cabal-dev doesn't solve everything. As a result, I've got some breakage, with this error anytime I try to use any code that interfaces with JIRA.
Spike: HandshakeFailed (Error_Misc "user error (unexpected type received. expecting
handshake and got: Alert [(AlertLevel_Warning,UnrecognizedName)])")
After a little googling, I think this either has to do with tls or http-conduit which uses tls.
I'm currently using tls-1.1.2 and http-conduit-1.8.7.1
previously I was using
tls-0.9.11 and http-conduit >= 1.5 && < 1.7 (not sure which exactly, old install is gone.
This is where I believe the break is happening
manSettings :: ManagerSettings
manSettings = def { managerCheckCerts = \ _ _ _-> return CertificateUsageAccept }
this is what it used to look like
manSettings :: ManagerSettings
manSettings = def { managerCheckCerts = \ _ _ -> return CertificateUsageAccept }
Here's the code that uses it
initialRequest :: forall (m :: * -> *). URI -> IO (Request m,Manager)
initialRequest uri = do
initReq <- parseUrl uri -- let the server tell you what the request header
-- should look like
manager <- newManager manSettings -- a Manager manages http connections
-- we mod the settings to handle
-- the SSL cert. See manSettings below.
return (modReq initReq,manager)
where modReq initReq = applyBasicAuth username password initReq
Let me know if I'm left something out. I'm not sure at this point what broke between then and now.
It's a good guess about the error source, but very unlikely: managerCheckCerts simply uses the certificate package to inspect certificates for validity. The error message you're seeing seems to be coming from tls itself and indicates a failure in the data transport. It's probably a good idea to file a bug report with tls, preferably first by narrowing down the issue to a single HTTPS call that fails (or even better, using tls alone and demonstrating the same failure).

Resources