Trouble working with `requestBody` (or `getRequestBodyChunk`) - haskell

Dependency versions
Wai: 3.2.2.1
Synopsis
The working of getRequestBodyChunk (https://hackage.haskell.org/package/wai-3.2.2.1/docs/Network-Wai.html#v:getRequestBodyChunk) is quite unclear to me. I understand that it reads the next chunk of bytes available in a request; and finally yields a Data.ByteString.empty indicating no more bytes are available.
Thus, given a req :: Request; I'm trying to create a Conduit from the request body as:
streamBody :: Wai.Request -> ConduitM () ByteString IO ()
streamBody req = repeatWhileMC (Wai.getRequestBodyChunk) (Data.ByteString.empty /=)
However, this seems to terminate immediately, as I've seen tracing the outputs:
repeatWhileMC (traceShowId <$> Wai.getRequestBodyChunk req) (traceShowId . (Data.ByteString.empty /=))
The output is
""
False
And hence the stream terminates.
But I can verify that the request body is not empty by other means.
So I'm a little stumped about this and I have a few questions:
Does this mean that the request body has been already consumed?
Or does this mean that the request body wasn't chunked?
Or does it mean that for smaller requests; the chunked bytes are always empty?
Is Scotty overriding this especially for smaller sized requests (I cannot seem to find this anywhere in its code/docs.)
If I do something like this:
streamBody req =
let req' = req { requestBody = pure "foobar" }
in repeatWhileMC (Wai.getRequestBodyChunk req') (Data.ByteString.empty /=)
I do get a non terminating stream of bytes; which leads me to suspect that something before this part of the code consumes the request body: or that function returns an empty body to begin with.
A point to note is that this seems to happen with smaller requests only.
Another point to note that I get the Wai.Request via Web.Scotty.Trans.request; which also has the body related streaming helpers.
Can this also be documented? I believe the documentation for getRequestBodyChunk etc. can be improved with this information.

Related

Akka: How to ensure that message has been received?

I have an actor Dispenser. What it does is it
dispenses some objects by request
listens to arriving new ones
Code follows
class Dispenser extends Actor {
override def receive: Receive = {
case Get =>
context.sender ! getObj()
case x: SomeType =>
addObj(x)
}
}
In real processing it doesn't matter whether 1 ms or even few seconds passed since new object was sent until the dispenser starts to dispense it, so there's no code tracking it.
But now I'm writing test for the dispenser and I want to be sure that firstly it receives new object and only then it receives a Get request.
Here's the test code I came up with:
val dispenser = system.actorOf(Props.create(classOf[Dispenser]))
dispenser ! obj
Thread.sleep(100)
val task = dispenser ? Get()
val result = Await.result(task, timeout)
check(result)
It satisfies one important requirement - it doesn't change original code. But it is
At least 100ms seconds slow even on very high performance boxes
Unstable and fails sometimes because 100 ms or any other constant doesn't provide any guaranties.
And the question is how to make a test that satisfies requirement and doesn't have cons above (neither any other obvious cons)
You can take out the Thread.sleep(..) and your test will be fine. Akka guarantees the ordering you need.
With the code
dispenser ! obj
val task = dispenser ? Get()
dispenser will process obj before Get deterministically because
The same thread puts obj then Get in the actor's mailbox, so they're in the correct order in the actor's mailbox
Actors process messages sequentially and one-at-a-time, so the two messages will be received by the actor and processed in the order they're queued in the mailbox.
(..if there's nothing else going on that's not in your sample code - routers, async processing in getObj or addObj, stashing, ..)
Akka FSM module is really handy for testing underlying state and behavior of the actor and does not require to change its implementation specifically for tests.
By using TestFSMRef one can get actors current state and and data by:
val testActor = TestFSMRef(<actors constructor or Props>)
testActor.stateName shouldBe <state name>
testActor.stateData shouldBe <state data>
http://doc.akka.io/docs/akka/2.4.1/scala/fsm.html

Firebase... Add/Update Firebase Using node.js Script

I have arbitrary JSON that is sensibly laid out like this:
[
{
"id":100,
"name":"Buckeye, AZ",
"status":"OPEN",
"address":{
"street":"416 S Watson RD",
"city":"Buckeye"
...
}
}
]
I've written a node.js script like this for proof of concept (why I'm using node is that the JS API seems better supported than REST or Ruby for this. I could be wrong):
http = require('http')
Firebase = require('firebase')
all_sites_url = "http://supercharge.info/service/supercharge/allSites"
firebase_url = "https://tesla-supercharger.firebaseio.com/"
http.get(all_sites_url, (res) ->
body = ""
res.on "data", (chunk) ->
body += chunk
return
res.on "end", ->
response = JSON.parse(body)
all_sites = response
send_to_firebase(response)
return
return
).on "error", (e) ->
console.log "Got error: ", e
return
send_to_firebase = (response) ->
firebase_ref = new Firebase(firebase_url)
for charger in response
console.log charger
new_child = firebase_ref.push()
new_child.set {id: charger.id, data: charger}, (error) ->
if error
console.log "Data cound not be saved #{error}"
else
console.log "Data saved successfully"
The result is a unique id generated by Firebase, which has as a child a data and an id child. The data child has the expected information like name, status, etc.
What I'd prefer is to generate a key-value pair. E.g., for an id of 100:
- 100
- name
- address
street
city
etc. So my first question is how to accomplish this or if it is even sensible.
After the first time around, this data (call it the data from an external server) will be there and a mobile app will have added some fields. These are not present in the data already there. Next time I fetch data from the external server, I want to update things that have changed that the server would know about, like status. I don't want to tamper with things that only the mobile devices would know about like remote_observations.
I know I'm seeming a bit dense here, but I'm trying to put together a sensible data model that will be updatable from that server using a CRON job and incrementally updatable from a bunch of mobile devices.
Any help is much appreciated.
UPDATE: I have found that this works for getting the structure I want:
send_to_firebase = (response) ->
firebase_ref = new Firebase(firebase_url)
for charger in response
firebase_ref.child(charger.id).update charger, (error) ->
if error
console.log "Data could not be saved #{error}"
else
responses_pending += 1
console.log "Data saved successfully : #{responses_pending} pending"
firebase_ref.on 'value', ->
console.log "value received rp=#{responses_pending}"
process.exit() if (responses_pending -= 1) < 1
So the code I settled on is this:
http = require('http')
Firebase = require('firebase')
firebase_url = '/path/to/your/firebase'
# code to get JSON of the form:
{
"id":100,
"name":"Buckeye, AZ",
"status":"OPEN",
"address":{"street":"416 S Watson RD",
"city":"Buckeye",
"state":"AZ",
"zip":"85326",
"country":"USA"},
... etc.
}
# Asynchronous get of JSON hash from some server or other.
get_my_fine_JSON().on 'complete', (response) ->
send_to_firebase(response)
send_to_firebase = (response) ->
firebase_ref = new Firebase(firebase_url)
length = response.length
for charger in response
firebase_ref.child(charger.id).update charger, (error) ->
if error
console.log "Data could not be saved #{error}"
else
console.log "Data saved successfully"
process.exit() if length -= 1 is 0
Discussion:
The idea was to have a Firebase structure like this:
- 100
- address
street: "123 Main Street"
etc.
That's reason 1 why id is pulled up to be the primary key. Reason 2 is so that I can uniquely identify an object pulled off the external server as the "same" one in my Firebase and apply any updates necessary.
Epiphany 1: Update is more like upsert. If the key is there, whatever hash you supply replaces matching values. If it's not there, then Firebase happily adds it. Which is way cool because it covers both the push and patch cases.
Epiphany 2: This process will hang waiting for events if nothing tells it to stop. That's why the countdown index, length is decremented until the code has upserted (for lack of a better term) each item.
Observation 1: Doing this in node.js is super fast compared with REST using Python or Ruby. And this upsert stuff is wicked cool if I'm understanding it right.
Observation 2: There isn't a ton of wisdom out there as of this writing regarding writing node shell scripts to do this kind of stuff. Maybe it's a good idea, maybe a bad one. I don't know.
Observation 3: Because of the asynchronous nature of node and the Firebase Javascript API (both GOOD THINGs), terminating a process before the last bit is done can be tricky because your process has to hang on just long enough to complete its last request/response with Firebase. This is, as mentioned before, done in the completion handler of the update. Otherwise we wouldn't necessarily be complete when the process exited.
Caveat 1: Related to observation 2, this could be a bad idea, but I haven't been able to find resources that speak to the problem.
Caveat 2: This could be a horrid abuse or misunderstanding of the Firebase update API. I am reporting observed behavior in the limited case of my specific data. YMMV.
Caveat 3: I'm hoping the process lifetime is as I suggest it is in observation 3.
A note to the decaffeinated: The Javascript for this is so trivially different that it shouldn't be too tough to translate. Or go to js2coffee and paste the Coffeescript into the right pane to get real Javascript in the left pane that you can tune.

Accessing Mt Gox API via http-conduit-0.1.9.3: query string causes timeout

I'm trying to access the Mt Gox REST API using http-conduit. Queries that just have a path (e.g. https://data.mtgox.com/api/2/BTCUSD/money/ticker) work fine, but when I add a queryString to the request it times out.
So this works:
mtGoxRequest :: String -> QueryText -> Request m
mtGoxRequest p qt = def {
secure = True,
host = "data.mtgox.com",
port = 443,
method = "GET",
path = fromString $ "api/2/" ++ p,
queryString = renderQuery False $ queryTextToQuery qt,
responseTimeout = Just 10000000
}
currencyTicker :: Request m
currencyTicker = mtGoxRequest "BTCUSD/money/ticker" []
But this times out:
tradeStream :: Currency -> UTCTime -> Request m
tradeStream t = mtGoxRequest
"BTCUSD/money/trades/fetch"
[("since", Just $ T.pack $ utcToGoxTime t)]
The difference seems to be in the use of a queryString: when I added the bogus query "foo=bar" to the currencyTicker that timed out too.
However all this works fine in the web browser: going to https://data.mtgox.com/api/2/BTCUSD/money/ticker?foo=bar instantly returns the correct error message instead of timing out. The trade fetch URL works as well, although I won't include a link because the "since" argument says how far back to go. Conversely, if I remove the queryString from the trades list request it correctly returns the entire available trade history.
So something about the http-conduit query string is obviously different. Anyone know what it might be?
Here is the Haskell Request object being sent (as printed by "Show"):
Request {
host = "data.mtgox.com"
port = 443
secure = True
clientCertificates = []
requestHeaders = []
path = "api/2/BTCUSD/money/trades/fetch"
queryString = "since=1367142721624293"
requestBody = RequestBodyLBS Empty
method = "GET"
proxy = Nothing
rawBody = False
redirectCount = 10
responseTimeout = Just 10000000
}
According to its returned headers Mt Gox is using cloudflare-nginx and PHP 5.
Edit: Forgot to mention that when I use http-conduit to send a request with a queryString to http://scooterlabs.com/echo I get the correct response as well, so it seems to be some interaction between the Mt Gox webserver and http-conduit.
Got it figured out. You need to add a User-Agent string. So
requestHeaders = [(CI.mk "User-Agent", "Test/0.0.1")],
in the middle of the request function makes it work.
$ time curl https://data.mtgox.com/api/2/BTCUSD/money/trades/fetch?since=1367142721624293
...
real 0m20.993s
Looks to me like everything is working correctly: the API call takes a while to return, so http-conduit throws a timeout exception since 20s is longer than 10s.

Where is the breaking change?

I wrote a CRUD application to interface with JIRA. I ended up upgrading my haskell enviornment, because cabal-dev doesn't solve everything. As a result, I've got some breakage, with this error anytime I try to use any code that interfaces with JIRA.
Spike: HandshakeFailed (Error_Misc "user error (unexpected type received. expecting
handshake and got: Alert [(AlertLevel_Warning,UnrecognizedName)])")
After a little googling, I think this either has to do with tls or http-conduit which uses tls.
I'm currently using tls-1.1.2 and http-conduit-1.8.7.1
previously I was using
tls-0.9.11 and http-conduit >= 1.5 && < 1.7 (not sure which exactly, old install is gone.
This is where I believe the break is happening
manSettings :: ManagerSettings
manSettings = def { managerCheckCerts = \ _ _ _-> return CertificateUsageAccept }
this is what it used to look like
manSettings :: ManagerSettings
manSettings = def { managerCheckCerts = \ _ _ -> return CertificateUsageAccept }
Here's the code that uses it
initialRequest :: forall (m :: * -> *). URI -> IO (Request m,Manager)
initialRequest uri = do
initReq <- parseUrl uri -- let the server tell you what the request header
-- should look like
manager <- newManager manSettings -- a Manager manages http connections
-- we mod the settings to handle
-- the SSL cert. See manSettings below.
return (modReq initReq,manager)
where modReq initReq = applyBasicAuth username password initReq
Let me know if I'm left something out. I'm not sure at this point what broke between then and now.
It's a good guess about the error source, but very unlikely: managerCheckCerts simply uses the certificate package to inspect certificates for validity. The error message you're seeing seems to be coming from tls itself and indicates a failure in the data transport. It's probably a good idea to file a bug report with tls, preferably first by narrowing down the issue to a single HTTPS call that fails (or even better, using tls alone and demonstrating the same failure).

connecting http-conduit to xml-conduit

I'm struggling converting a Response from http-conduit to an XML document via xml-conduit.
The doPost function takes an XML Document and posts it to the server. The server responds with an XML Document.
doPost queryDoc = do
runResourceT $ do
manager <- liftIO $ newManager def
req <- liftIO $ parseUrl hostname
let req2 = req
{ method = H.methodPost
, requestHeaders = [(CI.mk $ fromString "Content-Type", fromString "text/xml" :: Ascii) :: Header]
, redirectCount = 0
, checkStatus = \_ _ -> Nothing
, requestBody = RequestBodyLBS $ (renderLBS def queryDoc)
}
res <- http req2 manager
return $ res
The following works and returns '200':
let pingdoc = Document (Prologue [] Nothing []) (Element "SYSTEM" [] []) []
Response status headers body <- doPost pingdoc
return (H.statusCode status)
However, when I try and parse the Response body using xml-conduit, I run into problems:
Response status headers body <- doPost xmldoc
let xmlRes' = parseLBS def body
The resulting compilation error is:
Couldn't match expected type `L.ByteString'
with actual type `Source m0 ByteString'
In the second argument of `parseLBS', namely `body'
In the expression: parseLBS def body
In an equation for `xmlRes'': xmlRes' = parseLBS def body
I've tried connecting the Source from http-conduit to the xml-conduit using $= and $$, but I'm not having any success.
Does anyone have any hints to point me in the right direction? Thanks in advance.
Neil
You could use httpLbs rather than http, so that it returns a lazy ByteString rather than a Source — the parseLBS function is named because that's what it takes: a Lazy ByteString. However, it's probably best to use the conduit interface that the two are based on directly, as you mentioned. To do this, you should remove the runResourceT line from doPost, and use the following to get an XML document:
xmlRes' <- runResourceT $ do
Response status headers body <- doPost xmldoc
body $$ sinkDoc def
This uses xml-conduit's sinkDoc function, connecting the Source from http-conduit to the Sink from xml-conduit.
Once they're connected, the complete pipeline has to be run using runResourceT, which ensures all allocated resources are released in a timely fashion. The problem with your original code is that it runs the ResourceT too early, from inside doPost; you should generally use runResourceT right at the point that you want an actual result out, because a pipeline has to run entirely within the scope of a single ResourceT.
By the way, res <- http req2 manager; return $ res can be simplified to just http req2 manager.

Resources