Snap: Download files stored on a database - haskell

I need to download files stored on a database.
I believe snap has file utils which help with file upload and download but it only deals with files resident on a filesystem.
I was given advice on the snap IRC to writeBS function to push the data to the browser.
Also, I was told to modify HTTP header so that browser treats the data as a file and brings save/open dialog. I got to play with it today and have more questions.
I have this so far:
getFirst :: AppHandler ()
getFirst = do
modifyResponse $ setContentType "application/octet-stream" -- modify HTTP header
result <- eitherWithDB $ fetch (select [] "files") -- get the file from db
let doc = either (const []) id result -- get the result out of either
fileName = at "name" doc -- get the name of the file
Binary fileData = at "blob" doc -- get the file data
writeBS fileData
Can you please tell me if this is the correct way of doing it?
It works but few things are missing:
How do I pass the file name and file type to the browser?
How do I set the Content-Disposition?
So I need to be able to set something like this:
Content-Disposition: attachment; filename=document.pdf
Content-Type: application/pdf
How can I do this?

You can set an arbitrary header of the response using modifyResponse in combination with setHeader (both from Snap.Core). Like this:
modifyResponse $ setHeader "Content-disposition" "attachment; filename=document.pdf"

Related

Download file from URL with Haskell

I want to download a file/zip archiv. (for example the new ubuntu iso, oe...)
I came across following answer from daydaynatation in this: question
downloadFile :: String -> IO ()
downloadFile url = do
request <- parseRequest url
runResourceT $ httpSink request $ \_ -> sinkFile "tmpfile"
but sadly this downloads only the the site quellcode and not the file which will be downloaded when you navigate to the respective url in the browser of your choice...
So is this possible with this approach or what do I have to try otherwise

Trouble working with `requestBody` (or `getRequestBodyChunk`)

Dependency versions
Wai: 3.2.2.1
Synopsis
The working of getRequestBodyChunk (https://hackage.haskell.org/package/wai-3.2.2.1/docs/Network-Wai.html#v:getRequestBodyChunk) is quite unclear to me. I understand that it reads the next chunk of bytes available in a request; and finally yields a Data.ByteString.empty indicating no more bytes are available.
Thus, given a req :: Request; I'm trying to create a Conduit from the request body as:
streamBody :: Wai.Request -> ConduitM () ByteString IO ()
streamBody req = repeatWhileMC (Wai.getRequestBodyChunk) (Data.ByteString.empty /=)
However, this seems to terminate immediately, as I've seen tracing the outputs:
repeatWhileMC (traceShowId <$> Wai.getRequestBodyChunk req) (traceShowId . (Data.ByteString.empty /=))
The output is
""
False
And hence the stream terminates.
But I can verify that the request body is not empty by other means.
So I'm a little stumped about this and I have a few questions:
Does this mean that the request body has been already consumed?
Or does this mean that the request body wasn't chunked?
Or does it mean that for smaller requests; the chunked bytes are always empty?
Is Scotty overriding this especially for smaller sized requests (I cannot seem to find this anywhere in its code/docs.)
If I do something like this:
streamBody req =
let req' = req { requestBody = pure "foobar" }
in repeatWhileMC (Wai.getRequestBodyChunk req') (Data.ByteString.empty /=)
I do get a non terminating stream of bytes; which leads me to suspect that something before this part of the code consumes the request body: or that function returns an empty body to begin with.
A point to note is that this seems to happen with smaller requests only.
Another point to note that I get the Wai.Request via Web.Scotty.Trans.request; which also has the body related streaming helpers.
Can this also be documented? I believe the documentation for getRequestBodyChunk etc. can be improved with this information.

Cabal package difference between readPackageDescription and parsePackageDescription

Haskell package Cabal-1.24.2 has module Distribution.PackageDescription.Parse.
Module has 2 functions: readPackageDescription and parsePackageDescription.
When I run in ghci:
let d = readPackageDescription normal "C:\\somefile.cabal"
I got parsed GenericPackageDescription
But when I run in ghci:
content <- readFile "C:\\somefile.cabal"
let d = parsePackageDescription content
I got Parse error:
ParseFailed (FromString "Plain fields are not allowed in between stanzas: F 2 \"version\" \"0.1.0.0\"" (Just 2))
File example is a file that generated using cabal init
parsePackageDescription expects the file contents themselves to be passed it, not the file path they are stored at. You'll want to readFile first... though beware of file encoding issues. http://www.snoyman.com/blog/2016/12/beware-of-readfile

Accessing Mt Gox API via http-conduit-0.1.9.3: query string causes timeout

I'm trying to access the Mt Gox REST API using http-conduit. Queries that just have a path (e.g. https://data.mtgox.com/api/2/BTCUSD/money/ticker) work fine, but when I add a queryString to the request it times out.
So this works:
mtGoxRequest :: String -> QueryText -> Request m
mtGoxRequest p qt = def {
secure = True,
host = "data.mtgox.com",
port = 443,
method = "GET",
path = fromString $ "api/2/" ++ p,
queryString = renderQuery False $ queryTextToQuery qt,
responseTimeout = Just 10000000
}
currencyTicker :: Request m
currencyTicker = mtGoxRequest "BTCUSD/money/ticker" []
But this times out:
tradeStream :: Currency -> UTCTime -> Request m
tradeStream t = mtGoxRequest
"BTCUSD/money/trades/fetch"
[("since", Just $ T.pack $ utcToGoxTime t)]
The difference seems to be in the use of a queryString: when I added the bogus query "foo=bar" to the currencyTicker that timed out too.
However all this works fine in the web browser: going to https://data.mtgox.com/api/2/BTCUSD/money/ticker?foo=bar instantly returns the correct error message instead of timing out. The trade fetch URL works as well, although I won't include a link because the "since" argument says how far back to go. Conversely, if I remove the queryString from the trades list request it correctly returns the entire available trade history.
So something about the http-conduit query string is obviously different. Anyone know what it might be?
Here is the Haskell Request object being sent (as printed by "Show"):
Request {
host = "data.mtgox.com"
port = 443
secure = True
clientCertificates = []
requestHeaders = []
path = "api/2/BTCUSD/money/trades/fetch"
queryString = "since=1367142721624293"
requestBody = RequestBodyLBS Empty
method = "GET"
proxy = Nothing
rawBody = False
redirectCount = 10
responseTimeout = Just 10000000
}
According to its returned headers Mt Gox is using cloudflare-nginx and PHP 5.
Edit: Forgot to mention that when I use http-conduit to send a request with a queryString to http://scooterlabs.com/echo I get the correct response as well, so it seems to be some interaction between the Mt Gox webserver and http-conduit.
Got it figured out. You need to add a User-Agent string. So
requestHeaders = [(CI.mk "User-Agent", "Test/0.0.1")],
in the middle of the request function makes it work.
$ time curl https://data.mtgox.com/api/2/BTCUSD/money/trades/fetch?since=1367142721624293
...
real 0m20.993s
Looks to me like everything is working correctly: the API call takes a while to return, so http-conduit throws a timeout exception since 20s is longer than 10s.

connecting http-conduit to xml-conduit

I'm struggling converting a Response from http-conduit to an XML document via xml-conduit.
The doPost function takes an XML Document and posts it to the server. The server responds with an XML Document.
doPost queryDoc = do
runResourceT $ do
manager <- liftIO $ newManager def
req <- liftIO $ parseUrl hostname
let req2 = req
{ method = H.methodPost
, requestHeaders = [(CI.mk $ fromString "Content-Type", fromString "text/xml" :: Ascii) :: Header]
, redirectCount = 0
, checkStatus = \_ _ -> Nothing
, requestBody = RequestBodyLBS $ (renderLBS def queryDoc)
}
res <- http req2 manager
return $ res
The following works and returns '200':
let pingdoc = Document (Prologue [] Nothing []) (Element "SYSTEM" [] []) []
Response status headers body <- doPost pingdoc
return (H.statusCode status)
However, when I try and parse the Response body using xml-conduit, I run into problems:
Response status headers body <- doPost xmldoc
let xmlRes' = parseLBS def body
The resulting compilation error is:
Couldn't match expected type `L.ByteString'
with actual type `Source m0 ByteString'
In the second argument of `parseLBS', namely `body'
In the expression: parseLBS def body
In an equation for `xmlRes'': xmlRes' = parseLBS def body
I've tried connecting the Source from http-conduit to the xml-conduit using $= and $$, but I'm not having any success.
Does anyone have any hints to point me in the right direction? Thanks in advance.
Neil
You could use httpLbs rather than http, so that it returns a lazy ByteString rather than a Source — the parseLBS function is named because that's what it takes: a Lazy ByteString. However, it's probably best to use the conduit interface that the two are based on directly, as you mentioned. To do this, you should remove the runResourceT line from doPost, and use the following to get an XML document:
xmlRes' <- runResourceT $ do
Response status headers body <- doPost xmldoc
body $$ sinkDoc def
This uses xml-conduit's sinkDoc function, connecting the Source from http-conduit to the Sink from xml-conduit.
Once they're connected, the complete pipeline has to be run using runResourceT, which ensures all allocated resources are released in a timely fashion. The problem with your original code is that it runs the ResourceT too early, from inside doPost; you should generally use runResourceT right at the point that you want an actual result out, because a pipeline has to run entirely within the scope of a single ResourceT.
By the way, res <- http req2 manager; return $ res can be simplified to just http req2 manager.

Resources