Wreq: Stop 404s throwing exceptions - haskell

I am trying to test broken links but, when I use Wreq's get method and run into a 404, I get an exception (see bottom) rather than a statusCode to handle. Only 200s seem to be returned.
I tried to follow the error-handling code in the tutorial but I can't find a way to return the same type as get u. Moreover, this seems to be more complexity than I need in this instance.
How can I simply prevent the exception and just return the responseStatus as is
verifySeatme :: Maybe URL -> IO UrlStatus
verifySeatme url = do
case url of
Nothing -> return None
Just "" -> return None
Just u -> do
seatmeResp <- get u --`E.catch` handler
-- r ^? responseBody . key "url"
-- could also check for redirect to errorPage.aspx
if seatmeResp ^. W.responseStatus . statusCode == 200
then return (Working u)
else return Broken
where
handler e#(StatusCodeException s respHeaders _) =
do
return respHeaders
Here is the exception thrown, and you can see it has the stateCode i want
*Main> re <- get "https://www.seatme.nl/restaurant/1371/Londen.htm"
*** Exception: StatusCodeException (Status {statusCode = 404, statusMessage = "Not Found"}) [("Cache-Control","private"),....
Yuras suggested using options, but I have been unable to work from the example using params to one using checkStatus :: Lens' Options (Maybe StatusChecker):
getData :: IO Restos
getData = do
let opts = defaults & customStatusHandler
jdata <- asJSON =<< getWith opts "http://localhost/restos-short.json" :: IO Resp
let
restos = jdata ^. W.responseBody
verified <- mapM processEntry restos
return verified
-- type StatusChecker = Status -> ResponseHeaders -> CookieJar -> Maybe SomeException
customStatusHandler :: W.StatusChecker
customStatusHandler st res _ =
Just res

NOTE: the answer is outdated, see other answers.
I never used Wreq, but it looks like you should use getWith to pass custom options and checkStatus to configure status handling.
An example:
getWith (set checkStatus (Just $ \_ _ _ -> Nothing) defaults)
"http://google.com/hello/world"
The \_ _ _ -> Nothing is a function to check status code, see StatusChecker. It returns nothing indicating that any status code is OK.

To expand on the answer by Evelyn Schneider, I got this to work with
r <- getWith opts url
where
opts = set Network.Wreq.checkResponse (Just $ \_ _ -> return ()) defaults

For posterity: newer versions of wreq (starting with 0.5) have replaced checkStatus with checkResponse, which takes different arguments. An equivalent to Yuras' answer would now be:
getWith opts url
where opts = set checkResponse (\_ _ -> return ()) defaults

Here's the checkStatus function I ended up with after researching a bit. I couldn't figure out how to convert a HttpException to a SomeException, but then I found Control.Monad.Catch.SomeException. This will ignore 404s and re-throw all other exceptions.
import Network.HTTP.Client.Types
import Network.HTTP.Types.Status
import Network.HTTP.Types.Header
import qualified Control.Exception as E
import Control.Monad.Catch (SomeException(..))
notFoundMeansNothing :: Status -> ResponseHeaders -> CookieJar -> Maybe E.SomeException
notFoundMeansNothing s h c
| s == status404 = Nothing
| otherwise =
if statusIsClientError s || statusIsServerError s then
Just . SomeException $ StatusCodeException s h c
else
Nothing

Related

`wreq` Get / Post with exception handling

I am making some http calls using wreq and would like to catch any exception and return an Either type. I tried something like this but could not figure out how to manipulate the calls so it will type check.
-- exhaustive pattern match omitted here
safeGetUrl :: URL -> Maybe Login -> Maybe Password -> IO (Either String (Response LBS.ByteString))
safeGetUrl url (Just login) (Just pass) = do
let def = defaults
opts = def & auth ?~ basicAuth (BS.pack login) (BS.pack pass)
r <- getWith opts url `E.catch` handler
return $ Right r
where
handler :: HttpException -> Either String (Response LBS.ByteString)
handler (StatusCodeException s _ _) = do
return $ Left $ LBS.unpack (s ^. statusMessage)
I am pasting the type error below but I know the above code will not compile. The issue is r <- getWith opts url E.catch handler. The first part returns IO (Res... but the exception handler returns Either... I tried adding lifting the getWith.. into Either but that did not type check either.
Couldn't match type ‘Either String (Response LBS.ByteString)’
with ‘IO (Response LBS.ByteString)’
Expected type: HttpException -> IO (Response LBS.ByteString)
Actual type: HttpException
-> Either String (Response LBS.ByteString)
In the second argument of ‘catch’, namely ‘handler’
In a stmt of a 'do' block: r <- getWith opts url `catch` handler
Is there a way to catch this exception and return an IO Either type?
Since #jozefg answer, the API has changed a little bit and the answer doesn't compile anymore.
Here is an updated version that compiles:
import qualified Control.Exception as E
import Control.Lens
import qualified Data.ByteString.Char8 as BSC
import qualified Data.ByteString.Lazy as LBS
import Network.HTTP.Client
import Network.Wreq as NW
type URL = String
type Login = String
type Password = String
safeGetUrl ::
URL
-> Maybe Login
-> Maybe Password
-> IO (Either String (Response LBS.ByteString))
safeGetUrl url (Just login) (Just pass) = do
let def = defaults
opts = def & auth ?~ basicAuth (BSC.pack login) (BSC.pack pass)
(Right <$> getWith opts url) `E.catch` handler
where
handler :: HttpException -> IO (Either String (Response LBS.ByteString))
handler (HttpExceptionRequest _ (StatusCodeException r _)) =
return $ Left $ BSC.unpack (r ^. NW.responseStatus . statusMessage)
Your issue is that one side of the handle returns an unwrapped response (no Either) and the other side returns an Either-wrapped exception. You then attempt to wrap the response in an Either, which you do need to do, but it's just at the wrong place. You can fix this merely by switching where you do the wrapping
safeGetUrl :: URL -> Maybe Login -> Maybe Password -> IO (Either String (Response LBS.ByteString))
safeGetUrl url (Just login) (Just pass) = do
let def = defaults
opts = def & auth ?~ basicAuth (BS.pack login) (BS.pack pass)
(Right <$> getWith opts url) `E.catch` handler
where
handler :: HttpException -> IO (Either String (Response LBS.ByteString))
handler (StatusCodeException s _ _) = do
return $ Left $ LBS.unpack (s ^. statusMessage)
However there are some other problems with your functions, remember that unpack gives back Word8s not Char. You may want to import Data.ByteString.Char as the version of unpack defined in there should work better than LBS.unpack. Without your imports though I cannot confirm this definitively. The final (working) code for me is
import Control.Lens
import Network.Wreq
import Network.HTTP.Client
import qualified Control.Exception as E
import qualified Data.ByteString.Char8 as BSC
import qualified Data.ByteString.Lazy as LBS
type URL = String
type Login = String
type Password = String
safeGetUrl :: URL
-> Maybe Login
-> Maybe Password
-> IO (Either String (Response LBS.ByteString))
safeGetUrl url (Just login) (Just pass) = do
let def = defaults
opts = def & auth ?~ basicAuth (BSC.pack login) (BSC.pack pass)
(Right <$> getWith opts url) `E.catch` handler
where
handler :: HttpException -> IO (Either String (Response LBS.ByteString))
handler (StatusCodeException s _ _) = do
return $ Left $ BSC.unpack (s ^. statusMessage)

How can I fix a error when execute a sample code of http-conduit version 2.0.0.4

I am new to Haskell and am trying to a sample code of http-conduit version 2.0.0.4, but it dose not work
Here is the sample code
{-# LANGUAGE OverloadedStrings #-}
import Network.HTTP.Conduit
import Network
import Data.Time.Clock
import Data.Time.Calendar
import qualified Control.Exception as E
past :: UTCTime
past = UTCTime (ModifiedJulianDay 56200) (secondsToDiffTime 0)
future :: UTCTime
future = UTCTime (ModifiedJulianDay 562000) (secondsToDiffTime 0)
cookie :: Cookie
cookie = Cookie { cookie_name = "password_hash"
, cookie_value = "abf472c35f8297fbcabf2911230001234fd2"
, cookie_expiry_time = future
, cookie_domain = "example.com"
, cookie_path = "/"
, cookie_creation_time = past
, cookie_last_access_time = past
, cookie_persistent = False
, cookie_host_only = False
, cookie_secure_only = False
, cookie_http_only = False
}
main = withSocketsDo $ do
request' <- parseUrl "http://example.com/secret-page"
let request = request' { cookieJar = Just $ createCookieJar [cookie] }
E.catch (withManager $ httpLbs request)
(\(StatusCodeException statusCode _ _) ->
if statusCode==403 then putStrLn "login failed" else return ())
ref: http://hackage.haskell.org/package/http-conduit-2.0.0.4/docs/Network-HTTP-Conduit.html
the error message when I load it
samplecode.hs:33:39:
Couldn't match type `()'
with `Response Data.ByteString.Lazy.Internal.ByteString'
Expected type: IO
(Response Data.ByteString.Lazy.Internal.ByteString)
Actual type: IO ()
In the return type of a call of `putStrLn'
In the expression: putStrLn "login failed"
In the expression:
if statusCode == 403 then putStrLn "login failed" else return ()
samplecode.hs:33:75:
Couldn't match expected type `Response
Data.ByteString.Lazy.Internal.ByteString'
with actual type `()'
In the first argument of `return', namely `()'
In the expression: return ()
In the expression:
if statusCode == 403 then putStrLn "login failed" else return ()
Failed, modules loaded: none.
How can I fix it?
Many Thanks
Update
Following Abrahamson's advice, I have changed my code little bit to the following and now have proper StatusCodeException handling.
main = withSocketsDo $ do
request' <- parseUrl "http://example.com/secret-page"
let request = request' { cookieJar = Just $ createCookieJar [cookie] }
eitherResp <- E.try (withManager $ httpLbs request)
case eitherResp of
Left (StatusCodeException s _ _)
| statusCode s == 403 -> putStrLn "login failed"
| otherwise -> return ()
Right resp -> print (L.length (responseBody resp))
You're not using E.catch as it was intended. If you take a look at the type:
E.catch :: Exception e => IO a -> (e -> IO a) -> IO a
it's clear that the return type of the first and second arguments must match. In your case you have
withManager $ httpLbs request :: IO (Response ByteString)
in the first branch and either
putStrLn "login failed" -- or
return ()
in the second. These types do not match and thus you're getting the error you see.
In higher level terms, the problem is that you're not handling the success case. For instance, we could rewrite this using E.try to make that more clear
eitherResp <- E.try (withManager $ httpLbs request)
case eitherResp of
Left (StatusCodeException statusCode _ _)
| statusCode == 403 -> putStrLn "login failed"
| otherwise -> return ()
Right resp -> print (ByteString.length (responseBody resp))
Here since I explicitly pattern match on the Either StatusCodeException (Response ByteString) it's clear that I needed to provide both the failing and the succeeding branche and give them the same return types. To do so I introduced an action to perform on the successful case.
Generally, I find E.try easier to use. E.catch is primarily useful when you want to provide a default under failure.

simple rss downloader in haskell

Yesterday i tried to write a simple rss downloader in Haskell wtih hte help of the Network.HTTP and Feed libraries. I want to download the link from the rss item and name the downloaded file after the title of the item.
Here is my short code:
import Control.Monad
import Control.Applicative
import Network.HTTP
import Text.Feed.Import
import Text.Feed.Query
import Text.Feed.Types
import Data.Maybe
import qualified Data.ByteString as B
import Network.URI (parseURI, uriToString)
getTitleAndUrl :: Item -> (Maybe String, Maybe String)
getTitleAndUrl item = (getItemTitle item, getItemLink item)
downloadUri :: (String,String) -> IO ()
downloadUri (title,link) = do
file <- get link
B.writeFile title file
where
get url = let uri = case parseURI url of
Nothing -> error $ "invalid uri" ++ url
Just u -> u in
simpleHTTP (defaultGETRequest_ uri) >>= getResponseBody
getTuples :: IO (Maybe [(Maybe String, Maybe String)])
getTuples = fmap (map getTitleAndUrl) <$> fmap (feedItems) <$> parseFeedString <$> (simpleHTTP (getRequest "http://index.hu/24ora/rss/") >>= getResponseBody)
I reached a state where i got a list which contains tuples, which contains name and the corresponding link. And i have a downloadUri function which properly downloads the given link to a file which has the name of the rss item title.
I already tried to modify downloadUri to work on (Maybe String,Maybe String) with fmap- ing on get and writeFile but failed with it horribly.
How can i apply my downloadUri function to the result of the getTuples function. I want to implement the following main function
main :: IO ()
main = some magic incantation donwloadUri more incantation getTuples
The character encoding of the result of getItemTitle broken, it puts code points in the places of the accented characters. The feed is utf8 encoded, and i thought that all haskell string manipulation functions are defaulted to utf8. How can i fix this?
Edit:
Thanks for you help, i implemented successfully my main and helper functions. Here comes the code:
downloadUri :: (Maybe String,Maybe String) -> IO ()
downloadUri (Just title,Just link) = do
item <- get link
B.writeFile title item
where
get url = let uri = case parseURI url of
Nothing -> error $ "invalid uri" ++ url
Just u -> u in
simpleHTTP (defaultGETRequest_ uri) >>= getResponseBody
downloadUri _ = print "Somewhere something went Nothing"
getTuples :: IO (Maybe [(Maybe String, Maybe String)])
getTuples = fmap (map getTitleAndUrl) <$> fmap (feedItems) <$> parseFeedString <$> decodeString <$> (simpleHTTP (getRequest "http://index.hu/24ora/rss/") >>= getResponseBody)
downloadAllItems :: Maybe [(Maybe String, Maybe String)] -> IO ()
downloadAllItems (Just feedlist) = mapM_ downloadUri $ feedlist
downloadAllItems _ = error "feed does not get parsed"
main = getTuples >>= downloadAllItems
The character encoding issue has been partially solved, i put decodeString before the feed parsing, so the files get named properly. But if i want to print it out, the issue still happens. Minimal working example:
main = getTuples
It sounds like it's the Maybes that are giving you trouble. There are many ways to deal with Maybe values, and some useful library functions like fromMaybe and fromJust. However, the simplest way is to do pattern matching on the Maybe value. We can tweak your downloadUri function to work with the Maybe values. Here's an example:
downloadUri :: (Maybe String, Maybe String) -> IO ()
downloadUri (Just title, Just link) = do
file <- get link
B.writeFile title file
where
get url = let uri = case parseURI url of
Nothing -> error $ "invalid uri" ++ url
Just u -> u in
simpleHTTP (defaultGETRequest_ uri) >>= getResponseBody
downloadUri _ = error "One of my parameters was Nothing".
Or maybe you can let the title default to blank, in which case you could insert this just before the last line in the previous example:
downloadUri (Nothing, Just link) = downloadUri (Just "", Just link)
Now the only Maybe you need to work with is the outer one, applied to the array of tuples. Again, we can pattern match. It might be clearest to write a helper function like this:
downloadAllItems (Just ts) = ??? -- hint: try a `mapM`
downloadAllItems Nothing = ??? -- don't do anything, or report an error, or...
As for your encoding issue, my guesses are:
You're reading the information from a file that isn't UTF-8 encoded, or your system doesn't realise that it's UTF-8 encoded.
You are reading the information correctly, but it gets messed up when you output it.
In order to help you with this problem, I need to see a full code example, which shows how you're reading the information and how you output it.
Your main could be something like the shown below. There may be some more concise way to compose these two operations though:
main :: IO ()
main = getTuples >>= process
where
process (Just lst) = foldl (\s v -> do {t <- s; download v}) (return ()) lst
process Nothing = return ()
download (Just t, Just l) = downloadUri (t,l)
download _ = return ()

How can I catch a 404 status exception thrown by simpleHttp of Http.Conduit

I'm trying to download all png files contained in an html file.
I have trouble catching 404 status exceptions though, instead my program just crashes.
Here is some sample to demonstrate:
import Network.HTTP.Conduit
import qualified Data.ByteString.Lazy as L
main = do
let badUrl = "http://www.google.com/intl/en_com/images/srpr/WRONG.png"
imgData <- (simpleHttp badUrl) `catch` statusExceptionHandler
L.writeFile "my.png" imgData
statusExceptionHandler :: t -> IO L.ByteString
statusExceptionHandler e = (putStrLn "oops") >> (return L.empty)
My "oops" message never prints, instead app crashes with:
StatusCodeException (Status {statusCode = 404, statusMessage = "Not Found"}) [("Content-Type","text/html; charset=UTF-8"),("X-Content-Type-Options","nosniff"),("Date","Fri, 27 Jan 2012 03:10:34 GMT"),("Server","sffe"),("Content-Length","964"),("X-XSS-Protection","1; mode=block")]
What am I doing wrong?
Update:
Following Thoma's advice, I changed my code to the following snippet and now have proper exception handling in place.
main = do
let badUrl = "http://www.google.com/intl/en_com/images/srpr/WRONG.png"
imgData <- (simpleHttp badUrl) `X.catch` statusExceptionHandler
case imgData of x | x == L.empty -> return ()
| otherwise -> L.writeFile "my.png" imgData
statusExceptionHandler :: HttpException -> IO L.ByteString
statusExceptionHandler (StatusCodeException status headers) =
putStr "An error occured during download: "
>> (putStrLn $ show status)
>> (return L.empty)
In addition to Thomas's answer, you could tell http-conduit not to throw an exception by overriding the checkStatus record of your Request type.
You should probably read the Marlow paper on extensible exceptions. The original catch, exported by Prelude and used in your code snipt, only works for IOError's. The http-conduit code is throwing exceptions of a different type, HttpException to be exact. (there is some dynamic typing going on via the Typeable class, see the paper).
The solution? Use catch from Control.Exception and only catch the error types you want to handle (or SomeException for all of them).
import Network.HTTP.Conduit
import qualified Data.ByteString.Lazy as L
import Control.Exception as X
main = do
let badUrl = "http://www.google.com/intl/en_com/images/srpr/WRONG.png"
imgData <- (simpleHttp badUrl) `X.catch` statusExceptionHandler
L.writeFile "my.png" imgData
statusExceptionHandler :: SomeException -> IO L.ByteString
statusExceptionHandler e = (putStrLn "oops") >> (return L.empty)

How to "escape early" in a web monad

Something that happens to me a lot while web programming: I want to run an operation that has a chance of failure. On a failure, I want to send the client a 500. Normally though, I just want to continue executing a series of steps.
doSomeWebStuff :: SomeWebMonad ()
doSomeWebStuff = do
res <- databaseCall
case res of
Left err -> status 500
Right val -> do
res2 <- anotherDatabaseCall (someprop val)
case res2 of
Left err -> status 500
Right val2 -> text $ show val2
since the errs are exceptions, I don't like that I need all that case stuff just to catch them. I want to do the same thing whenever anything is a left. Is there a way to express that on one line with something like guard, but control what it returns on an exit?
In another language I could do this:
function doSomeWebStuff() {
var res = databaseCall()
if (res == Error) return status 500
var res2 = anotherDatabaseCall(res.someprop)
if (res2 == Error) return status 500
return text(res2)
}
So, I'm ok writing some boilerplate, but I don't want the errors to mess with my nesting, when it's far more common to just want to continue forward with the found case.
What's the cleanest way to do this? I know in theory I can use a monad to exit early on a failure, but I've only seen examples with Maybe and it would return Nothing at the end, rather than letting me specify what it returns.
Here's how I would do it with ErrorT. Disclaimer: I have never actually used ErrorT before.
webStuffOr500 :: ErrorT String SomeWebMonad () -> SomeWebMonad ()
webStuffOr500 action = do
res <- runErrorT action
case res of
Left err -> do
logError err -- you probably want to know what went wrong
status 500
Right () -> return ()
doSomeWebStuff :: SomeWebMonad ()
doSomeWebStuff = webStuffOr500 doSomeWebStuff'
doSomeWebStuff' :: ErrorT String SomeWebMonad ()
doSomeWebStuff' = do
val <- ErrorT databaseCall
val2 <- ErrorT $ anotherDatabaseCall (someprop val)
lift $ text $ show val2
Here are the imports and type declarations I used to make sure it all typechecks correctly:
import Control.Monad.Identity
import Control.Monad.Error
import Control.Monad.Trans (lift)
import Control.Monad
type SomeWebMonad = Identity
data Foo = Foo
data Bar = Bar
data Baz = Baz deriving (Show)
someprop :: Foo -> Bar
someprop = undefined
databaseCall :: SomeWebMonad (Either String Foo)
databaseCall = undefined
anotherDatabaseCall :: Bar -> SomeWebMonad (Either String Baz)
anotherDatabaseCall = undefined
logError :: String -> SomeWebMonad ()
logError = undefined
text :: String -> SomeWebMonad ()
text = undefined
status :: Int -> SomeWebMonad ()
status = undefined
If I'm doing this all wrong then please, somebody shout out. It may be wise, if you take this approach, to modify the type signature of databaseCall and anotherDatabaseCall to also use ErrorT, that way a <- ErrorT b can be reduced to a <- b in doSomeWebStuff'.
Since I'm a complete noob at ErrorT, I can't really do any hand-holding besides "here's some code, go have some fun".
Not a direct answer to your question, but have you considered using Snap? In snap, we have short-circuiting behavior built-in with an idiomatic:
getResponse >>= finishWith
where
finishWith :: MonadSnap m => Response -> m a
So given a response object, it will terminate early (and match whatever type comes after that). Haskell laziness will ensure computations within Snap monad after finishWith won't be executed.
I sometimes make a little helper:
finishEarly code str = do
modifyResponse $ setResponseStatus code str
modifyResponse $ addHeader "Content-Type" "text/plain"
writeBS str
getResponse >>= finishWith
which I can then use anywhere in my handlers.
myHandler = do
x <- doSomething
when (x == blah) $ finishEarly 400 "That doesn't work!!"
doOtherStuff

Resources