How to call multiple API in Haskell using Wreq? - haskell

I'm trying to get used to wreq in Haskell with an exercice.
Basically, I'm calling a first API that returns me a list of ID, and then I would like to call the API that returns the object from the ID.
My code looks like this
getAllIds :: IO ()
getAllIds = do
r <-
asJSON =<< get "https://dummyUrl/objectA.json" :: IO A
let body = r ^. responseBody
print body
getOneItemFromId :: IO ()
getOneItemFromId = do
r <-
asJSON =<< get "https://dummyUrl/objectB/idOfObject.json" :: IO B
let body = r ^. responseBody
print body
How can I pass the result of getAllIdsinto a call to getOneItemFromId recursively so that I can get a list of all the items based on the list of ID ?

This is a working example about printing a list comming from outside world one by one.
getAllIds :: IO [Int]
getAllIds = do
xs <- readLn :: IO [Int]
putStrLn "This is the list of Ids"
print xs
return xs
printOneId :: Int -> IO ()
printOneId i = do
putStrLn ("this is Id: " <> show i)
main = do
xs <- getAllIds
mapM_ printOneId xs
When executing the program above, It asks for input list and then print one by one
> echo [1,2,3] | ./main
This is the list of Ids -- This is the output of the program.
[1,2,3]
this is Id: 1
this is Id: 2
this is Id: 3
From here, you can modify your code to get it work. Below I provide a not-tested solution. (I can't install dependencies right now)
getAllIds :: IO [Id]
getAllIds = do
r <-
asJSON =<< get "https://dummyUrl/objectA.json" :: IO A
let body = r ^. responseBody
print body
return body
getOneItemFromId :: Id -> IO ()
getOneItemFromId objectId = do
r <-
asJSON =<< get ("https://dummyUrl/objectB/" <> show objectId <> ".json") :: IO B
let body = r ^. responseBody
print body
main = do
xs <- getAllIds
mapM_ getOneItemFromId xs

Related

Concatenating scrapeURL results from multiples scrapings into one list

I am scraping https://books.toscrape.com using Haskell's Scalpel library. Here's my code so far:
import Text.HTML.Scalpel
import Data.List.Split (splitOn)
import Data.List (sortBy)
import Control.Monad (liftM2)
data Entry = Entry {entName :: String
, entPrice :: Float
, entRate :: Int
} deriving Eq
instance Show Entry where
show (Entry n p r) = "Name: " ++ n ++ "\nPrice: " ++ show p ++ "\nRating: " ++ show r ++ "/5\n"
entries :: Maybe [Entry]
entries = Just []
scrapePage :: Int -> IO ()
scrapePage num = do
items <- scrapeURL ("https://books.toscrape.com/catalogue/page-" ++ show num ++ ".html") allItems
let sortedItems = items >>= Just . sortBy (\(Entry _ a _) (Entry _ b _) -> compare a b)
>>= Just . filter (\(Entry _ _ r) -> r == 5)
maybe (return ()) (mapM_ print) sortedItems
allItems :: Scraper String [Entry]
allItems = chroots ("article" #: [hasClass "product_pod"]) $ do
p <- text $ "p" #: [hasClass "price_color"]
t <- attr "href" $ "a"
star <- attr "class" $ "p" #: [hasClass "star-rating"]
let fp = read $ flip (!!) 1 $ splitOn "£" p
let fStar = drop 12 star
return $ Entry t fp $ r fStar
where
r f = case f of
"One" -> 1
"Two" -> 2
"Three" -> 3
"Four" -> 4
"Five" -> 5
main :: IO ()
main = mapM_ scrapePage [1..10]
Basically, allItems scrapes for each book's title, price and rating, does some formatting for price to get a float, and returns it as a type Entry. scrapePage takes a number corresponding to the result page number, scrapes that page to get IO (Maybe [Entry]), formats it - in this case, to filter for 5-star books and order by price - and prints each Entry. main performs scrapePage over pages 1 to 10.
The problem I've run into is that my code scrapes, filters and sorts each page, whereas I want to scrape all the pages then filter and sort.
What worked for two pages (in GHCi) was:
i <- scrapeURL ("https://books.toscrape.com/catalogue/page-1.html") allItems
j <- scrapeURL ("https://books.toscrape.com/catalogue/page-2.html") allItems
liftM2 (++) i j
This returns a list composed of page 1 and 2's results that I could then print, but I don't know how to implement this for all 50 result pages. Help would be appreciated.
Just return the entry list without any processing (or you can do filtering in this stage)
-- no error handling
scrapePage :: Int -> IO [Entry]
scrapePage num =
concat . maybeToList <$> scrapeURL ("https://books.toscrape.com/catalogue/page-" ++ show num ++ ".html") allItems
Then you can process them later together
process = filter (\e -> entRate e == 5) . sortOn entPrice
main = do
entries <- concat <$> mapM scrapePage [1 .. 10]
print $ process entries
Moreover you can easily make your code concurrent with mapConcurrently from async package
main = do
entries <- concat <$> mapConcurrently scrapePage [1 .. 20]
print $ process entries

Can we access the output from a replicateM defined in a do-block

Assume i have something like this
main = do
input_line <- getLine
let n = read input_line :: Int
replicateM n $ do
input_line <- getLine
let x = read input_line :: Int
return ()
***putStrLn $ show -- Can i access my replicateM here?
return ()
Can i access the result of my replicateM such as if it was a returned value, and for example print it out. Or do i have to work with the replicateM inside the actual do-block?
Specialized to IO
replicateM :: Int -> IO a -> IO [a]
which means that it returns a list. So in your example you could do:
results <- replicateM n $ do
input_line <- getLine
let x = read input_line :: Int
return x -- <- we have to return it if we want to access it
print results
replicateM n a returns a list of the values returned by a. In your case that'd just be a list of units because you have the return () at the end, but if you replace that with return x, you'll get a list of the read integers. You can then just use <- to get it out of the IO.
You can also simplify your code by using readLine instead of getLine and read. Similarly putStrLn . show can be replaced with print.
main = do
n <- readLn
ints <- replicateM n readLn :: IO [Int]
print ints
Of course. Its type is replicateM :: Monad m => Int -> m a -> m [a]. It means it can appear to the right of <- in a do block:
do
....
xs <- replicateM n $ do { ... }
....
xs will be of type [a], as usual for binding the results from Monad m => m [a].
With your code though, where you show return () in that nested do, you'll get ()s replicated n times in your xs. Presumably in the real code you will return something useful there.

Simple Haskell program not behaving correct

I'm new to Haskell and trying to write simple program to find maximal element and it's index from intput. I receive values to compare one by one. Maximal element I'm holding in maxi variable, it's index - in maxIdx. Here's my program:
loop = do
let maxi = 0
let maxIdx = 0
let idx = 0
let idxN = 0
replicateM 5 $ do
input_line <- getLine
let element = read input_line :: Int
if maxi < element
then do
let maxi = element
let maxIdx = idx
hPutStrLn stderr "INNER CHECK"
else
hPutStrLn stderr "OUTER CHECK"
let idx = idxN + 1
let idxN = idx
print maxIdx
loop
Even though I know elements coming are starting from bigger to smaller (5, 4, 3, 2, 1) program enters INNER CHECK all the time (it should happen only for the first element!) and maxIdx is always 0.
What am I doing wrong?
Thanks in advance.
Anyway, let's have fun.
loop = do
let maxi = 0
let maxIdx = 0
let idx = 0
let idxN = 0
replicateM 5 $ do
input_line <- getLine
let element = read input_line :: Int
if maxi < element
then do
let maxi = element
let maxIdx = idx
hPutStrLn stderr "INNER CHECK"
else
hPutStrLn stderr "OUTER CHECK"
let idx = idxN + 1
let idxN = idx
print maxIdx
loop
is not a particularly Haskelly code (and as you know is not particularly correct).
Let's make if Haskellier.
What do we do here? We've an infinite loop, which is reading a line 5 times, does something to it, and then calls itself again for no particular reason.
Let's split it:
import Control.Monad
readFiveLines :: IO [Int]
readFiveLines = replicateM 5 readLn
addIndex :: [Int] -> [(Int, Int)]
addIndex xs = zip xs [0..]
findMaxIndex :: [Int] -> Int
findMaxIndex xs = snd (maximum (addIndex xs))
loop :: ()
loop = loop
main :: IO ()
main = do xs <- readFiveLines
putStrLn (show (findMaxIndex xs))
snd returns the second element from a tuple; readLn is essentially read . getLine; zip takes two lists and returns a list of pairs; maximum finds a maximum value.
I left loop intact in its original beauty.
You can be even Haskellier if you remember that something (huge expression) can be replaced with something $ huge expression ($ simply applies its left operand to its right operand), and the functions can be combined with .: f (g x) is the same as (f . g) x, or f . g $ x (see? it's working for the left side as well!). Additionally, zip x y can be rewritten as x `zip` y
import Control.Monad
readFiveLines :: IO [Int]
readFiveLines = replicateM 5 readLn
addIndex :: [Int] -> [(Int, Int)]
addIndex = (`zip` [0..])
findMaxIndex :: [Int] -> Int
findMaxIndex = snd . maximum . addIndex
main :: IO ()
main = do xs <- readFiveLines
putStrLn . show . findMaxIndex $ xs
As for debug print, there's a package called Debug.Trace and a function traceShow which prints its first argument (formatted with show, hence the name) to stderr, and returns its second argument:
findMaxIndex :: [Int] -> Int
findMaxIndex = snd . (\xs -> traceShow xs (maximum xs)) . addIndex
That allows you to tap onto any expression and see what's coming in (and what are the values around — you can show tuples, lists, etc.)
I think alf's answer is very good, but for what it's worth, here's how I would interpret your intention.
{-# LANGUAGE FlexibleContexts #-}
module Main where
import System.IO
import Control.Monad.State
data S = S { maximum :: Int
, maximumIndex :: Int
, currentIndex :: Int }
update :: Int -> Int -> S -> S
update m mi (S _ _ ci) = S m mi ci
increment :: S -> S
increment (S m mi ci) = S m mi (ci+1)
next :: (MonadIO m, MonadState S m) => m ()
next = do
S maxi maxIdx currIdx <- get
input <- liftIO $ getLine
let element = read input :: Int
if maxi < element
then do
modify (update element currIdx)
liftIO $ hPutStrLn stderr "INNER CHECK"
else
liftIO $ hPutStrLn stderr "OUTER CHECK"
modify increment
run :: Int -> IO S
run n = execStateT (replicateM_ n next) (S 0 0 0)
main :: IO ()
main = do
S maxi maxIdx _ <- run 5
putStrLn $ "maxi: " ++ (show maxi) ++ " | maxIdx: " ++ (show maxIdx)
This uses a monad transformer to combine a stateful computation with IO. The get function retrieves the current state, and the modify function lets you change the state.

MVars are blocking indefinitely; but only in certain scenarios.

First, because this is about a specific case, I haven't reduced the code at all, so it will be quite long, and in 2 parts (Helper module, and the main).
SpawnThreads in ConcurHelper takes a list of actions, forks them, and gets an MVar containing the result of the action. It them combines the results, and returns the resulting list. It works fine in certain cases, but blocks indefinitely on others.
If I give it a list of putStrLn actions, it executes them fine, then returns the resulting ()s (yes, I know running print commands on different threads at the same time is bad in most cases).
If I try running multiTest in Scanner though (which takes either scanPorts or scanAddresses, the scan range, and the number of threads to use; then splits the scan range over the threads, and passes the list of actions to SpawnThreads), it will block indefinitely. The odd thing is, according to the debug prompts scattered around ConcurHelper, on each thread, ForkIO is returning before the MVar is filled. This would make sense if it wasn't in a do block, but shouldn't the actions be performed sequentially? (I don't know if this is related to the problem or not; it's just something I noticed while attempting to debug it).
I've thought it out step by step, and if it's executing in the order laid out in spawnThreads, the following should happen:
An empty MVar should be created inside forkIOReturnMVar, and passed to mVarWrapAct.
mVarWrapAct should execute the action, and put the result in the MVar (this is where the problem seems to lie. "MVar filled" is never shown, suggesting the MVar is never put into)
getResults should then take from the resulting list of MVars, and return the results
If point #2 isn't the issue, I can see where the problem would be (and if it is the issue, I can't see why putMVar never executes. Inside the scanner module, the only real function of interest for this question is multiTest. I only included the rest so it could be run).
To do a simple test, you can run the following:
spawnThreads [putStrLn "Hello", putStrLn "World"] (should return [(),()])
multiTest (scanPorts "127.0.0.1") 1 (0,5) (Creates the MVar, hangs for a sec, then crashes with the aforementioned error)
Any help in understanding whats going on here would be appreciated. I can't see what the difference between the 2 use cases are.
Thank you
(And I'm using this atrocious exception handling system because IO errors don't give codes for specific network exceptions, so I've been left with parsing messages to find out what happened)
Main:
module Scanner where
import Network
import Network.Socket
import System.IO
import Control.Exception
import Control.Concurrent
import ConcurHelper
import Data.Maybe
import Data.Char
import NetHelp
data NetException = NetNoException | NetTimeOut | NetRefused | NetHostUnreach
| NetANotAvail | NetAccessDenied | NetAddrInUse
deriving (Show, Eq)
diffExcept :: Either SomeException Handle -> Either NetException Handle
diffExcept (Right h) = Right h
diffExcept (Left (SomeException m))
| err == "WSAETIMEDOUT" = Left NetTimeOut
| err == "WSAECONNREFUSED" = Left NetRefused
| err == "WSAEHOSTUNREACH" = Left NetHostUnreach
| err == "WSAEADDRNOTAVAIL" = Left NetANotAvail
| err == "WSAEACCESS" = Left NetAccessDenied
| err == "WSAEADDRINUSE" = Left NetAddrInUse
| otherwise = error $ show m
where
err = reverse . dropWhile (== ')') . reverse . dropWhile (/='W') $ show m
extJust :: Maybe a -> a
extJust (Just a) = a
selectJusts :: IO [Maybe a] -> IO [a]
selectJusts mayActs = do
mays <- mayActs; return . map extJust $ filter isJust mays
scanAddresses :: Int -> Int -> Int -> IO [String]
scanAddresses port minAddr maxAddr =
selectJusts $ mapM (\addr -> do
let sAddr = "192.168.1." ++ show addr
print $ "Trying " ++ sAddr ++ " " ++ show port
connection <- testConn sAddr port
if isJust connection
then do hClose $ extJust connection; return $ Just sAddr
else return Nothing) [minAddr..maxAddr]
scanPorts :: String -> Int -> Int -> IO [Int]
scanPorts addr minPort maxPort =
selectJusts $ mapM (\port -> do
--print $ "Trying " ++ addr ++ " " ++ show port
connection <- testConn addr port
if isJust connection
then do hClose $ extJust connection; return $ Just port
else return Nothing) [minPort..maxPort]
main :: IO ()
main = do
withSocketsDo $ do
putStrLn "Scan Addresses or Ports? (a/p)"
choice <- getLine
if (toLower $ head choice) == 'a'
then do
putStrLn "On what port?"
sPort <- getLine
addrs <- scanAddresses (read sPort :: Int) 0 255
print addrs
else do
putStrLn "At what address?"
address <- getLine
ports <- scanPorts address 0 9999
print ports
main
testConn :: HostName -> Int -> IO (Maybe Handle)
testConn host port = do
result <- try $ timedConnect 1 host port
let result' = diffExcept result
case result' of
Left e -> do putStrLn $ "\t" ++ show e; return Nothing
Right h -> return $ Just h
setPort :: AddrInfo -> Int -> AddrInfo
setPort addInf nPort = case addrAddress addInf of
(SockAddrInet _ host) -> addInf { addrAddress = (SockAddrInet (fromIntegral nPort) host)}
getHostAddress :: HostName -> Int -> IO SockAddr
getHostAddress host port = do
addrs <- getAddrInfo Nothing (Just host) Nothing
let adInfo = head addrs
newAdInfo = setPort adInfo port
return $ addrAddress newAdInfo
timedConnect :: Int -> HostName -> Int -> IO Handle
timedConnect time host port = do
s <- socket AF_INET Stream defaultProtocol
setSocketOption s RecvTimeOut time; setSocketOption s SendTimeOut time
addr <- getHostAddress host port
connect s addr
socketToHandle s ReadWriteMode
multiTest :: (Int -> Int -> IO a) -> Int -> (Int, Int) -> IO [a]
multiTest partAction threads (mi,ma) =
spawnThreads $ recDiv [mi,perThread..ma]
where
perThread = ((ma - mi) `div` threads) + 1
recDiv [] = []
recDiv (curN:restN) =
partAction (curN + 1) (head restN) : recDiv restN
Helper:
module ConcurHelper where
import Control.Concurrent
import System.IO
spawnThreads :: [IO a] -> IO [a]
spawnThreads actions = do
ms <- mapM (\act -> do m <- forkIOReturnMVar act; return m) actions
results <- getResults ms
return results
forkIOReturnMVar :: IO a -> IO (MVar a)
forkIOReturnMVar act = do
m <- newEmptyMVar
putStrLn "Created MVar"
forkIO $ mVarWrapAct act m
putStrLn "Fork returned"
return m
mVarWrapAct :: IO a -> MVar a -> IO ()
mVarWrapAct act m = do a <- act; putMVar m a; putStrLn "MVar filled"
getResults :: [MVar a] -> IO [a]
getResults mvars = do
unpacked <- mapM (\m -> do r <- takeMVar m; return r) mvars
putStrLn "MVar taken from"
return unpacked
Your forkIOReturnMVar isn't exception safe: whenever act throws, the MVar isn't going to be filled.
Minimal example
import ConcurHelper
main = spawnThreads [badOperation]
where badOperation = do
error "You're never going to put something in the MVar"
return True
As you can see, badOperation throws, and therefore the MVar won't get filled in mVarWrapAct.
Fix
Fill the MVar with an appropriate value if you encounter an exception. Since you cannot provide a default value for all possible types a, it's better to use MVar (Maybe a) or MVar (Either b a) as you already do in your network code.
In order to catch the exceptions, use one of the operations provided in Control.Exception. For example, you could use onException:
mVarWrapAct :: IO a -> MVar (Maybe a) -> IO ()
mVarWrapAct act m = do
onException (act >>= putMVar m . Just) (putMVar m Nothing)
putStrLn "MVar filled"
However, you might want to preserve the actual exception for more information. In this case you could simply use catch together with Either SomeException a :
mVarWrapAct :: IO a -> MVar (Either SomeException a) -> IO ()
mVarWrapAct act m = do
catch (act >>= putMVar m . Right) (putMVar m . Left)
putStrLn "MVar filled"

How to access the response code in happstack?

I'm trying to store a counter of all 200 response codes in my happstack application.
module Main where
import Happstack.Server
import Control.Concurrent
import Control.Monad.IO.Class ( liftIO )
import Control.Monad
main :: IO ()
main = do
counter <- (newMVar 0) :: IO (MVar Integer)
simpleHTTP nullConf $ countResponses counter (app counter)
countResponses :: MVar Integer -> ServerPart Response -> ServerPart Response
countResponses counter r = do
resp <- r
liftIO $ putStrLn $ show resp
-- TODO: Does not work, response code always 200
if rsCode resp == 200
then liftIO $ (putMVar counter . (+) 1) =<< takeMVar counter
else liftIO $ putStrLn $ "Unknown code: " ++ (show $ rsCode resp)
return resp
app counter = do
c <- liftIO $ readMVar counter
msum
[ dir "error" $ notFound $ toResponse $ "NOT HERE"
, ok $ toResponse $ "Hello, World! " ++ (show c)
]
The problem, as far as I can tell, is that notFound adds a filter that sets the code, which hasn't been run at the time I am inspecting the response.
I can't hook in with my own filter, since it has type Response -> Response and I need to be in the IO monad to access the mvar. I found mapServerPartT which looks like it could be possible to hook in my own code, but I'm not quite sure whether that's overkill in this scenario.
I did find simpleHttp'' which seems to directly call runWebT, which then runs appFilterToResp outside of any code I can hook. Perhaps I have to build my own version of simpleHttp''?
UPDATE: This works, is it the best way?
-- Use this instead of simpleHTTP
withMetrics :: (ToMessage a) => MVar Integer -> Conf -> ServerPartT IO a -> IO ()
withMetrics counter conf hs =
Listen.listen conf (\req -> (simpleHTTP'' (mapServerPartT id hs) req) >>=
runValidator (fromMaybe return (validator conf)) >>=
countResponses counter)
A possibly related question: I also want to be able to time requests, which means I would have to hook in at probably the same spot at the end of the request cycle.
UPDATE 2: I was able to get timings for requests:
logMessage x = logM "Happstack.Server.AccessLog.Combined" INFO x
withMetrics :: (ToMessage a) => Conf -> ServerPartT IO a -> IO ()
withMetrics conf hs =
Listen.listen conf $ \req -> do
startTime <- liftIO $ getCurrentTime
resp <- simpleHTTP'' (mapServerPartT id hs) req
validatedResp <- runValidator (fromMaybe return (validator conf)) resp
endTime <- liftIO $ getCurrentTime
logMessage $ rqUri req ++ " " ++ show (diffUTCTime endTime startTime)
return validatedResp

Resources