Skipping certain items when parsing an array of json objects with Aeson - haskell

I am trying to write a FromJSON implementation which would parse a list of objects while at the same time skipping some of them - those which contain a certain json property.
I have the code like this, but without proper handling of mzero it returns an error once it encounters a value with "exclude: true".
newtype Response = Response [Foo]
newtype Foo = Foo Text
instance FromJSON Response where
parseJSON = withArray "Foos" $ \arr -> do
-- can I filter out here the ones which return `mzero`?
foos <- mapM parseJSON arr
pure $ Response (toList foos)
instance FromJSON Foo where
parseJSON = withObject "Foo" $ \foo -> do
isExcluded <- foo .: "exclude"
if isExcluded
then mzero
else do
pure $ Foo "bar"
I've found a few questions which hint at using parseMaybe, but I can't figure out how I can use it from within the FromJSON definition, it seems to be more suited to running the parser from "outside". Is it possible to do skipping "inside"? Or am I going the wrong route here?

What you're looking for is the optional function. It can be a bit tricky to find because it's very general-purpose and not simply a helper function in aeson. For your purposes, it will have the type Parser a -> Parser (Maybe a) and used in conjunction with catMaybes should do what you want.

Related

Using file name in json parsing (Haskell Aeson)

I have no experience in Haskell. I'm trying to parse many .json files to a data structure in Haskell using aeson. However, by reasons beyond my control, I need to store the name of the file from where the data was parsed as one of the fields in my data. A simple example of what I have so far is:
data Observation = Observation { id :: Integer
, value :: Integer
, filename :: String}
instance FromJSON Observation where
parseJson (Object v) =
Observation <$> (read <$> v .: "id")
<*> v .: "value"
<*> ????
My question is: what is a smart way to be able to serialize my data when parsing a json file having access to the name of the file?
What comes in my mind is to define another data like NotNamedObservation, initialize it and then having a function that converts NotNamedObservation -> String -> Observation (where String is the filename) but that sounds like a very poor approach.
Thanks.
Just make your instance a function from file path to Observation:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE OverloadedStrings #-}
import Data.Aeson
import qualified Data.ByteString.Lazy as LBS
import System.Environment
data Observation = Observation { ident :: Integer
, value :: Integer
, filename :: FilePath
} deriving (Show)
instance FromJSON (FilePath -> Observation) where
parseJSON (Object v) =
do i <- read <$> v .: "id"
l <- v .: "value"
pure $ Observation i l
main :: IO ()
main = do
files <- getArgs
fileContents <- traverse LBS.readFile files
print fileContents
let fs = map (maybe (error "Invalid json") id . decode) fileContents
jsons :: [Observation]
jsons = zipWith ($) fs files
print jsons
When you don't control the data definition and you have strict requirements about the format to parse, it's better to write the (de)serializer explicitly.
If external information is required to fully construct values, avoid the FromJSON/ToJSON type classes, and just write standalone parsers.
aeson's deriving mechanism is more suited to applications that talk to themselves (and thus only care about round-tripping between parseJSON and toJSON), or where there is flexibility either in defining the JSON format or the Haskell types.
If you still have to use these classes for some reason, one option is of course to just put undefined in those missing fields. To rely on the type system more, you can also parameterize types by a "phase" (that assumes again you can tweak the data type), which is a type constructor that wraps some fields.
Functor functors
One way to keep that style compatible with regular records is to use this HKD/defunctionalization design pattern.
data Observation' p = Observation
{ id :: Integer
, value :: Integer
, filename :: p String }
-- This is isomorphic to the original Observation data type
type Observation = Observation Identity
-- When we don't have the filename available, we keep the field empty with Proxy
instance FromJSON (Observation' Proxy) where
...
mkObservation :: FileName -> Observation' Proxy -> Observation

Read unicode from JSON to String field using aeson

I receive a JSON data using httpLbs and read it
import qualified Data.ByteString.Lazy.UTF8 as LB
sendSimpleRequest :: Credentials -> IO LB.ByteString
sendSimpleRequest creds = do
<...>
let request = applyBasicAuth user pass $ fromJust $ parseUrl url
manager <- newManager tlsManagerSettings
response <- httpLbs request manager
return $ responseBody response
After that, I can print the result with putStr . LB.toString and get "summary":"Обсуждение рабочих вопросов".
However, when I try to use aeson's decode to put this value into data and print it
data Fields = Fields
{ fi_summary :: String
} deriving (Show, Generic)
instance FromJSON Fields where parseJSON = genericParseJSON parseOptions
instance ToJSON Fields where toJSON = genericToJSON parseOptions
parseOptions :: Options
parseOptions = defaultOptions { fieldLabelModifier = drop 3 }
parseAndShow = putStr . show . fromJust . decode
I get escaped characters: Fields {fi_summary = "\1054\1073\1089\1091\1078\1076\1077\1085\1080\1077 \1088\1072\1073\1086\1095\1080\1093 \1074\1086\1087\1088\1086\1089\1086\1074"}
Seems like I need to configure aeson to correctly put ByteString into String, but I don't want to implement the FromJSON instance myself because I have a dozen more structures like data Fields. Changing the fi_summary type is also a possibility, but I had no luck with any so far.
If you're seeing escaped characters, then the data is in the string just fine. The default string Show instance prints all non-ASCII characters like this. So you're got the data, it's just a matter of trying to output it again appropriately.
You might try using putStrLn to print the string, or maybe write it to a text file. (I know sometimes putStrLn does strange things if the locale is set wrong...)

Flatten MonadPlus inside an Aeson Parser

I'm not sure if I'm barking up the wrong tree here, but I have an Aeson FromJSON definition that looks rather bulky and I was wondering if it could be turned into something more concise. I want to short-circuit the parsing of the entire object if the nested parsing of the URI fails.
data Link = Link { link :: URI
, tags :: [String]
} deriving (Show, Typeable, Eq)
instance FromJSON Link where
parseJSON :: Value -> Parser Link
parseJSON (Object o) = do
linkStr <- o .: "link"
tags' <- o .: "tags"
case parseURI linkStr of
Just l -> return $ Link l tags'
Nothing -> mzero
parseJSON _ = mzero
The type of parseURI is parseURI :: String -> Maybe URI and both Maybe and Parser have MonadPlus instances. Is there a way to compose the two directly and remove the ugly case statement at the end?
Applicative parsers are usually more concise and you can compose the result of parseURI using maybe mzero return which converts a Nothing into an mzero.
instance FromJSON Link where
parseJSON :: Value -> Parser Link
parseJSON (Object o) = Link
<$> (maybe mzero return . parseURI =<< o .: "link")
<*> o .: "tags"
parseJSON _ = mzero
Pattern matching works, but this only works inside do notation not explicit >>= due to the extra desugaring that goes on:
instance FromJSON Link where
parseJSON (Object o) = do
Just link' <- o .: "link"
tags' <- o .: "tags"
return $ Link link' tags'
parseJSON _ = mzero
> -- Note that I used link :: String for my testing instead
> decode "{\"link\": \"test\", \"tags\": []}" :: Maybe Link
Just (Link {link = "test", tags=[]})
> decode "{\"tags\": []}" :: Maybe Link
Nothing
What's going on here is that a failed pattern match on the left hand side of a <- is calling fail. Looking at the source for Parser tells me that fail is calling out to failDesc, which is also used by the implementation of mzero, so in this case you're safe. In general it just calls fail, which can do any number of things depending on the monad, but for Parser I'd say it makes sense.
However, #shang's answer is definitely better since it doesn't rely on implicit behavior.

Getting Aeson to deal with a mixed-type list

This works:
λ decode "[\"one\", \"two\"]" :: Maybe [Text]
Just ["one","two"]
This works:
λ decode "[1, 2]" :: Maybe [Int]
Just [1,2]
This is perfectly-valid JSON but I can't make it work:
λ decode "[\"one\", 2]" :: Maybe [Text]
Nothing
Or even:
λ decode "[2]" :: Maybe [Text]
Nothing
I would like to convince the last to give me:
Just ["one","2"]
Just ["2"]
But no way I can see to twist Aeson's arm into seeing something it wants to see as a number as a string instead.
Update:
λ decode "[1, \"2\"]" :: Maybe Array
Just (fromList [Number 1.0,String "2"])
I guess that's a little better. I still wish I could get Aeson to coerce everything to strings but I guess I can work with this.
The standard FromJSON instance for Text will not do the kind of coercion you're looking for. Fortunately, aeson is flexible enough to let you define your own types with their own rules. Here's an example, complete on FP Haskell Center. The main part of it is:
newtype LaxText = LaxText Text
deriving Show
instance FromJSON LaxText where
parseJSON (String t) = return $ LaxText t
parseJSON (Number n) = return $ LaxText $ toStrict $ toLazyText $ scientificBuilder n
parseJSON _ = fail "Invalid LaxText"

Replace nested pattern matching with "do" for Monad

I want to simplify this code
{-# LANGUAGE OverloadedStrings #-}
import Data.Aeson
import Network.HTTP.Types
import Data.Text
getJSON :: String -> IO (Either String Value)
getJSON url = eitherDecode <$> simpleHttp url
--------------------------------------------------------------------
maybeJson <- getJSON "abc.com"
case maybeJson of
Right jsonValue -> case jsonValue of
(Object jsonObject) ->
case (HashMap.lookup "key123" jsonObject) of
(Just (String val)) -> Data.Text.IO.putStrLn val
_ -> error "Couldn't get the key"
_ -> error "Unexpected JSON"
Left errorMsg -> error $ "Error in parsing: " ++ errorMsg
by using do syntax for Monad
maybeJson <- getJSON "abc.com/123"
let toPrint = do
Right jsonValue <- maybeJson
Object jsonObject <- jsonValue
Just (String val) <- HashMap.lookup "key123" jsonObject
return val
case toPrint of
Just a -> Data.Text.IO.putStrLn a
_ -> error "Unexpected JSON"
And it gave me 3 errors:
src/Main.hs:86:19:
Couldn't match expected type `Value'
with actual type `Either t0 (Either String Value)'
In the pattern: Right jsonValue
In a stmt of a 'do' block: Right jsonValue <- maybeJson
src/Main.hs:88:19:
Couldn't match expected type `Value' with actual type `Maybe Value'
In the pattern: Just (String val)
In a stmt of a 'do' block:
Just (String val) <- HashMap.lookup "key123" jsonObject
src/Main.hs:88:40:
Couldn't match type `Maybe' with `Either String'
Expected type: Either String Value
Actual type: Maybe Value
Even when I replace
Just (String val) <- HashMap.lookup "key123" jsonObject
with
String val <- HashMap.lookup "key123" jsonObject
I'm getting another similar error about Either:
Couldn't match type `Maybe' with `Either String'
Expected type: Either String Value
Actual type: Maybe Value
In the return type of a call of `HashMap.lookup'
In a stmt of a 'do' block:
String val <- HashMap.lookup "key123" jsonObject
How do I fix those errors?
You can't easily simplify that into a single block of do-notation, because each case is matching over a different type. The first is unpacking an either, the second a Value and the third a Maybe. Do notation works by threading everything together through a single type, so it's not directly applicable here.
You could convert all the cases to use the same monad and then write it all out in a do-block. For example, you could have helper functions that do the second and third pattern match and produce an appropriate Either. However, this wouldn't be very different to what you have now!
In fact, if I was going for this approach, I'd just be content to extract the two inner matches into their own where variables and leave it at that. Trying to put the whole thing together into one monad just confuses the issue; it's just not the right abstraction here.
Instead, you can reach for a different sort of abstraction. In particular, consider using the lens library which has prisms for working with nested pattern matches like this. It even supports aeson nateively! Your desired function would look something like this:
decode :: String -> Maybe Value
decode json = json ^? key "key123"
You could also combine this with more specific prisms, like if you're expecting a string value:
decode :: String -> Maybe String
decode json = json ^? key "key123" . _String
This takes care of parsing the json, making sure that it's an object and getting whatever's at the specified key. The only problem is that it doesn't give you a useful error message about why it failed; unfortunately, I'm not good enough with lens to understand how to fix that (if it's possible at all).
So every line in a do expression for a Monad must return a value in that Monadic type. Monad is a typeclass here, not a type by itself. So putting everything in a do Monad is not really a sensible statement.
You can try your code with everything wrapped in a Maybe monad.
Assuming you've fetched your JSON value:
{-# LANGUAGE OverloadedStrings #-}
import Data.Aeson
import Network.HTTP
import qualified Data.Map as M
import Control.Applicative
import qualified Data.HashMap.Strict as HM
--------------------------------------------------------------------
main = do
maybeJson <- return $ toJSON (M.fromList [("key123","value")] :: M.Map String String)
ioVal <- return $ do -- The Maybe monad do expression starts here
maybeJson <- Just maybeJson
jsonObject <- case maybeJson of
Object x -> Just x
_ -> Nothing
val <- HM.lookup "key123" jsonObject
return val
putStrLn $ show ioVal
Once we start working in the Maybe monad, every expression must return a Maybe Something value. The way the Maybe monad works is that anything that is a Just something comes out as a pure something that you can work with, but if you get a Nothing, the rest of the code will be skipped and you'll get a Nothing.
This property of falling through is unique to the Maybe monad. Different monads behave differently.
You should read up more about Monads and the IO monad here: http://www.haskell.org/haskellwiki/Introduction_to_IO
You should read more about monads and what they really help you do:
http://learnyouahaskell.com/a-fistful-of-monads
(You should work through the previous chapters and then get to this chapter. Once you do, you'll have a pretty solid understanding of what is happening).
I also think your HTTP request is screwed up. Here's an example of a POST request that you can use.
import qualified Network.HTTP as H
main = do
postData <- return $ H.urlEncodeVars [("someVariable","someValue")]
request <- return $ H.postRequestWithBody "http://www.google.com/recaptcha/api/verify" "application/x-www-form-urlencoded" postData
putStrLn $ show request
-- Make the request
(Right response) <- H.simpleHTTP request
-- Print status code
putStrLn $ show $ H.rspCode response
-- Print response
putSrLn $ show $ H.rspBody response
UPDATED:
Use the following to help you get a JSON value:
import qualified Data.ByteString.Lazy as LC
import qualified Data.ByteString.Char8 as C
import qualified Data.Aeson as DA
responseBody <- return $ H.rspBody response
responseJSON <- return (DA.decode (LC.fromChunks [C.pack responseBody]) :: Maybe DA.Value)
You'll have to make a request object to make a request. There are quite a few helpers. I meant the post request as the most generic case:
http://hackage.haskell.org/package/HTTP-4000.0.5/docs/Network-HTTP.html
Since you are in the IO monad, all the <- are going to assume that you are dealing with IO operations. When you write
do
Right jsonValue <- maybeJson
Object jsonObject <- jsonValue
you are saying that jsonValue must be an IO action just like maybeJson. But this is not the case! jsonValue is but a regular Either value. The silution here would ge to use a do-block let instead of a <-:
do
Right jsonValue <- maybeJson
let Object jsonObject = jsonValue
However, its important to note that in both versions of your code you are using an irrecoverable error to abort your program if the JSON parsing fails. If you want to be able to collect errors, the basic idea would be to convert your values to Either (and then use the monad instance for Either to avoid having lots of nested case expressions)

Resources