Remove single quotes/quotation marks in Prolog - string

I have an extern API sending info to my Prolog application and I found a problem creating my facts.
When the information received is extensive, Prolog automatically adds ' (single quotes) to that info.
Example: with the data received, the fact I create is:
object(ObjectID,ObjectName,'[(1,09:00,12:00),(2,10:00,12:00)]',anotherID)
The fact I would like to create is
object(ObjectID,ObjectName,[(1,09:00,12:00),(2,10:00,12:00)] ,anotherID)
without the ' before the list.
Does anyone know how to solve this problem? With a predicate that receives '[(1,09:00,12:00),(2,10:00,12:00)]' and returns [(1,09:00,12:00),(2,10:00,12:00)]?

What you see is an atom, and you want to convert it to a term I think.
If you use swi-prolog, you can use the builtin term_to_atom/2:
True if Atom describes a term that unifies with Term. When Atom is instantiated, Atom is parsed and the result unified with Term.
Example:
?- term_to_atom(X,'[(1,09:00,12:00),(2,10:00,12:00)]').
X = [ (1, 9:0, 12:0), (2, 10:0, 12:0)].
So at the right hand side, you enter the atom, at the left side the "equivalent" term. Mind however that for instance 00 is interpreted as a number and thus is equal to 0, this can be unintended behavior.
You can thus translate the predicate as:
translate(object(A,B,C,D),object(A,B,CT,D)) :-
term_to_atom(CT,C).
Since you do not fully specify how you get this data, it is unknown to me how you will convert it. But the above way will probably be of some help.

Related

Finding the start of an expression when the end of the previous one is difficult to express

I've got a file format that looks a little like this:
blockA {
uniqueName42 -> uniqueName aWord1 anotherWord "Some text"
anotherUniqueName -> uniqueName23 aWord2
blockB {
thing -> anotherThing
}
}
Lots more blocks with arbitrary nesting levels.
The lines with the arrow in them define relationships between two things. Each relationship has some optional metadata (multi-word quoted or single word unquoted).
The challenge I'm having is that because the there can be an arbitrary number of metadata items in a relationship my parser is treating anotherUniqueName as a metadata item from the first relationship rather than the start of the second relationship.
You can see this in the image below. The parser is only recognising one relationshipDeclaration when a second should start with StringLiteral: anotherUniqueName
The parser looks a bit like this:
block
: BLOCK LBRACE relationshipDeclaration* RBRACE
;
relationshipDeclaration
: StringLiteral? ARROW StringLiteral StringLiteral*
;
I'm hoping to avoid lexical modes because the fact that these relationships can appear almost anywhere in the file will leave me up to my eyes in NL+ :-(
Would appreciate any ideas on what options I have. Is there a way to look ahead, spot the '->', for example?
Thanks a million.
Your example certainly looks like the NL is what signals the end of a relationshipDeclaration.
If that's the case, then you'll need NLs to be tokens available to your parse rules so the parser can know recognize the end.
As you've alluded to, you could potentially use -> to trigger a different Lexer Mode and generate different tokens for content between the -> and the NL and then use those tokens in your parse rule for relationshipDeclaration.
If it's as simple as your snippet indicates, then just capturing RD_StringLiteral tokens in that lexical mode, would probably be easier to deal with than handling all the places you might need to allow for NL. This would be pretty simple as Lexer modes go.
(BTW you can use x+ to get the same effect as x x*)
relationshipDeclaration
: StringLiteral? ARROW RD_StringLiteral+
;
I don't think there's a third option for dealing with this.

Way to find a number at the end of a string in Smalltalk

I have different commands my program is reading in (i.e., print, count, min, max, etc.). These words can also include a number at the end of them (i.e., print3, count1, min2, max6, etc.). I'm trying to figure out a way to extract the command and the number so that I can use both in my code.
I'm struggling to figure out a way to find the last element in the string in order to extract it, in Smalltalk.
You didn't told which incarnation of Smalltalk you use, so I will explain what I would do in Pharo, that is the one I'm familiar with.
As someone that is playing with Pharo a few months at most, I can tell you the sheer amount of classes and methods available can feel overpowering at first, but the environment actually makes easy to find things. For example, when you know the exact input and output you want, but doesn't know if a method already exists somewhere, or its name, the Finder actually allow you to search by giving a example. You can open it in the world menu, as shown bellow:
By default it seeks selectors (method names) matching your input terms:
But this default is not what we need right now, so you must change the option in the upper right box to "Examples", and type in the search field a example of the input, followed by the output you want, both separated by a ".". The input example I used was the string 'max6', followed by the desired result, the number 6. Pharo then gives me a list of methods that match that:
To get what would return us the text part, you can make a new search, changing the example output from number 6 to the string 'max':
Fortunately there is several built-in methods matching the description of your problem.
There are more elegant ways, I suppose, but you can make use of the fact that String>>#asNumber only parses the part it can recognize. So you can do
'print31' reversed asNumber asString reversed asNumber
to give you 31. That only works if there actually is a number at the end.
This is one of those cases where we can presume the input data has a specific form, ie, the only numbers appear at the end of the string, and you want all those numbers. In that case it's not too hard to do, really, just:
numText := 'Kalahari78' select: [ :each | each isDigit ].
num := numText asInteger. "78"
To get the rest of the string without the digits, you can just use this:
'Kalahari78' withoutTrailingDigits. "Kalahari"6
As some of the Pharo "OGs" pointed out, you can take a look at the String class (just type CMD-Return, type in String, hit Return) and you will find an amazing number of methods for all kinds of things. Usually you can get some ideas from those. But then there are times when you really just need an answer!

How to represent a missing xsd:dateTime in RDF?

I have a dataset with data collected from a form that contains various date and value fields. Not all fields are mandatory so blanks are possible and
in many cases expected, like a DeathDate field for a patient who is still alive.
How do I best represent these blanks in the data?
I represent DeathDate using xsd:dateTime. Blanks or empty spaces are not allowed. All of these are flagged as invalid when validating using Jena RIOT:
foo:DeathDate_1
a foo:Deathdate ;
time:inXSDDatetime " "^^xsd:dateTime .
foo:DeathDate_2
a foo:Deathdate ;
time:inXSDDatetime ""^^xsd:dateTime .
foo:DeathDate_3
a foo:Deathdate ;
time:inXSDDatetime "--"^^xsd:dateTime .
I prefer to not omit the triple because I need to know if it was blank on the source versus a conversion error during construction of my RDF.
What is the best way to code these missing values?
You should represent this by just omitting the triple. That's the meaning of a triple that's "not present": it's information that is (currently) unknown.
Alternatively, you can choose to give it the value "unknown"^^xsd:string when there's no death date. The solution in this case is to not datatype it as an xsd:dateTime, but just as a simple string. It doesn't have to be a string of course, you could use any kind of "special" value for this, e.g. a boolean false - just as long as it's a valid literal value that you can distinguish from actual death dates. This will solve the parsing problem, but IMHO if you do this, you are setting yourself up for headaches in processing the data further down the line (because you will need to ask queries over this data, and they will have to take two different types of values into account, plus the possibility that the field is missing).
I prefer to not omit the triple because I need to know if it was blank
on the source versus a conversion error during construction of my RDF.
This sounds like an XY problem. If there are conversion errors, your application should signal that in another way, e.g. by logging an error. You shouldn't try to solve this by "corrupting" your data.

How to know if an atom is starting with a pattern?

For example, if I got the following predicates :
father('jim', 'Boby')
father('rob', 'bob')
and I would like to know who got father with is name starting with 'bo' ?
Simply use atom_concat/3, a ISO Prolog standard built-in predicate.
Another ISO option is sub_atom/5:
sub_atom(Atom, Before, Length, After, Sub_atom)
?- sub_atom(bob, 0, _, _, bo).
true.
Compared to atom_concat/3, this avoids the generation of the unneeded atom to represent the suffix.

FastCGI Haskell script to make use of Pandoc text conversion

1. Motivation
I'm writing my own mini-wiki. I want to be able to easily convert from markdown to LATEX/HTML and vice versa. After some searching I discovered Pandoc, which is written in Haskell and that I could use the FastCGI module to run a Haskell program on my Apache server.
2. Problem/ Question
I'm not sure how to what protocol I should use to send my FastCGI script the input/output variables (POST/GET?) and how this is done exactly. Any ideas, suggestions, solutions?
3. Steps taken
3.1 Attempt
Here is what I've done so far (based on example code). Note, I have no experience in Haskell and at the moment I don't have too much time to learn the language. I'd just love to be able to use the pandoc text format conversion tool.
module Main ( main ) where
import Control.Concurrent
import Network.FastCGI
import Text.Pandoc
--initialize Variables/ functions
fastcgiResult :: CGI CGIResult
markdownToHTML:: String -> String
--implement conversion function
markdownToHTML s = writeLaTeX defaultWriterOptions {writerReferenceLinks = True} (readMarkdown defaultParserState s)
--main action
fastcgiResult = do
setHeader "Content-type" "text/plain"
n <- queryString
output $ (markdownToHTML n)
main :: IO ()
main = runFastCGIConcurrent' forkIO 10 fastcgiResult
This code reads the string after the question mark in the request url. But this is not a good solution as certain characters are omitted (e.g. '#' ) and spaces are replaced by "/20%".
Thanks in advance.
3.2 Network.CGI
Documentation found here. Under the heading "Input" there are a number of methods to get input. Which one is right for me?
Is it :
Get the value of an input variable, for example from a form. If the variable has multiple values, the first one is returned. Example:
query <- getInput "query"
So lets say I have a HTML POST form with name='Joe' can I grab this using getInput? And if so how do I handle the Maybe String type?
The fastCGI package is actually a extension of the cgi package, which includes the protocol types for receiving request data and returning result pages. I'd suggest using CGI to start with, and then move to fastCGI once you know what you are doing.
You might also want to look at this tutorial.
Edit to answer questions about the tutorial:
"Maybe a" is a type that can either contain "Just a" or "Nothing". Most languages use a null pointer to indicate that there is no data, but Haskell doesn't have null pointers. So we have an explicit "Maybe" type instead for cases when the data might be null. The two constructors ("Just" and "Nothing") along with the type force you to explicitly allow for the null case when it might happen, but also let you ignore it when it can't happen.
The "maybe" function is the universal extractor for Maybe types. The signature is:
maybe :: b -> (a -> b) -> Maybe a -> b
Taking the arguments from front to back, the "Maybe a" third argument is the value you are trying to work with. The second argument is a function called if the third argument is "Just v", in which case the result is "f v". The first argument is the default, returned if the third is "Nothing".
In this case, the trick is that the "cgiMain" function is called twice. If it finds an input field "name" then the "mn" variable will be set to (Just "Joe Bloggs"), otherwise it will be set to (Nothing). (I'm using brackets to delimit values now because quotes are being used for strings).
So the "maybe" call returns the page to render. The first time through no name is provided, so "mn" is (Nothing) and the default "inputForm" page is returned for rendering. When the user clicks Submit the same URL is requested, but this time with the "name" field set, so now you get the "greet" function called with the name as an argument, so it says "Hello Joe Bloggs".

Resources