in RDF a statement is represented with S,P and O; In OWL the owl:ObjectProperty represents the predicate logic.
(S) (P) (O)
I like dog
<owl:Class rdf:about="Person" />
<owl:NamedIndividual rdf:about="I">
<rdf:type rdf:resource="Person"/>
<like rdf:resource="Dog"/>
</owl:NamedIndividual>
<owl:Class rdf:about="Pet" />
<owl:NamedIndividual rdf:about="Dog">
<rdf:type rdf:resource="Pet"/>
</owl:NamedIndividual>
<owl:ObjectProperty rdf:about="like">
<rdfs:domain>
<owl:Restriction>
<owl:onProperty rdf:resource="like"/>
<owl:someValuesFrom rdf:resource="Person"/>
</owl:Restriction>
</rdfs:domain>
<rdfs:range>
<owl:Restriction>
<owl:onProperty rdf:resource="like"/>
<owl:someValuesFrom rdf:resource="Pet"/>
</owl:Restriction>
</rdfs:range>
</owl:ObjectProperty>
But how about to describe "the degree" I like dogs?
How can I give a property or value to a predicate?
One solution I got is to extend one (S,P,O) statement to 3 statements.
For example,
(S) (P) (O)
Person isSrcOf LikeRelation
Pet isTargetOf LikeRelation
LikeRelation hasValue [0~100]
It should work but obviously it will let ontology 3 times bigger :(
I appreciate any suggestion!
I wouldn't use RDF reification, not in this case and almost not in any case. RDF reification just makes the things always more complicated. As you commented it will inflate your ontology, but not just that, it'll also make your ontology very difficult for applying OWL reasoning.
I've dealt with the same scenario that you've presented and most of times I've ended up with the following design.
(S) (P) [ (P) (O) (P) (O)]
I like [ 'what I like' Dog , 'how much I like it' 'a lot']
Class: LikeLevel //it represents class of things a person likes with a degree factor.
ObjectProperty: likeObject
Domain: LikeLevel
Range: Pet //(or Thing)
ObjectProperty: likeScale
Domain: LikeLevel
Range: xsd:int //(or an enumeration class i.e: 'nothing', 'a bit', 'very much',...)
ObjectProperty: like
Domain: Person
Range: LikeLevel
If you want to represent some instance data with this model (in RDF/Turtle syntax):
:I :like [ a :LikeLevel;
:likeObject :dogs;
:likeScale 5.7] .
In this case I'm creating a blank node for the object LikeLevel but you could create a ground object as well, sometimes you might want/need to avoid bNodes. In that case:
:I :like :a0001 .
:a0001 a :LikeLevel;
:likeObject :dogs;
:likeScale 5.7.
This design can be consider a light case of reification, the main difference with RDF reification is that keeps the ontology design in the user's model.
Your suggestion is a valid one; it is called reification and is the standard way of representing properties inherent to a relationship between two items in an ontology or RDF graph, where statements are made in a pairwise manner between items - it is a limitation of the data model itself that makes reification necessary sometimes.
If you're worried that reification will inflate your ontology, you could try the following instead, but are generally less desirable and come with their own problems:
Create specific properties, such as somewhatLikes, doesntLike, loves; this may be suitable if you have a limited set of alternatives, and don't mind creating the extra properties. This becomes tedious and cumbersome (and I'd go so far as to suggest incorrect) if you intend to encode the 'degree of likeness' with an integer (or any wide range of alternatives) - following this approach, you'd have properties like likes0, likes1, ..., likes99, likes100. This method would also preclude querying, for example, all dogs that a person likes within a range of degree values, which is possible in SPARQL with the reification approach you've specified, but not with this approach.
Attach the likesDogs property to the Person instance, if the assertion can be made against the person onto all types/instances of Dog, and not individual instances. This will, of course, be dependent of what you're trying to capture here; if it's the latter, then this also won't be appropriate.
Good luck!
I think #msalvadores gets it wrong.
Let's forget about the dogs and likes. What we are really doing here is:
a x b
axb y c
axb z d
where axb is the identifier of the a x b statement, a, b, c, d are subjects or objects and x, y, z are predicates. What we need is binding the a, x, b resources to the axb statement somehow.
This is how reification does it:
axb subject a
axb predicate x
axb object b
which I think is very easy to understand.
Let's check what msalvadores does:
:I :like [ a :LikeLevel;
:likeObject :dogs;
:likeScale 5.7] .
we can easily translate this to axb terms
a x w
w type AxbSpecificObjectWrapper
w object b
w y c
which is just mimicking reification with low quality tools and more effort (you need a wrapper class and define an object property). The a x w statement does not makes sense to me; I like a like level, which objects are dogs???
But how about to describe "the degree" I like dogs?
There are 2 ways to do this as far as I can tell with my very limited RDF knowledge.
1.) use reification
stmt_1
a LikeStatement
subject I
predicate like
object dogs
how_much "very much"
2.) instantiate a predicate class
I like_1 dogs
like_1
a Like
how_much "very much"
It depends on your taste and your actual vocab which one you choose.
How can I give a property or value to a predicate?
I don't think you understand the difference between a predicate and a statement. A great example about it is available here: Simple example of reification in RDF
Tolkien wrote Lord of the rings
Wikipedia said that
The statement here:
that: [Tolkien, wrote, LotR]
If we are making statements about the statement, we write something like this:
[Wikipedia, said, that]
If we are making statements about the predicate then we write something like this:
[Wikipedia, said, wrote]
I think there is a big difference. Reification is about making statements about statements not about predicates...
A sentence from Jena's document just catch my eye.
...OWL Full allows ... state the following .... construction:
<owl:Class rdf:ID="DigitalCamera">
<rdf:type owl:ObjectProperty />
</owl:Class>
..
does OWL Full really allow an ObjectProperty be a Class as well?
If an ObjectProperty could be a Class, and could have individuals then I could describe a statement with
S_individual P_individual O_individual
and I could have further properties on P_individual. Is it right?
or am I missing some points?
since the following RDF is valid, a corresponding OWL should be achievable.
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:j.0="http://somewhere/" >
<rdf:Description rdf:about="http://somewhere/Dog_my_dog">
<j.0:name>Lucky</j.0:name>
</rdf:Description>
<rdf:Description rdf:about="http://somewhere/like_dog">
<j.0:degree>80</j.0:degree>
</rdf:Description>
<rdf:Description rdf:about="http://somewhere/Cat_my_cat">
<j.0:name>Catty</j.0:name>
</rdf:Description>
<rdf:Description rdf:about="http://somewhere/like_cat">
<j.0:degree>86</j.0:degree>
</rdf:Description>
<rdf:Description rdf:about="http://somewhere/Person_I">
<j.0:name>Bob</j.0:name>
<j.0:like_dog rdf:resource="http://somewhere/Dog_my_dog"/>
<j.0:like_cat rdf:resource="http://somewhere/Cat_my_cat"/>
</rdf:Description>
</rdf:RDF>
Related
is there something like if then else in Ruta available? I'd like to do something like:
if there's at least one term from catA, then label the document with "one"
else if there's at least one term from catB, then label the document with "two"
else label the document with "three".
All the best
Philipp
There is no language structure for if-then-else in UIMA Ruta (2.7.0).
You need to duplicate some parts of the rule in order to model the else part, e.g., something like the following:
Document{CONTAINS(CatA) -> One};
Document{-CONTAINS(CatA), CONTAINS(CatB) -> Two};
Document{-CONTAINS(CatA), -CONTAINS(CatB) -> Three};
You could also check if the previous rule has matched and depend on that.
How the rule should actually look like depends mainly on the type system and how you want to model the information (features?).
DISCLAIMER: I am a developer of UIMA Ruta
I think you are asking about If-else-if in Ruta. This is possible using "ONLYFIRST"
PACKAGE uima.ruta.example;
DECLARE CatA,CatB,CatC;
"CatA"->CatA;
"CatB"->CatB;
"CatC"->CatC;
DECLARE one,two,three;
ONLYFIRST Document{}{
Document{CONTAINS(CatA) -> one};
Document{CONTAINS(CatB) -> two};
Document{CONTAINS(CatC) -> three};
}
I am currently new with NLP and need guidance as of how I can solve this problem.
I am currently doing a filtering technique where I need to brand data in a database as either being correct or incorrect. I am given a structured data set, with columns and rows.
However, the filtering conditions are given to me in a text file.
An example filtering text file could be the following:
Values in the column ID which are bigger than 99
Values in the column Cash which are smaller than 10000
Values in the column EndDate that are smaller than values in StartDate
Values in the column Name that contain numeric characters
Any value that follows those conditions should be branded as bad.
However, I want to extract those conditions and append them to the program that I've made so far.
For instance, for the conditions above, I would like to produce
`if ID>99`
`if Cash<10000`
`if EndDate < StartDate`
`if Name LIKE %[1-9]%`
How can I achieve the above result using the Stanford NLP? (or any other NLP library).
This doesn't look like a machine learning problem; it's a simple parser. You have a simple syntax, from which you can easily extract the salient features:
column name
relationship
target value or target column
The resulting "action rule" is simply removing the "syntactic sugar" words and converting the relationship -- and possibly the target value -- to its symbolic form.
Enumerate all of your critical words for each position in a lexicon. Then use basic string manipulation operators in your chosen implementation language to find the three needed fields.
EXAMPLE
Given the data above, your lexicons might be like this:
column_trigger = "Values in the column"
relation_dict = {
"are bigger than" : ">",
"are smaller than" : "<",
"contain" : "LIKE",
...
}
value_desc = {
"numeric characters" : "%[1-9]%",
...
}
From here, use these items in standard parsing. If you're not familiar with that, please look up the basics of a simple sentence grammar in your favourite programming language, with rules such as such as
SENTENCE => SUBJ VERB OBJ
Does that get you going?
I'm trying to use get() to access a list element in R, but am getting an error.
example.list <- list()
example.list$attribute <- c("test")
get("example.list") # Works just fine
get("example.list$attribute") # breaks
## Error in get("example.list$attribute") :
## object 'example.list$attribute' not found
Any tips? I am looping over a vector of strings which identify the list names, and this would be really useful.
Here's the incantation that you are probably looking for:
get("attribute", example.list)
# [1] "test"
Or perhaps, for your situation, this:
get("attribute", eval(as.symbol("example.list")))
# [1] "test"
# Applied to your situation, as I understand it...
example.list2 <- example.list
listNames <- c("example.list", "example.list2")
sapply(listNames, function(X) get("attribute", eval(as.symbol(X))))
# example.list example.list2
# "test" "test"
Why not simply:
example.list <- list(attribute="test")
listName <- "example.list"
get(listName)$attribute
# or, if both the list name and the element name are given as arguments:
elementName <- "attribute"
get(listName)[[elementName]]
If your strings contain more than just object names, e.g. operators like here, you can evaluate them as expressions as follows:
> string <- "example.list$attribute"
> eval(parse(text = string))
[1] "test"
If your strings are all of the type "object$attribute", you could also parse them into object/attribute, so you can still get the object, then extract the attribute with [[:
> parsed <- unlist(strsplit(string, "\\$"))
> get(parsed[1])[[parsed[2]]]
[1] "test"
flodel's answer worked for my application, so I'm gonna post what I built on it, even though this is pretty uninspired. You can access each list element with a for loop, like so:
#============== List with five elements of non-uniform length ================#
example.list=
list(letters[1:5], letters[6:10], letters[11:15], letters[16:20], letters[21:26])
#===============================================================================#
#====== for loop that names and concatenates each consecutive element ========#
derp=c(); for(i in 1:length(example.list))
{derp=append(derp,eval(parse(text=example.list[i])))}
derp #Not a particularly useful application here, but it proves the point.
I'm using code like this for a function that calls certain sets of columns from a data frame by the column names. The user enters a list with elements that each represent different sets of column names (each set is a group of items belonging to one measure), and the big data frame containing all those columns. The for loop applies each consecutive list element as the set of column names for an internal function* applied only to the currently named set of columns of the big data frame. It then populates one column per loop of a matrix with the output for the subset of the big data frame that corresponds to the names in the element of the list corresponding to that loop's number. After the for loop, the function ends by outputting that matrix it produced.
Not sure if you're looking to do something similar with your list elements, but I'm happy I picked up this trick. Thanks to everyone for the ideas!
"Second example" / tangential info regarding application in graded response model factor scoring:
Here's the function I described above, just in case anyone wants to calculate graded response model factor scores* in large batches...Each column of the output matrix corresponds to an element of the list (i.e., a latent trait with ordinal indicator items specified by column name in the list element), and the rows correspond to the rows of the data frame used as input. Each row should presumably contain mutually dependent observations, as from a given individual, to whom the factor scores in the same row of the ouput matrix belong. Also, I feel I should add that if all the items in a given list element use the exact same Likert scale rating options, the graded response model may be less appropriate for factor scoring than a rating scale model (cf. http://www.rasch.org/rmt/rmt143k.htm).
'grmscores'=function(ColumnNameList,DataFrame) {require(ltm) #(Rizopoulos,2006)
x = matrix ( NA , nrow = nrow ( DataFrame ), ncol = length ( ColumnNameList ))
for(i in 1:length(ColumnNameList)) #flodel's magic featured below!#
{x[,i]=factor.scores(grm(DataFrame[, eval(parse(text= ColumnNameList[i]))]),
resp.patterns=DataFrame[,eval(parse(text= ColumnNameList[i]))])$score.dat$z1}; x}
Reference
*Rizopoulos, D. (2006). ltm: An R package for latent variable modelling and item response theory analyses, Journal of Statistical Software, 17(5), 1-25. URL: http://www.jstatsoft.org/v17/i05/
I have CSV data (inherited - no choice here) which I need to use to create data type instances in Haskell. parsing CSV is no problem - tutorials and APIs abound.
Here's what 'show' generates for my simplified trimmed-down test-case:
JField {fname = "cardNo", ftype = "str"} (string representation)
I am able to do a read to convert this string into a JField data record. My CSV data is just the values of the fields, so the CSV row corresponding to JField above is:
cardNo, str
and I am reading these in as List of string ["cardNo", "str"]
So - it's easy enough to brute-force the exact format of "string representation" (but writing Java or python-style string-formatting in Haskell isn't my goal here).
I thought of doing something like this (the first List is static, and the second list would be read file CSV) :
let stp1 = zip ["fname = ", "ftype ="] ["cardNo", "str"]
resulting in
[("fname = ","cardNo"),("ftype =","str")]
and then concatenating the tuples - either explicitly with ++ or in some more clever way yet to be determined.
This is my first simple piece of code outside of tutorials, so I'd like to know if this seems a reasonably Haskellian way of doing this, or what clearly better ways there are to build just this piece:
fname = "cardNo", ftype = "str"
Not expecting solutions (this is not homework, it's a learning exercise), but rather critique or guidelines for better ways to do this. Brute-forcing it would be easy but would defeat my objective, which is to learn
I might be way off, but wouldn't a map be better here? I guess I'm assuming that you read the file in with each row as a [String] i.e.
field11, field12
field21, field22
etc.
You could write
map (\[x,y] -> JField {fname = x, ftype = y}) data
where data is your input. I think that would do it.
If you already have the value of the fname field (say, in the variable fn) and the value of the ftype field (in ft), just do JField {fname=fn, ftype=ft}. For non-String fields, just insert a read where appropriate.
I want to find the all possible combination of a given word. For example say, the given word is "the" then I need "t,h,e,teh..". I have to find this in groovy, is there is any method? Or please say me the outline of the algorithm.
If you need subsets as well, you could do something like this:
("word" as List).subsequences()*.permutations().inject( [] ) { list, set ->
list.addAll( set )
list
}*.join().sort { it.length() }
which gives you the following output:
[o, d, r, w, dw, wd, do, od, dr, rd,
wr, rw, ow, wo, ro, or, owd, wod, wdo,
odw, dwo, dow, orw, owr, wor, wro,
rwo, row, dor, ord, odr, rdo, rod,
dro, wdr, rwd, drw, rdw, wrd, dwr,
wrdo, orwd, wrod, wodr, ordw, wdor,
rwod, wdro, word, owdr, rdow, drow,
drwo, rdwo, odwr, dorw, odrw, dowr,
dwro, rodw, dwor, owrd, rowd, rwdo]
edit: changed the set.each to a list.addAll as it should be faster (and reads a lot easier)
("word" as List).permutations()*.join() will generate all permutations, not including subsets. Permutations of every possible subset could use this.
Update: After reading Tim's answer, I could come up with this:
("word" as List).subsequences()*.permutations().collect{ it*.join() }.flatten().sort{ it.length() } (could go without .sort{...})