What is the difference between abstract and comment in DBpedia - dbpedia

What is the difference between abstract and comment for any DBpedia resources?
is it some one is short description than the other one?

DBpedia abstracts usually* include the first paragraph of a Wikipedia page as simple text. Comments are substrings of abstracts limited to two sentences.
(*) "usually" because sometimes text extraction fails, and no abstract is produced.
(disclaimer: I am a DBpedia developer)

Related

How can I determine the exact meaning of a word in a sentence? [duplicate]

I am using Wordnet for finding synonyms of ontology concepts. How can i find choose the appropriate sense for my ontology concept. e.g there is an ontlogy concept "conference" it has following synsets in wordnet
The noun conference has 3 senses (first 3 from tagged texts)
(12) conference -- (a prearranged meeting for consultation or exchange of information or discussion (especially one with a formal agenda))
(2) league, conference -- (an association of sports teams that organizes matches for its members)
(2) conference, group discussion -- (a discussion among participants who have an agreed (serious) topic)
now 1st and 3rd synsets have apprpriate sense for my ontology concept. How can i choose only these two from wordnet?
The technology you're looking for is in the direction of semantic disambiguation / representation.
The most "traditional approach" is Word Sense Disambiguation (WSD), take a look at
https://en.wikipedia.org/wiki/Word-sense_disambiguation
https://stackoverflow.com/questions/tagged/word-sense-disambiguation
Anyone know of some good Word Sense Disambiguation software?
Then comes the next generation of Word Sense induction / Topic modelling / Knowledge representation:
https://en.wikipedia.org/wiki/Word-sense_induction
https://en.wikipedia.org/wiki/Topic_model
https://en.wikipedia.org/wiki/Knowledge_representation_and_reasoning
Then comes the most recent hype:
Word embeddings, vector space models, neural nets
Sometimes people skip the semantic representation and goes directly to do text similarity and by comparing pairs of sentences, the differences/similarities before getting to the ultimate aim of the text processing.
Take a look at Normalize ranking score with weights for a list of STS related work.
On the other direction, there's
ontology creation (Cyc, Yago, Freebase, etc.)
semantic web (https://en.wikipedia.org/wiki/Semantic_Web)
semantic lexical resources (WordNet, Open Multilingual WordNet, etc.)
Knowledge base population (http://www.nist.gov/tac/2014/KBP/)
There's also a recent task on ontology induction / expansion:
http://alt.qcri.org/semeval2015/task17/
http://alt.qcri.org/semeval2016/task13/
http://alt.qcri.org/semeval2016/task14/
Depending on the ultimate task, maybe either of the above technology would help.
You can also try Babelfy, which provides Word Sense Disambiguation and Named Entity Disambiguation.
Demo:
http://babelfy.org/
API:
http://babelfy.org/guide
Take a look at this list: 100 Best GitHub: Word-sense Disambiguation
and search by WordNet - there are several appropriate libraries.
I didn't use any of them, but this one seems to be promising, because it is based on classic yet effective idea (namely, Lesk algorithm) upgraded by modern word-embedding methods. Actually, before finding it, I was going to suggest to try almost the same ideas.
Note also that all methods try to find the meaning (WordNet sysnet, in your case) that is most similar to the context of the current word/collocation, so it is crucial to have context of the words you're trying to disambiguate. For example, words can come from some text and most libraries rely on that.

concept extraction using Wordnet

I wish to know how can i used WordNet to extract concepts from a text document.Earlier I have used bag of words approach to measure similarity between text documents, however i wish to use semantic information of text therefore wants to extract concepts from the document.I understand Wordnet offer Sysnet that contains synonyms for the given word.
however what i am trying to achieve is that how can i use this information to define a concept in the textual data. I wonder should i need to define the list of concepts separately and manually before using sysnet and than compare those concepts with the sysnet.
Any suggestion or link is appreciated.
I think you'll find that there are too many concepts out there for enumerating all of them yourself to be practical. Instead, you should consider using a pre-existing source of knowledge such as Wikidata, Wikipedia, Freebase, the content of Tweets, the web at large, or some other source as the basis for constructing your concepts. You may find clustering algorithms useful for defining these. In terms of synonyms... words related to a concept may not necessarily be synonymous (e.g. both love and hate may be connected to the same concept regarding an intensity of emotion towards someone else) and some words could belong to multiple concepts (e.g. wedding could be in both the love and in the marriage concept), so I'd suggest having some linkage from synset to concept that isn't strictly 1:1.

Choosing appropriate sense of a word from wordnet

I am using Wordnet for finding synonyms of ontology concepts. How can i find choose the appropriate sense for my ontology concept. e.g there is an ontlogy concept "conference" it has following synsets in wordnet
The noun conference has 3 senses (first 3 from tagged texts)
(12) conference -- (a prearranged meeting for consultation or exchange of information or discussion (especially one with a formal agenda))
(2) league, conference -- (an association of sports teams that organizes matches for its members)
(2) conference, group discussion -- (a discussion among participants who have an agreed (serious) topic)
now 1st and 3rd synsets have apprpriate sense for my ontology concept. How can i choose only these two from wordnet?
The technology you're looking for is in the direction of semantic disambiguation / representation.
The most "traditional approach" is Word Sense Disambiguation (WSD), take a look at
https://en.wikipedia.org/wiki/Word-sense_disambiguation
https://stackoverflow.com/questions/tagged/word-sense-disambiguation
Anyone know of some good Word Sense Disambiguation software?
Then comes the next generation of Word Sense induction / Topic modelling / Knowledge representation:
https://en.wikipedia.org/wiki/Word-sense_induction
https://en.wikipedia.org/wiki/Topic_model
https://en.wikipedia.org/wiki/Knowledge_representation_and_reasoning
Then comes the most recent hype:
Word embeddings, vector space models, neural nets
Sometimes people skip the semantic representation and goes directly to do text similarity and by comparing pairs of sentences, the differences/similarities before getting to the ultimate aim of the text processing.
Take a look at Normalize ranking score with weights for a list of STS related work.
On the other direction, there's
ontology creation (Cyc, Yago, Freebase, etc.)
semantic web (https://en.wikipedia.org/wiki/Semantic_Web)
semantic lexical resources (WordNet, Open Multilingual WordNet, etc.)
Knowledge base population (http://www.nist.gov/tac/2014/KBP/)
There's also a recent task on ontology induction / expansion:
http://alt.qcri.org/semeval2015/task17/
http://alt.qcri.org/semeval2016/task13/
http://alt.qcri.org/semeval2016/task14/
Depending on the ultimate task, maybe either of the above technology would help.
You can also try Babelfy, which provides Word Sense Disambiguation and Named Entity Disambiguation.
Demo:
http://babelfy.org/
API:
http://babelfy.org/guide
Take a look at this list: 100 Best GitHub: Word-sense Disambiguation
and search by WordNet - there are several appropriate libraries.
I didn't use any of them, but this one seems to be promising, because it is based on classic yet effective idea (namely, Lesk algorithm) upgraded by modern word-embedding methods. Actually, before finding it, I was going to suggest to try almost the same ideas.
Note also that all methods try to find the meaning (WordNet sysnet, in your case) that is most similar to the context of the current word/collocation, so it is crucial to have context of the words you're trying to disambiguate. For example, words can come from some text and most libraries rely on that.

Performing semantic analysis in text

I want to perform semantic analysis on some text similar to YAGO[1]. But I have no structure in the text to identify entities and relationships. One way is I use POS tagging and then identify subject and predicates in the sentences[2]. But still I cannot establish what relationships exist between them.
How should I go about this?
For example:
Albert Einstein was born in 1879.
Should result in:
AlbertEinstein BORNIN 1879
subject relation predicate
My aim to look for better approaches to find subjects, predicates and relationships in raw text.
What you are trying to do is essentially Natural Language Understanding, a subfield of Natural Language Processing, which again is a subfield of Computational Linguistics ~ often thought as the engineering arm.
You could do semantic parsing or relation extraction. Either are fine for this task. I decided to read through Suchanek et al (2007) and you will realise that it is ontology based, where the relations are extracted into a predefined ontological template where aixoms are predifed (e.g. BORNIN). I personally think this is far to restrictive for general intelligence but works great with weak ai problems [narrow domains]. Much more interesting work has been happening over the years such as ontology driven information extraction, where the algorithms are trained on the ontology rather than having a corpus annotated by an ontology. One PhD study that comes to mind is McDowell Thesis and the Yildiz & Miksch (2007) paper.
Regardless and without going off topic, there is a really interesting open source Python GUI driven project called iepy at the moment being developed by a firm called Machinalis which is based on django. It allows for rule based and machine learning based information extraction. I highly recommend you check it out -> Tried and tested by myself. Also, I'm not affiliated with this company.
https://github.com/machinalis/iepy
According to the documentation:
IEPY is an open source tool for Information Extraction focused on
Relation Extraction.
To give an example of Relation Extraction, if we are trying to find a
birth date in:
"John von Neumann (December 28, 1903 – February 8, 1957) was a
Hungarian and American pure and applied mathematician, physicist,
inventor and polymath." then IEPY's task is to identify "John von
Neumann" and "December 28, 1903" as the subject and object entities of
the "was born in" relation.
It's aimed at: users needing to perform Information Extraction on a
large dataset. scientists wanting to experiment with new IE
algorithms.
The task you attempt to solve is called relation extraction, while semantic analysis has much broader meaning (honestly, I can't say for sure what does it mean now).
Relation extraction is an open research problem, so I suggest to review the field - for example, start from the chapter 2.3 of Mining text data book or A Review of Relation Extraction paper (which is a little older - 2007). Then continue research by following citing or cited-by links; finally, try to implement approach that looks most promising: for example, if you know that your data is rather formal (all sentences are short and share similar strict structure), then try something like pattern-based approaches; and so on.
Stanford parser can do it :) You need to look at the dependency parser though. Have a look at the bottom of this page: http://nlp.stanford.edu/software/lex-parser.shtml:
subject: nsubj(snapped, rain),
or direct object: dobj(shut, hub))
...
Or have a look at this page (Stanford Dependencies): http://nlp.stanford.edu/software/stanford-dependencies.shtml
And to understand the annotations have a look at this: http://nlp.stanford.edu/software/dependencies_manual.pdf
And for your particular example, use Stanford "collapsed" dependency parser which for a given sentence will produce predicates like born_in(Einstein,1879), which is very similar to what you want.

How to represent repository pattern in UML?

How to represent Repository pattern in UML?
Is there any stereotype that can be used to describe repository pattern? I am using Enterprise Architect to create diagrams. I specifically looking for class diagram representation.
According to Martin Fowler, P of EAA, p. 322:
(However, you must have already found this since it's the first hit on Google.)
Based on this example (and the text from P of EAA), this roughly translates to the following DCD:
jensgram has already provided an answer on how to represent the pattern as classes.
When it comes to using patterns in EA, you can quite easily create them yourself using Save UML Pattern under the Diagram - Advanced menu. This saves an XML representation of the pattern.
You import the pattern for use in your project either using the Resources window or by creating an MDG Technology (more complex, but a much better alternative for medium and large-scale deployments).
Unfortunately, the one UML diagram type where EA does not support pattern creation is the sequence diagram.

Resources