I have just created class hierarchy for smartphones ontology using protege.What do i have to do after this?Using that ontology i have to retrieve tweets.So please help me how to develop a simple complete ontology.If possible please mention your mail id for further contacts.I need to develop this ontology for my final year project.
That's hard to answer, you might use any of the text analysis frameworks like GATE to find an association between your ontology and the tweet text, or parts of it, like any tags. But that's very open ended - do you have an example of a tweet, and of a query for which you'd expect the tweet to be returned? That would help speculate on a solution.
Related
In Enterprise Architect I'm trying to model my Business Process through the Eriksson-Penker Business Modelling Profile which looks like this:
Everything goes well except for the Output element on the bottom right.
For some reason it doesn't exist in the Toolbox:
How can I get this Output Element here? I'm searching and searching in different toolboxes but I can't find it. Some help would be much appreciated!
This is a simple Object. Choose Other/UML/Object/Object from the toolbox and name it Output. It will appear underlined as in your diagram.
P.S. I see that the EP toolbox has an Object already. Use that in you're done.
Just found out that if you choose 'New Model from Pattern' in the project browser, browse to 'Business' and then select the Eriksson-Penker Diagram that it makes the entire diagram for you and you only have to change the descriptions. So case closed!
Does someone have a recommendation of tagging tool for NER types in raw text?
The input for the tool should be a library of text files(.txt simple format) , there should be a convenient UI for selecting words and set the tag/annotation fit to selection, the output should be structural representations of the tags(e.gs tart index , last index, tag in a JSON format)
Founderof LightTag here
We provide a super convenient interface to do span annotations such as named entity recognition, classifications and relationships.
You can work as one labeler or bring in a team and LightTag will disribute work between everyone automatically (no more selecting files and remembering what you labeled already) .
You can upload your own suggestions and let labelers use those, or use LightTags built in model.
Of course you can annotate at the character level and highlight subwords or multi word phrases.
You can try https://github.com/lasigeBioTM/MER (bash)
see the demo at http://labs.fc.ul.pt/mer/
Online tools:
I guess Dataturks' POS tool should work fine for your use case, you can just upload your data and specify the labels. The UI seems convenient enough.
Here is the link:
https://dataturks.com
It's an online tool, so you can work with multiple people to get the tagging done.
The exact output format you are looking for is not supported, but the format can easily be converted to what you are looking for, the output is like: word___LABEL word2___LABEL , so a simple 2-line script can convert it to start and end index.
Offline:
Another tool you can check out is prodigy, it's a downloadable software and does similar things. Just that you might be willing to pay for it upfront.
https://prodi.gy
TLDR: I wanna build multi-language search on my website ala Pinterest, how do I do that?
I am starting a website, where people can publish content that gets metadata typed by the user. People can then interact with the content by looking at it, liking it, commenting on it, sharing it to social media. Also content discovery is mostly done through search.
I do not wish to create geographic boundaries on my website. I would like people who speak any language to find content that is relevant to them in any language. This requirement makes sense because the content is highly visual, ala Pinterest. So even if I don't understand that the word "car" is written in French in the description, it's fine because I'll mostly be interested in seeing the car.
Pinterest is really really good with search across language. For example, on uk.pinterest.com I typed "coupe carrée" which is the French for "bob haircut" and all the results are visually relevant. Even if the pin metadata is in English and the original web site is all in English.
How is that possible? how was Pinterest able to match to my french search query content whose text is all in English? is there translation at some step: coupe carrée > bob haircut > content containing "bob haircut"?
I looked at their engineering blog and all I found is tech to detect the original country and language of a website. Nothing about managing language in search.
please let me know if this is the wrong place to ask the how-it-works question.
Thanks in advance for any help/pointers you will be able to share!
The general strategy in this case is to index your content with every language translation you wish to search.
This would require use of a language translation API at index-time. And a language identification model. Here's a Solr example.
My question is not about parsing.
I have been looking through the wikipedia API. I need to search for companies and get a one sentence summary. It's working good, the only problem I have is when I need to disambiguate. It's hard for my code to know whether "dropbox (service)" or "dropbox (band)" is the dropbox company my user is looking for.
I tried to put the word "company" in the query, expecting it to work like a google search, but unfortunately it didn't.
so my question is: is there an easy way to disambiguate the results I get by telling wikipedia it is a "company" that I want?
If you're looking for companies only then consider using their full names instead of short forms. In case of Dropbox, the name of the company is Dropbox, Inc. If you search for Dropbox, Inc in Wikipedia you will be redirected to the page Dropbox(Service) which i believe is the page youre looking for.
If you dont have the resources to have the name of the company in the perfect format, then consider using Category:Companies to refine your results further.
When you get to the page, you can mine for the extract of the company by using the Mediawiki API as follows
https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&explaintext=&titles=Dropbox%20(service)
Note: The extract is called section0 in MediaWiki
I recommend trying Wikidata. Wikidata are a multilingual factual database of everything, and they have a query interface at query.wikidata.org. The language the interface uses is called SPARQL. For instance, if you're interested in a list of well-known cats, https://w.wiki/W4W is your query. More details can be found at https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service.
import wikipedia
print(wikipedia.summary("COMPANY_NAME"))
Try to filter out the companies by categories - there is a list provided in the end of the page:
xx = wikipedia.page("Dropbox")
xx.title
print(xx.categories)
I need to fetch the OOB approval workflow from current web to associate it with my custom list
I tried to fetch it through BaseId but then i got to know that BaseId is not the same for different languages.
Is there any other way to fetch the OOB approval workflow's template for sites created on different languages?
I have also tried GetTemplateByName method but its returning null for any language other than English also.
web.WorkflowTemplates.GetTemplateByName("Approval - SharePoint 2010", CultureInfo.CurrentCulture)
Any help would be appreciated.
Thanks
Sanjay
I solved this problem myself.
As i mentioned in my question, the BaseId varies from language to language. After trying out different methods and googling it, i didn't find any solution to the problem, so i started investigating it myself.
During my investigation, I fetched the BaseId for approval workflow for different languages and soon a pattern emerged. It turned out that BaseId (Guid) is almost same for every language except last three digits and those three digits were basically hexadecimal representation of LCID !!
So, this is how template based id can be formed
string baseId = "8ad4d8f0-93a7-4941-9657-cf3706f00" + web.Language.ToString("X");
Guid workflowBaseId = new Guid(baseId);
so, once we have the base id, we can use the same method to retrieve the template:
web.WorkflowTemplates.GetTemplateByBaseID(workflowBaseId);
I also wrote the blog post on this one here, in case you would like to see the entire piece of code for workflow association: http://sanjaybhagia.wordpress.com/2013/10/20/associating-oob-approval-workflow-with-custom-list-for-different-locale-lcids/
Hope it helps.
Regards,
Sanjay