Can NLTK be used to Analyse the sentiment a certain word has within a sentence? - nlp

I have a quick question and could not find the answer anywhere on the internet:
Can NLTK be used to Analyze the sentiment a certain word has within a sentence?
Like: Sentiment for iPhone: "Even though it is terrible weather outside, my iPhone makes me feel good again." = Sentiment: positive

Have you thought of breaking down the text into clauses ("it is terrible weather outside", "my iphone makes me feel good again"), and evaluating them separately? You can use the NLTK's parsers for that. This will reduce the amount of text you have to analyze, though, so it might end up doing more harm than good.
This won't help you in cases like "Microsoft Surface is no iPad, it's terrible" (where your target is "iPad"), since the sentiment is negative but the iPad wins the comparison. So perhaps you'll also want to check the syntactic analysis, and only examine sentences where your target word is the subject or object. Whether these will give you better performance is anybody's guess, I think.

I do not have much experience with NLTK but I have done some concept level sentiment analysis using NLP libraries in Java. Here is how I did it. The same approach should work for you if you are able to identify dependencies in NLTK. This approach works fine for simple rules but may not work well for complicated sentences.

Related

How to detect sentence stress by python NLP packages (spaCy or NLTK)?

Can we detect the sentence stress (the stress on some words or pauses between words in a sentence) using common NLP packages such as spaCy or NLTK?
How can we tell content words from structure words using spaCy or NLTK?
Since all NLP programs detect the dependencies, there should be a possibility to identify which words are stressed in natural speech.
I don't think that NLTK or spacy support this directly. You can find content words with either tool, sure, but that's only part of the picture. You want to look for software related to prosody or intonation, which you might find as a component of a text-to-speech system.
Here's a very recently published research paper with code that might be a good place to start: https://github.com/Helsinki-NLP/prosody/ . The annotated data and the references could be useful even if the code might not be exactly the kind of approach you're looking for.
I assume you do not have a special training data set with labeled data in what words to stress. So I guess the simplest way would be to assume, that stressed words are all of the same Part-of-speech. I guess nouns and verbs would be a good start, excluding modal verbs for example.
NLTK comes with PoS-Taggers.
But as natural speech depends lot on context, it's probaly difficult for humans as well to identify a single solution for what to stress in a sentence.

Elementary Sentence Construction

I am working in NLP Project and I am looking for parser to construct simple Sentences from complex one, written in C# . Since Sentences may have complex grammatical structure with multiple embedded clauses.
Any Help ?
Text summarisation and sentence simplification are very much an open research area. Wikipedia has articles about both, you can start from there. Beware: this is hard problem and chances are the state of the art system is far worse than you might expect. There isn't an off-the-shelf piece of software that you can just grab and solve all your problems. You will have some success with the more basic sentences, but performance will degrade as you complex sentence gets more complex. Have a look at the articles referenced on Wikipedia or google around to get an idea of what is possible. My impression is most readily available software packages are for academic purposes and might take a bit of work to get running.

Evaluate the content of a paragraph

We are building a database of scientific papers and performing analysis on the abstracts. The goal is to be able to say "Interest in this topic has gone up 20% from last year". I've already tried key word analysis and haven't really liked the results. So now I am trying to move onto phrases and proximity of words to each other and realize I'm am in over my head. Can anyone point me to a better solution to this, or at very least give me a good term to google to learn more?
The language used is python but I don't think that really affects your answer. Thanks in advance for the help.
It is a big subject, but a good introduction to NLP like this can be found with the NLTK toolkit. This is intended for teaching and works with Python - ie. good for dabbling and experimenting. Also there's a very good open source book (also in paper form from O'Reilly) on the NLTK website.
This is just a guess; not sure if this approach will work. If you're looking at phrases and proximity of words, perhaps you can build up a Markov Chain? That way you can get an idea of the frequency of certain phrases/words in relation to others (based on the order of your Markov Chain).
So you build a Markov Chain and frequency distribution for the year 2009. Then you build another one at the end of 2010 and compare the frequencies (of certain phrases and words). You might have to normalize the text though.
Other than that, something that comes to mind is Natural-Language-Processing techniques (there is a lot of literature surrounding the topic!).

Natural Language Processing Algorithm for mood of an email

One simple question (but I haven't quite found an obvious answer in the NLP stuff I've been reading, which I'm very new to):
I want to classify emails with a probability along certain dimensions of mood. Is there an NLP package out there specifically dealing with this? Is there an obvious starting point in the literature I start reading at?
For example, if I got a short email something like "Hi, I'm not very impressed with your last email - you said the order amount would only be $15.95! Regards, Tom" then it might get 8/10 for Frustration and 0/10 for Happiness.
The actual list of moods isn't so important, but a short list of generally positive vs generally negative moods would be useful.
Thanks in advance!
--Trindaz on Fedang #NLP
You can do this with a number of different NLP tools, but nothing to my knowledge comes with it ready out of the box. Perhaps the easiest place to start would be with LingPipe (java), and you can use their very good sentiment analysis tutorial. You could also use NLTK if python is more your bent. There are some good blog posts over at Streamhacker that describe how you would use Naive Bayes to implement that.
Check out AlchemyAPI for sentiment analysis tools and scikit-learn or any other open machine learning library for the classifier.
if you have not decided to code the implementation, you can also have the data classified by some other tool. google prediction api may be an alternative.
Either way, you will need some labeled data and do the preprocessing. But if you use a tool that may help you get better accuracy easily.

NLP: Qualitatively "positive" vs "negative" sentence

I need your help in determining the best approach for analyzing industry-specific sentences (i.e. movie reviews) for "positive" vs "negative". I've seen libraries such as OpenNLP before, but it's too low-level - it just gives me the basic sentence composition; what I need is a higher-level structure:
- hopefully with wordlists
- hopefully trainable on my set of data
Thanks!
What you are looking for is commonly dubbed Sentiment Analysis. Typically, sentiment analysis is not able to handle delicate subtleties, like sarcasm or irony, but it fares pretty well if you throw a large set of data at it.
Sentiment analysis usually needs quite a bit of pre-processing. At least tokenization, sentence boundary detection and part-of-speech tagging. Sometimes, syntactic parsing can be important. Doing it properly is an entire branch of research in computational linguistics, and I wouldn't advise you with coming up with your own solution unless you take your time to study the field first.
OpenNLP has some tools to aid sentiment analysis, but if you want something more serious, you should look into the LingPipe toolkit. It has some built-in SA-functionality and a nice tutorial. And you can train it on your own set of data, but don't think that it is entirely trivial :-).
Googling for the term will probably also give you some resources to work with. If you have any more specific question, just ask, I'm watching the nlp-tag closely ;-)
Some approaches to sentiment analysis use strategies popular on other text classification tasks. The most common being transforming your film review into a word vector, and feeding it into a classifier algorithm as training data. Most popular data mining packages can help you here. You could have a look at this tutorial on sentiment classification illustrating how to do an experiment using the open source RapidMiner toolkit.
Incidentally, there is a good data set made available for research purposes related to detecting opinion on film reviews. It is based on IMDB user reviews, and you can check many related research work on the area and how they use the data set.
Its worth bearing in mind that the effectiveness of these methods can only be judged from a statistical viewpoint, so you can pretty much assume there will be misclassifications and cases where opinion is hard to detect. As already noticed in this thread, detecting things like irony and sarcasm can be very difficult indeed.

Resources