I am looking to measure the level of confusion in series of short texts. I was wondering if anyone knows if there exists a "Confusion" dictionary similar to the sentiment dictionaries that are easily available. Basically, I'm looking for a collection of words or phrases that signal confusion. Any help would be greatly appreciated. Thanks.
Related
I'm using GPT-3 for some experiments where I prompt the language model with tests from cognitive science. The tests have the form of short text snippets. Now I'd like to check whether GPT-3 has already encountered these text snippets during training. Hence my question: Is there any way to sift through GPT-3's training text corpora? Can one find out whether a certain string is part of these text corpora?
Thanks for your help!
I don't think that's possible, unfortunately. GPT-3's training corpora is private.
But if that was possible, it would be great for detecting plagiarism. Maybe ask if it it knows where a certain line of text came from?
Can someone tell me if the library pysimilar compares lexical or semantic similarity of input text?
I wanted to find lexical similarity between two sentences. Any better suggestion or library is greatly appreciated.
I am trying to analyze text and sentiment data, but I don't want an extensive analysis. I just want a basic distribution of good, neutral and bad, and the percentage of each category.
can anyone direct me with some advice or suggestions?
thank you all!!
WinkNLP can measure sentiment on a scale of -1 to +1 for the entire document or its sentences. Here is an Observable notebook with a live example.
I have a project where I need to analyze a text to extract some information if the user who post this text need help in something or not, I tried to use sentiment analysis but it didn't work as expected, my idea was to get the negative post and extract the main words in the post and suggest to him some articles about that subject, if there is another way that can help me please post it below and thanks.
for the dataset i useed, it was a dataset for sentiment analyze, but now I found that it's not working and I need a dataset use for this subject.
Please use the NLP methods before processing the sentiment analysis. Use the TFIDF, Word2Vector to create vectors on the given dataset. And them try the sentiment analysis. You may also need glove vector for the conducting analysis.
For this topic I found that this field in machine learning is called "Natural Language Questions" it's a field where machine learning models trained to detect questions in text and suggesting answer for them based on data set you are working with, check this article for more detail.
Lets say I have a text in English and there is a word missing somewhere in it.
I have a list of candidate words from a dictionary with no other information. These candidate words are selected by some other, rather inaccurate, algorithm. I would like to use WordNet and the context around the missing word to assign probabilities to candidate words.
There is an obvious ad-hoc way that came to my mind on how to solve this. One way would be to extract "interesting" words surrounding the missing word, calculate semantic similarity with every candidate word according to some metric and assign probabilities to candidate words based on the average score.
However I was unable to find any useful research papers on this problem.
So, what i'm asking is if you're aware of any research (papers) about this problem, how do you find my proposal and do you have a better idea?
You can start from Experiments: Enriching indirect answers. A good article is Semantic web access prediction using WordNed.