Use Freeling with sentences without full stop at the end? - python-3.x

I have just started using Freeling, and I am using it to obtain the lemma form (get_lemma() ) and saving it on a string of some Spanish reviews I get from Google Maps API. Freeling works well with sentences that have full stop at the end (for example, “Buen lugar, comodo y agradable."), but it does not when the review doesn’t have full stop (for example, Buen lugar. Trato amigable). In that case, Freeling won’t return the lemma form of each one of the words in the sentence, so the string remains empty.
Is there any way of making Freeling return the lemma form of sentences that doesn’t have a full stop, other than adding it manually to the sentence?
I’m writting the code in Python, using the example from sample.py.
Thanks in advance.

You can add the option "flush=true" to the splitter.
Please check the user manual, this is described there.
If you ask your questions in FreeLing forum, answers may come faster...

Related

Is there any option available in Azure Search for spell check?

I am using azure search in my bot application.
In this if we give input with spelling mistake, for small words like trvel => travel we are getting response properly.
But if i enter "travelexpense" for this i am not getting any result.
Currently i am passing input to do fuzzy search.
I have suggested to use Bing Spell Check API, but it is not approved as they think our input may be stored outside.
Is there any option available in azure search to correct the words like "travelexpense".
Is there any option available for this scenario?
The closest I would say is a phonetic Analyzer.
https://learn.microsoft.com/en-us/azure/search/index-add-custom-analyzers
There a couple of other things you can try:
Enable Auto Complete and Suggestions (https://learn.microsoft.com/en-us/azure/search/search-autocomplete-tutorial)
Create synonyms (https://learn.microsoft.com/en-us/azure/search/search-synonyms)

How to perform text search in a Google-docs document using Python3

I want to search for key words in my book I am writing with Google docs. I want key worrds to be presented in a list by the page number to see where and how many times a certain word occur.
I havent tried anything. This is my first test at using Stack overflow and my first problem solving using Python.
Is this something I can use for Google docs?
How to search for a string in text files?
I found the function is built into Docs already (cmd-shift-H).

Wit.ai bot understand wit/number as wit/location

What's wrong with Wit.ai ? My bot understand few numbers as location and it breaks my stories. You can see the picture below :
What can I Do for that ? Thank you.
If you earlier have validated some GPS coordinates on your Understanding console, this type of misprediction may be possible. For avoiding that, you can validate some useful numbers with the wit/numbers intent, other GPS coordinates should be validated with the wit/location.
Also you may accidentally validated some numbers using the wit/location entity, feed some numbers with the wit/numbers entity. wit.ai does not know anything about numbers, locations, etc, without you validated them first.. Try to write " Amsterdam " on your Understanding tab, you'll see that wit.ai cannot assign this text to any intent or location entity because you have not trained his modal yet :) Validate it with wit/location. After that he will know..
Also you can train(validate or feed) your own wit.ai NLP without the Understanding tab. You may use simple CURL command and a loop.
Check this out:
https://wit.ai/docs/http/20160526
Have a nice day :)

Recognizing new words with Freeling

I'm using Freeling to analyse text in Spanish, but I have a question when it comes to customize the used dictionary. The specific example is that the word
morelos
is a singular masculine noun but is being split in two words and classified as follows:
more morar VMM03S0 1 -
los lo PP3MPA0 1 -
I've tried a wide variety of things from adding the word in the dictionary, which entry the following
morelos morelos NPMSS00
I've tried not using multiwords, but is also unsuccessful.
Can anyone recommend me what to do?
(Is there anywhere a comprehensive tutorial to understand use freeling?)
This is because the affixation module is considering this a clitic pronoun (morar+los)
You can deactivated affixation, or try to fine-tune the affixation rules.
There is comprehensive information about FreeLing in its user manual and in its user forums. Check FreeLing webpage

How to get a description of a URL

I have a list of URLs and am trying to collect their "descriptions." By description I mean what comes up, for example, if you Googled the link. For example, http://stackoverflow.com">Google: http://stackoverflow.com shows the description as
A language-independent collaboratively
edited question and answer site for
programmers. Questions and answers
displayed by user votes and tags.
This the data I'm trying to accumulate for the URLs I have.
I tried parsing the URL's meta-descriptions, however most of them are lacking a meta-description (yet Google and other search engines manage to get a description somehow).
Any ideas? Should I just "google" each link and scrape the data? I have a feeling Google wouldn't like this...
Thanks guys.
Different search engines have different algorithms to get the description out of the page if/when they are lacking the description meta tag. Some ignore the tag even it it's there.
If you want the description Google has, the most accurate way to get it would be to scrape it. Otherwise, you could write your own or look around on the web for code that does it.
These are called snippets.
Google use proprietary (and possibly patented) methods to garner this information, so there is no simple answer.
As you suggest, they will use meta-description information if it is there. (How to set the meta-information to help Google.)
They will also honour requests from the page authors to NOT include snippets. (How to prevent Google from displaying snippets) You should probably respect this too (as well as robots.txt, of course.)
You may have some luck with existing auto-summary packages, such as OTS.
You may want to check AboutUs.org (i.e. http://www.aboutus.org/StackOverflow.com).
But, there's little chance that the site will have an aboutus page and not have a meta description.
Some info that might explain how google does this:
Webmasters/Site owners Help
Adding a URL to google
I am not familiar with Google APIs, but perhaps there is an official way to get such information.
Interesting. some sources are better than others.
For "audiotuts.com" google has a worse description than AboutUs.com.
Google
Nov 18th in General by Joel Falconer ·
1. Recently, an AUDIOTUTS reader asked me about creative process. While this
is a topic that can’t be made into a
...
AboutUs.com:
AUDIOTUTS is a blog/tutorial site for
musicians, producers and audio
junkies! It is the sister site of the
popular PSDTUTS, VECTORTUTS and
NETTUTS.
I hate problems like these... they should be trivial but they aren't!
If you can assume English content, you can first look for Meta Description, and if that doesn't work, you can look for the first two or three sentence-like word sequences.
A product I worked on looked for the first P or DIV that contained more than one sequence of > n "words" delimited by periods. It would use the two or three sentence-like sequences, up to x total words, as a summary paragraph. It wasn't 100% accurate, but good enough for the average case. The number of words was adjusted a few times to eliminate things like navigation elements.

Resources