Train Text Module in another Idiom - tensorflow-hub

I was thinking about how train the Universal Sentence Enconder in Portuguese, can you share some tips, what kind of dataset I need for example, transfer learning make a sense?
Thank you!

you can use the TensorFlow-Hub module in a TensorFlow model that trains on Portuguese tasks. So which tasks/data to use is the main question, and that's outside the realm of tensorflow-hub, i.e. you can ask that question and tag it with "machine-learning" and "nlp".
You might also want to take a look at links such as
http://www.nltk.org/howto/portuguese_en.html
https://stackoverflow.com/search?q=Portuguese+corpus

Related

Is this how nn.Transformer works?

If I want to transform an image to another image,
then
transformer_model = nn.Transformer(img_size, n_heads)
transformer_model(source_image, target_image)
is this the correct way to use nn.Transformer?
No, this is not what the Transformer module does. The Transformer is primarily used for pre-training general use models for NLP on large bodies of text. If you're curious to learn more, I strongly recommend you read the article which introduced the architecture, "Attention is All You Need". If you've heard of models like BERT or GPT-2, these are examples of transformers.
It's not entirely clear what you are trying to accomplish when you ask how to "transform an image into another image." I'm thinking maybe you are looking for something this? https://junyanz.github.io/CycleGAN/
In any event, to re-answer your question: no, that's not how you use nn.Transformer. You should try to clarify what you are trying to accomplish with "transforming one picture into another," and post that description as a separate question.

how to see a full image of deep neural network

How it is possible to see in Keras or Tensorflow the graphical structure of deep neural network? I made its model and see output of "plot_model" but I want to see the graphical similar to this image.
This is a common problem within the community. There are many 'visualization' APIs and libraries out there for NNs and CNNs, but many of them are flat and not what you're looking for. A couple of months ago, I bookmarked this Github project: https://github.com/gwding/draw_convnet . It looks like exactly what you want, or at least very close. I've never personally used it, but I plan to at some point. I hope this helps!
For Tensorflow at least, you can use Tensorboard
Tutorial and Explanation
It also features Graph visualization, which is what you are looking for.
Still, not equal to your sample picture, but good ennough I think.

Devanagaric text processing(NLP) where to start

I am new to Devnagaric NLP, Is there any group or resources that would help me get started with NLP in Devnagaric language(Mostly Nepali language or similar like Hindi). I want to be able to develop fonts for Devanagaric and also do some font processing application. If anyone (working in this field), could give me some advice then it would be highly appreciable.
Thanks in advance
I am new to Devnagaric NLP, Is there any group or resources that would help me get started with NLP in Devnagaric language(Mostly Nepali language or similar like Hindi)
You can use embeddings given by fasttext [https://fasttext.cc/docs/en/pretrained-vectors.html#content] and use some deep learning RNN models like LSTM for text-classification, sentiment analysis.
You can find some datasets for named entity recoginition here [http://ltrc.iiit.ac.in/ner-ssea-08/index.cgi?topic=5]
For Processing Indian languages, you can refer here [https://github.com/anoopkunchukuttan/indic_nlp_library]
Nltk supports the indian lanugages, for pos tagging and nlp related tasks you can refer here [http://www.nltk.org/_modules/nltk/corpus/reader/indian.html]
Is there any group or resources that would help me get started with NLP in Devnagaric language?
The Bhasa Sanchar project under Madan Puraskar Pustakalaya has developed a Nepali corpus. You may request a Nepali corpus for non-commerical purposes from the contact provided in the link above.
Python's NLTK has the Hindi Language corpus. You may import it using
from nltk.corpus import indian
For gaining insight to Devnagari based NLP, I suggest you go through research papers.Nepali being an under-resourced language;much work yet to be done, and it might be difficult to get contents for the same.
You should probably look into language detection,text classification,sentiment analysis among others (preferably based on POS tagging library from the corpus) for grasping the basics.
For the second part of the question
I am pretty sure font development doesn't come under the domain of Natural Language Processing. Did you mean something else?

Simple toolkits for emotion (sentiment) analysis (not using machine learning)

I am looking for a tool that can analyze the emotion of short texts. I searched for a week and I couldn't find a good one that is publicly available. The ideal tool is one that takes a short text as input and guesses the emotion. It is preferably a standalone application or library.
I don't need tools that is trained by texts. And although similar questions are asked before no satisfactory answers are got.
I searched the Internet and read some papers but I can't find a good tool I want. Currently I found SentiStrength, but the accuracy is not good. I am using emotional dictionaries right now. I felt that some syntax parsing may be necessary but it's too complex for me to build one. Furthermore, it's researched by some people and I don't want to reinvent the wheels. Does anyone know such publicly/research available software? I need a tool that doesn't need training before using.
Thanks in advance.
I think that you will not find a more accurate program than SentiStrength (or SoCal) for this task - other than machine learning methods in a specific narrow domain. If you have a lot (>1000) of hand-coded data for a specific domain then you might like to try a generic machine learning approach based on your data. If not, then I would stop looking for anything better ;)
Identifying entities and extracting precise information from short texts, let alone sentiment, is a very challenging problem specially with short text because of lack of context. Hovewer, there are few unsupervised approaches to extracting sentiments from texts mainly proposed by Turney (2000). Look at that and may be you can adopt the method of extracting sentiments based on adjectives in the short text for your use-case. It is hovewer important to note that this might require you to efficiently POSTag your short text accordingly.
Maybe EmoLib could be of help.

Natural Language Processing Algorithm for mood of an email

One simple question (but I haven't quite found an obvious answer in the NLP stuff I've been reading, which I'm very new to):
I want to classify emails with a probability along certain dimensions of mood. Is there an NLP package out there specifically dealing with this? Is there an obvious starting point in the literature I start reading at?
For example, if I got a short email something like "Hi, I'm not very impressed with your last email - you said the order amount would only be $15.95! Regards, Tom" then it might get 8/10 for Frustration and 0/10 for Happiness.
The actual list of moods isn't so important, but a short list of generally positive vs generally negative moods would be useful.
Thanks in advance!
--Trindaz on Fedang #NLP
You can do this with a number of different NLP tools, but nothing to my knowledge comes with it ready out of the box. Perhaps the easiest place to start would be with LingPipe (java), and you can use their very good sentiment analysis tutorial. You could also use NLTK if python is more your bent. There are some good blog posts over at Streamhacker that describe how you would use Naive Bayes to implement that.
Check out AlchemyAPI for sentiment analysis tools and scikit-learn or any other open machine learning library for the classifier.
if you have not decided to code the implementation, you can also have the data classified by some other tool. google prediction api may be an alternative.
Either way, you will need some labeled data and do the preprocessing. But if you use a tool that may help you get better accuracy easily.

Resources