Rapid Miner NeuralNet polynomial - excel

I'm usig RapidMiner for the first time. I have a dataset (in .xlsx format) on which I want to run the neural network algorithm. I am getting this error;
The operator NeuralNet does not have sufficient capabilities for the given data set; polynomial attributes not supported
Any help about this please?
Thank in advance!

Per the Neural Net operator's Help file...
...This operator cannot handle polynominal attributes.
Your given input file has several binominal and polynominal attributes. Therefore, if you wish to use the out of the box Neural Net operator, you need to convert your nominal data to numerical data. One way of doing this within RapidMiner is with the Nominal to Numerical operator.
Always be cognizant of the type of data/attribute you are maniuplating: (1) text, (2) numeric, and (3) nominal.

Related

Using GAN to Model Posteriors with PyTorch

I have a 5-dimensional dataset and I'm interested in using a neural network to model the posterior distributions from which the data was drawn. I decided to implement a GAN to do this, and have been familiarizing myself with PyTorch.
I'm wondering how one should go about restricting what values the generator can produce for the parameters. For one of the parameters, the values must be nonnegative real values. For another case, the values must be nonnegative integer values. For the other three cases, the parameters can take on any real value.
My first idea was to control this through the transfer function applied to the nodes in the output layer of my neural network. But all of the PyTorch examples I've seen so far apply the same transfer function to all of the output nodes, which is not what I want to do. Is there a way to apply a different transfer function to each output node? Or is there maybe a better way to approach this problem?

How Does the Hashing Trick in Machine Learning Work?

I have a large categorical dataset and a feedforward ANN that I am using for classification purposes. I programmed the machine learning model using Excel VBA (the only programming language I have access too currently).
I have 150 categories in my dataset that I need to process. I have tried using Binary Encoding and One-Hot Encoding, however because of the number of categories I need to process, these vectors are often too large for VBA to handle and I end up with a memory error.
I’d like to give the Hashing trick a go, and see if it works any better. I don't understand how to do this with Excel however.
I have reviewed the following links to try and understand it:
https://learn.microsoft.com/en-us/azure/machine-learning/studio-module-reference/feature-hashing
https://medium.com/value-stream-design/introducing-one-of-the-best-hacks-in-machine-learning-the-hashing-trick-bf6a9c8af18f
https://en.wikipedia.org/wiki/Vowpal_Wabbit
I still don’t completely understand it. Here is what I have done so far. I used the following code example to create a hash sequence for my categorical date:
Generate short hash string based using VBA
Using the code above, I have been able to produce collision free numerical hash sequences. However, what do I do now? Does the hash sequence need to be converted to a binary vector now? This is where I get lost.
I provided a small example of my data thus far. Would somebody be able to show me step by step how the hashing trick works (preferably for Excel)?
'CATEGORY 'HASH SEQUENCE
STEEL 37152
PLASTIC 31081
ALUMINUM 2310
BRONZE 9364
So what the hashing trick does is it prevents ~fake words from taking up extra memory. In a regular Bag-Of-Words (BOW) model, you have 1 dimension per word in the vocabulary. This means that a misspelled word and the regular word can both take up separate dimensions - if you have the misspelled word in the model at all. If the misspelled word is not in the model, (depending on your model) you might ignore it completly. This adds up over time. And by misspelled word, I'm just using an example of any word not in the vocabulary you use to create the vectors to train your model with. Meaning any model trained this way cannot adapt to new vocab without being trained all over again.
The hashing method allows you to incorporate out-of-vocab words, with some potential accuracy loss. It also ensures that you can bound your memory. Essentially the hashing method starts by defining a hash function that takes some input (typically a word) and mapping it to an output value Within an Already Determined Range. You would choose your hash function to output somewhere between say 0-2^16. Thus you know your output vectors will always be capped at size 2^16 (arbitrary value really), so you can prevent memory issues. Further, hash functions have "collisions" - what this means is that hash(a) might equal hash(b) - very rarely with an appropriate output range, but its possible. This means that you lose some accuracy - but since the hash function is theoretically able to take any input string, it can work with out of vocabulary words to get a new vector Of the Same Size as the original vectors used to train the model. Since your new data vector is the Same Size as those used to train the model previously, you can use it to refine your model instead of being forced to train a new model.

Data conversion for string based data for machine learning or deep learning

I have string data in my dataset of the type :
AGF.SL.CA.LOSANG.15764
ABC.EMP.GOO.__._ME$.ZR_ME$ATR$GENERAL
SEM.JP.YOO.����_������_�����.ZC_NA:US::SANDO$GENERAL
Every record has a category associated with it, and given one such string, I have to use a Machine Learning or Deep Learning approach to identify the corresponding category.
I am confused as to what approach to follow in order to do this. My primary question is, should I keep the strings as is and use string similarity functions, or should I break up the strings into different words, and then do count vectorization on it, and then proceed from there?
Given this kind of data, with just one string to predict the class, what would be the best approach? I have to put this into production so I need look at something which will scale well. I am new to ML so any suggestions would be appreciated. Thanks.
It seems to me that you can tackle this problem using lstm. Long short-term memory (LSTM) units (or blocks) are a building unit for layers of a recurrent neural network (RNN)
These LSTM will help us to capture sequential information and generally used in case where we want to learn the sequential patterns in the data
You can decode this problem using character level LSTM.
In this you have to pass every character of the text in a LSTM cell.and at the last time step you will have a class which is the true label
You can use cross-entropy loss function.
https://machinelearningmastery.com/develop-character-based-neural-language-model-keras/
This will give you complete idea

Creating probability matrix from a DocumentTermMatrix

I'm an economist and now I'm analysing some qualitative and text data. This is new for me.
I want to create a Markov Model for text predicton based on my interviews corpora. I have analyzed a corpora with tm package and after creating a DocumentTermMatrix and the TermDocumentMatrix (is equivalent) with bigrams (pairs of words), I want to compute the probability matrix for each pair of words in order to use it for further Markov Chain prediction. So, I have tried this piece from http://www.salemmarafi.com/code/twitter-naive-bayes/
probabilityMatrix <-function(docMatrix)
{
# Sum up the term frequencies
termSums<-cbind(colnames(as.matrix(docMatrix)),as.numeric(colSums(as.matrix(docMatrix))))
# Add one
termSums<-cbind(termSums,as.numeric(termSums[,2])+1)
# Calculate the probabilties
termSums<-cbind(termSums,(as.numeric(termSums[,3])/sum(as.numeric(termSums[,3]))))
# Calculate the natural log of the probabilities
termSums<-cbind(termSums,log(as.numeric(termSums[,4])))
# Add pretty names to the columns
colnames(termSums)<-c("term","count","additive","probability","lnProbability")
termSums
}
But I'm sure that this is not a correct approach to my problem because this code compute the frequency of each pair, but not consider the transition probability from a word to another. I have also seen that there are some implementations of text prediction algorithms in phyton, also in Java (see github), but I'm not able to translate it to R. Some people has a piece of code to perform this kind of analysis in R or know a package that performs it directly?
Thanks in advance
Jose

SVM: Adding Clinical Features To Feature Vector Extracted From Image

I'm using SVM to classify clinical images of patients belonging to two different groups (patients vs. controls). I use PCA to extract a vector of features from each image, but I'd like to add other clinical information (for example, the output value of a clinical exam) in order to include it in the classification process.
Is there a way to do this?
I didn't find exhaustive suggestions in literature.
Thanks in advance.
You could just append the new information at the end of each sample. Other approach that you could try is having two additional classifiers, one that you could train with the additional information and a third classifier that would take the output of the other two classifiers as input to product a final prediction.
The question is pretty old, I' post my answer though.
If you have to scale your values, make sure that the new values are scaled to the similar range of your values in PCA-vector.
If your PCA vectors of features have constant length, you just start enumerating your features from length+1 e.g. for SVM input (libsvm):
1 1:<PCAval1> ... N:<PCAvalN> N+1:<Clinical exam value 1> ...
I've made a test adding such general features for cell recognition and the accuracy raised.
This Guide describes how to use enumerator-features.
P.S.:
In my test I've isolated, and squeezed cells from microscope image to a matrix 16x16. Each pixel in this matrix was a feature - 256 features. Additionally I've added some features as original size, moments, etc.

Resources