Does this neural network model exist? - nlp

I'm looking for a neural network model with specific characteristics. This model may not exist...
I need a network which doesn't use "layers" as traditional artificial neural networks do. Instead, I want [what I believe to be] a more biological model.
This model will house a large cluster of interconnected neurons, like the image below. A few neurons (at bottom of diagram) will receive input signals, and a cascade effect will cause successive, connected neurons to possibly fire depending on signal strength and connection weight. This is nothing new, but, there are no explicit layers...just more and more distant, indirect connections.
As you can see, I also have the network divided into sections (circles). Each circle represents a semantic domain (a linguistics concept) which is the core information surrounding a concept; essentially a semantic domain is a concept.
Connections between nodes within a section have higher weights than connections between nodes of different sections. So the nodes for "car" are more connected to one another than nodes connecting "English" to "car". Thus, when a neuron in a single section fires (is activated), it is likely that the entire (or most of) the section will also be activated.
All in all, I need output patterns to be used as input for further output, and so on. A cascade effect is what I am after.
I hope this makes sense. Please ask for clarification where needed.
Are there any suitable models in existence that model what I've described, already?

Your neural network resembles a neural network which is created using Evolutionary Algorithms for example genetic algorithm.
See following articles for details.
Han - Evolutionary neural networks for anomaly detection based on the behavior of a program
WHITLEY - Genetic Algorithms and Neural Networks
For a summary in this type of neural network. Neurons and their connections are created using evolutionary techniques. Therefore they do not have strict layer approach. Hans uses following technique:
"Genetic Operations:
The crossover operator produces a new descendant by exchanging partial sections between two neural networks. It selects two distinct neural networks randomly and chooses one hidden node as the pivot point.Then, they exchange the connection links and the corresponding weight based on the selected pivot point.
The mutation operator changes a connection link and the corresponding weight of a randomly selected neural network. It performs one of two operations: addition of a new connection or deletion of an existing connection.
The mutation operator selects two nodes of a neural network randomly.
If there is no connection between them, it connects two nodes with random weights.
Otherwise, it removes the connection link and weight information.
"
Following figure from Whitley's article.
#ARTICLE{Han2005Evolutionary,
author = {Sang-Jun Han and Sung-Bae Cho},
title = {Evolutionary neural networks for anomaly detection based on the behavior
of a program},
journal = {Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions
on},
year = {2005},
volume = {36},
pages = {559 -570},
number = {3},
month = {june },
}
#article{whitley1995genetic,
title={Genetic algorithms and neural networks},
author={Whitley, D.},
journal={Genetic algorithms in engineering and computer science},
pages={203--216},
year={1995},
publisher={Citeseer}
}

All in all, I need output patterns to be used as input for further output, and so on. A cascade effect is what I am after.
That sounds like a feed-forward net with multiple hidden layers. Don't be scared of the word "layer" here, with multiple ones it would be just like you have drawn there.. something like a 5-5-7-6-7-6-6-5-6-5 -structured net (5 inputs, 8 hidden layers with varying amount of nodes in each and 5 outputs).
You can connect the nodes to each other any way you like from layer to another. You can leave some unconnected by simple using constant zero as a weight between them, or if object oriented programming is used, simply leave unwanted connections out of connection phase. Skipping the layers might be harder with a standard NN-model, but one way could be using a dummy node for each layer a weight needs to cross. Just copying the original output*weight -value from node to dummy would be same as skipping a layer and this would also keep the standard NN-model intact.
If you want the net just to output some 1's and 0's, a simple step-function can be used as an activation function in each node: 1 for values more than 0.5, 0 otherwise.
I'm not sure if this is want you want, but this way you should be able to build a net you described. However, I have no idea how are you planning to teach your net to produce some semantic domains. Why not just let the net learn its own weights? This can be achieved with simple input-output -examples and a backpropagation -algorithm. If you use standard model to do build your net, also the mathematics of the learning wouldn't be any different from any other feed-forward net. Last but not least, you can probably find a library that is suitable for this task with only minor or with no change at all to the code.

The answers involving genetic algorithms sound fine (especially the one citing Darrell Whitley's work).
Another alternative would be to simply randomly connect nodes? This is done, more or less, with recurrent neural networks.
You could also take a look at LeCun's highly successful convolutional neural networks for an example of an ANN with a lot of layers that is somewhat like what you've described here that was designed for a specific purpose.

your network also mimics this
http://nn.cs.utexas.edu/?fullmer:evolving
but doesn't really allow the network to learn, but be replaced.
which may be covered here
http://www.alanturing.net/turing_archive/pages/reference%20articles/connectionism/Turing%27s%20neural%20networks.html

Related

How to prioritize few Neural Network inputs?

I have a Neural Network with five inputs for a classification task. Two inputs out of those five are very important and have a direct relationship to the classification task. Therefore, I need to prioritize those two inputs within the network and give less priority to the other three. Is there a way in the neural network to facilitate my requirement?
If training works well, the NN should automatically pick up what's most important for your classification. That's the entire point of a NN (or ML in general); so that you don't have to manually tell it what's more important and what's not. After learning, you can verify that the model indeed does learn the correct order of importance between the features.
You can use any model explanation technique for this. ELI5, SHAP or LIME are some examples. All these will tell you if your model did indeed learn that the features that you know are important is actually important to the network.
You probably shouldn't try to manually incorporate such biases into the network (unless you have a very good reason for doing so, like incorporating spatial information of images via CNNs). Trust the learning xD

choose filter in convolution neural network

I have done implementation part of convolution neural network. But I am still confused about how to select the filter to obtain convolved feature in convolution neural network. As I know we detect features(like eyes, nose, mouth) to recognize a face from an image using convolution layer with the help of the filter.is it true that filter contains eyes, nose, mouth to recognize a face from an image?
There is no hard rule for this purpose.
In many university courses and even implemented models in papers, researcher uses 3x3 or 5x5 filters with with 1 or 2 strides.
It is one of your hyperparameters you should tune for your model. But the best way as a practice is to go to implemented model's documentations by google or others and find best size with respect to your conv layers.
But the last thing you should know is that the purpose of adding filters is to reduce nmber of parameters but keeping high quality features.
Here is a link to all models implemented using Tensoflow for different tasks.
Good luck

How do you decide on the dimensions of the convolutional neural filter?

If I have an image which is WxHx3 (RGB), how do I decide how big to make the filter masks? Is it a function of the dimensions (W and H) or something else? How does the dimensions of the second, third, ... filters compare to the dimensions of the first filter? (Any concrete pointers would be appreciated.)
I have seen the following, but they don't answer the question.
Dimensions in convolutional neural network
Convolutional Neural Networks: How many pixels will be covered by each of the filters?
How do you decide the parameters of a Convolutional Neural Network for image classification?
It would be great if you add details what are you trying to extract from the image and details of the dataset that you are trying to use.
A general assumption can be drawn from Alexnet and ZFnet about the filter mask sizes that are needed to be considered. There is no specific formulation which size should be considered for particular format but the size is kept low if a deeper analysis is required as many smaller details might miss with larger filter sizes. In the above link with Inception networks describes how effectively you can utilize the computing resources. If you dont have the issue of the resources, then from ZFNet you can observe the visualizations in multiple layers, there are many finer details visible. We can call it CNN even if it has one layer of convolution and pooling layer. The number of layers depends on the deep finer requirements.
I am not expert, but can recommend if your dataset is small as few thousands and not many features extraction is required, and if you are not sure about the size you can just simply go with the small sizes (small best and popular is 5x5 - Lenet5).

Difference of filters in convolutional neural network

When creating a convolutional neural network (CNN) (e.g. as described in
https://cs231n.github.io/convolutional-networks/) the input layer is connected with one or several filters, each representing a feature map. Here, each neuron in a filter layer is connected with just a few neurons of the input layer.
In the most simple case each of my n filters has the same dimensionality and uses the same stride.
My (tight-knitted) questions are:
How is ensured that the filters learn different features, although they are trained with the same patches?
"Depends" the learned feature of a filter on the randomly assigned values (for weights and biases) when initiating the network?
I'm not an expert, but I can speak a bit to your questions. To be honest, it sounds like you already have the right idea: it's specifically the initial randomization of weights/biases in the filters that fosters their tendencies to learn different features (although I believe randomness in the error backpropagated from higher layers of the network can play a role as well).
As #user2717954 indicated, there is no guarantee that the filters will learn unique features. However, each time the error of a training sample or batch is backpropagated to a given convolutional layer, the weights and biases of each filter is slightly modified to improve the overall accuracy of the network. Since the initial weights and biases are all different in each filter, it's possible (and likely given a suitable model) for most of the filters to eventually stabilize to values representing a robust set of unique features.
In addition to proper randomization of weights, this also demonstrates why it's crucial to use convolutional layers with an adequate number of filters. Without enough filters, the network is fundamentally limited such that there are important, useful patterns at the given layer of abstraction that simply can't be represented by the network.

Multi threaded AForge.NET training

I am using AForge.NET ANN and training it on my training set. Because the training is single threaded and the process can take ages, I wondered if it's possible to run a multi threaded training.
Because it is a problem to use threads while training a Resilient Backpropagation network I thought about splitting my training set between different networks and once every N epoch's, combine the weights of all networks in to one, Then, duplicate it to all threads (so the next epoch will start with the new weights).
I can't seem to find a method in the AForge.NET that combines two (or more) networks. Looking for some help on how to get started with the implementation process.
Combining the neural networks every N number of iterations won't work really well. It can be very tricky to just take the weights and combine them. In some ways this is how the crossover operation of a Genetic Algorithm works.
Really the only way you are going to be able to do this is modify AForge's training to support multiple threads. Basically to do this you need to map the gradient calculation and then do a reduce-sum on the gradients. Then use the reduced gradients to update the network.
I've implemented this exact thing in the Encog Framework, it supports multi-threaded (RPROP), and has a C# version. http://www.heatonresearch.com/encog.

Resources