I'm almost newbie at PyTorch
One of my output size from conv is [1, 25, 8, 32]
(25=channel, 8=height, 32=width)
I can use squeeze and make it to [25, 8, 32].
But I'm confused with 25 channel.
When I want to visualize sum of 25 channel and make to one GRAYorRGB image(1or3x8x32),How can i deal with in code??
I can use matplot or tensorboardX for visualizing..
It is difficult to visualize images with more than 3 channels and it is unclear what a feature vector in 25 dimensional space actually looks like.
The most straight forward approach would be to visualize the 8x32 feature maps you have as separate 25 gray scale images of size 8x32. Each image will show how how "sensitive" is a specific neuron/conv filter/channel (these are all equivalent) to the input at a certain spatial location.
There are more intricate methods for feature visualization, you can find more details about them in this blog post.
Related
I understand that in order to create a color image, three channel information of input data must be maintained inside the network. However, data must be flattened to pass through the linear layer. If so, can GAN consisting of only FC layer generate only black and white images?
Your fully connected network can generate whatever you want. Even three channel outputs. However, the question is: does it make sense to do so? Flattened your input will inherently lose all kinds of spatial and feature consistency that is naturally available when represented as an RGB map.
Remember that an RGB image can be thought of as 3-element features describing each spatial location of a 2D image. In other words, each of the three channels gives additional information about a given pixel, considering these channels as separate entities is a loss of information.
I have a list of arrays of different shapes i.e.
list = [array([1,2,3]), dtype=int16),
array([1,2,3,4,5]), dtype=int16),
array([1,2]), dtype=int16),
array([1,2,3,4,5,6,7,8,9]), dtype=int16)]
I want to use these data as input in a cnn in which the first layer is conv1. How should i transform the data in order to work? Should i fill with zeros the arrays? The data is signal coming from a heart device.d
Every sample data has to have same shape to feed any Keras model as you know, therefore you have to make all sample's shape same. In order to do so, you can leverage sklearn.impute.SimpleImputer to full with dummy numbers to make the shapes same.
The SimpleImputer has some options to fill out, please refer below site.
https://scikit-learn.org/stable/modules/generated/sklearn.impute.SimpleImputer.html
If the smallest sample shape (e.g. [1, 2]) is still relatively big, you can prune other sample's data in order to fit to the smallest shape. If the smallest one is still small, you should consider whether you should omit it or not.
Given data is heart beat wave data, I'll try with zero as you mentioned at first.
I am microbiology student new to computer vision, so any help will be extremely appreciated.
This question involves microscope images that I am trying to analyze. The goal I am trying to accomplish is to count bacteria in an image but I need to pre-process the image first to enhance any bacteria that are not fluorescing very brightly. I have thought about using several different techniques like enhancing the contrast or sharpening the image but it isn't exactly what I need.
I want to reduce the noise(black spaces) to 0's on the RBG scale and enhance the green spaces. I originally was writing a for loop in OpenCV with threshold limits to change each pixel but I know that there is a better way.
Here is an example that I did in photo shop of the original image vs what I want.
Original Image and enhanced Image.
I need to learn to do this in a python environment so that I can automate this process. As I said I am new but I am familiar with python's OpenCV, mahotas, numpy etc. so I am not exactly attached to a particular package. I am also very new to these techniques so I am open to even if you just point me in the right direction.
Thanks!
You can have a look at histogram equalization. This would emphasize the green and reduce the black range. There is an OpenCV tutorial here. Afterwards you can experiment with different thresholding mechanisms that best yields the bacteria.
Use TensorFlow:
create your own dataset with images of bacteria and their positions stored in accompanying text files (the bigger the dataset the better).
Create a positive and negative set of images
update default TensorFlow example with your images
make sure you have a bunch of convolution layers.
train and test.
TensorFlow is perfect for such tasks and you don't need to worry about different intensity levels.
I initially tried histogram equalization but did not get the desired results. So I used adaptive threshold using the mean filter:
th = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, 3, 2)
Then I applied the median filter:
median = cv2.medianBlur(th, 5)
Finally I applied morphological closing with the ellipse kernel:
k1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
dilate = cv2.morphologyEx(median, cv2.MORPH_CLOSE, k1, 3)
THIS PAGE will help you modify this result however you want.
I understand (and please correct me if my understanding is wrong) that the primary purpose of a CNN is to reduce the number of parameters from what you would need if you were to use a fully connected NN. And CNN achieves this by extracting "features" of images.
CNN can do this because in a natural image, there are small features such as lines and elementary curves that may occur in an "invariant" fashion, and constitute the image much like elementary building blocks.
My question is: when we create layers of feature maps, say, 5 of them, and we get these by using the sliding window of a size, say, 5x5 on an image that has pixels of, say, 100x100, Initially, these feature maps are initialized as random number weight matrices, and must progressively adjust the weights with gradient descent right? But then, if we are getting these feature maps by using the exactly same sized windows, sliding in exactly the same ways (sharing the same starting point and the same stride value), on the exactly same image, how can these maps learn different features of the image? Won't they all come out the same, say, a line or a curve?
Is it due to the different initial values of the weight matrices? (I.e. some weight matrices are more receptive to learning a certain particular feature than others?)
Thanks!! I wrote my 4 questions/opinions and indexed them, for the ease of addressing them separately!
I've had an interest for neural networks for a while now and have just started following the deep learning tutorials. I have what I hope is a relatively straight forward question that I am hoping someone may answer.
In the multilayer perception tutorial, I am interested in seeing the state of the network at different layers (something similar to what is seen in this paper: http://www.iro.umontreal.ca/~lisa/publications2/index.php/publications/show/247 ). For instance, I am able to write out the weights of the hidden layer using:
W_open = open('mlp_w_pickle.pkl','w')
cPickle.dump(classifier.hiddenLayer.W.get_value(borrow=True), W_open, -1)
When I plot this using the utils.py tile plotting, I get the following pretty plot [edit: pretty plot rmoved as I dont have enough rep].
If I wanted to plot the weights at the logRegressionLayer, such that
cPickle.dump(classifier.logRegressionLayer.W.get_value(borrow=True), W_open, -1)
what would I actually have to do? The above doesn't seem to work - it returns a 2darray of shape (500,10). I understand that the 500 relates to the number of hidden units. The paragraph on the Miscellaneous page:
Plotting the weights is a bit more tricky. We have n_hidden hidden
units, each of them corresponding to a column of the weight matrix. A
column has the same shape as the visible, where the weight
corresponding to the connection with visible unit j is at position j.
Therefore, if we reshape every such column, using numpy.reshape, we
get a filter image that tells us how this hidden unit is influenced by
the input image.
confuses me alittle. I am unsure exactly how I would string it together.
Thanks to all - sorry if the question is confusing!
You could plot them just the like the weights in the first layer but they will not necessarily make much sense.
Consider the weights in the first layer of a neural network. If the inputs have size 784 (e.g. MNIST images) and there are 2000 hidden units in the first layer then the first layer weights are a matrix of size 784x2000 (or maybe the transpose depending on how it's implemented). Those weights can be plotted as either 784 patches of size 2000 or, more usually, 2000 patches of size 784. In this latter case each patch can be plotted as a 28x28 image which directly ties back to the original inputs and thus is interpretable.
For you higher level regression layer, you could plot 10 tiles, each of size 500 (e.g. patches of size 22x23 with some padding to make it rectangular), or 500 patches of size 10. Either might illustrate some patterns that are being found but it may be difficult to tie those patterns back to the original inputs.