modeling Knapsack with petri net - modeling

I am new at petri net. what are the steps of drawing a petri net graph for Knapsack problem?
Is petri net suitable for modeling this problem?

Related

What training scheme keras.fit use

I am training various Autoencoder structures in keras+tensorflow for my master thesis on anomaly detection and during my literature research on how to train the autoencoder structures, I found various training schemes for Autoencoders including, Layerwise pretraining and Joint training. However, I have not been able to find any literature on how Keras.fit() handles the training process. Can anyone point me to any useful links?

Uni-directional Transformer VS Bi-directional BERT

I just finished reading the Transformer paper and BERT paper. But couldn't figure out why Transformer is uni-directional and BERT is bi-directional as mentioned in BERT paper. As they don't use recurrent networks, it's not so straightforward to interpret the directions. Can anyone give some clue? Thanks.
To clarify, the original Transformer model from Vaswani et al. is an encoder-decoder architecture. Therefore the statement "Transformer is uni-directional" is misleading.
In fact, the transformer encoder is bi-directional, which means that the self-attention can attend to tokens both on the left and right. In contrast, the decoder is uni-directional, since while generating text one token at a time, you cannot allow the decoder to attend to the right of the current token. The transformer decoder constrains the self-attention by masking the tokens to the right.
BERT uses the transformer encoder architecture and can therefore attend both to the left and right, resulting in "bi-directionality".
From the BERT paper itself:
We note that in the literature the bidirectional Transformer is often referred to as a “Transformer encoder” while the left-context-only version is referred to as a “Transformer decoder” since it can be used for text generation.
Recommended reading: this article.

When and why would you want to use a Probability Density Function?

A wanna-be data-scientist here and am trying to understand as a data scientist, when and why would you use a Probability Density Function (PDF)?
Sharing a scenario and a few pointers to learn about this and other such functions like CDF and PMF would be really helpful. Know of any book that talks about these functions from practice stand-point?
Why?
Probability theory is very important for modern data-science and machine-learning applications, because (in a lot of cases) it allows one to "open up a black box" and shed some light into the model's inner workings, and with luck find necessary ingredients to transform a poor model into a great model. Without it, a data scientist's work is very much restricted in what they are able to do.
A PDF is a fundamental building block of the probability theory, absolutely necessary to do any sort of probability reasoning, along with expectation, variance, prior and posterior, and so on.
Some examples here on StackOverflow, from my own experience, where a practical issue boils down to understanding data distribution:
Which loss-function is better than MSE in temperature prediction?
Binary Image Classification with CNN - best practices for choosing “negative” dataset?
How do neural networks account for outliers?
When?
The questions above provide some examples, here're a few more if you're interested, and the list is by no means complete:
What is the 'fundamental' idea of machine learning for estimating parameters?
Role of Bias in Neural Networks
How to find probability distribution and parameters for real data? (Python 3)
I personally try to find probabilistic interpretation whenever possible (choice of loss function, parameters, regularization, architecture, etc), because this way I can move from blind guessing to making reasonable decisions.
Reading
This is very opinion-based, but at least few books are really worth mentioning: The Elements of Statistical Learning, An Introduction to Statistical Learning: with Applications in R or Pattern Recognition and Machine Learning (if your primary interest is machine learning). That's just a start, there are dozens of books on more specific topics, like computer vision, natural language processing and reinforcement learning.

what is Stochastic Reward Net model(SRN Model)?

I can't find anywhere else. please give me the Description of Stochastic Reward Net model and the difference between UML & SRN model.
Stochastic Reward Networks (SRN) use Petri nets for what they are trying to achieve. UML's state diagrams also base on Petri nets. The relation between SRN and UML is (at best) like between a rocket and a train. Both are used to transport people.

Writing code for A Neural Probabilistic Language Model Bengio, 2003. Not able to understand the model

I'm trying to write code for A Neural Probabilistic Language Model by yoshua Bengio, 2003, but I'm not able to understand the connections between the input layer and projection matrix and between projection matrix and hidden layer. I'm not able to get how exactly is the learning for word-vector representation taking place.
have a look at this answer here
It explains the difference between the hidden layer and the projection layer.
Referring to this thesis
Also, do read this paper by Tomas Mikolov and go through this tutorial.
this will really improve your understanding.
Hope this helps!

Resources