I am working on a problem that I want to implement as a reinforcement learning problem and integrate with OpenAI's Gym.
The environment has 96 states. In each state we observe 3 elements
[home_Loard, SOC_, Price_].
how could I make my observation space to be compatible with Gym.
Related
Iam new to machine learning, started learning it two months ago, I am trying to working on a project of converting a smart contract to a vector by applying NLP transformers, I have tried word embedding and positional embedding on natural languages, but after that Iam having problem with calculating the weight of Keys, Queries and values for multi head attention layers, also could anyone provide a code of similar context .
I am trying to create a custom gym environment for a Connect Four game. My plan with this environment is to train a Connect Four AI with DQN using stable-baselines3.
When creating this environment, I am using Discrete(7) as my action space
#property
def action_space(self):
return Discrete(7)
However during training, I realised the AI would just place the piece into the same column although it is already full. (It is allowed to do so because the action space is always from 0 to 6.)
Is there a way to let the training model realise the available moves? Like instead of using Discrete(7), maybe something like Discrete([0,1,2,3,4,6]) when the 2nd last column is full?
I'm trying to make a OpenCV program in Python 3 to detect the faces of my friends. I've seen that one can train a Cascade Classifier using OpenCV to specify a certain type of object. However, it isn't clear whether that could create a classifier refined enough to pick only my friends' faces out of a large sample size, or whether this is something I could achieve without making my own Cascade Classifier. Can anyone help?
Cascade classifiers usually are built for face detection. You are trying to solve a different problem, face recognition.
Deep learning is a common framework nowdays, but other models do exist. http://www.face-rec.org/algorithms/ makes a very good job at presenting the main algorithms.
This presents an interesting implementation in OpenCV.
I'm trying to implement 'multi-threading' to do both training and prediction(testing) at the same time. And I'm gonna use the python module 'threading' as shown in https://www.tensorflow.org/api_docs/python/tf/FIFOQueue
And the followings are questions.
If I use the python module 'threading', does tensorflow use more portion of gpu or more portion of cpu?
Do I have to make two graphs(neural nets which have the same topology) in tensorflow one for prediction and the other for training? Or is it okay to make just one graph?
I'll be very grateful to anyone who can answer these questions! thanks!
If you use python threading module, it will only make use of cpu; also python threading not for run time parallelism, you should use multiprocessing.
In your model if you are using dropout or batch_norm like ops which change based on training and validation, it's a good idea to create separate graphs, reusing (validation graph will reuse all training variables) the common variable for validation/testing.
Note: you can use one graph also, with additional operations which changes behaviors based on training/validation.
How to find the variables that contributes the most for a particular prediction in case of a decision tree? For eg. If there are features A,B,C,D,E and we build a decision tree on top of the dataset. Then for a sample x, lets say variables C,D contributes the most to the prediction(x). How to find these variables that contributed the most for prediction(x) in H2O? I know the H2O gives global importance of variables once the decision tree is built. My question applies in the case when we are using that particular tree to make a decision and in finding the variables that contributed to that particular decision. Scikit learn has functions to extract the rules that were used to predict a sample. Does H2O have any such functionality?
H2O has no support for doing this currently (as of Feb 2017, h2o 3.10.3.x); Erin opened a JIRA for it: https://0xdata.atlassian.net/browse/PUBDEV-4007