Using ray with custom environment created with gym.make() - openai-gym

I would like to run the following code but instead of Cartpole use a custom environment:
import ray
import ray.rllib.agents.dqn.apex as apex
from ray.tune.logger import pretty_print
def train_cartpole() -> None:
ray.init()
config = apex.APEX_DEFAULT_CONFIG.copy()
config["num_gpus"] = 0
config["num_workers"] = 3
trainer = apex.ApexTrainer(config=config, env="CartPole-v0")
for _ in range(1000):
# Perform one iteration of training the policy with Apex-DQN
result = trainer.train()
print(pretty_print(result))
train_cartpole()
My environment is defined as a gym.Env class and I want to create it using gym.make and then apply a wrapper to it and gym's FlattenObservation(). I have found ways of providing the environment as a class or a string, but that does not work for me because I do not know how to apply the wrappers afterwards.

Related

understanding tensorflow Recommending movies: retrieval / usage of : in python class /usage of : in python function

I was reading and trying to work with below documentation from tensorflow
https://www.tensorflow.org/recommenders/examples/basic_retrieval?hl=sl
In this we have implementation of MovielenseModel class. Let me provide snippet of same code below
class MovielensModel(tfrs.Model):
def __init__(self, user_model, movie_model):
super().__init__()
self.movie_model: tf.keras.Model = movie_model
self.user_model: tf.keras.Model = user_model
self.task: tf.keras.layers.Layer = task
def compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:
# We pick out the user features and pass them into the user model.
user_embeddings = self.user_model(features["user_id"])
# And pick out the movie features and pass them into the movie model,
# getting embeddings back.
positive_movie_embeddings = self.movie_model(features["movie_title"])
# The task computes the loss and the metrics.
return self.task(user_embeddings, positive_movie_embeddings)
In this one usages are not clear and could not find much help in any online documentations
Usage of self.movie_model: tf.keras.Model = movie_model . Looks like its first class object implementation of function but how does this work? When I simply tried d:c=3, just to replicate it worked fine d gets value 3 and c its saying as undefined.
Its an type annotation, check this link here https://docs.python.org/3/library/typing.html. Here self.movie_model is supposed to be an instance of tf.keras.Model it is very useful and helpful as python is dynamically typed language especially in function / method signatures
you can annotate types of inpuit params and the type of return value

Restarting an optimisation with Pymoo

I'm trying to restart an optimisation in pymoo.
I have a problem defined as:
class myOptProb(Problem):
"""my body goes here"""
algorithm = NSGA2(pop_size=24)
problem = myOptProblem(opt_obj=dp_ptr,
nvars=7,
nobj=4,
nconstr=0,
lb=0.3 * np.ones(7),
ub=0.7 * np.ones(7),
parallelization=('threads', cpu_count(),))
res = minimize(problem,
algorithm,
('n_gen', 100),
seed=1,
verbose=True)
During the optimisation I write the design vectors and results to a .csv file. An example of design_vectors.csv is:
5.000000000000000000e+00, 4.079711567060104183e-01, 6.583544872784267143e-01, 4.712364759485179189e-01, 6.859360188593541796e-01, 5.653765991273791425e-01, 5.486782880836487131e-01, 5.275405748345924906e-01,
7.000000000000000000e+00, 5.211287914743063521e-01, 6.368123569438421949e-01, 3.496693260479644128e-01, 4.116734716044557763e-01, 5.343037085833151068e-01, 6.878382993278697732e-01, 5.244120877022839800e-01,
9.000000000000000000e+00, 5.425317846613321171e-01, 5.275405748345924906e-01, 4.269449637288642574e-01, 6.954464617649794844e-01, 5.318980876983187001e-01, 4.520564690494201510e-01, 5.203792876471586837e-01,
1.100000000000000000e+01, 4.579502451694219545e-01, 6.853050113762846340e-01, 3.695822666721857441e-01, 3.505318077758549089e-01, 3.540316632186925050e-01, 5.022648662707586142e-01, 3.086099221096791911e-01,
3.000000000000000000e+00, 4.121775968257620493e-01, 6.157117313805953174e-01, 3.412904026310568106e-01, 4.791574104703620329e-01, 6.634382012372381787e-01, 4.174456593494717538e-01, 4.151101354345394512e-01,
The results.csv is:
5.000000000000000000e+00, 1.000000000000000000e+05, 1.000000000000000000e+05, 1.000000000000000000e+05, 1.000000000000000000e+05,
7.000000000000000000e+00, 1.041682833582066703e+00, 3.481167125962069189e-03, -5.235115318709097909e-02, 4.634480813876099177e-03,
9.000000000000000000e+00, 1.067730307802263967e+00, 2.194702810002167534e-02, -3.195892023664552717e-01, 1.841232582360878426e-03,
1.100000000000000000e+01, 8.986880344052742275e-01, 2.969022150977750681e-03, -4.346692726475211849e-02, 4.995468429444801205e-03,
3.000000000000000000e+00, 9.638770499257821589e-01, 1.859596479928402393e-02, -2.723230073142696162e-01, 1.600910928983005632e-03,
The first column is the index of the design vector - because I thread asynchronously, I specify the indices.
I see that it should be possible to restart the optimisation via the sampling parameter for pymoo.algorithms.nsga2.NSGA2 but I couldn't find a working example. The documentation for both population and individuals is also not clear. So how can I restart a simulation with the previous results?
Yes, you can initialize the algorithm object with a population instead of doing it randomly.
I have written a small tutorial for a biased initialization:
https://pymoo.org/customization/initialization.html
Because in your case the data already exists, in a CSV or in-memory file, you might want to create a dummy problem (I have called it Constant in my example) to set the attributes in the Population object. (In the population X, F, G, CV and feasible needs to be set). Another way would be setting the attributes directly...
The biased initialization with a dummy problem is shown below. If you already use pymoo to store the csv files, you can also just np.save the Population object directly and load it. Then all intermediate steps are not necessary.
I am planning to improve checkpoint implementation in the future. So if you have some more feedback and use case which are not possible yet please let me know.
import numpy as np
from pymoo.algorithms.nsga2 import NSGA2
from pymoo.algorithms.so_genetic_algorithm import GA
from pymoo.factory import get_problem, G1, Problem
from pymoo.model.evaluator import Evaluator
from pymoo.model.population import Population
from pymoo.optimize import minimize
class YourProblem(Problem):
def __init__(self, n_var=10):
super().__init__(n_var=n_var, n_obj=1, n_constr=0, xl=-0, xu=1, type_var=np.double)
def _evaluate(self, x, out, *args, **kwargs):
out["F"] = np.sum(np.square(x - 0.5), axis=1)
problem = YourProblem()
# create initial data and set to the population object - for your this is your file
N = 300
X = np.random.random((N, problem.n_var))
F = np.random.random((N, problem.n_obj))
G = np.random.random((N, problem.n_constr))
class Constant(YourProblem):
def _evaluate(self, x, out, *args, **kwargs):
out["F"] = F
out["G"] = G
pop = Population().new("X", X)
Evaluator().eval(Constant(), pop)
algorithm = GA(pop_size=100, sampling=pop)
minimize(problem,
algorithm,
('n_gen', 10),
seed=1,
verbose=True)

__post_init__ of python 3.x dataclasses is not called when loaded from yaml

Please note that I have already referred to StackOverflow question here. I post this question to investigate if calling __post_init__ is safe or not. Please check the question till the end.
Check the below code. In step 3 where we load dataclass A from yaml string. Note that it does not call __post_init__ method.
import dataclasses
import yaml
#dataclasses.dataclass
class A:
a: int = 55
def __post_init__(self):
print("__post_init__ got called", self)
print("\n>>>>>>>>>>>> 1: create dataclass object")
a = A(33)
print(a) # print dataclass
print(dataclasses.fields(a))
print("\n>>>>>>>>>>>> 2: dump to yaml")
s = yaml.dump(a)
print(s) # print yaml repr
print("\n>>>>>>>>>>>> 3: create class from str")
a_ = yaml.load(s)
print(a_) # print dataclass loaded from yaml str
print(dataclasses.fields(a_))
The solution that I see for now is calling __-post_init__ on my own at the end like in below code snippet.
a_.__post_init__()
I am not sure if this is safe recreation of yaml serialized dataclass. Also, it will pose a problem when __post_init__ takes kwargs in case when dataclass fields are dataclasses.InitVar type.
This behavior is working as intended. You are dumping an existing object, so when you load it pyyaml intentionally avoids initializing the object again. The direct attributes of the dumped object will be saved even if they are created in __post_init__ because that function runs prior to being dumped. When you want the side effects that come from __post_init__, like the print statement in your example, you will need to ensure that initialization occurs.
There are few ways to accomplish this. You can use either the metaclass or adding constructor/representer approaches described in pyyaml's documentation. You could also manually alter the dumped string in your example to be ''!!python/object/new:' instead of ''!!python/object:'. If your eventual goal is to have the yaml file generated in a different manner, then this might be a solution.
See below for an update to your code that uses the metaclass approach and calls __post_init__ when loading from the dumped class object. The call to cls(**fields) in from_yaml ensures that the object is initialized. yaml.load uses cls.__new__ to create objects tagged with ''!!python/object:' and then loads the saved attributes into the object manually.
import dataclasses
import yaml
#dataclasses.dataclass
class A(yaml.YAMLObject):
a: int = 55
def __post_init__(self):
print("__post_init__ got called", self)
yaml_tag = '!A'
yaml_loader = yaml.SafeLoader
#classmethod
def from_yaml(cls, loader, node):
fields = loader.construct_mapping(node, deep=True)
return cls(**fields)
print("\n>>>>>>>>>>>> 1: create dataclass object")
a = A(33)
print(a) # print dataclass
print(dataclasses.fields(a))
print("\n>>>>>>>>>>>> 2: dump to yaml")
s = yaml.dump(a)
print(s) # print yaml repr
print("\n>>>>>>>>>>>> 3: create class from str")
a_ = yaml.load(s, Loader=A.yaml_loader)
print(a_) # print dataclass loaded from yaml str
print(dataclasses.fields(a_))

How to run all modules in a folder?

/machine_learning
dtree.py
lr.py
nb.py
svm.py
/main.py
Each python file contains one class of machine learning method. In the main.py, import machine_learning as ml, so calling each method like
model = ml.py_name.model_name()
Is there a way to let me build a list containing all the model classes like
[ml.svm.svm_ml(), ml.nb.naivebayes(), ml.lr.logisticregression(), ml.dtree.decisiontree()]
I tried
ml_list = [name for _, name, _ in pkgutil.iter_modules(['machine_learning'])];
print(ml_list);
#["dtree","lr","nb","svm"]
import all models you need -> from sklearn.neighbors import KNeighborsClassifier
creat list models=[]
add models to list -> models.append(KNeighborsClassifier(n_neighbors=3))
split your data to train test
use for loop to fit your data to models
for the model in models:
model.fit(X, Y)

Using the globals argument of timeit.timeit

I am attempting to run timeit.timeit in the following class:
from contextlib import suppress
from pathlib import Path
import subprocess
from timeit import timeit
class BackupVolume():
'''
Backup a file system on a volume using tar
'''
targetFile = "bd.tar.gz"
srcPath = Path("/BulkData")
excludes = ["--exclude=VirtualBox VMs/*", # Exclude all the VM stuff
"--exclude=*.tar*"] # Exclude this tar file
#classmethod
def backupData(cls, targetPath="~"): # pylint: disable=invalid-name
'''
Runs tar to backup the data in /BulkData so we can reorganize that
volume. Deletes any old copy of the backup repository.
Parameters:
:param str targetPath: Where the backup should be created.
'''
# pylint: disable=invalid-name
tarFile\
= Path(Path(targetPath /
cls.targetFile).resolve())
with suppress(FileNotFoundError):
tarFile.unlink()
timeit('subprocess.run(["tar", "-cf", tarFile.as_posix(),'
'cls.excludes[0], cls.excludes[1], cls.srcPath.as_posix()])',
number=1, globals=something)
The problem I have is that inside timeit() it cannot interpret subprocess. I believe that the globals argument to timeit() should help but I have no idea how to specify the module namespace. Can someone show me how?
I think in your case globals = globals() in the timeit call would work.
Explanation
The globals argument specifies a namespace in which to execute the code. Due to your import of the subprocess module (outside the function, even outside the class) you can use globals(). In doing so you have access to a dictionary of the current module, you can find more info in the documentation.
Super simple example
In this example I'll expose 3 different scenarios.
Need to access globals
Need to access locals
Custom namespace
Code to follow the example:
import subprocess
from timeit import timeit
import math
class ExampleClass():
def performance_glob(self):
return timeit("subprocess.run('ls')", number = 1, globals = globals())
def performance_loc(self):
a = 69
b = 42
return timeit("a * b", number = 1, globals = locals())
def performance_mix(self):
a = 69
return timeit("math.sqrt(a)", number = 1, globals = {'math': math, 'a': a})
In performance_glob you are timing something that needs a global import, the module subprocess. If you don't pass the globals namespace you'll get an error message like this NameError: name 'subprocess' is not defined
On the contrary, if you pass globals() to the function that depends on local values performance_loc the needed variables for the timeit execution a and b won't be in the scope. That's why you can use locals()
The last one is a general scenario where you need both the local vars in the function and general imports. If you keep in mind that the parameter globals can be specified as a dictionary, you just need to provide the necessary keys, you can customize it.

Resources