What are the Discrete and Box datatypes used by OpenAi's Gym? - openai-gym

They both seem like matrix/arrays.
I'm not much of a python guy, are these generic datatypes used within python or specific to the gym?
I'm reading through API and still confused on what these actually are.
For example (from the docs)
print(env.action_space)
#> Discrete(2)
print(env.observation_space)
#> Box(4,)
Why does the box have a trailing comma? Does this represent something.
What's the difference between the Discrete data type and Box type?
From what I've gathered the numbers inside are the dimensions.
Is Discrete analogous to an array and Box analogous to a matrix?

Discrete is a collection of actions that the agent can take, where only one can be chose at each step. There is no variability to an action in this scenario. If, for example you have an agent traversing a grid-world, an action in a discrete space might tell the agent to move forward, but the distance they will move forward is a constant.
Box defines a space in which the agent can act, and allows for variable forward distances in the gridworld scenario.
MultiDiscrete allows for multiple actions to be taken at once, similar to Box, but like Discrete, either they are taken or they are not. There is no 0.1 forward step.
Review this question for more information on how Box is used.
There isn't great built documentation that I have found, but reviewing the comments in the source code can be helpful.

Related

Do I use foreach for 2 different inspection checks in activity diagram?

I am new in doing an activity and currently, I am trying to draw one based on given description.
I enter into doubt on a particular section as I am unsure if it should be 'split'.
Under the "Employee", the given description is as follows:
Employee enter in details about physical damage and cleanliness on the
machine. For the cleanliness, there must be a statement to indicate
that the problem is no longer an issue.
As such, I use a foreach as a means to describe that there should be 2 checks - physical and cleanliness (see diagram in the link), before it moves on to the next activity under the System - for the system to record the checks.
Thus, am I on the right track? Thank you in advance for any replies.
Your example is no valid UML. In order to make it proper you need to enclose the fork/join in a expansion region like so:
A fork/join does not accept any sematic labels. They just split the control flow into several parallel ones which join at the end.
However, this still seems odd since you would probably have some control for the different inspections being entered. So I'd guess there's a decision which loops through multiple inspection entries. Personally I use regions only for handling interrupts. ADs are nice to a certain level. But sometimes a tabular text (like suggested by Cockburn) is just easier to write and read. Graphical programming is not the ultimate answer (unlike 42).
First, the 'NO' branch of the decision node must lead somewhere (at the end?).
After, It differs if you want to show the process for ONE or MULTIPLE inspections. But the most logical way is to represent the diagram for an inspection, because you wrote inspection without S ! If you want represent more than one inspection, you can use decision and merge node to represent loop that stop when there is no more inspection.

Optimiser for excel spreadsheet

I'm a mechanical engineer, and I have developed a pretty cool spreadsheet that I use to size some steel members for lifting beams. The set back is that I need to do some trial and error in the selection of the member until I get one that gets as close to the allowable limits as possible.
What I'm hoping to improve on is to develop a function that based upon a length and weight variable that I enter, the program runs a loop and automatically selects the best member size(s) based upon a list of the members and their physical properties. Is this possible?
Yeah, depending on the complexity, either a simple search through parameters (less than, more than etc) might bring you the answer. You can do it quite easily via Pandas library. Just load up the excel as pandas DataFrame (pandas.read_excel()), which then will allow you to perform the searches on that DataFrame object.
If you want to run some optimization algo, you should look into SciPy's optimize to get what you're looking for based on the input data (it handles unconstrained and constrained functions).
Of course, the question you've stated is quite general, so I only pointed the direction. More info would be better.

Why is it usually easier to perform selection tests in object space?

I'm taking an introductory graphics course, and while I intuitively understand that converting a click or touch into object coordinates will make the math much cleaner, reduce the chances for human error, and potentially make debugging easier, none of these are actually a very good explanation, conceptually, of why object coordinate spaces are used in selection tests, as opposed to simply using world coordinates for the test - rather, they're just observations of what tends to happen when object coordinates are used. So I ask: why?
A selection test involves comparing the click coordinates, which you get in window coordinates, against lots and lots of object features, which are represented in object coordinates.
You need to transform them into the same coordinate system in order to do the checks, so you can EITHER transform the one simple click point OR you can transform all the various object features.
Transforming one point or line is just a lot easier that transforming a whole bunch of object features of various types.
There are cases where the location of a specific object or point may not be known within a world coordinate system, but is known relative to some other coordinate system.
To summarize an example from my course text, consider the idea of two different towns, one using a grid system for its layout, and the other using what I can only describe as the New England we-made-cow-trails-into-roads method. A government employee is tasked with creating a layout of the area which includes them, and in doing so has to convert the two coordinate systems into a third, which encompasses the other two.
Sometimes, using a world atlas just isn't practical to get across the street, and so something much more local (and relevant) is used instead, as it provides much more detail over a much smaller area.
The text also explains that it may be more than simply impractical to use a given coordinate system - it may yield results that are improbable or just plain wrong. This is evidenced in the evolution of the geocentric and heliocentric models of the universe - the distance of the stars from us was calculated with very different results using the two models.
Thinking of my own example, the best that comes to mind would be something like your own internal organs - from the outside, you don't know for sure exactly the shape, size, and structure of each of them, but your own body does. In order to be able to access that information, you need to look inside the body (ideally in a way that doesn't kill you). It's not something that is plainly observable from outside.

Identifying teeth area within a mouth region in an image

I am trying to do an image manipulation wherein the user would be prompted to enclose the mouth portion within an image. Once the user does that my application should identify the pixels that would identify the teeth (the color varying from white to yellow) and then I would like to brighten only those pixel. Could anyone give me a guidance on how to proceed?
Your question is quite honestly, very broad as an adequate answer will touch on a large number of areas.
Nevertheless, what you are trying to attempt is called Pattern Recognition. More specifically, your problem is geared towards image-analysis, dealing mainly in Template Matching:
Template matching is a technique in digital image processing for
finding small parts of an image which match a template image. It can
be used in manufacturing as a part of quality control, a way to
navigate a mobile robot, or as a way to detect edges in images.
The Template Matching page has a C-like language sample algorithm which demonstrates what you are attempting to do (identify a specific color within an image).
As for how to go about this, generally speaking you will have to load an image, store it into an array then try to manipulate it as the algorithm suggests:
One way to perform template matching on color images is to decompose
the pixels into their color components and measure the quality of
match between the color template and search image using the sum of the
absolute differences (SAD) computed for each color separately.
Of course, there are numerous projects in various languages that do that for you. My suggestion is to read up a bit more on the topic, pick a language, and attempt a solution using libraries as necessary.
One book that you might find to be very helpful is the classic Phillips: Image Processing in C even if you don't want to use C. Why? Because it pores over a lot of the algorithmic details in how they work, and how to implement them. And, its free too.

What is the best approach to compute efficiently the first intersection between a viewing ray and a set of objects?

For instance:
An approach to compute efficiently the first intersection between a viewing ray and a set of three objects: one sphere, one cone and one cylinder (other 3D primitives).
What you're looking for is a spatial partitioning scheme. There are a lot of options for dealing with this, and lots of research spent in this area as well. A good read would be Christer Ericsson's Real-Time Collision Detection.
One easy approach covered in that book would be to define a grid, assign all objects to all cells it intersects, and walk along the grid cells intersecting the line, front to back, intersecting with each object associated with that grid cell. Keep in mind that an object might be associated with more grid-cells, so the intersection point computed might actually not be in the current cell, but actually later on.
The next question would be how you define that grid. Unfortunately, there's no one good answer, and you need to consider what approach might fit your scenario best.
Other partitioning schemes of interest are different tree structures, such as kd-, Oct- and BSP-trees. You could even consider using trees combined with a grid.
EDIT
As pointed out, if your set is actually these three objects, you're definately better of just intersecting each one, and just pick the earliest one. If you're looking for ray-sphere, ray-cylinder, etc, intersection tests, these are not really hard and a quick google should supply all the math you might possibly need. :)
"computationally efficient" depends on how large the set is.
For a trivial set of three, just test each of them in turn, it's really not worth trying to optimise.
For larger sets, look at data structures which divide space (e.g. KD-Trees). Whole chapters (and indeed whole books) are dedicated to this problem. My favourite reference book is An Introduction to Ray Tracing (ed. Andrew. S. Glassner)
Alternatively, if I've misread your question and you're actually asking for algorithms for ray-object intersections for specific types of object, see the same book!
Well, it depends on what you're really trying to do. If you'd like to produce a solution that is correct for almost every pixel in a simple scene, an extremely quick method is to pre-calculate "what's in front" for each pixel by pre-rendering all of the objects with a unique identifying color into a background item buffer using scan conversion (aka the z-buffer). This is sometimes referred to as an item buffer.
Using that pre-computation, you then know what will be visible for almost all rays that you'll be shooting into the scene. As a result, your ray-environment intersection problem is greatly simplified: each ray hits one specific object.
When I was doing this many years ago, I was producing real-time raytraced images of admittedly simple scenes. I haven't revisited that code in quite a while but I suspect that with modern compilers and graphics hardware, performance would be orders of magnitude better than I was seeing then.
PS: I first read about the item buffer idea when I was doing my literature search in the early 90s. I originally found it mentioned in (I believe) an ACM paper from the late 70s. Sadly, I don't have the source reference available but, in short, it's a very old idea and one that works really well on scan conversion hardware.
I assume you have a ray d = (dx,dy,dz), starting at o = (ox,oy,oz) and you are finding the parameter t such that the point of intersection p = o+d*t. (Like this page, which describes ray-plane intersection using P2-P1 for d, P1 for o and u for t)
The first question I would ask is "Do these objects intersect"?
If not then you can cheat a little and check for ray collisions in order. Since you have three objects that may or may not move per frame it pays to pre-calculate their distance from the camera (e.g. from their centre points). Test against each object in turn, by distance from the camera, from smallest to largest. Although the empty space is the most expensive part of the render now, this is more effective than just testing against all three and taking a minimum value. If your image is high res then this is especially efficient since you amortise the cost across the number of pixels.
Otherwise, test against all three and take a minimum value...
In other situations you may want to make a hybrid of the two methods. If you can test two of the objects in order then do so (e.g. a sphere and a cube moving down a cylindrical tunnel), but test the third and take a minimum value to find the final object.

Resources