Vtables are ubiquitous in most OO implementations, but do they have alternatives? The wiki page for vtables has a short blurb, but not really to much info (and stubbed links).
Do you know of some language implementation which does not use vtables?
Are there are free online pages which discuss the alternatives?
Yes, there are many alternatives!
Vtables are only possible when two conditions hold.
All method calls can be determined statically. If you can call functions by string name, or if you have no type information about what objects you are calling methods on, you can't use vtables because you can't map each method to the index in some table. Similarly, if you can add functions to a class at runtime, you can't assign all methods an index in the vtable statically.
Inheritance can be determined statically. If you use prototypal inheritance, or another inheritance scheme where you can't tell statically what the inheritance structure looks like, you can't precompute the index of each method in the table or what particular class's method goes in a slot.
Commonly, inheritance is implemented by having a string-based table mapping names of functions to their implementations, along with pointers allowing each class to look up its base class. Method dispatch is then implemented by walking this structure looking for the lowest class at or above the class of the receiver object that implements the method. To speed up execution, techniques like inline caching are often used, where call sites store a guess of which method should be invoked based on the type of the object to avoid spending time traversing this whole structure. The Self programming language used this idea, which was then incorporates into the HotSpot JVM to handle interfaces (standard inheritance still uses vtables).
Another option is to use tracing, where the compiler emits code that guesses what the type of the object is and then hardcodes the method to call into the trace. Mozilla Firefox uses this in its JavaScript interpreter, since there isn't a way to build vtables for every object.
I just finished teaching a compilers course and one of my lectures was on implementations of objects in various programming languages and the associated tradeoffs. If you'd like, you can check out the slides here.
Hope this helps!
Related
I am learning about virtual function tables and their representation by analyzing a binary of a simple program written in Visual C++ (with some optimizations on).
A few days ago I asked this question while being stuck on virtual method table content with identical COMDAT folding on.
Now I'm stuck on something else: whenever I analyze a class, I need to find its Virtual Method Table. I can do this by finding either its RTTITypeDescriptor or _s_RTTIClassHierarchyDescriptor, finding a cross reference on it, which should lead me to the _RTTICompleteObjectLocator. When I find a cross reference to the Complete Object Locator, it is written just before the VMT (basically -1st entry of the VMT).
This approach works on some classes (their names start with C in my program). Then there are classes, that are named with I in the beginning and I am able to pair them with other classes starting with C -- for example there is class CClass and it inherits from IClass. These I-classes are probably serving as interfaces to the C-classes and thus they probably only contain abstract methods.
By searching a cross reference to Type Descriptor or Class Hierarchy Descriptor of any of the I-classes I cannot find anything -- there is no Complete Object Locator that would lead me to the VMT of the class (that should be full of references to pure_virtual call if I am correct about the all-abstract methods in the I-classes and if I understand correctly what VMT of abstract class looks like).
Why do the I-classes have no VMT? Did the compiler optimize it out because it would just be full of references to pure_virtual call and manages it in a different way?
These "interfaces" abstract classes probably have need no user written code in any their constructors and destructors (these either have an empty body and no ctor-init-list, or simply are never user defined); let's call these pure interface classes.
[Pure interface class: concept related but not identical to Java interfaces that are (were?) defined as having zero implementation code, in any member function. But be careful with analogies, as Java interfaces inheritance semantic isn't the same as C++ abstract classes inheritance semantic.]
It means that in practice no used object ever has pure interface class type: no expression ever refers to an object with pure interface type. Hence, no vtable is ever needed, so the vtable, which may have been generated during compilation, isn't included in linked code (the linker can see the symbol of the pure interface class vtable isn't used).
In the book GOOS. It is told not to mock values, which leaves me confused. Does it means that values don't have any behavior?
I dont' much knowledge about the value object but AFAIK the value objects are those which are immutable. Is there any heuristic on deciding when to create a value object?
Not all immutable objects are value objects. By the way, when designing, consider that the ideal object has only immutable fields and no-arg methods.
Regarding the heuristic, a valid approach can be considering how objects will be used: if you build an instance, invoke some methods and then are done with it (or store it in a field) likely it won't be a value object. On the contrary, if you keep objects in some data structure and compare them (with .equals()) likely you have a value object. This is especially true for objects that will be used to key Maps
Value objects should be automatic-tested themselves (and tests are usually a pleasure to read and write because are straightforward) but there's no point in mocking them: the main practical reasons for mocking interfaces is that implementation classes
are usually difficult to build (need lot of collaborators)
are expensive to run (access the network, the filesystem, ...).
Neither apply to value objects.
Quoting the linked blog post:
There are a couple of heuristics for when a class is not worth mocking. First, it has only accessors or simple methods that act on values it holds, it doesn't have any interesting behaviour. Second, you can't think of a meaningful name for the class other than VideoImpl or some such vague term.
The implication of the first point, in the context of a section entitled "Don't mock value objects", is that value objects don't have interesting behaviour.
I am attempting to understand how I should use the realization of interfaces and the implementation of abstract classes in UML. I came across the post at https://stackoverflow.com/a/13438187/700543 whereby the poster states that pure virtual methods are interfaces whilst those that are part pure virtual methods are abstract classes. Is anyone able to give me a real world scenario and not one based on code?
An Interface is only a "class skeleton" for library users to extend, and as you said, methods cannot be implemented. An Abstract class can have implemented methods. I will give you a real life example:
Imagine I provide an Interface for people to implement sorting functions and I also provide a Class for bench marking sorting functions. My bench marking class only needs to know what methods of the Interface it needs to call in order to perform the bench marking, it does not know how they are implemented. Therefore, inside the bench marking class you might only see something like sortInterfaceInstace.getNumberOfSwap(), whereas sortInterfaceInstance is only known to be of sortInterface type at compile time, and not of any specific user sort implementation.
If you need implemented methods, use abstract instead of interfaces.
An interface only describes how something can be used, it provides none of the underlying implementation of how it gets done, i.e. a class with only pure virtual functions. An English analogy for an interface may be an adjective.
One example of an interface is a Movable interface. This interface may provide one pure virtual function move which tells the object to move to a given location. However, how it moves there is not implemented.
An abstract class on the other hand differs from an interface in that it provides some of the implementation details, but not all of them. These are conceptually high-level items that can be manipulated in certain ways, but when you get down to it the high-level item doesn't really exists or make sense by itself.
For example, say we have an abstract Shape class. The shape can have a certain origin which can be tracked independent of what Shape it is. The functions to transform the shape can be declared and implemented in the Shape class, saving the hassle of having to provide the same implementation in each sub-class. However, when you try to get the area or perimeter of the shape it's difficult to answer this without knowing more about the shape.
For example, could I implement a rule that would change every string that followed the pattern '1..4' into the array [1,2,3,4]? In JavaScript:
//here you create a rule that changes every string that matches /$([0-9]+)_([0-9]+)*/
//ever created into range($1,$2) (imagine a b are the results of the regexp)
var a = '1..4';
console.log(a);
>> output: [1,2,3,4];
Of course, I'm pretty confident that would be impossible in most languages. My question is: is there any language in which that would be possible? Or have anyone ever proposed something like that? Does this thing have a 'name' for which I can google to read more about?
Modifying the language from whithin itself falls under the umbrell of reflection and metaprogramming. It is referred as behavioral reflection. It differs from structural reflection that opperates at the level of the application (e.g. classes, methods) and not the language level. Support for behavioral reflection varies greatly across languages.
We can broadly categorize language changes in two categories:
changes that modify the semantics (i.e. the rules) of the language itself (e.g. redefine the method lookup algorithm),
changes that modify the syntax (e.g. your syntax '1..4' to create arrays).
For case 1, certain languages expose the structure of the application (structural reflection) and the inner working of their implementation (behavioral reflection) to the application itself via special object, called meta-objects. Meta-objects are reifications of otherwise implicit aspects, that become then explicitely manipulable: the application can modify the meta-objects to redefine part of its structure, or part of the language. When it comes to langauge changes, the focus is usually on modifiying message sending / method invocation since it is the core mechanism of object-oriented language. But the same idea could be applied to expose other aspects of the language, e.g. field accesses, synchronization primitives, foreach enumeration, etc. depending on the language.
For case 2, the program must be representated in a suitable data structure to be modified. For languages of the lisp family, the program manipulates lists, and the program can be itself represented as lists. This is called homoiconicity and is handy for metaprogramming, hence the flexibility of lisp-like languages. For other languages, their representation is usually an AST. Transforming the representation of the program, or rewriting it, is possible with macro, preprocessors, or hooks during compilation or class loading.
The line between 1 and 2 is however blurry. Syntactical changes can appear to modify the semantics of the language. For instance, I can rewrite all fields accesses with proper getter and setter and perform additional logic there, say to implement transactional memory. Did I perform a semantical change of what a field access is, or merely a syntax change?
Also, there are other constructs the fall bewten the lines. For instance, proxies and #doesNotUnderstand trap are popular techniques to simulate the reification of message sends to some extent.
Lisp and Smalltalk have been very influencial in the field of metaprogramming, and I think the two following projects/platform are interesting to look at for a representative of each of these:
Racket, a lisp-like language focused on growing languages from within the langauge
Helvetia, a Smalltalk extension to embed new languages into the host language by leveraging the AST of the host environment.
I hope you enjoyed this even if I did not really address your question ;)
Your desired change require modifying the way literals are created. This is AFAIK not usually exposed to the application. The closed work that I can think of is Virtual Values for Language Extension, that tackled Javascript.
Yes. Common Lisp (and certain other lisps) have "reader macros" which allow the user to reprogram (incrementally) the mapping between the input stream and the actual language construct as parsed.
See http://dorophone.blogspot.com/2008/03/common-lisp-reader-macros-simple.html
If you want to operate on the level of objects, you will want to use a debugging/memory management framework that keeps track of all objects, and processes the rules on each evaluation step (nasty). This seems like the kind of thing you might be able to shoehorn into smalltalk.
CLOS (Common Lisp Object System) allows redefinition of live objects.
Ultimately you need two things to implement this:
Access to the running system's AST (Abstract Syntax Tree), and
Access to the running system's objects.
You'll want to study meta-object protocols and the languages that use them, then the implementations of both the MOPs and the environment within which these programs are executed.
Image-based systems will be the easiest to modify (e.g., Lisp, potentially Smalltalk).
(Image-based systems store a snapshot of a running system, allowing complete shutdown and restarts, redefinitions, etc. of a complete environment, including existing objects, and their definitions.)
Ruby allows you to extend classes. For instance, this example adds functionality to the String class. But you can do more than add methods to classes. You can also overwrite methods, but defining a method that's already been defined. You may want to preserve access to the original method using alias_method.
Putting all this together, you can overload a constructor in Ruby, but in your case, there's a catch: It sounds like you want the constructor to return a different type. Constructors by definition return instances of their class. If you just want it to return the string "[1,2,3,4]", that's simple enough:
class string
alias_method :initialize :old_constructor
def initialize
old_constructor
# code that applies your transformation
end
end
But there's no way to make it return an Array if that's what you want.
What is vftable in high programming languages?
I read something like it's the address of a virtual object structure, but this is a pretty messy information
Can someone please explain it?
It most likely stands for "Virtual Function Table", and is a mechanism used by some runtime implementations in order to allow virtual function dispatch.
Mainstream C++ implementations (GCC, Clang, MSVS) call it the vtable. C has no polymorphism. I could only speculate about other languages.
Here's what Wikipedia says on the topic:
An object's dispatch table will contain the addresses of the object's
dynamically bound methods. Method calls are performed by fetching the
method's address from the object's dispatch table. The dispatch table
is the same for all objects belonging to the same class, and is
therefore typically shared between them. Objects belonging to
type-compatible classes (for example siblings in an inheritance
hierarchy) will have dispatch tables with the same layout: the address
of a given method will appear at the same offset for all
type-compatible classes. Thus, fetching the method's address from a
given dispatch table offset will get the method corresponding to the
object's actual class.[1]
The C++ standards do not mandate exactly how dynamic dispatch must be
implemented, but compilers generally use minor variations on the same
basic model.
Typically, the compiler creates a separate vtable for each class. When
an object is created, a pointer to this vtable, called the virtual
table pointer, vpointer or VPTR, is added as a hidden member of this
object (becoming its first member unless it's made the last[2]). The
compiler also generates "hidden" code in the constructor of each class
to initialize the vpointers of its objects to the address of the
corresponding vtable. Note that the location of the vpointer in the
object instance is not standard among all compilers, and relying on
the position may result in unportable code. For example, g++
previously placed the vpointer at the end of the object.[3]
Ellis & Stroustrup 1990, pp. 227–232
Heading "Multiple Inheritance"
CodeSourcery C++ ABI
Vftable is not explicitly mentioned in the C++ standard, but most (if not all) implementations use it for virtual function implementation.
For each class with virtual functions the compiler creates an array of function poiners which are the pointers to the last overriden version of the virtual functions of that class. Then each object has a pointer to the vtable of its dynamic class.
See this question and its accepted answer for more illustrations
Virtual dispatch implementation details