Clojure closures and GC - garbage-collection

It is my understanding that the default ClassLoader used in Java (and thus, Clojure) holds on to pointers to any anonymous classes created, and thus, onto lambdas and closures. These are never garbage collected, and so represent a "memory leak". There is some investigation going on for Java 7 or 8 (https://blogs.oracle.com/jrose/entry/anonymous_classes_in_the_vm) to adding an anonymous ClassLoader that will not retain references to these functions. In the mean time how are people dealing with writing long-running applications in languages like Clojure and Scala, that encourage the use of these constructs?
Is there any possibility that Clojure could provide its own anonymous ClassLoader, extending the system one, but not holding onto created classes?

From bendin's comment above, and information from The Joy of Clojure, by Michael Fogus and Chris Houser, in the section "Compile-time vs. Run-time" (Chapter 7, Section 7.2), Fogus and Houser explain that closures and anonymous functions are compiled to byte-code at compile time and each call to the higher-order function that returns the closure, simply returns a new instance of the closure class, and not a new class. These instances will, of course, be garbage collected. Since there is an obvious, compile-time, upper bound on the number of anonymous functions and closures, memory will infrequently, if ever, be an issue.
My worries were unfounded.

Related

Why is Box called like that in Rust?

Box<> is explained like this on the Rust Book:
... allow you to store data on the heap rather than the stack. What remains on the stack is the pointer to the heap data.
With a description like that, I would expect the described object to be called Heap<> or somethingHeapsomethingelse (DerefHeap, perhaps?). Instead, we use Box.
Why was the name Box chosen?
First, Heap is a very overloaded term, and importantly a heap is an abstract datastructure often used to implement things like priority queues. Having a type called Heap which is not a heap would be extremely confusing, a good reason to avoid that.
Second, "box" is related to the concept of "boxing" or "boxed" objects, in languages which strongly distinguish between value and reference types e.g. Java or Javascript: https://en.wikipedia.org/wiki/Object_type_(object-oriented_programming), in those a "boxed" type is the heap-allocated version of a value type e.g. int/Integer in java, or number/Number in Javascript.
Rust's Box performs an operation which is similar in spirit. Box also originally had a built-in "lifting" operator called box (it's still an internal operation and was originally planned to be stabilised for placement new), as such "box"/"boxing" makes sense linguistically in a way "heap"/"heaping" really does not (as "heaping" hints at a lot of things being put on a heap).

Stack allocation of `isbits` types in Julia

Summary of question and answers
Objects of a particular type, say
type Foo
a::A
b::B
end
can be stored in either of two ways:
Inlined (aka by value): in this case, the statement "variable foo::Foo is stored at location x" effectively means we have a variable foo.a::A at location x and a variable foo.b::B at location x + sizeof(A) (technically the addresses could be a bit more complicated, but that's irrelevant for our purposes).
Referenced (aka by reference): "foo::Foo is stored at location x" means the location x contains a pointer fooptr::Ptr{Foo} such that there is a variable foo.a::A at location fooptr and foo.b::B at location fooptr + sizeof(A).
Unlike other languages (I'm looking at you, C/C++), Julia decides by itself whether to store variables inlined or referenced, and it does so based on the properties of the type:
mutable types -> referenced,
immutable types -> referenced if at least one of its fields is referenced, inlined otherwise.
There are at least two reasons for this rule:
StefanKarpinski's answer: The garbage collector needs be able to find all pointers to heap-allocated objects on the stack. Currently, Julia ensures this by storing all such pointers on a separate "shadow stack", but if we allowed composite types containing pointers to be placed on the stack then such a neat separation would no longer be possible. Instead, the compiler would need to look for pointers among other variables which poses technical difficulties.
yuyichao's answer: Julia requires the inline/reference decision to be made on a per-type rather than per-object basis, which means a hypothetical type
immutable A
a::A
end
would have to be infinitely big if we insisted on inlining it. So we would either have to forbid such recursive immutable types, or we could at most allow non-recursive immutable types to be inlined.
Original question
My understanding of memory management in Julia is:
mutable types -> heap-allocated,
immutable types and tuples -> stack-allocated unless one of their fields is heap-allocated (i.e. mutable).
I don't quite understand the rationale for this behaviour, however. I've read somewhere that the problem with stack-allocating immutables with pointers to mutables is that then the garbage collector might consider the mutables unreachable and destroy them prematurely. On the other hand, if we place the immutable on the heap then there will still be a pointer to the mutables, so it might seem like we avoided the problem, but actually we just shifted it to making sure that now the immutable itself will not be destroyed.
Can anyone explain this to me who has only very superficial knowledge of how garbage collection works?
The problem with stack-allocation of objects which reference other objects is knowing that they need to be traced during garbage collection. The simplest way to do this is what Julia does: heap allocate the objects and "root" them using "shadow stack" which is pushed and popped in sync with the actual stack. This introduces a fair bit of overhead and forces these objects to be heap allocated.
A more sophisticated approach that avoids the overhead of a shadow stack and heap allocation is to stack allocated these objects and then scan the stack which doing garbage collection and follow references from objects in the stack to objects on the heap. However, this requires knowing which objects in the stack are pointers to objects on the heap – in general, non heap-allocated objects are not guaranteed to be kept intact or contiguous in registers or the stack. One approach to doing this is called "conservative stack scanning" which entails assuming during gc that any value on the stack which looks like it could be a pointer to an object on the heap actually is. That approach has been successfully used in applications like Safari's JavaScript engine, but it's not without it's challenges. We've contemplated using conservative stack scanning in Julia, and an initial effort to do so was started but the effort was never completed.
References:
https://github.com/JuliaLang/julia/issues/11714
https://github.com/JuliaLang/julia/pull/8134
There are multiple issues/concepts that are frequently mixed together whenever this is brought up.
mutable or non-pointerfree immutable doesn't necessarily mean heap allocation, we already have optimization passes to elide some of the optimizations and are working on improving them further.
The object layout ABI is an user visible behavior and not something an optimization pass can easily change (unless it can prove that the local optimization it wants to do does not escape). The current ABI is that only isbits immutable will be stored inline (and "stack allocated" when used as local variable). There's a fundamental limitation of lifting the requirement of pointerfree-ness for inlined object, i.e. the necessity to handle recursive types. It is impossible to make all types in a reference circle stored inline and the loop has to be broken somewhere if we want to make some of them inlined. I believe we do have a consistent and predictable model to do this though whether this is desireable is another issue.
This is somewhat related to performance but not always. Stored inline means more copy so it's hard to make sure there's no regression if we do the switch.
Edit: And I should also mention that pointer-free is a sufficient condition for cycle free and is easier to compute, which is partly why we are currently using it to break inlining cycles.
GC support. This is basically the easiest part. It's very easy to make GC recognize pointers on the stack. It just needs to be done if we decide to change the object layout ABI.
Edit: And I should add that "GC support" is needed because we currently only support a limited / simple stack layout for object reference (i.e. an array of pointers). It's this that needs to be improved.

Multiple specialization, iterator patterns in Rust

Learning Rust (yay!) and I'm trying to understand the intended idiomatic programming required for certain iterator patterns, while scoring top performance. Note: not Rust's Iterator trait, just a method I've written accepting a closure and applying it to some data I'm pulling off of disk / out of memory.
I was delighted to see that Rust (+LLVM?) took an iterator I had written for sparse matrix entries, and a closure for doing sparse matrix vector multiplication, written as
iterator.map_edges({ |x, y| dst[y] += src[x] });
and inlined the closure's body in the generated code. It went quite fast. :D
If I create two of these iterators, or use the first a second time (not a correctness issue) each instance slows down quite a lot (about 2x in this case), presumably because the optimizer no longer chooses to do specialization because of the multiple call sites, and you end up doing a function call for each element.
I'm trying to understand if there are idiomatic patterns that keep the pleasant experience above (I like it, at least) without sacrificing the performance. My options seem to be (none satisfying this constraint):
Accept dodgy performance (2x slower is not fatal, but no prizes either).
Ask the user to supply a batch-oriented closure, so acting on an iterator over a small batch of data. This exposes a bit much of the internals of the iterator (the data are compressed nicely, and the user needs to know how to unwrap them, or the iterator needs to stage an unwrapped batch in memory).
Make map_edges generic in a type implementing a hypothetical EdgeMapClosure trait, and ask the user to implement such a type for each closure they want to inline. Not tested, but I would guess this exposes distinct methods to LLVM, each of which get nicely inlined. Downside is that the user has to write their own closure (packing relevant state up, etc).
Horrible hacks, like make distinct methods map_edges0, map_edges1, ... . Or add a generic parameter the programmer can use to make the methods distinct, but which is otherwise ignored.
Non-solutions include "just use for pair in iterator.iter() { /* */ }"; this is prep work for a data/task-parallel platform, and I would like to be able to capture/move these closures to work threads rather than capturing the main thread's execution. Maybe the pattern I should be using is to write the above, put it in a lambda/closure, and ship it around instead?
In a perfect world, it would be great to have a pattern which causes each occurrence of map_edges in the source file to result in different specialized methods in the binary, without forcing the entire project to be optimized at some scary level. I'm coming out of an unpleasant relationship with managed languages and JITs where generics would be the only way (I know of) to get this to happen, but Rust and LLVM seem magical enough that I thought there might be a good way. How do Rust's iterators handle this to inline their closure bodies? Or don't they (they should!)?
It seems that the problem is resolved by Rust's new approach to closures outlined at
http://smallcultfollowing.com/babysteps/blog/2014/11/26/purging-proc/
In short, Option 3 above (make functions generic with respect to a new closure type) is now transparently implemented when you make an implementation generic using the new closure traits. Rust produces the type behind the scenes for you.

Alternatives to vtable

Vtables are ubiquitous in most OO implementations, but do they have alternatives? The wiki page for vtables has a short blurb, but not really to much info (and stubbed links).
Do you know of some language implementation which does not use vtables?
Are there are free online pages which discuss the alternatives?
Yes, there are many alternatives!
Vtables are only possible when two conditions hold.
All method calls can be determined statically. If you can call functions by string name, or if you have no type information about what objects you are calling methods on, you can't use vtables because you can't map each method to the index in some table. Similarly, if you can add functions to a class at runtime, you can't assign all methods an index in the vtable statically.
Inheritance can be determined statically. If you use prototypal inheritance, or another inheritance scheme where you can't tell statically what the inheritance structure looks like, you can't precompute the index of each method in the table or what particular class's method goes in a slot.
Commonly, inheritance is implemented by having a string-based table mapping names of functions to their implementations, along with pointers allowing each class to look up its base class. Method dispatch is then implemented by walking this structure looking for the lowest class at or above the class of the receiver object that implements the method. To speed up execution, techniques like inline caching are often used, where call sites store a guess of which method should be invoked based on the type of the object to avoid spending time traversing this whole structure. The Self programming language used this idea, which was then incorporates into the HotSpot JVM to handle interfaces (standard inheritance still uses vtables).
Another option is to use tracing, where the compiler emits code that guesses what the type of the object is and then hardcodes the method to call into the trace. Mozilla Firefox uses this in its JavaScript interpreter, since there isn't a way to build vtables for every object.
I just finished teaching a compilers course and one of my lectures was on implementations of objects in various programming languages and the associated tradeoffs. If you'd like, you can check out the slides here.
Hope this helps!

Side-effects in closures, are they still purely functional?

Being relatively new to functional programming, I expend lots of energy wondering “is this the functional way to do things?” Obviously recursion vs. iteration is pretty straightforward and it’s obvious that recursion is the functional way of doing things. But take closures for instance.
I’ve learned about closures using Lisp and I understand that closures are a combination of a function and an environment (sounds a lot like state and behavior). For instance:
(let ((x 1))
(defun doubleX()
(setf x (* x 2))))
Here we have a function doubleX that has been defined within the environment of the x variable. We could pass this function around to other functions and then invoke it and it will still be able to reference the x variable. The function can continue to refer to that variable, even if it is invoked outside of the environment where the variable has been defined. Many of the examples I’ve seen of closures look like this. Where setf is used to change the value of the lexical variable. This confuses me because:
1.) I thought setf was evil. Mostly because it causes side-effects and apparently they are also evil.
2.) Is this really “functional”? Seems like just a way of keeping global state and I thought functional languages were stateless.
Maybe I just don’t understand closures. Can someone help me out?
You're right, using closures to manipulate state is not purely functional. Lisp allows you to program in a functional style, but it doesn't force you to. I actually prefer this approach because it allows me to strike a pragmatic balance between purely functional and the convenience of modifying state.
What you might try is to write something that seems functional from the outside but keeps an internal mutable state for efficiency. A great example of this is memoization where you keep a record of all the previous invocations to speed up functions like fibonacci, but since the function always returns the same output for the same input and doesn't modify any external state, it can be considered to be functional from the outside.
Closures are a poor man's objects (and vice versa), see
When to use closure?
and my answer therein. So if you intend to use side-effects to manage state in your non-OO application, closures-over-mutable-state are indeed an easy way to do this. Immutable alternatives are "less evil", but 99.9% of languages offer mutable state and they can't all be wrong. :) Mutable state is valuable when used judiciously, but it can be especially error-prone when used with closures & capture, as seen here
On lambdas, capture, and mutability
In any case, I think the reason you see "so many examples like this" is that one of the most common ways to explain the behavior of closures is to show a tiny example like this where a closure captures a mutable and thus becomes a mini-stateful-object that encapsulates some mutable state. It's a great example to help ensure you understand the lifetime and side-effect implications of the construct, but it's not an endorsement to go and use this construct all over the place.
Most of the time with closures you'll just close over values or immutable state and 'not notice' that you're doing it.
Common Lisp and Scheme are not purely functional. Clojure is mostly functional, but still not purely. Haskell is the only language I know that is purely functional, I can't even mention the name of another one.
The truth is that working in a purely functional environment is very hard (go, learn Haskell and try to program something on it). So all these functional programming languages really what they do is allow functional programming, but not enforce it. Functional programming is very powerful, so use it whenever you can and when you can't don't.
Something important to know with the age that's coming is that anything that's functional is paralelizable, so it makes sense to avoid having side effects, or having in a smallest possible subset of your program as possible.

Resources