What does "for" in "defimpl" in Elixir actually checks for? - protocols

Does "for" always checks the type of first argument in each function defined in a protocol?
EDIT (rephrasing):
When protocol method has only one argument, implementation is found based on the type of this single argument (either direct or as Any). When protocol method has multiple arguments, which one is used to find the corresponding implementation? Is it always the first one? Can it be changed?

The implementation is always determined based on the first argument.
When you define a protocol, a generic protocol module will be generated. All def clauses in that module will perform delegations to concrete functions, determining which function to call based on the type of the first argument.
The place in Elixir source where this happens is here (where first argument is explicitly referred to as t), and here (where t is passed to impl_for! to obtain the module where the function call is forwarded).
A defimpl will generate concrete modules whose names adhere to the internal conventions used by defprotocol. Thus it is ensured that function call will be delegated to the proper concrete module.

It's my understanding that for defines what type the protocol implementation is for. When a function specified in a protocol is invoked on a value, Elixir checks to see if there is an implementation of that function for that type. Of course there are some special cases like fallback to Any and built in protocols. But to answer your question, yes, as far as I know, the type is always checked.
More can be learned by inspect the source code:
https://github.com/elixir-lang/elixir/blob/150a8a1dcd3610d5ff875e00a1c8779894456ca6/lib/elixir/lib/protocol.ex#L522
https://github.com/elixir-lang/elixir/blob/150a8a1dcd3610d5ff875e00a1c8779894456ca6/lib/elixir/lib/protocol.ex#L456
EDIT
It appears that the type of the first argument is the only thing that matters.
Starting on line 28 in protocol.ex, it seems that the first argument is the only one that is taken into account: https://github.com/elixir-lang/elixir/blob/150a8a1dcd3610d5ff875e00a1c8779894456ca6/lib/elixir/lib/protocol.ex#L28
From what I understand only the type of first argument is taken into account. The types of all the other values are ignored.

Related

What is ... variadic argument syntactically?

What does a C/C++ compiler think ... is? To be clear, I don't think this is a duplicate question becuase other stdarg questions are about "what are variadic argument lists/how do they work?" That's not my question.
I have looked through MSVC's include files and found stdarg.h, vcruntime.h, etc., but haven't satisfied myself yet.
Does the compiler see ... as an operator? A linker symbol? A macro? It can't be an identifier, because that source character (.) isn't allowed in identifiers.
If I had to guess, I'd say it's something akin to using __attribute__ macros or inline or register compiler "hints" to inhibit warnings/errors upon invoking the function with multiple parameters.
From ISO9899:
6.5.2.2 Function calls
Constraints
6 The ellipsis notation in a function prototype declarator causes
argument type conversion to stop after the last declared parameter. The default argument
promotions are performed on trailing arguments.
I suppose not everything needs to be nailed down exactly, but I was curious if maybe there was more technical information out there.
A punctuator.
ISO 9899:
6.4.6 PunctuatorsSemantics2  A punctuator is a symbol that has independent syntactic and semantic significance. Depending on context, it may specify an operation to be performed (which in turn may yield a value or a function designator, produce a side effect, or some combination thereof) in which case it is known as an operator (other forms of operator also exist in somecontexts). An operand is an entity on which an operator acts.

Is there a way to use the namelist I/O feature to read in a derived type with allocatable components?

Is there a way to use the namelist I/O feature to read in a derived type with allocatable components?
The only thing I've been able to find about it is https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/269585 which ended on an fairly unhelpful note.
Edit:
I have user-defined derived types that need to get filled with information from an input file. So, I'm trying to find a convenient way of doing that. Namelist seems like a good route because it is so succinct (basically two lines). One to create the namelist and then a namelist read. Namelist also seems like a good choice because in the text file it forces you to very clearly show where each value goes which I find highly preferable to just having a list of values that the compiler knows the exact order of. This makes it much more work if I or anyone else needs to know which value corresponds to which variable, and much more work to keep clean when inevitably a new value is needed.
I'm trying to do something of the basic form:
!where myType_T is a type that has at least one allocatable array in it
type(myType_T) :: thing
namelist /nmlThing/ thing
open(1, file"input.txt")
read(1, nml=nmlThing)
I may be misunderstanding user-defined I/O procedures, but they don't seem to be a very generic solution. It seems like I would need to write a new one any time I need to do this action, and they don't seem to natively support the
&nmlThing
thing%name = "thing1"
thing%siblings(1) = "thing2"
thing%siblings(2) = "thing3"
thing%siblings(3) = "thing4"
!siblings is an allocatable array
/
syntax that I find desirable.
There are a few solutions I've found to this problem, but none seem to be very succinct or elegant. Currently, I have a dummy user-defined type that has arrays that are way large instead of allocatable and then I write a function to copy the information from the dummy namelist friendly type to the allocatable field containing type. It works just fine, but it is ugly and I'm up to about 4 places were I need to do this same type of operation in the code.
Hence trying to find a good solution.
If you want to use allocatable components, then you need to have an accessible generic interface for a user defined derived type input/output procedure (typically by the type having a generic binding for such a procedure). You link to a thread with an example with such a procedure.
Once invoked, that user defined derived type input/output procedure is then responsible for reading and writing the data. That can include invoking namelist input/output on the components of the derived type.
Fortran 2003 also offers derived types with length parameters. These may offer a solution without the need for a user defined derived type input/output procedure. However, use of derived types with length parameters, in combination with namelist, will put you firmly in the "highly experimental" category with respect to the current compiler implementation.

Is there a compelling reason to call type.mro() rather than iterate over type.__mro__ directly?

Is there a compelling reason to call type.mro() rather than iterate over type.__mro__ directly? It's literally ~13 times faster to access (36ns vs 488 ns)
I stumbled upon it while looking to cache type.mro(). It seems legit, but it makes me wonder: can I rely on type.__mro__, or do I have to call type.mro()? and under what conditions can I get away with the former?
More importantly, what set of conditions would have to occur for type.__mro__ to be invalid?
For instance, when a new subclass is defined/created that alters an existing class's mro, is the existing class' .__mro__ immediately updated? Does this happen on every new class creation? that makes it part of class type? Which part? ..or is that what type.mro() is about?
Of course, all that is assuming that type.__mro__ is, in fact, a tuple of cached names pointing to the objects in a given type's mro. If that assumption is incorrect; then, what is it? (probably a descriptor or something..) and why can/can't I use it?
EDIT: If it is a descriptor, then I'd love to learn its magic, as both: type(type.__mro__) is tuple and type(type(type).__mro__) is tuple (ie: probably not a descriptor)
EDIT: Not sure how relevant this is, but type('whatever').mro() returns a list whereas type('whatever').__mro__ returns a tuple. (Un?)fortunately, appending to that list doesn't change the __mro__ or subsequent calls to .mro() of/on the type in question (in this case, str).
Thanks for the help!
According to the docs:
class.__mro__
This attribute is a tuple of classes that are considered when looking for base classes during method resolution.
class.mro()
This method can be overridden by a metaclass to customize the method resolution order for its instances. It is called at class instantiation, and its result is stored in __mro__.
So yes, your assumption about __mro__ being a cache is correct. If your metaclass' mro() always returns the same thing, or if you don't have any metaclasses, you can safely use __mro__.

Implementing pass-by-reference argument semantics in an interpreter

Pass-by-value semantics are easy to implement in an interpreter (for, say, your run-of-the-mill imperative language). For each scope, we maintain an environment that maps identifiers to their values. Processing a function call involves creating a new environment and populating it with copies of the arguments.
This won't work if we allow arguments that are passed by reference. How is this case typically handled?
First, your interpreter must check that the argument is something that can be passed by reference – that the argument is something that is legal in the left-hand side of an assignment statement. For example, if f has a single pass-by-reference parameter, f(x) is okay (since x := y makes sense) but f(1+1) is not (1+1 := y makes no sense). Typical qualifying arguments are variables and variable-like constructs like array indexing (if a is an array for which 5 is a legal index, f(a[5]) is okay, since a[5] = y makes sense).
If the argument passes that check, it will be possible for your interpreter to determine while processing the function call which precise memory location it refers to. When you construct the new environment, you put a reference to that memory location as the value of the pass-by-reference parameter. What that reference concretely looks like depends on the design of your interpreter, particularly on how you represent variables: you could simply use a pointer if your implementation language supports it, but it can be more complex if your design calls for it (the important thing is that the reference must make it possible for you to retrieve and modify the value contained in the memory location being referred to).
while your interpreter is interpreting the body of a function, it may have to treat pass-by-referece parameters specially, since the enviroment does not contain a proper value for it, just a reference. Your interpreter must recognize this and go look what the reference points to. For example, if x is a local variable and y is a pass-by-reference parameter, computing x+1 and y+1 may (depending on the details of your interpreter) work differently: in the former, you just look up the value of x, and then add one to it; in the latter, you must look up the reference that y happens to be bound to in the environment and go look what value is stored in the variable on the far side of the reference, and then you add one to it. Similarly, x = 1 and y = 1 are likely to work differently: the former just goes to modify the value of x, while the latter must first see where the reference points to and modify whatever variable or variable-like thing (such as an array element) it finds there.
You could simplify this by having all variables in the environment be bound to references instead of values; then looking up the value of a variable is the same process as looking up the value of a pass-by-reference parameter. However, this creates other issues, and it depends on your interpreter design and on the details of the language whether that's worth the hassle.

Shouldn't C# 4.0's new "named parameters" feature be called "named arguments"?

I suppose there could be historical reasons for this naming and that other languages have similar feature, but it also seems to me that parameters always had a name in C#. Arguments are the unnamed ones. Or is there a particular reason why this terminology was chosen?
Oh, you wanted arguments! Sorry, this is parameters - arguments are two doors down the hall on the left.
Yes, you're absolutely right (to my mind, anyway). Ironically, although I'm usually picky about these terms, I still use "parameter passing" when I should probably talk about "argument passing". I suppose one could argue that prior to C# 4.0, if you're calling a method you don't care about the parameter names, whereas the names become part of the significant metadata when you can specify them on the arguments as well.
I agree that it makes a difference, and that terminology is important.
"Optional parameters" is definitely okay though - that's adding metadata to the parameter when you couldn't do so before :) (Having said that, it's not going to be optional in terms of the generated IL...)
Would you like me to ask the team for their feedback?
I don't think so. The names are quite definitely the names of parameters, as they are defined and given a specific meaning in the method definition, where they are properly called the parameters to the method. At the call site, arguments can now be tagged with the name of the parameter that they supply a value for.
The new term refers to the perspective of the method caller - which is logical because that's where the feature applies. Previously, callers only had to think of parameters as being "positioned parameters". Now they can optionally treat them as "named parameters" - hence the name.
I dont know if its worth adding it now, but MS calls it named arguments anyway. See named and optional arguments

Resources