Pensando en programar

Why does 'this' behave that way?

Question: Why does this behave the way it does? (not how, which is already explained everywhere, but why) […] What kind of scenarios/patterns/designs does it enable? (if there are actually any besides confusing Java and .Net devs? ;) – @gulnor

This is a complex question. Mainly because I don't think there is a single answer. What I'm sure of is that many would title this question as “this, why are you that way? What did I do to you?”. But anyway, let's try to answer seriously :)

Historic Context

If there's something I'm never going to do is write a “History of JavaScript” article. Nevertheless, I feel it is necessary that we, at least partially, review the context of JavaScript's creation.

Even if we don't strictly constrain ourselves to the “11 days” thing, we need to understand that the creation and development of JavaScript was very quick. That is, even if we add some more research time or whatever to those 11 days, it is not much longer. In general I don't really mind just how long it took exactly, but we should have a very clear impression of urgency.

Urgency

It's fairly clear for anyone who does software development, why this urgency is relevant. When confronted to such a deadline and with enough liberty to define requirements (except for some general guides and restrictions, I understand Brendan Eich was largely free to decide on the design of the language), we tend to simplify.

There's a somewhat amusing detail in JavaScript: the amount of interesting keywords that were reserved but not used. Already in 1997, in the first spec we can find keywords not only like class or super, even export and import… or enum which is there from the first version but still (ES2018) hasn't seen any use. I guess this gives an idea of a project wanting to leave a number of paths open for the future but clearly conscious that a lot will be sacrificed for now to be able to ship in time.

And I don't mean with any of this that he opted for shortcuts or kludges, but that he opted for simplifications, simpler options that allowed for a more immediate work. As I understand it, Eich opted for developing something he could actually produce in the time given.

Prototypes and the absence of Classes

Guided by those ideas of simplifying and, also, by the inspiration from Scheme and Self, Eich opts for a inheritance/delegation system based on prototypes and the complete absence of classes. This simplifies a lot of things.

Now, here, you have to be careful with certain assumptions. This is 1995.

In this same year, Java and Ruby were born. Python is from 1991, Lua from 1993. Perl from '87 (with relevant versions on '93 and '94), C++ (or “C with classes”) from '85. C# is from 2000. These facts are not particularly relevant to JavaScript. But as a whole, looking at all these languages (and more) we can observe one detail that is actually relevant: Object orientation is not equal through all these languages. And so, it is risky to assume that one of them in particular represents the specifics of “the correct definition” or “the appropriate way to implement” OO. Even today, when Java/C#'s way has become mainstream and probably the most used way, this still applies. I mention this, because we have to keep in mind that they are simply different ways and not necessarily better or worse. The same applies to this (maybe called self or other things in other languages) and what it means.

In any case… Choosing this particular delegation/inheritance system (based in objects and prototypes, not using classes), is relevant to the meaning of this.

Me, Myself and I

In general, when a language allows creation of objects with properties and methods and such, it needs to provide some way in which it can refer to itself. e.g. we need that inside an object's method we can identify and refer to some other property from this same object. This reference to “this same object” we are operating on, is what we need to provide.

But as I said before, given that there are variations in how we can implement “object orientation”, there may also be some differences in what “this same” or “myself” mean.

Objects, Properties and Functions in JavaScript

For now, we know as relevant the decision of using prototypes and no classes. To this we add a second relevant decision: making functions first-class entities in the language.

Now a fairly easy way to implement objects in a language, specially with the previous decisions in mind, is making them be just a collection of properties, understood as slots. They are basically a table (conceptually; it doesn't mean they are implemented as a table or in any particular way).

This has a number of advantages. You can add, remove, modify properties in an object. And the fact that typing is dynamic and being just slots, gives you a lot of flexibility.

This flexibility, along with functions being first-class values, allows you to do stuff like:

function f() { /*...*/ };
let o = { };

o.prop = f;

And this is very nice, because with a simple and flexible object system it's just that easy to implement the concept of an object having methods: They are just functions we put into a slot! And functions are values so there's no additional concern in doing that.

Where am I? Who am I? Is there no beer left?

Ok, so this is were it all falls in. Keep your eyes peeled!

In such a system, how do we solve the necessity of, from an object's method, referring to “myself”, “this same object”? Well, there must surely be a few ways. We could, say, have some mechanism that, when a function is called, could find “what object this function is assigned to”. Ah, but… this is problematic.

function f() { /*...*/ };
let o = { };
let p = { }

o.prop = f;
p.prap = f;

o.prop();
p.prap();

So, no, because the function is free (as in freedom) and it doesn't help knowing that some slot from some object references it.

What actually happens, is that, being the design what it is, and being able to change at runtime the properties' values, the concept itself of “this same object” becomes distorted.

Or maybe instead of distorted, what happens is that “this same object” need to mean something equally dynamic, something that must be established in that same moment we make the call. There's no other alternative. At no other moment can we know what “this same object” means in a certain function call, except at the moment of making the call. Even more, the only thing that makes sense to be reffered as “this same object” is what the particular call says (o.prop()o).

What's more, this is something that happens for every call, for each call; and so it makes sense that this should not be inherited or be resolved from the container lexical context as other references do, because it is not tied to the definition context but to the calling context.

It is true, though, that more recently, years later, arrow functions were introduced and what these do is precisely that: resolve the reference of this from the definition context. Personally I believe that this was not the best decision possible and that it adds more complexities instead of reducing them. I would have preferred, say, that arrow functions couldn't reference this at all. This would've limited their usage a lot, of course. In any case, I'm aware that this new syntax (and semantic) for functions had additional motivations from the developer community. Well, such is life.

So, the fact that this refers to the subject of the call, and has to be so for every call, is not so much a design decision but the only available option given the rest of the design.

Other values for 'this'

Personally I feel that most confusion around this in JavaScript doesn't come from the explanation above -which I sincerely think is perfectly reasonable-, but instead comes from some design omissions elsewhere.

That is, in my opinion, there would be no -or much less- confusion if, e.g., there was some restriction that wouldn't allow referring to this in a call where no subject is available. Maybe a ReferenceError should be thrown. But instead, and here I think it is difficult to find any explanation other than urgency a shortcut was taken 1) 2) to assign “by omission” the global object to this on the call, just to have something assigned. (Later, this would be “fixed” leaving it as undefined which would then throw, but the error thrown is then the same as if we tried to de-reference undefined in any other case.)

And I say that this is what adds confusion because it allows for something we never want, to happen with normalcy. It lets us write code apparently valid but conceptually erroneous. The only case where we want to call a function that tries to reference “myself”, “this same object” is when there really is such a thing, not when doing calls without a subject.

Then there's the option of explicitly manipulating the assignment of the subject (not just call/apply or bind, but in general a fair number of functions which receive a function as argument). These cases, though, don't really raise any doubt, as it is us asking for it in an explicit way.

And finally there's the problem of new. But here I think we're talking about something else entirely. I mean, it is fairly logical and even reasonably intuitive that, if this is a value that depends on the call, and the call is through the operator that creates new objects, then this should reference that new object. The thing about new is how or why try to fit this mechanism for object creation in a prototype system. But this, as I said, is something else entirely and bear no relation with the meaning of this.

1)
Or at least an apparent shortcut; if we go by the general policy decision of making the language very permissive and avoiding throwing errors as much as possible, then this decision here is perfectly cromulent
2)
cromulent – (ironic) Appearing legitimate but actually being spurious