typeof null: investigating a classic JavaScript bug

Mar 19, 2018 08:40 · 1386 words · 7 minutes read javascript

In my last post, I looked into some JavaScript casting and explored why 0 <= null evaluates as true.

For this post, I’d like to investigate another unexpected behavior in JavaScript: why typeof(null) evaluates as 'object'.

This is a well-known bug, and we’ll investigate first in the ECMAScript specification followed by a deep dive into an early implementation of JavaScript to see the bug in its natural habitat.

The main idea is that the code assigned each item some bits for use as flags for different types, but null was different. Objects had a flag of 000, so an object’s last 3 bits were 000. null was previously defined as 32 0 bits: 00000000000000000000000000000000. When the code tried to check null’s flag, the last three bits of null (000) matched the Object flag (000), so it was incorrectly determined to be an object.

Let’s take a deeper look!

As before, for a proper background and preparation for this post, please watch (or rewatch!) wat.


To start, let’s take a look at the ECMAScript spec for typeof:

The typeof Operator

Runtime Semantics: Evaluation

UnaryExpression : typeof UnaryExpression

1. Let val be the result of evaluating UnaryExpression.
2. If Type(val) is Reference, then
  a. If IsUnresolvableReference(val) is true, return "undefined".
3. Set val to ? GetValue(val).
4. Return a String according to Table 35.

So when we call typeof null, null is passed in as the UnaryExpression. It is evaluated in step 1, and val gets its value, null. Type(null) evaluates to the Null type, so Type(val) is not Reference and step 2 is false. Next, step 3 calls GetValue on null. We saw get value before, but let’s take another look to confirm nothing odd is going on:

GetValue ( V )

1. ReturnIfAbrupt(V).
2. If Type(V) is not Reference, return V.
3. Let base be GetBase(V).
4. If IsUnresolvableReference(V) is true, throw a ReferenceError exception.
5. If IsPropertyReference(V) is true, then
  a. If HasPrimitiveBase(V) is true, then
    i. Assert: In this case, base will never be undefined or null.
    ii. Set base to ! ToObject(base).
  b. Return ? base.[[Get]](GetReferencedName(V), GetThisValue(V)).
6. Else base must be an Environment Record,
  a. Return ? base.GetBindingValue(GetReferencedName(V), IsStrictReference(V)) (see 8.1.1).

Calling GetValue with null will only reach the second step, as Type(null) is the Null type, and GetValue will return null.

Finally we reach step 4: “Return a String according to Table 35.” Here’s Table 35:

Type of val Result
Undefined “undefined”
Null “object”
Boolean “boolean
Number “number”
String “string”
Symbol “symbol”
Object (ordinary and does not implement [[Call]]) “object”
Object (standard exotic and does not implement [[Call]]) “object”
Object (implements [[Call]]) “function
Object (non-standard exotic and does not implement [[Call]]) Implementation-defined. Must not be “undefined”, “boolean”, “function”, “number”, “symbol”, or “string”.

Not very illuminating. It says that typeof null should just return the string 'object'. But why? Let’s dive a little deeper.

I came across the 2ality post about the history of typeof null, and it answers the question very well with reference to an early implementation. As Brendan Eich wrote in the top comment, the code comes from the 1996 Spider Monkey implementation, not the very original 10-day Mocha, although the bug existed in the original as well. Let’s take a look!

Although the link from that post is deprecated, we can still see the classic version of the code on https://dxr.mozilla.org/classic/source/js/src/.

Two main C files will be of interest here at first: classic/js/src/jsapi.h and classic/js/src/jsapi.c. Let’s take a look at the relevant part of jsapi.c first:

JS_PUBLIC_API(JSType)
JS_TypeOfValue(JSContext *cx, jsval v)
{
    JSType type = JSTYPE_VOID;
    JSObject *obj;
    JSObjectOps *ops;
    JSClass *clasp;

    CHECK_REQUEST(cx);
    if (JSVAL_IS_VOID(v)) {
        type = JSTYPE_VOID;
    } else if (JSVAL_IS_OBJECT(v)) {
        obj = JSVAL_TO_OBJECT(v);
        if (obj &&
            (ops = obj->map->ops,
             ops == &js_ObjectOps
             ? (clasp = OBJ_GET_CLASS(cx, obj),
                clasp->call || clasp == &js_FunctionClass)
             : ops->call != 0)) {
            type = JSTYPE_FUNCTION;
        } else {
            type = JSTYPE_OBJECT;
        }
    } else if (JSVAL_IS_NUMBER(v)) {
        type = JSTYPE_NUMBER;
    } else if (JSVAL_IS_STRING(v)) {
        type = JSTYPE_STRING;
    } else if (JSVAL_IS_BOOLEAN(v)) {
        type = JSTYPE_BOOLEAN;
    }
    return type;
}

Let’s walk through what this does.

First, it checks JSVAL_IS_VOID(v). That is defined in jsapi.h as follows:

#define JSVAL_IS_VOID(v)        ((v) == JSVAL_VOID)

And JSVAL_VOID is defined as:

#define JSVAL_VOID              INT_TO_JSVAL(0 - JSVAL_INT_POW2(30))

So the void value, what we know in JavaScript as undefined, is defined as -(2^30), an integer outside of the integer range. The typeof call first checks if that’s what we’re dealing with. If so, the line type = JSVAL_VOID is called, and then that is returned. Good so far!

If we aren’t dealing with a void value, the code then evaluates JSVAL_IS_OBJECT(v). This is defined as follows in jsapi.h:

#define JSVAL_IS_OBJECT(v)      (JSVAL_TAG(v) == JSVAL_OBJECT)

So this checks a tag on the value and sees if it matches the object tag. Here are the tag definitions from jsapi.h:

/*
 * Type tags stored in the low bits of a jsval.
 */
#define JSVAL_OBJECT            0x0     /* untagged reference to object */
#define JSVAL_INT               0x1     /* tagged 31-bit integer value */
#define JSVAL_DOUBLE            0x2     /* tagged reference to double */
#define JSVAL_STRING            0x4     /* tagged reference to string */
#define JSVAL_BOOLEAN           0x6     /* tagged boolean value */

This means that objects will have 000 as the bits for its tag, integers will have 001, doubles will have 010, strings will have 100, and booleans will have 110.

Here’s JSVAL_TAG and its called JSVAL_TAGMASK:

#define JSVAL_TAGBITS           3
#define JSVAL_TAGMASK           JS_BITMASK(JSVAL_TAGBITS)
#define JSVAL_TAG(v)            ((v) & JSVAL_TAGMASK)

JS_BITMASK comes from a different file, jstypes.h, and it builds the correct value to bitmask against tag. This means that it grabs the last 3 bits from the value, and those three bits correspond to the tag.

For instance, if we were working with a very simplified version with integers of 5 bit length and the last 3 bits were the tag, we may have a value like 00001001. Here, the first 5 bits, 00001, would correspond to the value, and the last 3 bits, 001, wold be the tag (here, for an integer). To get just the tag, we would do a bitwise AND comparison between our value 00001001 and a mask with 1s for the last three bits, 111, to get just the tag, 001.

Value:           00001001
Bitmask:         00000111
Value & bitmask: 00000001

So far so good! Let’s see what we can find about the null value. A quick search reveals the definition for null from jsapi.h:

/*
 * Well-known JS values.  The extern'd variables are initialized when the
 * first JSContext is created by JS_NewContext (see below).
 */
#define JSVAL_VOID              INT_TO_JSVAL(0 - JSVAL_INT_POW2(30))
#define JSVAL_NULL              OBJECT_TO_JSVAL(0)
#define JSVAL_ZERO              INT_TO_JSVAL(0)
#define JSVAL_ONE               INT_TO_JSVAL(1)
#define JSVAL_FALSE             BOOLEAN_TO_JSVAL(JS_FALSE)
#define JSVAL_TRUE              BOOLEAN_TO_JSVAL(JS_TRUE)

Here, JSVAL_NULL is OBJECT_TO_JSVAL(0). Let’s follow this down:

jsapi.h:

#define OBJECT_TO_JSVAL(obj)    ((jsval)(obj))

jspubtd.h:

typedef jsword    jsval;

jscompat.h:

typedef JSWord jsword;

jstypes.h:

/*
** A JSWord is an integer that is the same size as a void*
*/
typedef long JSWord;
typedef unsigned long JSUword;

This means that null, OBJECT_TO_JSVAL(0), is actually just creating a long with just 0 as the value. This is just 00000000000000000000000000000000 (32 bits, all 0).

Now, let’s go back to our JSVAL_IS_OBJECT call

#define JSVAL_IS_OBJECT(v)      (JSVAL_TAG(v) == JSVAL_OBJECT)

Now, what happens if we call JS_TAG on our null value?

Value:           00000000000000000000000000000000
Bitmask:         00000000000000000000000000000111
Value & bitmask: 00000000000000000000000000000000

No surprises there: this means that the resulting calculated tag is just 0x0.

This is where we have the problem! We then compare null’s calculated 0x0 tag with JSVAL_OBJECT, 0x0. They match!

This means our JS_TypeOfValue call then checks if its a function, and when it isn’t, it assigns type to be JSTYPE_OBJECT and returns type. JSTYPE_OBJECT is the string 'object', and we get typeof null === 'object'.

As mentioned in the 2ality blog, interestingly enough there is a function (in this version of the code, at least) that checks for a null value:

#define JSVAL_IS_NULL(v)        ((v) == JSVAL_NULL)

Maybe this was added in after the fact? Maybe this was in there and overlooked? Regardless, this bug was in the original, and it remains to this day.

“In general, typeof seems like a mess that will be hard to reform sensibly. “

Brendan Eich 2006/03/31 15:13


Thanks to Brendan Eich for writing JavaScript, Mozilla for hosting the classic code, and Dr. Axel Rauschmayer for the very helpful blog post about this topic.