Type safety with native JavaScript v02

Why everybody wants type safety and how to get it.

Typecast example

This is an update from the original article. It is shorter, simpler, more logical, and more correct.

1. Overview

Many developer tools like IDEs, frameworks, libraries, and linters try to provide some level of type safety to JavaScript. This article explains what type safety is, why we want it, and how we can get it using native JavaScript.

2. What is type safety?

Type safety is the extent a programming language discourages or prevents type errors. A type error occurs when an unintended or incompatible type value is provided to a function or expression, usually as an argument or context. Dividing a number by an array is a perfect example. Usually this doesn’t make sense, yet JavaScript support this very operation, often with surprising and unwanted results.

Let’s see how easy it is to create a type error in Javascript using the terse semantic style popular in some fringe cults. This is awful code, so please don’t copy it. We’ll improve it as we go along.

function repeats(counts, run)
{
    while(counts < 0)
    {
        run(counts)
        counts+=1
    }
}

function reports(info)
{
    console.log(info)
}

repeats('-3', reports)

That didn’t take long! The first type error occurs when we provide the ‘-3’ argument to the repeats invocation. This value should be an integer, but we instead provide a string. Then all hell breaks loose.

When repeats is invoked it creates an ‘endless’ loop condition that will rather quickly consume all available resources within its execution environment. This results in a force-kill of a NodeJS process, or a browser tab, or the entire browser process, or the host OS.

If we watch the progression of the value of counts in the while loop we see the following series: '-3', '-31', '-311', '-3111', .... The test condition counts < 0 does convert the counts string into a number, but by then it is too late: the value is always less than 0 and it only gets more negative with each iteration. That’s because the expression counts+=1 always converts the number 1 into a string '1' and appends it to the counts variable.

3. Why do we want type safety?

We want type safety with JavaScript because type errors can be quite troublesome:

They are easy to create
They are challenging to resolve
They are often serious

Let’s look at each of these issues in turn.

3.1. JS type errors are easy to create

JavaScript is missing almost all type controls because it was not originally intended to run large-scale applications. There are no type declarations, return types are not reliable, there is no static type-checking, and one must roll-their-own dynamic checking.

3.1.1 No type declarations

JavaScript has no means to formally declare a variable type. A variable may contain any type which may be changed at any time during execution. There are no sigils to identify type, and a variable name is unconstrained. We probably all have seen a JavaScript application or two where a single symbol like watches is used as an object, a map, a list, a boolean flag, a string, an integer, and a floating point number all in different contexts. Yet in practice most variables used in JavaScript are not intended to change type within a context because doing so is needlessly confusing.

Other languages provide a ‘wild card’ variables to reference any type of value. These are usually easy to identify because they have a unique syntax. Developers know they these ‘wild cards’ need to be handled with care. In JavaScript, all variables are ‘wild cards.’

3.1.2 Return types are not reliable

The return types of JavaScript expressions are often not reliable. Many JavaScript operators are polymorphic and will return values of different types dependant on the types of variables used. For example, the JavaScript ‘+’ operator can concatenate strings or add numbers or converts a string to a number. The return value can be suprising when varilable types are mixed as illustrated below.

// Confirmed using Google V8 version 6.0.286.52
console.log(process.versions.v8);

var x, y;
// expression   // |     returned value | Coercion       | Op
x = 3;          // |                 3  | -              |
x = 3 + 1;      // |                 4  | -              | +num
x = 3 + '1';    // |               '31' | 3      => '3'  | +str
x = 3 + [];     // |                '3' | []     => ''   | +str
x = 3 + [ 21 ]; // |              '321' | [ 21 ] => '21' | +str
x = 3 + {};     // | '3[object Object]' | {}     => str  | +str
x = '3' - 2;    // |                 1  | '3'    =>  3   | -num
x = '3' + 2;    // |               '32' | 2      => '2'  | +str
x = + '3';      // |                 3  | '3'    =>  3   | cast_num
x = 0 + '3';    // |               '03' | 0      => '0'  | +str
x = y + 3;      // |                NaN | -              | NaN
x = y + '3';    // |       'undefined3' | undef  => str  | +str

This behavior (confirmed using Google V8 version 6.0.286.52; see node -e 'console.log(process.versions.v8);') is probably consistent across the across the four primary JavaScript engines (V8, IonMonkey, Nitro, Chakra) and the dozen of others. However less popular operators probably have some variance across the engines that can cause headaches.

Other languages have less complex behaviors because they usually have stricter type checking, stricter coercion rules, fewer polymorphic operators, and fewer vendors. Perl, for example, uses the dot (.) operator to join strings and the plus (+) operator to add numbers. Perl also uses sigals (prefixes) like $, @, and % to indicate types and a special syntax to identify ‘wild-card’ (type-glob) references.

3.1.3 No static checking

Many languages provide some level of static type checking. Java, for example, resolves most variable types during compilation. If JavaScript had a similar mechanism we wouldn’t be able to run our application until we resolved the compile errors. In this imaginary world, our JavaScript compile output might look like this:

ok                           | x = 3;          |
ok                           | x = 3 + 1;      |
compile_error: type_mismatch | x = 3 + '1';    |
compile_error: type_mismatch | x = 3 + [];     |
compile_error: type_mismatch | x = 3 + [ 21 ]; |
compile_error: type_mismatch | x = 3 + {};     |
compile_error: type_mismatch | x = '3' - 2;    |
compile_error: type_mismatch | x = '3' + 2;    |
compile_error: type_mismatch | x = + '3';      |
compile_error: type_mismatch | x = 0 + '3';    |
compile_error: type_mismatch | x = y + 3;      |
compile_error: type_mismatch | x = y + '3';    |

Perhaps the greatest advantage of static (compile-time) type checking is that it can improve performance: every type check that can be resolved once during a compile removes a type check that would need to be invoked on every call of a function or method. This can remove a large number of calls when the application is run and thus improves performance.

3.1.4 Roll-your-own dynamic checking

Static type checking does not work in all situations, especially when dealing with data from unknown or untrusted sources. In those cases we resort to dynamic type checking at run-time. Native JavaScript tools for this purpose are limited. For example, the typeof method does not distinguish between an object and an array.

3.2. Type errors are challenging to resolve

Type errors can be hard to identify and debug. When one routine fails to check for type an incorrect result can propagate up the call stack resulting in a cascade of errors. The originating flaw can be hard to spot if variable aren’t named to indicate their intended type, like so:

// Almost useless names
let total = watches / in_use;

However, if we name our variable by intended type the mismatches become obvious:

// Type errors in this assignment are pretty obvious
let total_str = watch_list / use_bool;

// Problems made obvious
// - The return from a divide operation should be a number, not a string
// - Dividing a list will result in NaN in most circumstances
// - The boolean will coerce to 0 or 1

// Here is the problem fixed by using proper types
let total_num = watch_list.length / use_count;

Yes, the intended variable type is that important. We’ve had to maintain plenty of third-party modules where variable names provide no hint of type or purpose, or worse, are patently misleading. We’d rather name our variables sensibly and use the time saved to focus on new challenges.

3.3. Type errors are often serious

As we have shown, type errors can result in severe application failures and security holes. Imagine some NodeJS code that doesn’t properly type-check its JSON API. One could implement a Denial Of Service (DOS) attack and shut down an entire cluster by simply sending strings instead of numbers in API requests. This stuff happens.

4. How do we get type safety in Native JavaScript?

There are a few ways to improve JavaScript type safety. One approach involves using libraries or frameworks such as Flow or TypeScript that require transpiling or otherwise prepocessing the code. Here we propose a simpler solution that may be especially well-suited for smaller projects and requires only three steps.

Get typecast methods
Use typecasting
Name variables to indicate type

4.1. Get typecast methods

Typecasting, for the purposes of this article, is the process of converting a value into the desired data type using a very strict set of rules. Our typecasting functions either return the requested value type or a failure value which is undefined by default.

We can get typecast methods from the hi_score project which is easy to install (type npm install hi_score into a terminal). If you edit the example application you can use all the cast methods from xhi.util.js.

  npm install hi_score
  cd hi_score
  bin/xhi setup
  google-chrome ./index.html
  # Open the JavaScript console to access xhi._util_ functions

You don’t have to use the whole library; you can just crib the methods from xhi.util.js if you want. Go ahead, you won’t hurt anyone’s feelings. The typecast methods are as follows and have all been thoroughly tested and quite well docmented in the project.

  castBool, castFn,  castInt, castJQ,
  castList, castMap, castNum, castObj, castStr

All typecast methods take one or two arguments. Only numbers, strings, and integers are converted between types and only when the conversion is unambiguous. Examples are shown below.

// return_data = castInt( <value-to-cast> [, <failure-value>] );

return_data = castInt( 0      ); // 0
return_data = castInt( '0'    ); // 0
return_data = castInt( 'a'    ); // undefined
return_data = castInt( []     ); // undefined
return_data = castInt( 'a', 0 ); // 0
return_data = castInt( [],  0 ); // 0

JavaScript tries hard to coerce types, and the results are often undesirable. Blank cells are conditions where a type error exception is usually thrown.

Value	Bool	Fn	Num	Ary	Obj	Str
’’	FALSE		0			’’
‘0’	t		0			‘0’
‘1’	t		1			‘1’
‘20’	t		20			‘20’
‘ten’	t		NaN			‘ten’
0	FALSE		0			‘0’
1	t		1			‘1’
NaN	FALSE		NaN			‘NaN’
[]	t		0	[]	{}	’’
[‘ten’]	t		NaN	[‘ten’]	{ 0: ‘ten’ }	‘ten’
[‘ten’,’t’]	t		NaN	[‘ten’,’t’]	{ 0: ‘ten’, 1: ‘t’}	‘ten,t’
[10]	t		10	[10]	{ 0 : 10 }	‘10’
[10,20]	t		NaN	[10,20]	{ 0: 10, 1:20 }	‘10,20’
false	FALSE		0			‘false’
function(){}	t	function(){}	NaN	[]	{}	‘function(){}’
null	FALSE		0			‘null’
true	t		1			‘true’
undefined	FALSE		NaN			‘undefined’
{}	t		NaN	[]	{}	‘0’
-Infinity	t	-Infinity				‘-Infinity’
Infinity	t	Infinity				‘Infinity’

The cast methods are they are predictable, explicit, and self-documenting. The only convert the most unambigous values. Blank cells are conditions where the failure value (undefined by default) will be returned. No exceptions are thrown by these methods.

Value	Bool	Fn	Num	Ary	Obj	Str
’’						’’
‘0’			0			‘0’
‘1’			1			‘1’
‘20’			20			‘20’
‘ten’						‘ten’
0			0			‘0’
1			1			‘1’
NaN
[]				[]
[‘ten’]				[‘ten]
[‘ten’,’t’]				[‘ten’,’t’]
[10]				[10]
[10,20]				[10,20]
false	FALSE
function(){}		function(){}
null
true	true
undefined
{}					{}
-Infinity			-Infinity
Infinity			Infinity

4.2 Use Typecasting

Let’s adjust our example function to use castInt and castFn to ensure the provided arguments are the correct type. If they are not, the function will return without any further processing.

function repeats(arg_counts, arg_run)
{
    var counts = castInt(arg_counts)
    var run = castFn(arg_run)
    if (!(counts && run)){return}

    while (counts < 0)
    {
        run(counts)
        counts+=1
    }
}

function reports(info)
{
    console.log(info)
}

repeats('-3', reports)

The function is now impervious to most type errors.

4.3 Name variables to indicate type

Typecasting handles run-time type checking. Most benefits of static checking can be realized by naming our variables to indicate their intended type. Most static type errors become self-evident if we adhere to this convention. Let’s rewrite our code using the name convention from the JS Code standard quick reference.

function repeatFn ( arg_map ) {
  var
    map = castMap( arg_map,  {} ),
    int = castInt( map._int_, 0 ),
    fn  = castFn(  map._fn_     ),
    idx
    ;

  if ( ! fn ) { return; }

  for ( idx = int; idx < 0; idx++ ) {
    fn( idx );
  }
}

function printToConsole ( idx ) { console.log( idx ); }

repeatFn({ _int_ : '-3', _fn_ : printToConsole });

This code will now pass ESLint. Thanks to the naming convention we can tell that tell that fn should be a function and and idx should be an integer immediately without the need to read other code, add mark-up comments, or employ any transpiling.

The full JS Code standard we use discusses why a simple naming convention can vastly reduce the need for comments. We think it’s an interesting and compelling read if you’re into that kind of thing.

Now that we have consistent named-by-type variables and better formatting, we can easily read the code to create in-line API documentation. Using the guide from the code standard we get the following:

// BEGIN utility method /repeatFn/
// Summary   : repeatFn({ _int_ : <integer>, _fn_ : <function> )
// Purpose   : Repeatedly call a function 'fn' as long as the
//             counter 'int' is < 0.  After each call, 'int' is
//             incremented by 1.  If the initial value of 'int'
//             is not < 0 the function 'fn' is not called.
// Example   : repeatFn({
//               _int_ : -3,
//               _fn_ : function (idx ) { console.log( idx ) }
//             });
// Arguments  : ( named )
//   _fn_     : The function to execute. The current value of the
//              index (idx) is provided as its sole argument.
//   _int_    : The initial value of idx. Idx is incremented after
//              _fn_ is executed. Thus a value of '-1' will result in a
//              single execution of _fn_( idx );
// Returns    : undefined
// Throws     : none
//
function repeatFn ( arg_map ) {
  var
    map = castMap( arg_map,  {} ),
    int = castInt( map._int_, 0 ),
    fn  = castFn(  map._fn_     ),
    idx
    ;

  if ( ! fn ) { return; }
  for ( idx = int; idx < 0; idx++ ) {
    fn( idx );
  }
}
// END utility method /repeatFn/

function printToConsole ( idx ) { console.log( idx ); }
repeatFn({ _int_ : '-3', _fn_ : printToConsole });

Remember where we begged you not to copy our first code example? Copy this code instead. It is impervious to most type errors, readable, testable, maintainable, compressable, and well documented.

We can use the in-line API docs along with tools like Istanbul and nodeunit to create tests for our type-safe repeatFn function. Check out the test suite for hi_score to see how this can be accomplished.

5. What about frameworks and libraries?

How does this simple technique compare with Flow and TypeScript? We still intend to publish a sequel that will answer this question.

Remember to use typcasting only when processing external data and inputs of public methods. Rely on the name convention to communicate intended type of variables in private methods and variables.

We hope you found this useful! Please share your thoughts and experiences in the comments below.

Cheers, Mike

Written on August 12, 2017