Type safety with native JavaScript v02
Why everybody wants type safety and how to get it.
This is an update from the original article. It is shorter, simpler, more logical, and more correct.
1. Overview
Many developer tools like IDEs, frameworks, libraries, and linters try to provide some level of type safety to JavaScript. This article explains what type safety is, why we want it, and how we can get it using native JavaScript.
2. What is type safety?
Type safety is the extent a programming language discourages or prevents type errors. A type error occurs when an unintended or incompatible type value is provided to a function or expression, usually as an argument or context. Dividing a number by an array is a perfect example. Usually this doesn’t make sense, yet JavaScript support this very operation, often with surprising and unwanted results.
Let’s see how easy it is to create a type error in Javascript using the terse semantic style popular in some fringe cults. This is awful code, so please don’t copy it. We’ll improve it as we go along.
function repeats(counts, run)
{
while(counts < 0)
{
run(counts)
counts+=1
}
}
function reports(info)
{
console.log(info)
}
repeats('-3', reports)
That didn’t take long! The first type error occurs when we provide the ‘-3’
argument to the repeats
invocation. This value should be an integer, but we
instead provide a string. Then all hell breaks loose.
When repeats
is invoked it creates an ‘endless’ loop condition that will
rather quickly consume all available resources within its execution
environment. This results in a force-kill of a NodeJS process, or a browser
tab, or the entire browser process, or the host OS.
If we watch the progression of the value of counts
in the while
loop we
see the following series: '-3', '-31', '-311', '-3111', ...
. The test
condition counts < 0
does convert the counts
string into a number, but
by then it is too late: the value is always less than 0 and it only gets more
negative with each iteration. That’s because the expression counts+=1
always
converts the number 1
into a string '1'
and appends it to the counts
variable.
3. Why do we want type safety?
We want type safety with JavaScript because type errors can be quite troublesome:
- They are easy to create
- They are challenging to resolve
- They are often serious
Let’s look at each of these issues in turn.
3.1. JS type errors are easy to create
JavaScript is missing almost all type controls because it was not originally intended to run large-scale applications. There are no type declarations, return types are not reliable, there is no static type-checking, and one must roll-their-own dynamic checking.
3.1.1 No type declarations
JavaScript has no means to formally declare a variable type. A variable may
contain any type which may be changed at any time during execution. There are
no sigils to identify type, and a variable name is unconstrained. We
probably all have seen a JavaScript application or two where a single symbol
like watches
is used as an object, a map, a list, a boolean flag, a string,
an integer, and a floating point number all in different contexts. Yet in
practice most variables used in JavaScript are not intended to change type
within a context because doing so is needlessly confusing.
Other languages provide a ‘wild card’ variables to reference any type of value. These are usually easy to identify because they have a unique syntax. Developers know they these ‘wild cards’ need to be handled with care. In JavaScript, all variables are ‘wild cards.’
3.1.2 Return types are not reliable
The return types of JavaScript expressions are often not reliable. Many JavaScript operators are polymorphic and will return values of different types dependant on the types of variables used. For example, the JavaScript ‘+’ operator can concatenate strings or add numbers or converts a string to a number. The return value can be suprising when varilable types are mixed as illustrated below.
// Confirmed using Google V8 version 6.0.286.52
console.log(process.versions.v8);
var x, y;
// expression // | returned value | Coercion | Op
x = 3; // | 3 | - |
x = 3 + 1; // | 4 | - | +num
x = 3 + '1'; // | '31' | 3 => '3' | +str
x = 3 + []; // | '3' | [] => '' | +str
x = 3 + [ 21 ]; // | '321' | [ 21 ] => '21' | +str
x = 3 + {}; // | '3[object Object]' | {} => str | +str
x = '3' - 2; // | 1 | '3' => 3 | -num
x = '3' + 2; // | '32' | 2 => '2' | +str
x = + '3'; // | 3 | '3' => 3 | cast_num
x = 0 + '3'; // | '03' | 0 => '0' | +str
x = y + 3; // | NaN | - | NaN
x = y + '3'; // | 'undefined3' | undef => str | +str
This behavior (confirmed using Google V8 version 6.0.286.52; see node -e
'console.log(process.versions.v8);'
) is probably consistent across the across
the four primary JavaScript engines (V8, IonMonkey, Nitro, Chakra) and the
dozen of others. However less popular operators probably have some
variance across the engines that can cause headaches.
Other languages have less complex behaviors because they usually have stricter
type checking, stricter coercion rules, fewer polymorphic operators, and fewer
vendors. Perl, for example, uses the dot (.
) operator to join strings
and the plus (+
) operator to add numbers. Perl also uses sigals
(prefixes) like $
, @
, and %
to indicate types and a special syntax to
identify ‘wild-card’ (type-glob) references.
3.1.3 No static checking
Many languages provide some level of static type checking. Java, for example, resolves most variable types during compilation. If JavaScript had a similar mechanism we wouldn’t be able to run our application until we resolved the compile errors. In this imaginary world, our JavaScript compile output might look like this:
00: ok | x = 3; |
01: ok | x = 3 + 1; |
02: compile_error: type_mismatch | x = 3 + '1'; |
03: compile_error: type_mismatch | x = 3 + []; |
04: compile_error: type_mismatch | x = 3 + [ 21 ]; |
06: compile_error: type_mismatch | x = 3 + {}; |
07: compile_error: type_mismatch | x = '3' - 2; |
08: compile_error: type_mismatch | x = '3' + 2; |
09: compile_error: type_mismatch | x = + '3'; |
10: compile_error: type_mismatch | x = 0 + '3'; |
11: compile_error: type_mismatch | x = y + 3; |
12: compile_error: type_mismatch | x = y + '3'; |
Perhaps the greatest advantage of static (compile-time) type checking is that it can improve performance: every type check that can be resolved once during a compile removes a type check that would need to be invoked on every call of a function or method. This can remove a large number of calls when the application is run and thus improves performance.
3.1.4 Roll-your-own dynamic checking
Static type checking does not work in all situations, especially when dealing
with data from unknown or untrusted sources. In those cases we resort to
dynamic type checking at run-time. Native JavaScript tools for this purpose
are limited. For example, the typeof
method does not distinguish between an
object and an array.
3.2. Type errors are challenging to resolve
Type errors can be hard to identify and debug. When one routine fails to check for type an incorrect result can propagate up the call stack resulting in a cascade of errors. The originating flaw can be hard to spot if variable aren’t named to indicate their intended type, like so:
// Almost useless names
let total = watches / in_use;
However, if we name our variable by intended type the mismatches become obvious:
// Type errors in this assignment are pretty obvious
let total_str = watch_list / use_bool;
// Problems made obvious
// - The return from a divide operation should be a number, not a string
// - Dividing a list will result in NaN in most circumstances
// - The boolean will coerce to 0 or 1
// Here is the problem fixed by using proper types
let total_num = watch_list.length / use_count;
Yes, the intended variable type is that important. We’ve had to maintain plenty of third-party modules where variable names provide no hint of type or purpose, or worse, are patently misleading. We’d rather name our variables sensibly and use the time saved to focus on new challenges.
3.3. Type errors are often serious
As we have shown, type errors can result in severe application failures and security holes. Imagine some NodeJS code that doesn’t properly type-check its JSON API. One could implement a Denial Of Service (DOS) attack and shut down an entire cluster by simply sending strings instead of numbers in API requests. This stuff happens.
4. How do we get type safety in Native JavaScript?
There are a few ways to improve JavaScript type safety. One approach involves using libraries or frameworks such as Flow or TypeScript that require transpiling or otherwise prepocessing the code. Here we propose a simpler solution that may be especially well-suited for smaller projects and requires only three steps.
- Get typecast methods
- Use typecasting
- Name variables to indicate type
4.1. Get typecast methods
Typecasting, for the purposes of this article, is the process of converting a
value into the desired data type using a very strict set of rules. Our
typecasting functions either return the requested value type or a
failure value which is undefined
by default.
We can get typecast methods from the hi_score project which is easy to
install (type npm install hi_score
into a terminal). If you edit the example
application you can use all the cast
methods from xhi.util.js
.
npm install hi_score
cd hi_score
bin/xhi setup
google-chrome ./index.html
# Open the JavaScript console to access xhi._util_ functions
You don’t have to use the whole library; you can just crib the methods from
xhi.util.js
if you want. Go ahead, you won’t hurt anyone’s feelings. The
typecast methods are as follows and have all been thoroughly tested and quite
well docmented in the project.
castBool, castFn, castInt, castJQ,
castList, castMap, castNum, castObj, castStr
All typecast methods take one or two arguments. Only numbers, strings, and integers are converted between types and only when the conversion is unambiguous. Examples are shown below.
// return_data = castInt( <value-to-cast> [, <failure-value>] );
return_data = castInt( 0 ); // 0
return_data = castInt( '0' ); // 0
return_data = castInt( 'a' ); // undefined
return_data = castInt( [] ); // undefined
return_data = castInt( 'a', 0 ); // 0
return_data = castInt( [], 0 ); // 0
JavaScript tries hard to coerce types, and the results are often undesirable. Blank cells are conditions where a type error exception is usually thrown.
Value | Bool | Fn | Num | Ary | Obj | Str |
---|---|---|---|---|---|---|
’’ | FALSE | 0 | ’’ | |||
‘0’ | t | 0 | ‘0’ | |||
‘1’ | t | 1 | ‘1’ | |||
‘20’ | t | 20 | ‘20’ | |||
‘ten’ | t | NaN | ‘ten’ | |||
0 | FALSE | 0 | ‘0’ | |||
1 | t | 1 | ‘1’ | |||
NaN | FALSE | NaN | ‘NaN’ | |||
[] | t | 0 | [] | {} | ’’ | |
[‘ten’] | t | NaN | [‘ten’] | { 0: ‘ten’ } | ‘ten’ | |
[‘ten’,’t’] | t | NaN | [‘ten’,’t’] | { 0: ‘ten’, 1: ‘t’} | ‘ten,t’ | |
[10] | t | 10 | [10] | { 0 : 10 } | ‘10’ | |
[10,20] | t | NaN | [10,20] | { 0: 10, 1:20 } | ‘10,20’ | |
false | FALSE | 0 | ‘false’ | |||
function(){} | t | function(){} | NaN | [] | {} | ‘function(){}’ |
null | FALSE | 0 | ‘null’ | |||
true | t | 1 | ‘true’ | |||
undefined | FALSE | NaN | ‘undefined’ | |||
{} | t | NaN | [] | {} | ‘0’ | |
-Infinity | t | -Infinity | ‘-Infinity’ | |||
Infinity | t | Infinity | ‘Infinity’ |
The cast
methods are they are predictable, explicit, and self-documenting.
The only convert the most unambigous values. Blank cells are conditions where
the failure value (undefined
by default) will be returned. No exceptions
are thrown by these methods.
Value | Bool | Fn | Num | Ary | Obj | Str |
---|---|---|---|---|---|---|
’’ | ’’ | |||||
‘0’ | 0 | ‘0’ | ||||
‘1’ | 1 | ‘1’ | ||||
‘20’ | 20 | ‘20’ | ||||
‘ten’ | ‘ten’ | |||||
0 | 0 | ‘0’ | ||||
1 | 1 | ‘1’ | ||||
NaN | ||||||
[] | [] | |||||
[‘ten’] | [‘ten] | |||||
[‘ten’,’t’] | [‘ten’,’t’] | |||||
[10] | [10] | |||||
[10,20] | [10,20] | |||||
false | FALSE | |||||
function(){} | function(){} | |||||
null | ||||||
true | true | |||||
undefined | ||||||
{} | {} | |||||
-Infinity | -Infinity | |||||
Infinity | Infinity |
4.2 Use Typecasting
Let’s adjust our example function to use castInt
and castFn
to ensure the
provided arguments are the correct type. If they are not, the function will
return without any further processing.
function repeats(arg_counts, arg_run)
{
var counts = castInt(arg_counts)
var run = castFn(arg_run)
if (!(counts && run)){return}
while (counts < 0)
{
run(counts)
counts+=1
}
}
function reports(info)
{
console.log(info)
}
repeats('-3', reports)
The function is now impervious to most type errors.
4.3 Name variables to indicate type
Typecasting handles run-time type checking. Most benefits of static checking can be realized by naming our variables to indicate their intended type. Most static type errors become self-evident if we adhere to this convention. Let’s rewrite our code using the name convention from the JS Code standard quick reference.
function repeatFn ( arg_map ) {
var
map = castMap( arg_map, {} ),
int = castInt( map._int_, 0 ),
fn = castFn( map._fn_ ),
idx
;
if ( ! fn ) { return; }
for ( idx = int; idx < 0; idx++ ) {
fn( idx );
}
}
function printToConsole ( idx ) { console.log( idx ); }
repeatFn({ _int_ : '-3', _fn_ : printToConsole });
This code will now pass ESLint. Thanks to the naming convention we can tell
that tell that fn
should be a function and and idx
should be an integer
immediately without the need to read other code, add mark-up comments, or
employ any transpiling.
The full JS Code standard we use discusses why a simple naming convention can vastly reduce the need for comments. We think it’s an interesting and compelling read if you’re into that kind of thing.
Now that we have consistent named-by-type variables and better formatting, we can easily read the code to create in-line API documentation. Using the guide from the code standard we get the following:
// BEGIN utility method /repeatFn/
// Summary : repeatFn({ _int_ : <integer>, _fn_ : <function> )
// Purpose : Repeatedly call a function 'fn' as long as the
// counter 'int' is < 0. After each call, 'int' is
// incremented by 1. If the initial value of 'int'
// is not < 0 the function 'fn' is not called.
// Example : repeatFn({
// _int_ : -3,
// _fn_ : function (idx ) { console.log( idx ) }
// });
// Arguments : ( named )
// _fn_ : The function to execute. The current value of the
// index (idx) is provided as its sole argument.
// _int_ : The initial value of idx. Idx is incremented after
// _fn_ is executed. Thus a value of '-1' will result in a
// single execution of _fn_( idx );
// Returns : undefined
// Throws : none
//
function repeatFn ( arg_map ) {
var
map = castMap( arg_map, {} ),
int = castInt( map._int_, 0 ),
fn = castFn( map._fn_ ),
idx
;
if ( ! fn ) { return; }
for ( idx = int; idx < 0; idx++ ) {
fn( idx );
}
}
// END utility method /repeatFn/
function printToConsole ( idx ) { console.log( idx ); }
repeatFn({ _int_ : '-3', _fn_ : printToConsole });
Remember where we begged you not to copy our first code example? Copy this code instead. It is impervious to most type errors, readable, testable, maintainable, compressable, and well documented.
We can use the in-line API docs along with tools like Istanbul
and
nodeunit
to create tests for our type-safe repeatFn
function. Check out
the test suite for hi_score to see how this can be accomplished.
5. What about frameworks and libraries?
How does this simple technique compare with Flow and TypeScript? We still intend to publish a sequel that will answer this question.
Remember to use typcasting only when processing external data and inputs of public methods. Rely on the name convention to communicate intended type of variables in private methods and variables.
We hope you found this useful! Please share your thoughts and experiences in the comments below.
Cheers, Mike