[...] we’ve begun work on a native port of the TypeScript compiler and tools. The native implementation will drastically improve editor startup, reduce most build times by 10×, and substantially reduce memory usage.
This blog post looks at some of the details behind the news.
TypeScript 6 (JavaScript): The JavaScript code base will continue into the 6.x series, with 6.0 introducing some breaking changes to align with the native code base.
Original code name for TypeScript: Strada
TypeScript 7 (native): Once the native code base has reached sufficient parity with the JavaScript code base, it’ll be released as TypeScript 7.0.
Later: tool that automatically ports TypeScript code to Go code.
Porting the code worked well (with some human intervention).
Porting the data structures could only be done by humans – because JavaScript’s objects (with flexible types) and Go’s structs (with highly configurable data layouts) are very different. Plus, they now have to work in a concurrent setting – e.g.: The JavaScript code base orders types by giving each one a serial number when it is created. That approach doesn’t work with the Go code base where the order in which types are created is not deterministic anymore because multiple threads are involved.
The TypeScript team wanted to (mostly) port the JavaScript code base, not rewrite it in a different language – for two reasons:
The new code base must (mostly) be a plug-and-play replacement for the old code base. And that is difficult to do with a rewrite.
Rewriting is much more time-consuming.
If we look at the requirements for the programming language to be used for the new code base, then some of them stemmed from the decision to do a port:
Support for cyclic data structures – which the TypeScript code base uses a lot.
Garbage collection. That’s what the code base assumes.
The style of the JavaScript code base is more functional than OOP and doesn’t use classes often. That style is similar to how Go is written.
The remaining requirements were motivated by performance and ease of use (developer experience):
Good native code support on all major platforms.
The language should be simple and easy to approach.
The language should have good tooling.
Control over how data structures are laid out in memory. With Go, you can use structs and create an array of structs with a single allocation (vs. multiple allocations in JavaScript).
Good support for shared memory concurrency – which is an important element of making the code faster (more on that next).
When asked “Why not C#?”, Anders Hejlsberg mentioned the following points:
Go is lower-level than C#.
Go has better support for producing native code (including specifying the layout of data structures).
Go is a better fit for the (rather functional) coding style used in the JavaScript code base.
Half of the speedup comes from shared memory concurrency and using multiple cores.
The other half comes from native code: There is no just-in-time compilation, data structures are more efficient, etc.
JavaScript does have concurrency via web workers, but shared memory can only be shared in a very limited manner (see SharedArrayBuffer).
TypeScript compilation has the following phases:
Parsing: producing an abstract syntax tree (AST)
Binding: creating “symbol tables” for the declarations, setting up the control flow graph, etc.
Type checking
Emitting: code generation
In the native code base, parsing and binding can be done independently (no memory sharing necessary). Then the data structures are immutable and shared between threads.
The speed of parsing, binding and emitting scales linearly with the number of cores that are used. Taken together, they make up about a third of the total compilation time.
Type checking makes up the remaining two thirds and is not as parallelizable. Therefore, the following trick is used:
Type checking works for single files – it lazily pulls in more information as needed.
Technique: Run multiple type checkers, give each one part of the files.
There is not much need for thread safety: The checkers only share the immutable ASTs.
There is some duplication of work, but not much because a lot of the type information is local.
With 4 checkers (current hard-coded number), 20% more memory is used (due to duplicated work) but checking is 2–3 times faster.
Note that the 20% are relative to single-checker Go – which uses half the memory of the JavaScript code base.
One interesting worry expressed by Anders Hejlsberg is that, with faster type checking, people may stop trying to write types that can be computed quickly – e.g., with template literal types it’s easy to create types that make type checking slow.
Maybe we’ll get tools for analyzing the performance of types. Type-level debugging also seems useful.