TS compile API

Background#

Previously, during the SDK compilation process, to facilitate developers, we often wrote some aliases to convert long relative paths into shorter alias paths.
For example

{
  // ...
  "baseUrl":"src",
  "paths": {
    "@/package": ["./index"],
    "@/package/*": ["./*"],
  }
}

The purpose of this is to always point the alias to the src directory. When files deep in the src directory reference files at the top level, they can be written directly like this, which reduces the reference of relative paths and makes the code look neat && convenient for adjusting the directory structure of files.

import Components from "@/package/ui/header"

In this way, the DX experience is friendly during the development phase, but when we finally package the code into an SDK, the host environment using the SDK may not have the same alias configuration as we do. If there is a discrepancy, there will be issues with finding files, which requires us to compile the relevant files of the alias back to relative paths during the TS packaging process to achieve compatibility for all importers. Therefore, it is necessary to have an understanding of the TS compilation process.

TS Compilation Process#

SourceCode（Source Code） ~~ Scanner ~~> Token Stream
Token Stream ~~ Parser ~~> AST（Abstract Syntax Tree）
AST ~~ Binder ~~> Symbols（Symbols）
AST + Symbols ~~ Checker ~~> Type Validation
AST + Checker ~~ Emitter ~~> JavaScript Code

Scanner#

The scanner in TS (TypeScript) is the first stage of the compiler, also known as the lexical analyzer. It is responsible for converting the character stream in the source code file into a series of lexical units (tokens).
The working process is as follows:

Read the character stream: The scanner reads characters from the source code file one by one.
Identify lexical units: The scanner combines characters into recognized lexical units, such as identifiers, keywords, operators, constants, etc., based on a set of predefined syntax rules. It uses finite automata or regular expressions to match character sequences.
Generate lexical units: Once a complete lexical unit is recognized, the scanner generates it as an object containing type and value information and passes it to the next stage of the compiler.
Handle special cases: The scanner also handles special cases, such as comments, string literals, and parsing of escape characters.

For example, consider the following TypeScript code snippet:

let age: number = 25;

The scanner will read characters one by one and generate the following lexical units:

let keyword
age identifier
: colon (operator)
number keyword
= equals (operator)
25 numeric constant
; semicolon (separator)

The order of lexical unit generation is defined by syntax rules, and the scanner will continuously repeat this process until all characters in the source code file have been processed. This stage only extracts the relevant tokens without performing any syntax or semantic analysis.

import * as ts from "typescript";

// TypeScript has a singleton scanner
const scanner = ts.createScanner(ts.ScriptTarget.Latest, /*skipTrivia*/ true);

// That is initialized using a function `initializeState` similar to
function initializeState(text: string) {
  scanner.setText(text);
  scanner.setOnError((message: ts.DiagnosticMessage, length: number) => {
    console.error(message);
  });
  scanner.setScriptTarget(ts.ScriptTarget.ES5);
  scanner.setLanguageVariant(ts.LanguageVariant.Standard);
}

// Sample usage
initializeState(`
var foo = 123;
`.trim());

// Start the scanning
var token = scanner.scan();
while (token != ts.SyntaxKind.EndOfFileToken) {
  console.log(ts.SyntaxKind[token]);
  token = scanner.scan();
}

output

VarKeyword
Identifier
FirstAssignment
FirstLiteralToken
SemicolonToken

Parser#

The parser in TS (TypeScript) is a tool used to convert TypeScript code into an Abstract Syntax Tree (AST). The main role of the parser is to parse the source code into a syntax tree for subsequent static analysis, type checking, and compilation operations.
The parser constructs the syntax tree by analyzing the lexical and syntactic structure of the source code. The lexical analysis phase breaks the source code into tokens, such as keywords, identifiers, operators, and constants. The syntactic analysis phase organizes the tokens into a tree structure, ensuring the syntactic correctness of the code.

import * as ts from "typescript";

function printAllChildren(node: ts.Node, depth = 0) {
  console.log(new Array(depth + 1).join('----'), ts.SyntaxKind[node.kind], node.pos, node.end);
  depth++;
  node.getChildren().forEach(c => printAllChildren(c, depth));
}

var sourceCode = `
var foo = 123;
`.trim();

var sourceFile = ts.createSourceFile('foo.ts', sourceCode, ts.ScriptTarget.ES5, true);
printAllChildren(sourceFile);

output

SourceFile 0 14
---- SyntaxList 0 14
-------- VariableStatement 0 14
------------ VariableDeclarationList 0 13
---------------- VarKeyword 0 3
---------------- SyntaxList 3 13
-------------------- VariableDeclaration 3 13
------------------------ Identifier 3 7
------------------------ FirstAssignment 7 9
------------------------ FirstLiteralToken 9 13
------------ SemicolonToken 13 14
---- EndOfFileToken 14 14

Binder#

The general flow of a JavaScript parser is roughly

SourceCode ~~ Scanner ~~> Tokens ~~ Parser ~~> AST ~~ Emitter ~~> JavaScript

However, the above flow lacks a critical step for TS: TypeScript's semantic system. To assist the checker in performing type checks, the binder connects various parts of the source code into a related type system for the checker to use. The main responsibility of the binder is to create symbols (Symbols).

Simple Understanding

In-depth Exploration of Structure

#

You can determine the uniqueness of scope-related references through the pos and end inside.

Checker#

Here, the symbols generated by the binder above will be used together for type inference, type checking, etc.
Code example

import * as ts from "typescript";
import path from 'path'

// Create a TypeScript project
const program = ts.createProgram({
  rootNames: [path.join(__dirname, './check.ts')], // Paths of all files to check in the project
  options: {
    ...ts.getDefaultCompilerOptions(),
    baseUrl: '.'
  }, // Compilation options
});

// Get all semantic errors in the project
const diagnostics = ts.getPreEmitDiagnostics(program)

// Print error messages
diagnostics.forEach((diagnostic) => {
  console.log(
    `Error: ${ts.flattenDiagnosticMessageText(diagnostic.messageText, "\n")}`
  );
});

check.ts

const a:string  = 1
console.log(a)

const b = ({)

output

Error: Type 'number' is not assignable to type 'string'.
Error: Property assignment expected.
Error: '}' expected.

Emitter#

emitter.ts: is the emitter for TS -> JavaScript
declarationEmitter.ts: this emitter is used to create declaration files for TypeScript source files (.ts)

The Emit phase will call the Printer to convert the AST into text, and the name Printer is very appropriate as it prints the AST into text.

import * as ts from 'typescript';

const printer = ts.createPrinter();
const result = printer.printNode(
  ts.EmitHint.Unspecified,
  makeNode(),
  undefined,
);
console.log(result);

function makeNode() {
  return ts.factory.createVariableStatement(
    undefined,
    ts.factory.createVariableDeclarationList(
      [
        ts.factory.createVariableDeclaration(
          ts.factory.createIdentifier('video'),
          undefined,
          ts.factory.createKeywordTypeNode(ts.SyntaxKind.NumberKeyword),
          ts.factory.createStringLiteral('conference'),
        ),
      ],
      ts.NodeFlags.Const,
    ),
  );
}

Transformers#

The above section introduced some processes of TS compiling code, and TS also provides us with some lifecycle hooks that allow us to add our custom parts during the compilation process.

before runs the transformer before TypeScript (the code has not been compiled yet)
after runs the transformer after TypeScript (the code has been compiled)
afterDeclarations runs the transformer after the declaration step (you can transform type definitions here)

API#

Visiting#

ts.visitNode(node, visitor) is used to traverse the root node
ts.visitEachChild(node, visitor, context) is used to traverse child nodes
ts.isXyz(node) is used to determine the node type, for example, ts.isVariableDeclaration(node)

Nodes#

ts.createXyz creates a new node (and returns it), ts.createIdentifier('world')
ts.updateXyz is used to update nodes ts.updateVariableDeclaration()

Writing a Transformer#

const transformer =
  (_program: ts.Program) => (context: ts.TransformationContext) => {
    return (sourceFile: ts.Bundle | ts.SourceFile) => {
      const visitor = (node: ts.Node) => {
        console.log('zxzxxxx', node);
        if (ts.isIdentifier(node)) {
          switch (node.escapedText) {
            case 'babel':
              return ts.factory.createStringLiteral('babel-transformer');
            case 'typescript':
              return ts.factory.createStringLiteral('typescript-transformer');
          }
        }
        return ts.visitEachChild(node, visitor, context);
      };

      return ts.visitNode(sourceFile, visitor);
    };
  };

const program = ts.createProgram([path.join(__dirname, './02.ts')], {
  baseUrl: '.',
  target: ts.ScriptTarget.ESNext,
  module: ts.ModuleKind.ESNext,
  declaration: true,
  declarationMap: true,
  jsx: ts.JsxEmit.React,
  moduleResolution: ts.ModuleResolutionKind.NodeJs,
  skipLibCheck: true,
  allowSyntheticDefaultImports: true,
  outDir: path.join(__dirname, '../dist/transform'),
});

const res = program.emit(undefined, undefined, undefined, undefined, {
  after: [transformer(program)],
});
console.log(res);

More code examples https://github.com/itsdouges/typescript-transformer-handbook/tree/master/example-transformers

Practical Applications#

import path from 'path';
import { chain, head, isEmpty } from 'lodash';
import ts from 'typescript';

export function replaceAlias(
  fileName: string,
  importPath: string,
  paths?: Record<string, string[]>
) {
  if (isEmpty(paths)) return importPath;

  const normalizedPaths = chain(paths)
    .mapKeys((_, key) => key.replace(/\*$/, ''))
    .mapValues(head)
    .omitBy(isEmpty)
    .mapValues((resolve) => (resolve as string).replace(/\*$/, ''))
    .value();

  for (const [alias, resolveTo] of Object.entries(normalizedPaths)) {
    if (importPath.startsWith(alias)) {
      const resolvedPath = importPath.replace(alias, resolveTo);
      const relativePath = path.relative(path.dirname(fileName), resolvedPath);
      return relativePath.startsWith('.') ? relativePath : `./${relativePath}`;
    }
  }

  return importPath;
}

export default function (_program?: ts.Program | null, _pluginOptions = {}) {
  return ((ctx) => {
    const { factory } = ctx;
    const compilerOptions = ctx.getCompilerOptions();

    return (sourceFile: ts.Bundle | ts.SourceFile) => {
      const { fileName } = sourceFile.getSourceFile();
      function traverseVisitor(node: ts.Node): ts.Node | null {
        let importValue: string | null = null;
        if (ts.isCallExpression(node)) {
          const { expression } = node;
          if (node.arguments.length === 0) return null;
          const arg = node.arguments[0];
          if (!ts.isStringLiteral(arg)) return null;
          if (
            // Can't call getText on after step
            expression.getText(sourceFile as ts.SourceFile) !== 'require' &&
            expression.kind !== ts.SyntaxKind.ImportKeyword
          )
            return null;
          importValue = arg.text;
          // import, export
        } else if (
          ts.isImportDeclaration(node) ||
          ts.isExportDeclaration(node)
        ) {
          if (
            !node.moduleSpecifier ||
            !ts.isStringLiteral(node.moduleSpecifier)
          )
            return null;
          importValue = node.moduleSpecifier.text;
        } else if (
          ts.isImportTypeNode(node) &&
          ts.isLiteralTypeNode(node.argument) &&
          ts.isStringLiteral(node.argument.literal)
        ) {
          importValue = node.argument.literal.text;
        } else if (ts.isModuleDeclaration(node)) {
          if (!ts.isStringLiteral(node.name)) return null;
          importValue = node.name.text;
        } else {
          return null;
        }

        const newImport = replaceAlias(
          fileName,
          importValue,
          compilerOptions.paths
        );

        if (!newImport || newImport === importValue) return null;

        const newSpec = factory.createStringLiteral(newImport);

        let newNode: ts.Node | null = null;

        if (ts.isImportTypeNode(node))
          newNode = factory.updateImportTypeNode(
            node,
            factory.createLiteralTypeNode(newSpec),
            node.assertions,
            node.qualifier,
            node.typeArguments,
            node.isTypeOf
            );

            if (ts.isImportDeclaration(node))
              newNode = factory.updateImportDeclaration(
              node,
              node.modifiers,
              node.importClause,
              newSpec,
              node.assertClause
            );

            if (ts.isExportDeclaration(node))
              newNode = factory.updateExportDeclaration(
              node,
              node.modifiers,
              node.isTypeOnly,
              node.exportClause,
              newSpec,
              node.assertClause
            );

            if (ts.isCallExpression(node))
              newNode = factory.updateCallExpression(
              node,
              node.expression,
              node.typeArguments,
              [newSpec]
            );

            if (ts.isModuleDeclaration(node))
              newNode = factory.updateModuleDeclaration(
              node,
              node.modifiers,
              newSpec,
              node.body
            );

            return newNode;
            }

            function visitor(node: ts.Node): ts.Node {
            	return traverseVisitor(node) || ts.visitEachChild(node, visitor, ctx);
            }
            return ts.visitNode(sourceFile, visitor);
            };
            }) as ts.TransformerFactory<ts.Bundle | ts.SourceFile>;
            }

References#

https://www.youtube.com/watch?v=BU0pzqyF0nw
https://github.com/basarat/typescript-book
https://github.com/itsdouges/typescript-transformer-handbook
https://github.com/LeDDGroup/typescript-transform-paths
https://github.com/nonara/ts-patch
https://github.com/LeDDGroup/typescript-transform-paths/blob/v1.0.0/src/index.ts
https://github.com/microsoft/TypeScript-Compiler-Notes

Background#

TS Compiler Related#

TS Compilation Process#

Scanner#

Parser#

Binder#

#

Checker#

Emitter#

Transformers#

API#

Visiting#

Nodes#

Writing a Transformer#

Practical Applications#

References#