How VS Code works: text storage

I'm looking to make my own note taking application, as many software developers try to do. Going for a web-based text editor, I wanted to investigate how this gets done. This article mostly digs into VS Code, and will be expanded in future articles.

There seems to be two main techniques for making a browser based text editor:

Use contenteditable.
Use a proxy textarea element.

Editors such as Quill and Atlassian's Confluence (based on TinyMCE) go the contenteditable route. This is a HTML attribute you can put on elements to make them editable, implemented by the browser itself.

Visual Studio Code however has no contenteditable. Instead it recreates editing features from scratch using a hidden textarea to capture actions. This method seems more interesting and allows more flexibility. So lets explore how that works.

The view

Within VS Code's codebase you can find the View class. This appears to be the heart of the text editor. There's a massive amount of class hierarchy going on, but it looks like View contains almost everything, including the actual text, the scrollbar, the cursors, and the text area handler, which is the hidden textarea.

Chasing the edit

We're now going to start from this view, and chase the code all the way to the actual editing of text somewhere in memory. It's a long journey and not much fun, but it's here anyway. Feel free to skip this whole section unless you're really interested.

The TextAreaHandler is the class that actually creates the textarea element, wrapping it in a TextAreaWrapper. This wrapper is where the various event handlers get hooked up:

public readonly onKeyDown = this._register(new DomEmitter(this._actual, 'keydown')).event;
public readonly onKeyPress = this._register(new DomEmitter(this._actual, 'keypress')).event;
public readonly onKeyUp = this._register(new DomEmitter(this._actual, 'keyup')).event;
// ...
public readonly onInput = <Event<InputEvent>>this._register(new DomEmitter(this._actual, 'input')).event;

These bubble up to call various methods on the view controller, like .type(), which in turn call methods on an ICommandDelegate which looks like this:

export interface ICommandDelegate {
	paste(text: string, pasteOnNewLine: boolean, multicursorText: string[] | null, mode: string | null): void;
	type(text: string): void;
	compositionType(text: string, replacePrevCharCnt: number, replaceNextCharCnt: number, positionDelta: number): void;
	startComposition(): void;
	endComposition(): void;
	cut(): void;
}

You can find an implementation of this in the CodeEditorWidget class. It calls it's own _type method which calls this._modelData.viewModel.type(). A likely candidate for the viewModel variable is this ViewModel. This is the type method of it:

public type(text: string, source?: string | null | undefined): void {
  this._executeCursorEdit(eventsCollector => this._cursor.type(eventsCollector, text, source));
}

The _cursor is a CursorsController, with the following type method:

public type(eventsCollector: ViewModelEventsCollector, text: string, source?: string | null | undefined): void {
  this._executeEdit(() => {
    if (source === 'keyboard') {
      // If this event is coming straight from the keyboard, look for electric characters and enter

      const len = text.length;
      let offset = 0;
      while (offset < len) {
        const charLength = strings.nextCharLength(text, offset);
        const chr = text.substr(offset, charLength);

        // Here we must interpret each typed character individually
        this._executeEditOperation(TypeOperations.typeWithInterceptors(!!this._compositionState, this._prevEditOperationType, this.context.cursorConfig, this._model, this.getSelections(), this.getAutoClosedCharacters(), chr));

        offset += charLength;
      }

    } else {
      this._executeEditOperation(TypeOperations.typeWithoutInterceptors(this._prevEditOperationType, this.context.cursorConfig, this._model, this.getSelections(), text));
    }
  }, eventsCollector, source);
}

A few things to notice here. The entire contents of this function is passing a callback to this._executeEdit, and we end up calling this._executeEditOperation in that callback with typeWith*Interceptors, which is itself a function call. The typeWith.. call returns an EditOperationResult object, which contains commands that _executeEditOperation then executes, using CommandExecutor.executeCommands().

After much chasing, this calls a ITextBuffer::applyEdits function. An implementation of this is PieceTreeTextBuffer. You can see insertion of text finally here, which is just a call to PieceTreeBase::insert.

Not all trees are green

Chasing the edit we found the text you see is stored in a PieceTreeTextBuffer, which is an implementation of ITextBuffer. You can see all the code in the pieceTreeTextBuffer directory.

This is an implementation of a Red-Black tree, where the nodes of the tree are pieces of text. Red-Black trees are ordered, so running through the tree would produce the text of the editor. This data structure allows efficient random edits--exactly the kind of operation you do when writing code.

The applyEdits on the text buffer is also responsible for producing the reverse of the edits that would be used to undo. It also fires an onDidChangeContent event which may be useful to know later.

Rendering

We've seen where VS Code stores its text now. But how does it render it? Starting at the lowest part, we can find the ILine interface. ILine is extended by IVisibleLine, which adds getting and setting an actual DOM node, as well as renderLine and layoutLine methods. This is implemented by ViewLine.

Here's the renderLine method signature:

public renderLine(
    lineNumber: number, 
    deltaTop: number, 
    viewportData: ViewportData, 
    sb: StringBuilder): boolean

You can see it is passed a StringBuilder, this is important later. StringBuilder's underlying storage is a Uint16Array, likely to match JavaScript's UTF-16 encoding for string. It is probably stored this way rather than a string for better control over allocation and mutation. During the renderLine method, a RenderLineInput is created taking many arguments:

const renderLineInput = new RenderLineInput(
  // ...
  lineData.content,
  // ... 
);

...and we finally get to actual rendering of the line:

sb.appendString('<div style="top:');
sb.appendString(String(deltaTop));
sb.appendString('px;height:');
sb.appendString(String(this._options.lineHeight));
sb.appendString('px;" class="');
sb.appendString(ViewLine.CLASS_NAME);
sb.appendString('">');

const output = renderViewLine(renderLineInput, sb);

sb.appendString('</div>');

Literally writing out HTML as strings! I found this surprising, I expected direct DOM manipulation. If you follow the renderViewLine, there's two major functions doing a lot of work.

resolveRenderLineInput which breaks a line up into 'parts'.
_renderLine which writes HTML to the string builder.

I will likely dig into these in a future article.