Skip to content

JavaScript API

Powered by napi.rs, ast-grep's JavaScript API enables you to write JavaScript to programmatically inspect and change syntax trees.

ast-grep's JavaScript API design is pretty stable now. No major breaking changes are expected in the future.

To try out the JavaScript API, you can use the code sandbox here.

Installation

First, install ast-grep's napi package.

bash
npm install --save @ast-grep/napi
bash
pnpm add @ast-grep/napi

Now let's explore ast-grep's API!

Core Concepts

The core concepts in ast-grep's JavaScript API are:

  • SgRoot: a class representing the whole syntax tree
  • SgNode: a node in the syntax tree

Make AST like a DOM tree!

Using ast-grep's API is like using jQuery. You can use SgNode to traverse the syntax tree and collect information from the nodes.

Remember your old time web programming?

A common workflow to use ast-grep's JavaScript API is:

  1. Get a syntax tree object SgRoot from string by calling a language's parse method
  2. Get the root node of the syntax tree by calling ast.root()
  3. find relevant nodes by using patterns or rules
  4. Collect information from the nodes

Example:

js
import { js } from '@ast-grep/napi';

let source = `console.log("hello world")`
const ast = js.parse(source)                // 1. parse the source
const root = ast.root()                     // 2. get the root
const node = root.find('console.log($A)')   // 3. find the node
node.getMatch('A').text()                   // 4. collect the info
// "hello world"

SgRoot

SgRoot represents the syntax tree of a source string.

We can import a language object from the @ast-grep/napi package and call the parse to transform string.

js
import { js } from '@ast-grep/napi';

const source = `console.log("hello world")`
const ast = js.parse(source)

The SgRoot object has a root method that returns the root SgNode of the AST.

js
const root = ast.root() // root is an instance of SgNode

SgNode

SgNode is the main interface to view and manipulate the syntax tree.

It has several jQuery like methods for us to search, filter and inspect the AST nodes we are interested in.

js
const log = root.find('console.log($A)') // search node
const arg = log.getMatch('A') // get matched variable
log.text() // "hello world"

Let's see its details in the following sections!

You can use find and findAll to search for nodes in the syntax tree.

  • find returns the first node that matches the pattern or rule.
  • findAll returns an array of nodes that match the pattern or rule.
ts
// search
class SgNode {
  find(matcher: string): SgNode | null
  find(matcher: number): SgNode | null
  find(matcher: NapiConfig): SgNode | null
  findAll(matcher: string): Array<SgNode>
  findAll(matcher: number): Array<SgNode>
  findAll(matcher: NapiConfig): Array<SgNode>
}

Both find and findAll are overloaded functions. They can accept either string, number or a config object. The argument is called Matcher in ast-grep JS.

Matcher

A Matcher can be one of the three types: string, number or object.

  • string is parsed as a pattern. e.g. 'console.log($A)'

  • number is interpreted as the node's kind. In tree-sitter, an AST node's type is represented by a number called kind id. Different syntax node has different kind ids. You can convert a kind name like function to the numeric representation by calling the kind function on the language object. e.g. js.kind('function').

  • A NapiConfig has a similar type of config object. See details below.

ts
// basic find example
root.find('console.log($A)')   // returns SgNode of call_expression
const kind = js.kind('string') // convert kind name to kind id number
root.find(kind)                // returns SgNode of string
root.find('notExist')          // returns null if not found

// basic find all example
const nodes = root.findAll('function $A($$$) {$$$}')
Array.isArray(nodes)     // true, findAll returns SgNode
nodes.map(n => n.text()) // string array of function source
const empty = root.findAll('not exist') // returns []
empty.length === 0 // true

Note, find returns null if no node is found. findAll returns an empty array if nothing matches.

Match

Once we find a node, we can use the following methods to get meta variables from the search.

The getMatch method returns the single node that matches the single meta variable.

And the getMultipleMatches returns an array of nodes that match the multi meta variable.

ts
// search
export class SgNode {
  getMatch(m: string): SgNode | null
  getMultipleMatches(m: string): Array<SgNode>
}

Example:

ts
const src = `
console.log('hello')
logger('hello', 'world', '!')
`
const root = js.parse(src).root()
const node = root.find('console.log($A)')
const arg = node.getMatch("A") // returns SgNode('hello')
arg !== null // true, node is found
arg.text() // returns 'hello'
// returns [] because $A and $$$A are different
node.getMultipleMatches('A')

const logs = root.find('logger($$$ARGS)')
// returns [SgNode('hello'), SgNode('world'), SgNode('!')]
node.getMultipleMatches("ARGS")
node.getMatch("A") // returns null

Inspection

The following methods are used to inspect the node.

ts
// node inspection
export class SgNode {
  range(): Range
  isLeaf(): boolean
  kind(): string
  text(): string
}

Example:

ts
const ast = js.parse("console.log('hello world')")
root = ast.root()
root.text() // will return "console.log('hello world')"

Another important method is range, which returns two Pos object representing the start and end of the node.

One Pos contains the line, column, and offset of that position. All of them are 0-indexed.

You can use the range information to locate the source and modify the source code.

ts
const rng = node.range()
const pos = rng.start // or rng.end, both are `Pos` objects
pos.line // 0, line starts with 0
pos.column // 0, column starts with 0
rng.end.index // 17, index starts with 0

Refinement

You can also filter nodes after matching by using the following methods.

This is dubbed as "refinement" in the documentation. Note these refinement methods only support using pattern at the moment.

ts
export class SgNode {
  matches(m: string): boolean
  inside(m: string): boolean
  has(m: string): boolean
  precedes(m: string): boolean
  follows(m: string): boolean
}

Example:

ts
const node = root.find('console.log($A)')
node.matches('console.$METHOD($B)') // true

Traversal

You can traverse the tree using the following methods, like using jQuery.

ts
export class SgNode {
  children(): Array<SgNode>
  field(name: string): SgNode | null
  parent(): SgNode | null
  child(nth: number): SgNode | null
  ancestors(): Array<SgNode>
  next(): SgNode | null
  nextAll(): Array<SgNode>
  prev(): SgNode | null
  prevAll(): Array<SgNode>
}

findInFiles

If you have a lot of files to parse and want to maximize your programs' performance, ast-grep's language object provides a findInFiles function that parses multiple files and searches relevant nodes in parallel Rust threads.

APIs we showed above all require parsing code in Rust and pass the SgRoot back to JavaScript. This incurs foreign function communication overhead and only utilizes the single main JavaScript thread. By avoiding Rust-JS communication overhead and utilizing multiple core computing, findInFiles is much faster than finding files in JavaScript and then passing them to Rust as string.

The function signature of findInFiles is as follows:

ts
export function findInFiles(
  /** specify the file path and matcher */
  config: FindConfig,
  /** callback function for found nodes in a file */
  callback: (err: null | Error, result: SgNode[]) => void
): Promise<number>

findInFiles accepts a FindConfig object and a callback function.

FindConfig specifies both what file path to parse and what nodes to search.

findInFiles will parse all files matching paths and will call back the function with nodes matching the matcher found in the files as arguments.

FindConfig

The FindConfig object specifies which paths to search code and what rule to match node against.

The FindConfig object has the following type:

ts
export interface FindConfig {
  paths: Array<string>
  matcher: NapiConfig
}

The path field is an array of strings. You can specify multiple paths to search code. Every path in the array can be a file path or a directory path. For a directory path, ast-grep will recursively find all files matching the language.

The matcher is the same as NapiConfig stated above.

Callback Function and Termination

The callback function is called for every file that have nodes that match the rule. The callback function is a standard node-style callback with the first argument as Error and second argument as an array of SgNode objects that match the rule.

The return value of findInFiles is a Promise object. The promise resolves to the number of files that have nodes that match the rule.

DANGER

findInFiles can return before all file callbacks are called due to NodeJS limitation. See https://github.com/ast-grep/ast-grep/issues/206.

If you have a lot of files and findInFiles prematurely returns, you can use the total files returned by findInFiles as a check point. Maintain a counter outside of findInFiles and increment it in callback. If the counter equals the total number, we can conclude all files are processed. The following code is an example, with core logic highlighted.

ts
type Callback = (t: any, cb: any) => Promise<number>
function countedPromise<F extends Callback>(func: F) {
  type P = Parameters<F>
  return async (t: P[0], cb: P[1]) => {
    let i = 0
    let fileCount: number | undefined = undefined
    // resolve will be called after all files are processed
    let resolve = () => {}
    function wrapped(...args: any[]) {
      let ret = cb(...args)
      if (++i === fileCount) resolve()
      return ret
    }
    fileCount = await func(t, wrapped as P[1])
    // not all files are processed, await `resolve` to be called
    if (fileCount > i) {
      await new Promise<void>(r => resolve = r)
    }
    return fileCount
  }
}

Example

Example of using findInFiles

ts
let fileCount = await js.findInFiles({
  paths: ['relative/path/to/code'],
  matcher: {
    rule: {kind: 'member_expression'}
  },
}, (err, n) => {
  t.is(err, null)
  t.assert(n.length > 0)
  t.assert(n[0].text().includes('.'))
})

Made with ❤️ with Rust