Skip to content

Add New Language to ast-grep

Thank you for your interest in adding a new language to ast-grep! We appreciate your contribution to this project. Adding new languages will make the tool more useful and accessible to a wider range of users.

However, there are some requirements and constraints that you need to consider before you start. This guide will help you understand the process and the standards of adding a new language to ast-grep.

Requirements and Constraints

To keep ast-grep lightweight and fast, we have several factors to consider when adding a new language. As a rule of thumb, we want to limit the binary size of ast-grep under 10MB after zip compression.

  • Popularity of the language. While the popularity of a language does not necessarily reflect its merits, our limited size budget allows us to only support languages that are widely used and have a large user base. Online sources like TIOBE index or GitHub Octoverse can help one to check the popularity of the language.
  • Quality of the Tree-sitter grammar. ast-grep relies on Tree-sitter, a parser generator tool and a parsing library, to support different languages. The Tree-sitter grammar for the new language should be well-written, up-to-date, and regularly maintained. You can search Tree-sitter on GitHub or on crates.io.

  • Size of the grammar. The new language's grammar should not be too complicated. Otherwise it may take too much space from other languages. You can also check the current size of ast-grep in the releases page.

  • Availability of the grammar on crates.io. To ease the maintenance burden, we prefer to use grammars that are published on crates.io, Rust's package registry. If your grammar is not on crates.io, you need to publish it yourself or ask the author to do so.


Don't worry if your language is not supported by ast-grep. You can try ast-grep's custom language support and register your own Tree-sitter parser!

If your language satisfies the requirements above, congratulations! Let's see how to add it to ast-grep.

Add to ast-grep Core

ast-grep has several distinct use cases: CLI tool, n-api lib and web playground.

Adding a language includes two steps. The first step is to add the language to ast-grep core. The core repository is multi-crate workspace hosted at GitHub. The relevant crate is language, which defines the supported languages and their tree-sitter grammars.

We will use Ruby as an example to show how to add a new language to ast-grep core. You can see the commit as a reference.

Add Dependencies

  1. Add tree-sitter-[lang] crate as dependencies to the Cargo.toml in the language crate.
toml
# Cargo.toml
[dependencies]
...
tree-sitter-ruby = {version = "0.20.0", optional = true } 
...

Note the optional attribute is required here.

  1. Add the tree-sitter-[lang] dependency in builtin-parser list.
toml
# Cargo.toml
[features]
builtin-parser = [
  ...
  "tree-sitter-ruby",  // [!code ++]
  ...
]

The builtin-parser feature is used for command line tool. Web playground is not using the builtin parser so the dependency must be optional.

Implement Parser

  1. Add the parser function in parsers.rs, where tree-sitter grammars are imported.
rust
#[cfg(feature = "builtin-parser")]
mod parser_implementation  {
  ...
  pub fn language_ruby() -> TSLanguage { 
    tree_sitter_ruby::language().into()  
  }                                      
  ...
}

#[cfg(not(feature = "builtin-parser"))]
mod parser_implementation  {
  impl_parsers!(
    ...
    language_ruby, 
    ...
  );
}

Note there are two places to add, one for #[cfg(feature = "builtin-parser")] and the other for #[cfg(not(feature = "builtin-parser"))].

  1. Implement language trait by using macro in lib.rs
rust
// lib.rs
impl_lang_expando!(Ruby, language_ruby, 'µ'); 

There are two macros, impl_lang_expando or impl_lang, to generate necessary methods required by ast-grep Language trait.

You need to choose one of them to use for the new language. If the language does not allow $ as valid identifier character and you need to customize the expando_char, use impl_lang_expando.

You can reference the comment here for more information.

Register the New Language

  1. Add new lang in SupportLang enum.
rust
// lib.rs
pub enum SupportLang {
  ...
  Ruby, 
  ...
}
  1. Add new lang in execute_lang_method
rust
// lib.rs
macro_rules! execute_lang_method {
  ($me: path, $method: ident, $($pname:tt),*) => {
    use SupportLang as S;
    match $me {
      ...
      S::Ruby => Ruby.$method($($pname,)*), 
    }
  }
}
  1. Add new lang in all_langs, alias, extension and file_types

See this commit for the detailed code change.

Find existing languages as reference

The rule of thumb to add a new language is to find a reference language that is already included in the language crate. Then add your new language by searching and following the existing language.

Add to ast-grep Playground

Adding new language to web playground is a little bit more complex.

The playground has a standalone repository and we need to change code there.

Prepare WASM

  1. Set up Tree-sitter

First, we need to set up Tree-sitter development tools like. You can refer to the Tree-sitter setup section in this link.

  1. Build WASM file

Then, in your parser repository, use this command to build a WASM file.

bash
tree-sitter generate # if grammar is not generated before
tree-sitter build-wasm

Note you may need to install docker when building WASM files.

  1. Move WASM file to the website public folder.

You can also see other languages' WASM files in the public directory. The file name is in the format of tree-sitter-[lang].wasm. The name will be used later in parserPaths.

Add language in Rust

You need to add the language in the wasm_lang.rs. More specifically, you need to add a new enum variant in WasmLang, handle the new variant in execute_lang_method and implement FromStr.

rust
// new variant
pub enum WasmLang {
  // ...
  Swift, 
}

// handle variant in macro
macro_rules! execute_lang_method {
  ($me: path, $method: ident, $($pname:tt),*) => {
    use WasmLang as W;
    match $me {
      W::Swift => L::Swift.$method($($pname,)*), 
    }
  }
}

// impl FromStr
impl FromStr for WasmLang {
  // ...
  fn from_str(s: &str) -> Result<Self, Self::Err> {
    Ok(match s {
      "swift" => Swift, 
    })
  }
}

Add language in TypeScript

Finally you need to add the language in TypeScript to make it available in playground. The file is lang.ts. There are two changes need to make.

typescript
// Add language parserPaths
const parserPaths = {
  // ...
  swift: 'tree-sitter-swift.wasm', 
}

// Add language display name
export const languageDisplayNames: Record<SupportedLang, string> = {
  // ...
  swift: 'Swift',
}

You can see Swift's support as the reference commit.

Made with ❤️ with Rust