Pattern Syntax
In this guide we will walk through ast-grep's pattern syntax. The example will be written in JavaScript, but the basic principle will apply to other languages as well.
Pattern Matching
ast-grep uses pattern code to construct AST tree and match that against target code. The pattern code can search through the full syntax tree, so pattern can also match nested expression. For example, the pattern a + 1
can match all the following code.
const b = a + 1
funcCall(a + 1)
deeplyNested({
target: a + 1
})
WARNING
Pattern code must be valid code that tree-sitter can parse.
ast-grep playground is a useful tool to confirm pattern is parsed correctly.
If ast-grep fails to parse code as expected, you can try give it more context by using object-style pattern.
Meta Variable
It is usually desirable to write a pattern to match dynamic content.
We can use meta variables to match sub expression in pattern.
Meta variables start with the $
sign, followed by a name composed of upper case letters A-Z
, underscore _
or digits 1-9
. $META_VARIABLE
is a wildcard expression that can match any single AST node.
Think it as REGEX dot .
, except it is not textual.
Valid meta variables
$META
, $META_VAR
, $META_VAR1
, $_
, $_123
Invalid meta variables
$invalid
, $Svalue
, $123
, $KEBAB-CASE
, $
The pattern console.log($GREETING)
will match all the following.
function tryAstGrep() {
console.log('Hello World')
}
const multiLineExpression =
console
.log('Also matched!')
But it will not match these.
// console.log(123) in comment is not matched
'console.log(123) in string' // is not matched as well
console.log() // mismatch argument
console.log(a, b) // too many arguments
Note, one meta variable $MATCH
will match one single AST node, so the last two console.log
calls do not match the pattern. Let's see how we can match multiple AST nodes.
Multi Meta Variable
We can use $$$
to match zero or more AST nodes, including function arguments, parameters or statements. These variables can also be named, for example: console.log($$$ARGS)
.
Function Arguments
For example, console.log($$$)
can match
console.log() // matches zero AST node
console.log('hello world') // matches one node
console.log('debug: ', key, value) // matches multiple nodes
console.log(...args) // it also matches spread
Function Parameters
function $FUNC($$$ARGS) { $$$ }
will match
function foo(bar) {
return bar
}
function noop() {}
function add(a, b, c) {
return a + b + c
}
ARGS
will be populated with a list of AST nodes. Click to see details.
Code | Match |
---|---|
function foo(bar) { ... } | [bar ] |
function noop() {} | [] |
function add(a, b, c) { ... } | [a , b , c ] |
Meta Variable Capturing
Meta variable is also similar to capture group in regular expression. You can reuse same name meta variables to find previously occurred AST nodes.
For example, the pattern $A == $A
will have the following result.
// will match these patterns
a == a
1 + 1 == 1 + 1
// but will not match these
a == b
1 + 1 == 2
Non Capturing Match
You can also suppress meta variable capturing. All meta variables with name starting with underscore _
will not be captured.
// Given this pattern
$_FUNC($_FUNC)
// it will match all function call with one argument or spread call
test(a)
testFunc(1 + 1)
testFunc(...args)
Note in the example above, even if two meta variables have the same name $_FUNC
, each occurrence of $_FUNC
can match different content because the are not captured.
Why use non-capturing match?
This is a useful trick to micro-optimize pattern matching speed, since we don't need to create a HashMap for bookkeeping.
Capture Unnamed Nodes
A meta variable pattern $META
will capture named nodes by default. To capture unnamed nodes, you can use double dollar sign $$VAR
.
Namedness is an advanced topic in Tree-sitter. You can read this in-depth guide for more background.
More Powerful Rule
Pattern is a fast and easy way to match code. But it is not as powerful as rule which can match code with more precise selector or more context.
We will cover using rules in next chapter.