Reusing Rule as Utility
ast-grep chooses to use YAML for rule representation. While this decision makes writing rules easier, it does impose some limitations on the rule authoring. One of the limitations is that rule objects cannot be reused. Let's see an example.
Suppose we want to match all literal values in JavaScript. We will need to match these kinds:
any:
- kind: 'false'
- kind: undefined
- kind: 'null'
- kind: 'true'
- kind: regex
- kind: number
- kind: string
If we want to match functions in different places using only the plain YAML file, we will have to copy and paste the rule above several times. Say, we want to match either literal values or an array of literal values:
rule:
any:
- kind: 'false'
- kind: undefined
# more literal kinds omitted
# ...
- kind: array
has:
any:
- kind: 'false'
- kind: undefined
# more literal kinds omitted
# ...
ast-grep provides a mechanism to reuse common rules: utils
. A utility rule is a rule defined in the utils
section of the config file, or in a separate global rule file. It can be referenced in other rules using the composite rule matches
. So, the above example can be rewritten as:
# define util rules using utils field
utils:
# it accepts a string-keyed dictionary of rule object
is-literal: # rule-id
any: # actual rule object
- kind: 'false'
- kind: undefined
- kind: 'null'
- kind: 'true'
- kind: regex
- kind: number
- kind: string
rule:
any:
- matches: is-literal # reference the util!
- kind: array
has:
matches: is-literal # reference it again!
There are two ways to define utility rules in ast-grep: Local utility rules and Global Utility Rules. They are both used in the matches
composite rules by their ids.
Local Utility Rules
Local utility rules are defined in the utils
field of the config file. utils
is a string-keyed dictionary.
The keys of the dictionary is utility rules' identifiers, which will be later used in matches
. Note that local utility rule identifier cannot have the same name of another local utility rule. But a local utility rule can have the same name of another global utility rule and override the global one.
The value of the dictionary is the rule object. You can define a local utility rule using the same syntax as the rule
field.
Local utility rules are only available in the config file where they are defined.
For example, the following config file defines a local utility rule is-literal
:
utils:
is-literal:
any:
- kind: 'false'
- kind: undefined
- kind: 'null'
- kind: 'true'
- kind: regex
- kind: number
- kind: string
rule:
matches: is-literal
The matches
in rule
will run the matcher rule is-literal
against AST nodes.
Local rules must have the same language as their rule configuration file where utilities are defined. Local rules cannot have their separate constraints
as well.
Global Utility Rules
Global utility rules are defined in a separate file. But they are available across all rule configurations in the project.
To create global utility rules, you first need a proper ast-grep project setup like below.
my-awesome-project # project root
|- rules # rule directory
| |- my-rule.yml
|- utils # utils directory
| |- is-literal.yml
|- sgconfig.yml # project configuration
Note the utils
directory where all global utility rules will be stored. We also need to specify which directory is utility rules so that ast-grep can pick up.
In sgconfig.yml
:
ruleDirs:
- rules
utilDirs:
- utils
Now we can define our global utility rule in the is-literal.yml
file. A global utility rule looks like a regular rule file, but it can only have limited fields: id
, language
, rule
, constraints
and their own local rules utils
.
# is-literal.yml
id: is-literal
language: TypeScript
rule:
any:
- kind: 'false'
- kind: undefined
- kind: 'null'
- kind: 'true'
- kind: regex
- kind: number
- kind: string
Contrary to local utility rule, you must define id
and language
in the global utility rule. The id
is not defined as a dictionary key.
Global utility rule have their own local utility rules and these local rules can only be accessed in their defining global rule file. Similarly, global utility rules can have their own constraints
as well.
Finally, a rule file, whether it is a utility rule or not, can have local utility rules with the same name of another global utility rule. Global utility rules are superseded by the local homonymous rule.
Recursive Rule Trick
You can use a utility rule inside another utility. Besides rule reusing, this also opens the possibility of recursive rule.
For example, if we want to match all expressions that evaluate to number literal in JavaScript. We can use kind: number
to match 123
or 1.23
. But how to match expressions in parenthesis like (((123)))
?
Using matches
and utility rule can solve this.
utils:
is-number:
any:
- kind: number
- kind: parenthesized_expression
has:
matches: is-number
rule:
matches: is-number
If we matches (123)
with this rule, we will first match the kind: parenthesized_expression
with a direct child that also matches is-number
rule. This will make us match 123
with is-number
which will succeed because kind: number
matches the number literal.
Using matches
and recursive utility rule can unlock a lot of sophisticated usage of rule. But there is one thing you need to bear in mind:
Dependency Cycle is not allowed
Rule cannot have a cyclic dependency when using matches
. That is, a rule cannot transitively reference itself in its composite components.
A dependency cycle in rule will cause infinite recursion and make ast-grep stuck in one AST node without any progression.
However, you can use self-referencing rule in relational components like inside
or has
. A curious reader can try to answer why this is okay.