Rust
This page curates a list of example ast-grep rules to check and to rewrite Rust applications.
Avoid Duplicated Exports
Description
Generally, we don't encourage the use of re-exports.
However, sometimes, to keep the interface exposed by a lib crate tidy, we use re-exports to shorten the path to specific items. When doing so, a pitfall is to export a single item under two different names.
Consider:
pub mod foo;
pub use foo::Foo;
The issue with this code, is that Foo
is now exposed under two different paths: Foo
, foo::Foo
.
This unnecessarily increases the surface of your API. It can also cause issues on the client side. For example, it makes the usage of auto-complete in the IDE more involved.
Instead, ensure you export only once with pub
.
YAML
id: avoid-duplicate-export
language: rust
rule:
all:
- pattern: pub use $B::$C;
- inside:
kind: source_file
has:
pattern: pub mod $A;
- has:
pattern: $A
stopBy: end
Example
pub mod foo;
pub use foo::Foo;
pub use foo::A::B;
pub use aaa::A;
pub use woo::Woo;
Contributed by
Julius Lungys(voidpumpkin)
Beware of char offset when iterate over a string Has Fix
Description
It's a common pitfall in Rust that counting character offset is not the same as counting byte offset when iterating through a string. Rust string is represented by utf-8 byte array, which is a variable-length encoding scheme.
chars().enumerate()
will yield the character offset, while char_indices()
will yield the byte offset.
let yes = "y̆es";
let mut char_indices = yes.char_indices();
assert_eq!(Some((0, 'y')), char_indices.next()); // not (0, 'y̆')
assert_eq!(Some((1, '\u{0306}')), char_indices.next());
// note the 3 here - the last character took up two bytes
assert_eq!(Some((3, 'e')), char_indices.next());
assert_eq!(Some((4, 's')), char_indices.next());
Depending on your use case, you may want to use char_indices()
instead of chars().enumerate()
.
Pattern
sg -p '$A.chars().enumerate()' \
-r '$A.char_indices()' \
-l rs
Example
for (i, char) in source.chars().enumerate() {
println!("Boshen is angry :)");
}
Diff
for (i, char) in source.chars().enumerate() {
for (i, char) in source.char_indices() {
println!("Boshen is angry :)");
}
Contributed by
Inspired by Boshen's Tweet
Get number of digits in a usize
Has Fix
Description
Getting the number of digits in a usize number can be useful for various purposes, such as counting the column width of line numbers in a text editor or formatting the output of a number with commas or spaces.
A common but inefficient way of getting the number of digits in a usize
number is to use num.to_string().chars().count()
. This method converts the number to a string, iterates over its characters, and counts them. However, this method involves allocating a new string, which can be costly in terms of memory and time.
A better alternative is to use checked_ilog10
.
num.checked_ilog10().unwrap_or(0) + 1
The snippet above computes the integer logarithm base 10 of the number and adds one. This snippet does not allocate any memory and is faster than the string conversion approach. The efficient checked_ilog10
function returns an Option<usize>
that is Some(log)
if the number is positive and None
if the number is zero. The unwrap_or(0)
function returns the value inside the option or 0
if the option is None
.
Pattern
sg -p '$NUM.to_string().chars().count()' \
-r '$NUM.checked_ilog10().unwrap_or(0) + 1' \
-l rs
Example
let width = (lines + num).to_string().chars().count();
Diff
let width = (lines + num).to_string().chars().count();
let width = (lines + num).checked_ilog10().unwrap_or(0) + 1;
Contributed by
Herrington Darkholme, inspired by dogfooding ast-grep