Skip to content

Rust

This page curates a list of example ast-grep rules to check and to rewrite Rust applications.

Avoid Duplicated Exports

Description

Generally, we don't encourage the use of re-exports.

However, sometimes, to keep the interface exposed by a lib crate tidy, we use re-exports to shorten the path to specific items. When doing so, a pitfall is to export a single item under two different names.

Consider:

rs
pub mod foo;
pub use foo::Foo;

The issue with this code, is that Foo is now exposed under two different paths: Foo, foo::Foo.

This unnecessarily increases the surface of your API. It can also cause issues on the client side. For example, it makes the usage of auto-complete in the IDE more involved.

Instead, ensure you export only once with pub.

YAML

yaml
id: avoid-duplicate-export
language: rust
rule:
  all:
     - pattern: pub use $B::$C;
     - inside:
        kind: source_file
        has:
          pattern: pub mod $A;
     - has:
        pattern: $A
        stopBy: end

Example

rs
pub mod foo;
pub use foo::Foo;
pub use foo::A::B;


pub use aaa::A;
pub use woo::Woo;

Contributed by

Julius Lungys(voidpumpkin)

Beware of char offset when iterate over a string Has Fix

Description

It's a common pitfall in Rust that counting character offset is not the same as counting byte offset when iterating through a string. Rust string is represented by utf-8 byte array, which is a variable-length encoding scheme.

chars().enumerate() will yield the character offset, while char_indices() will yield the byte offset.

rs
let yes = "y̆es";
let mut char_indices = yes.char_indices();
assert_eq!(Some((0, 'y')), char_indices.next()); // not (0, 'y̆')
assert_eq!(Some((1, '\u{0306}')), char_indices.next());
// note the 3 here - the last character took up two bytes
assert_eq!(Some((3, 'e')), char_indices.next());
assert_eq!(Some((4, 's')), char_indices.next());

Depending on your use case, you may want to use char_indices() instead of chars().enumerate().

Pattern

shell
sg -p '$A.chars().enumerate()' \
   -r '$A.char_indices()' \
   -l rs

Example

rs
for (i, char) in source.chars().enumerate() {
    println!("Boshen is angry :)");
}

Diff

rs
for (i, char) in source.chars().enumerate() { 
for (i, char) in source.char_indices() { 
    println!("Boshen is angry :)");
}

Contributed by

Inspired by Boshen's Tweet

Boshen's footgun

Get number of digits in a usize Has Fix

Description

Getting the number of digits in a usize number can be useful for various purposes, such as counting the column width of line numbers in a text editor or formatting the output of a number with commas or spaces.

A common but inefficient way of getting the number of digits in a usize number is to use num.to_string().chars().count(). This method converts the number to a string, iterates over its characters, and counts them. However, this method involves allocating a new string, which can be costly in terms of memory and time.

A better alternative is to use checked_ilog10.

rs
num.checked_ilog10().unwrap_or(0) + 1

The snippet above computes the integer logarithm base 10 of the number and adds one. This snippet does not allocate any memory and is faster than the string conversion approach. The efficient checked_ilog10 function returns an Option<usize> that is Some(log) if the number is positive and None if the number is zero. The unwrap_or(0) function returns the value inside the option or 0 if the option is None.

Pattern

shell
sg -p '$NUM.to_string().chars().count()' \
   -r '$NUM.checked_ilog10().unwrap_or(0) + 1' \
   -l rs

Example

rs
let width = (lines + num).to_string().chars().count();

Diff

rs
let width = (lines + num).to_string().chars().count(); 
let width = (lines + num).checked_ilog10().unwrap_or(0) + 1; 

Contributed by

Herrington Darkholme, inspired by dogfooding ast-grep

Made with ❤️ with Rust