c2rust-refactor 0.15.0

C2Rust refactoring tool implementation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
```refactor-options hidden
revert
diff-style = full
no-show-filename
no-collapse-diff
irrelevant-start-regex = '.*let y: i32'
irrelevant-end-regex = '.*let y: i32'
```


`c2rust refactor` provides a general-purpose rewriting command, `rewrite_expr`,
for transforming expressions.
In its most basic form, `rewrite_expr` replaces one expression with another,
everywhere in the crate:

```rust refactor-target hidden
fn main() {
    println!("{}", 1 + 1);
    println!("{}", 1 + /*comment*/ 1);
    println!("{}", 1+11);
}
```

```refactor
rewrite_expr '1+1' '2'
```

Here, all instances of the expression `1+1` (the "pattern") are replaced with
`2` (the "replacement").

`rewrite_expr` parses both the pattern and the replacement as Rust expressions,
and compares the structure of the expression instead of its raw text when
looking for occurrences of the pattern.  This lets it recognize that `1 + 1`
and `1 + /* comment */ 1` both match the pattern `1+1` (despite being textually
distinct), while `1+11` does not (despite being textually similar).


# Metavariables

In `rewrite_expr`'s expression pattern, any name beginning with double
underscores is a *metavariable*.  Just as a variable in an ordinary Rust
`match` expression will match any value (and bind it for later use), a
metavariable in an expression pattern will match any Rust code.  For example,
the expression pattern `__x + 1` will match any expression that adds 1 to
something:

```rust refactor-target hidden
fn f() -> i32 { 123 }

fn main() {
    println!("a = {}", 1 + 1);
    println!("b = {}", 2 * 3 + 1);
    println!("c = {}", 4 + 5 + 1);
    println!("d = {}", f() + 1);
}
```

```refactor
rewrite_expr '__x + 1' '11'
```

In these examples, the `__x` metavariable matches the expressions `1`, `2 * 3`,
and `f()`.

## Using bindings

When a metavariable matches against some piece of code, the code it matches is
bound to the variable for later use.  Specifically, `rewrite_expr`'s
replacement argument can refer back to those metavariables to substitute in the
matched code:

```refactor
rewrite_expr '__x + 1' '11 * __x'
```

In each case, the expression bound to the `__x` metavariable is substituted
into the right-hand side of the multiplication in the replacement.

## Multiple occurences

Finally, the same metavariable can appear multiple times in the pattern.  In
that case, the pattern matches only if each occurence of the metavariable
matches the same expression.  For example:

```rust refactor-target hidden
fn f() -> i32 { 123 }

fn main() {
    let a = 2;
    println!("{}", 1 + 1);
    println!("{}", a + a);
    println!("{}", f() + f());
    println!("{}", f() + 1);
}
```

```refactor
rewrite_expr '__x + __x' '2 * __x'
```

Here `a + a` and `f() + f()` are both replaced, but `f() + 1` is not because
`__x` cannot match both `f()` and `1` at the same time.


## Example: adding a function argument

Suppose we wish to add an argument to an existing function.  All current
callers of the function should pass a default value of `0` for this new
argument.  We can update the existing calls like this:

```rust refactor-target hidden
fn my_func(x: i32, y: i32) {
    /* ... */
}

fn main() {
    my_func(1, 2);
    let x = 123;
    my_func(x, x);
    my_func(0, { let y = x; y + y });
}
```

```refactor
rewrite_expr 'my_func(__x, __y)' 'my_func(__x, __y, 0)'
```

Every call to `my_func` now passes a third argument, and we can update the
definition of `my_func` to match.


# Special matching forms

`rewrite_expr` supports several *special matching forms* that can appear in
patterns to add extra restrictions to matching.


## `def!`

A pattern such as `def!(::foo::f)` matches any ident or path expression that
resolves to the function whose absolute path is `::foo::f`.  For example, to
replace all expressions referencing the function `foo::f` with ones referencing
`foo::g`:

```rust refactor-target hidden
mod foo {
    fn f() { /* ... */ }
    fn g() { /* ... */ }
}

fn main() {
    use self::foo::f; 
    // All these calls get rewritten
    f();
    foo::f();
    ::foo::f();
}

mod bar {
    fn f() {}

    fn f_caller() {
        // This call does not...
        f();
        // But this one still does
        super::foo::f();
    }
}
```

```refactor
rewrite_expr 'def!(::foo::f)' '::foo::g'
```

This works for all direct references to `f`, whether by relative path
(`foo::f`), absolute path (`::foo::f`), or imported identifier (just `f`, with
`use foo::f` in scope).  It can even handle imports under a different name
(`f2` with `use foo::f as f2` in scope), since it checks only the path of the
referenced definition, not the syntax used to reference it.

### Under the hood

When `rewrite_expr` attempts to match `def!(path)` against some expression `e`,
it actually completely ignores the content of `e` itself.  Instead, it performs
these steps:

 1. Check `rustc`'s name resolution results to find the definition `d` that `e`
    resolves to.  (If `e` doesn't resolve to a definition, then the matching
    fails.)
 2. Construct an absolute path `dpath` referring to `d`.  For definitions in
    the current crate, this path looks like `::mod1::def1`.  For definitions in
    other crates, it looks like `::crate1::mod1::def1`.
 3. Match `dpath` against the `path` pattern provided as the argument
    of `def!`.  Then `e` matches `def!(path)` if `dpath` matches `path`, and
    fails to match otherwise.

### Debugging match failures

Matching with `def!` can sometimes fail in surprising ways, since the
user-provided `path` is matched against a generated path that may not appear
explicitly anywhere in the source code.  For example, this attempt to match
`HashMap::new` does not succeed:

```rust refactor-target hidden
use std::collections::hash_map::HashMap;

fn main() {
    let m: HashMap<i32, i32> = HashMap::new();
}
```

```refactor
rewrite_expr
    'def!(::std::collections::hash_map::HashMap::new)()'
    '::std::collections::hash_map::HashMap::with_capacity(10)'
```

The `debug_match_expr` command exists to diagnose such problems.  It takes only
a pattern, and prints information about attempts to match it at various points
in the crate:

```refactor hide-diff
debug_match_expr 'def!(::std::collections::hash_map::HashMap::new)()'
```

Here, its output includes this line:

```
def!(): trying to match pattern path(::std::collections::hash_map::HashMap::new) against AST path(::std::collections::HashMap::new)
```

Which reveals the problem: the absolute path `def!` generates for
`HashMap::new` uses the reexport at `std::collections::HashMap`, not the
canonical definition at `std::collections::hash_map::HashMap`.  Updating the
previous `rewrite_expr` command allows it to succeed:

```refactor
rewrite_expr
    'def!(::std::collections::HashMap::new)()'
    '::std::collections::HashMap::with_capacity(10)'
```


### Metavariables

The argument to `def!` is a path pattern, which can contain metavariables just
like the overall expression pattern.  For instance, we can rewrite all calls to
functions from the `foo` module:

```rust refactor-target hidden
mod foo {
    fn f() { /* ... */ }
    fn g() { /* ... */ }
}

mod bar {
    fn f() { /* ... */ }
    fn g() { /* ... */ }
}

fn main() {
    foo::f();
    foo::g();
}
```

```refactor
rewrite_expr 'def!(::foo::__name)()' '123'
```

Since every definition in the `foo` module has an absolute path of the form
`::foo::(something)`, they all match the expression pattern
`def!(::foo::__name)`.

Like any other metavariable, the ones in a `def!` path pattern can be used in
the replacement expression to substitute in the captured name.  For example, we
can replace all references to items in the `foo` module with references to the
same-named items in the `bar` module:

```refactor
rewrite_expr 'def!(::foo::__name)' '::bar::__name'
```

Note, however, that each metavariable in a path pattern can match only a single
ident.  This means `foo::__name` will not match the path to an item in a
submodule, such as `foo::one::two`.  Handling these would require an additional
rewrite step, such as `rewrite_expr 'def!(::foo::__name1::__name2)'
'::bar::__name1::__name2'`.


## `typed!`

A pattern of the form `typed!(e, ty)` matches any expression that matches the
pattern `e`, but only if the type of that expression matches the pattern `ty`.
For example, we can perform a rewrite that only affects `i32`s:

```rust refactor-target hidden
fn main() {
    let x = 100_i32;
    let y: i32 = 100;
    let z = x + y;

    let a = "hello";
    let b = format!("{}, {}", a, "world");
}
```

```refactor
rewrite_expr 'typed!(__e, i32)' '0'
```

Every expression matches the metavariable `__e`, but only the `i32`s (whether
literals or variables of type `i32`) are affected by the rewrite.


### Under the hood

Internally, `typed!` works much like `def!`.  To match an expression `e`
against `typed!(e_pat, ty_pat)`, `rewrite_expr` follows these steps:

 1. Consult `rustc`'s typechecking results to get the type of `e`.  Call
    that type `rustc_ty`.
 2. `rustc_ty` is an internal, abstract representation of the type, which is
    not suitable for matching.  Construct a concrete representation of
    `rustc_ty`, and call it `ty`.
 3. Match `e` against `e_pat` and `ty` against `ty_pat`.  Then `e` matches
    `typed!(e_pat, ty_pat)` if both matches succeed, and fails to match
    otherwise.


### Debugging match failures

When matching fails unexpectedly, `debug_match_expr` is once again useful for
understanding the problem.  For example, this rewriting command has no effect:

```rust refactor-target hidden
fn main() {
    let a = "hello";
    let b = format!("{}, {}", a, "world");
}
```

```refactor
rewrite_expr "typed!(__e, &'static str)" '"hello"'
```

Passing the same pattern to `debug_match_expr` produces output that includes
the following:

```refactor hidden
debug_match_expr "typed!(__e, &'static str)"
```

```
typed!(): trying to match pattern type(&'static str) against AST type(&str)
```

Now the problem is clear: the concrete type representation constructed for
matching omits lifetimes.  Replacing `&'static str` with `&str` in the pattern
causes the rewrite to succeed:

```refactor
rewrite_expr 'typed!(__e, &str)' '"hello"'
```


### Metavariables

The expression pattern and type pattern arguments of `typed!(e, ty)` are
handled using the normal `rewrite_expr` matching engine, which means they can
contain metavariables and other special matching forms.  For example,
metavariables can capture both parts of the expression and parts of its type
for use in the replacement:

```rust refactor-target hidden
fn main() {
    let v: Vec<&'static str> = Vec::with_capacity(20);

    let v: Vec<_> = Vec::with_capacity(10);
    // Allow `v`'s element type to be inferred
    let x: i32 = v[0];
}
```

```refactor
rewrite_expr
    'typed!(Vec::with_capacity(__n), ::std::vec::Vec<__ty>)'
    '::std::iter::repeat(<__ty>::default())
        .take(__n)
        .collect::<Vec<__ty>>()'
```

Notice that the rewritten code has the correct element type in the call to
`default`, even in cases where the type is not written explicitly in the
original expression!  The matching of `typed!` obtains the inferred type
information from `rustc`, and those inferred types are captured by
metavariables in the type pattern.


## Example: `transmute` to `<*const T>::as_ref`

This example demonstrates usage of `def!` and `typed!`.

Suppose we have some unsafe code that uses `transmute` to convert a raw
pointer that may be null (`*const T`) into an optional reference
(`Option<&T>`).  This conversion is better expressed using the `as_ref` method
of `*const T`, and we'd like to apply this transformation automatically.

### Initial attempt

Here is a basic first attempt:

```rust refactor-target hidden
use std::mem;

unsafe fn foo(ptr: *const u32) {
    let r: &u32 = mem::transmute::<*const u32, Option<&u32>>(ptr).unwrap();

    let opt_r2: Option<&u32> = mem::transmute(ptr);
    let r2 = opt_r2.unwrap();
    let ptr2: *const u32 = mem::transmute(r2);

    {
        use std::mem::transmute;
        let opt_r3: Option<&u32> = transmute(ptr);
        let r3 = opt_r2.unwrap();
    }

    /* ... */
}
```

```refactor
rewrite_expr 'transmute(__e)' '__e.as_ref()'
```

This has two major shortcomings, which we will address in order:

 1. It works only on code that calls exactly `transmute(foo)`.  The instances that
    import `std::mem` and call `mem::transmute(foo)` do not get rewritten.
 2. It rewrites transmutes between any types, not just `*const T` to
    `Option<&T>`.  Only transmutes between those types should be replaced with
    `as_ref`.

### Identifying `transmute` calls with `def!`

We want to rewrite calls to `std::mem::transmute`, regardless of how those
calls are written.  This is a perfect use case for `def!`:

```refactor
rewrite_expr 'def!(::std::intrinsics::transmute)(__e)' '__e.as_ref()'
```

Now our rewrite catches all uses of `transmute`, whether they're written as
`transmute(foo)`, `mem::transmute(foo)`, or even `::std::mem::transmute(foo)`.

Notice that we refer to `transmute` as `std::intrinsics::transmute`: this is
the location of its original definition, which is re-exported in `std::mem`.
See the ["`def!`: debugging match failures" section](#debugging-match-failures)
for an explanation of how we discovered this.

### Filtering `transmute` calls by type

We now have a command for rewriting all `transmute` calls, but we'd like it to
rewrite only transmutes from `*const T` to `Option<&T>`.  We can achieve this
by filtering the input and output types with `typed!`:

```refactor
rewrite_expr '
    typed!(
        def!(::std::intrinsics::transmute)(
            typed!(__e, *const __ty)
        ),
        ::std::option::Option<&__ty>
    )
' '__e.as_ref()'
```

Now only those transmutes that turn `*const T` into `Option<&T>` are affected
by the rewrite.  And because `typed!` has access to the results of type
inference, this works even on `transmute` calls that are not fully annotated
(`transmute(foo)`, not just `transmute::<*const T, Option<&T>>(foo)`).


## `marked!`

The `marked!` form is simple: `marked!(e, label)` matches an expression only if
`e` matches the expression and the expression is marked with the given `label`.
See the [documentation on marks and `select`](select.md) for more
information.



# Other commands

Several other refactoring commands use the same pattern-matching engine as
`rewrite_expr`:

 * `rewrite_ty PAT REPL` ([docs]commands.md#rewrite_ty) works like `rewrite_expr`,
   except it matches and replaces type annotations instead of expressions.
 * `abstract SIG PAT` ([docs]commands.md#abstract) replaces expressions matching a
   pattern with calls to a newly-created function.
 * `type_fix_rules` ([docs]commands.md#type_fix_rules) uses type patterns to find
   the appropriate rule to fix each type error.
 * `select`'s `match_expr` ([docs]select.md#match_) and similar filters
   use syntax patterns to identify nodes to mark.