Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix false-positive in expr_or_init and in the invalid_from_utf8 lint #138360

Merged
merged 1 commit into from
Mar 12, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 14 additions & 2 deletions compiler/rustc_lint/src/context.rs
Original file line number Diff line number Diff line change
@@ -6,6 +6,7 @@
use std::cell::Cell;
use std::slice;

use rustc_ast::BindingMode;
use rustc_data_structures::fx::FxIndexMap;
use rustc_data_structures::sync;
use rustc_data_structures::unord::UnordMap;
@@ -14,6 +15,7 @@ use rustc_feature::Features;
use rustc_hir::def::Res;
use rustc_hir::def_id::{CrateNum, DefId};
use rustc_hir::definitions::{DefPathData, DisambiguatedDefPathData};
use rustc_hir::{Pat, PatKind};
use rustc_middle::bug;
use rustc_middle::middle::privacy::EffectiveVisibilities;
use rustc_middle::ty::layout::{LayoutError, LayoutOfHelpers, TyAndLayout};
@@ -890,7 +892,12 @@ impl<'tcx> LateContext<'tcx> {
}
&& let Some(init) = match parent_node {
hir::Node::Expr(expr) => Some(expr),
hir::Node::LetStmt(hir::LetStmt { init, .. }) => *init,
hir::Node::LetStmt(hir::LetStmt {
init,
// Binding is immutable, init cannot be re-assigned
pat: Pat { kind: PatKind::Binding(BindingMode::NONE, ..), .. },
..
}) => *init,
_ => None,
}
{
@@ -935,7 +942,12 @@ impl<'tcx> LateContext<'tcx> {
}
&& let Some(init) = match parent_node {
hir::Node::Expr(expr) => Some(expr),
hir::Node::LetStmt(hir::LetStmt { init, .. }) => *init,
hir::Node::LetStmt(hir::LetStmt {
init,
// Binding is immutable, init cannot be re-assigned
pat: Pat { kind: PatKind::Binding(BindingMode::NONE, ..), .. },
..
}) => *init,
hir::Node::Item(item) => match item.kind {
hir::ItemKind::Const(.., body_id) | hir::ItemKind::Static(.., body_id) => {
Some(self.tcx.hir_body(body_id).value)
41 changes: 29 additions & 12 deletions tests/ui/lint/invalid_from_utf8.rs
Original file line number Diff line number Diff line change
@@ -128,18 +128,21 @@ pub fn from_utf8() {
}

pub fn from_utf8_with_indirections() {
let mut a = [99, 108, 130, 105, 112, 112, 121];
std::str::from_utf8_mut(&mut a);
//~^ WARN calls to `std::str::from_utf8_mut`
str::from_utf8_mut(&mut a);
//~^ WARN calls to `str::from_utf8_mut`
let mut b = &mut a;
let mut c = b;
std::str::from_utf8_mut(c);
//~^ WARN calls to `std::str::from_utf8_mut`
str::from_utf8_mut(c);
//~^ WARN calls to `str::from_utf8_mut`
let mut c = &[99, 108, 130, 105, 112, 112, 121];
// NOTE: We used to lint on the patterns below, but due to the
// binding being mutable it could be changed between the
// declaration and the call and that would have created a
// false-positive, so until we can reliably avoid those false
// postive we don't lint on them. Example of FP below.
Comment on lines +134 to +135
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously, this is out of scope for this PR, but I'm curious why this PR is implemented as a HIR analysis instead of a MIR analysis? We have a dataflow framework for MIR that can answer these kinds of questions ("do we definitely know the value of this variable at this point in the program?") and it seems like it would be much simpler to write this as a MIR pass.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expr_or_init method came from clippy, same for invalid_from_utf8 which also came from clippy and both of them were using the HIR, so there is a bit a historical precedent.

As for implementing the lint as a MIR lint, it could probably be done, but I also remember MIR lints being more tricky to do, in part because MIR is much less like surface Rust than MIR is but also because the cargo check/build difference of MIR lints makes MIR lints less attractive; I don't know if it would apply here. I will have to check it.

//
// let mut a = [99, 108, 130, 105, 112, 112, 121];
// std::str::from_utf8_mut(&mut a);
// str::from_utf8_mut(&mut a);
// let mut b = &mut a;
// let mut c = b;
// std::str::from_utf8_mut(c);
// str::from_utf8_mut(c);

let c = &[99, 108, 130, 105, 112, 112, 121];
std::str::from_utf8(c);
//~^ WARN calls to `std::str::from_utf8`
str::from_utf8(c);
@@ -164,6 +167,20 @@ pub fn from_utf8_with_indirections() {
//~^ WARN calls to `std::str::from_utf8`
str::from_utf8(INVALID_4);
//~^ WARN calls to `str::from_utf8`

let mut a = [99, 108, 130, 105, 112, 112, 121]; // invalid
loop {
a = [99, 108, 130, 105, 112, 112, 121]; // still invalid, but too complex
break;
}
std::str::from_utf8_mut(&mut a);

let mut a = [99, 108, 130, 105, 112, 112]; // invalid
loop {
a = *b"clippy"; // valid
break;
}
std::str::from_utf8_mut(&mut a);
}

fn main() {}
65 changes: 15 additions & 50 deletions tests/ui/lint/invalid_from_utf8.stderr
Original file line number Diff line number Diff line change
@@ -202,68 +202,33 @@ LL | str::from_utf8(concat_bytes!(b"cl", b"\x82ippy"));
| |
| the literal was valid UTF-8 up to the 2 bytes

warning: calls to `std::str::from_utf8_mut` with an invalid literal always return an error
--> $DIR/invalid_from_utf8.rs:132:5
|
LL | let mut a = [99, 108, 130, 105, 112, 112, 121];
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
LL | std::str::from_utf8_mut(&mut a);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

warning: calls to `str::from_utf8_mut` with an invalid literal always return an error
--> $DIR/invalid_from_utf8.rs:134:5
|
LL | let mut a = [99, 108, 130, 105, 112, 112, 121];
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
...
LL | str::from_utf8_mut(&mut a);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^

warning: calls to `std::str::from_utf8_mut` with an invalid literal always return an error
--> $DIR/invalid_from_utf8.rs:138:5
|
LL | let mut a = [99, 108, 130, 105, 112, 112, 121];
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
...
LL | std::str::from_utf8_mut(c);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^

warning: calls to `str::from_utf8_mut` with an invalid literal always return an error
--> $DIR/invalid_from_utf8.rs:140:5
|
LL | let mut a = [99, 108, 130, 105, 112, 112, 121];
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
...
LL | str::from_utf8_mut(c);
| ^^^^^^^^^^^^^^^^^^^^^

warning: calls to `std::str::from_utf8` with an invalid literal always return an error
--> $DIR/invalid_from_utf8.rs:143:5
--> $DIR/invalid_from_utf8.rs:146:5
|
LL | let mut c = &[99, 108, 130, 105, 112, 112, 121];
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
LL | let c = &[99, 108, 130, 105, 112, 112, 121];
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
LL | std::str::from_utf8(c);
| ^^^^^^^^^^^^^^^^^^^^^^

warning: calls to `str::from_utf8` with an invalid literal always return an error
--> $DIR/invalid_from_utf8.rs:145:5
--> $DIR/invalid_from_utf8.rs:148:5
|
LL | let mut c = &[99, 108, 130, 105, 112, 112, 121];
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
LL | let c = &[99, 108, 130, 105, 112, 112, 121];
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
...
LL | str::from_utf8(c);
| ^^^^^^^^^^^^^^^^^

warning: calls to `std::str::from_utf8` with an invalid literal always return an error
--> $DIR/invalid_from_utf8.rs:148:5
--> $DIR/invalid_from_utf8.rs:151:5
|
LL | const INVALID_1: [u8; 7] = [99, 108, 130, 105, 112, 112, 121];
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
LL | std::str::from_utf8(&INVALID_1);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

warning: calls to `str::from_utf8` with an invalid literal always return an error
--> $DIR/invalid_from_utf8.rs:150:5
--> $DIR/invalid_from_utf8.rs:153:5
|
LL | const INVALID_1: [u8; 7] = [99, 108, 130, 105, 112, 112, 121];
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
@@ -272,15 +237,15 @@ LL | str::from_utf8(&INVALID_1);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^

warning: calls to `std::str::from_utf8` with an invalid literal always return an error
--> $DIR/invalid_from_utf8.rs:153:5
--> $DIR/invalid_from_utf8.rs:156:5
|
LL | static INVALID_2: [u8; 7] = [99, 108, 130, 105, 112, 112, 121];
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
LL | std::str::from_utf8(&INVALID_2);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

warning: calls to `str::from_utf8` with an invalid literal always return an error
--> $DIR/invalid_from_utf8.rs:155:5
--> $DIR/invalid_from_utf8.rs:158:5
|
LL | static INVALID_2: [u8; 7] = [99, 108, 130, 105, 112, 112, 121];
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
@@ -289,15 +254,15 @@ LL | str::from_utf8(&INVALID_2);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^

warning: calls to `std::str::from_utf8` with an invalid literal always return an error
--> $DIR/invalid_from_utf8.rs:158:5
--> $DIR/invalid_from_utf8.rs:161:5
|
LL | const INVALID_3: &'static [u8; 7] = &[99, 108, 130, 105, 112, 112, 121];
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
LL | std::str::from_utf8(INVALID_3);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

warning: calls to `str::from_utf8` with an invalid literal always return an error
--> $DIR/invalid_from_utf8.rs:160:5
--> $DIR/invalid_from_utf8.rs:163:5
|
LL | const INVALID_3: &'static [u8; 7] = &[99, 108, 130, 105, 112, 112, 121];
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
@@ -306,21 +271,21 @@ LL | str::from_utf8(INVALID_3);
| ^^^^^^^^^^^^^^^^^^^^^^^^^

warning: calls to `std::str::from_utf8` with an invalid literal always return an error
--> $DIR/invalid_from_utf8.rs:163:5
--> $DIR/invalid_from_utf8.rs:166:5
|
LL | const INVALID_4: &'static [u8; 7] = { &[99, 108, 130, 105, 112, 112, 121] };
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
LL | std::str::from_utf8(INVALID_4);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

warning: calls to `str::from_utf8` with an invalid literal always return an error
--> $DIR/invalid_from_utf8.rs:165:5
--> $DIR/invalid_from_utf8.rs:168:5
|
LL | const INVALID_4: &'static [u8; 7] = { &[99, 108, 130, 105, 112, 112, 121] };
| ---------------------------------- the literal was valid UTF-8 up to the 2 bytes
...
LL | str::from_utf8(INVALID_4);
| ^^^^^^^^^^^^^^^^^^^^^^^^^

warning: 38 warnings emitted
warning: 34 warnings emitted

Loading