Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add as_ascii_unchecked() methods to char, u8, and str #137432

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions library/core/src/char/methods.rs
Original file line number Diff line number Diff line change
@@ -1202,6 +1202,20 @@ impl char {
}
}

/// Converts this char into an [ASCII character](`ascii::Char`), without
/// checking whether it is valid.
///
/// # Safety
///
/// This char must be within the ASCII range, or else this is UB.
#[must_use]
#[unstable(feature = "ascii_char", issue = "110998")]
#[inline]
pub const unsafe fn as_ascii_unchecked(&self) -> ascii::Char {
// SAFETY: the caller promised that this char is ASCII.
unsafe { ascii::Char::from_u8_unchecked(*self as u8) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the as here will mean that, as far as Miri and the backend can tell, '\u{1234}'.as_ascii_unchecked() is completely legal and fine, which seems wrong.

Probably should do something to avoid that and let Miri catch mistakes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I am not particularly familiar with Miri so I'm unsure which kinds of UB it can and cannot catch. Would you be able to give me some pointers as to how to implement this feaure in a way that Miri can diagnose?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add some assert_unsafe_precondition!s using check_library_ub. You want to ensure the input char/byte/string is within range (ie. run the validation that the checked counterparts do). I linked the assert_unsafe_precondition docs above, and you can also search around std for examples of it being used.

}

/// Makes a copy of the value in its ASCII upper case equivalent.
///
/// ASCII letters 'a' to 'z' are mapped to 'A' to 'Z',
14 changes: 14 additions & 0 deletions library/core/src/num/mod.rs
Original file line number Diff line number Diff line change
@@ -482,6 +482,20 @@ impl u8 {
ascii::Char::from_u8(*self)
}

/// Converts this byte to an [ASCII character](ascii::Char), without
/// checking whether or not it's valid.
///
/// # Safety
///
/// This byte must be valid ASCII, or else this is UB.
#[must_use]
#[unstable(feature = "ascii_char", issue = "110998")]
#[inline]
pub const unsafe fn as_ascii_unchecked(&self) -> ascii::Char {
// SAFETY: the caller promised that this byte is ASCII.
unsafe { ascii::Char::from_u8_unchecked(*self) }
}

/// Makes a copy of the value in its ASCII upper case equivalent.
///
/// ASCII letters 'a' to 'z' are mapped to 'A' to 'Z',
15 changes: 15 additions & 0 deletions library/core/src/str/mod.rs
Original file line number Diff line number Diff line change
@@ -2633,6 +2633,21 @@ impl str {
self.as_bytes().as_ascii()
}

/// Converts this string slice into a slice of [ASCII characters](ascii::Char),
/// without checking whether they are valid.
///
/// # Safety
///
/// Every character in this string must be ASCII, or else this is UB.
#[unstable(feature = "ascii_char", issue = "110998")]
#[must_use]
#[inline]
pub const unsafe fn as_ascii_unchecked(&self) -> &[ascii::Char] {
// SAFETY: the caller promised that every byte of this string slice
// is ASCII.
unsafe { self.as_bytes().as_ascii_unchecked() }
}

/// Checks that two strings are an ASCII case-insensitive match.
///
/// Same as `to_ascii_lowercase(a) == to_ascii_lowercase(b)`,
Loading