Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Portable SIMD subtree update #138687

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions library/portable-simd/beginners-guide.md
Original file line number Diff line number Diff line change
@@ -80,12 +80,12 @@ Most of the portable SIMD API is designed to allow the user to gloss over the de

Fortunately, most SIMD types have a fairly predictable size. `i32x4` is bit-equivalent to `[i32; 4]` and so can be bitcast to it, e.g. using [`mem::transmute`], though the API usually offers a safe cast you can use instead.

However, this is not the same as alignment. Computer architectures generally prefer aligned accesses, especially when moving data between memory and vector registers, and while some support specialized operations that can bend the rules to help with this, unaligned access is still typically slow, or even undefined behavior. In addition, different architectures can require different alignments when interacting with their native SIMD types. For this reason, any `#[repr(simd)]` type has a non-portable alignment. If it is necessary to directly interact with the alignment of these types, it should be via [`mem::align_of`].
However, this is not the same as alignment. Computer architectures generally prefer aligned accesses, especially when moving data between memory and vector registers, and while some support specialized operations that can bend the rules to help with this, unaligned access is still typically slow, or even undefined behavior. In addition, different architectures can require different alignments when interacting with their native SIMD types. For this reason, any `#[repr(simd)]` type has a non-portable alignment. If it is necessary to directly interact with the alignment of these types, it should be via [`align_of`].

When working with slices, data correctly aligned for SIMD can be acquired using the [`as_simd`] and [`as_simd_mut`] methods of the slice primitive.

[`mem::transmute`]: https://doc.rust-lang.org/core/mem/fn.transmute.html
[`mem::align_of`]: https://doc.rust-lang.org/core/mem/fn.align_of.html
[`align_of`]: https://doc.rust-lang.org/core/mem/fn.align_of.html
[`as_simd`]: https://doc.rust-lang.org/nightly/std/primitive.slice.html#method.as_simd
[`as_simd_mut`]: https://doc.rust-lang.org/nightly/std/primitive.slice.html#method.as_simd_mut

2 changes: 1 addition & 1 deletion library/portable-simd/crates/core_simd/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[package]
name = "core_simd"
version = "0.1.0"
edition = "2021"
edition = "2024"
homepage = "https://github.com/rust-lang/portable-simd"
repository = "https://github.com/rust-lang/portable-simd"
keywords = ["core", "simd", "intrinsics"]
6 changes: 5 additions & 1 deletion library/portable-simd/crates/core_simd/src/lib.rs
Original file line number Diff line number Diff line change
@@ -35,7 +35,11 @@
feature(stdarch_x86_avx512)
)]
#![warn(missing_docs, clippy::missing_inline_in_public_items)] // basically all items, really
#![deny(unsafe_op_in_unsafe_fn, clippy::undocumented_unsafe_blocks)]
#![deny(
unsafe_op_in_unsafe_fn,
unreachable_pub,
clippy::undocumented_unsafe_blocks
)]
#![doc(test(attr(deny(warnings))))]
#![allow(internal_features)]
#![unstable(feature = "portable_simd", issue = "86656")]
22 changes: 11 additions & 11 deletions library/portable-simd/crates/core_simd/src/masks/bitmask.rs
Original file line number Diff line number Diff line change
@@ -5,7 +5,7 @@ use core::marker::PhantomData;

/// A mask where each lane is represented by a single bit.
#[repr(transparent)]
pub struct Mask<T, const N: usize>(
pub(crate) struct Mask<T, const N: usize>(
<LaneCount<N> as SupportedLaneCount>::BitMask,
PhantomData<T>,
)
@@ -78,7 +78,7 @@ where
{
#[inline]
#[must_use = "method returns a new mask and does not mutate the original value"]
pub fn splat(value: bool) -> Self {
pub(crate) fn splat(value: bool) -> Self {
let mut mask = <LaneCount<N> as SupportedLaneCount>::BitMask::default();
if value {
mask.as_mut().fill(u8::MAX)
@@ -93,20 +93,20 @@ where

#[inline]
#[must_use = "method returns a new bool and does not mutate the original value"]
pub unsafe fn test_unchecked(&self, lane: usize) -> bool {
pub(crate) unsafe fn test_unchecked(&self, lane: usize) -> bool {
(self.0.as_ref()[lane / 8] >> (lane % 8)) & 0x1 > 0
}

#[inline]
pub unsafe fn set_unchecked(&mut self, lane: usize, value: bool) {
pub(crate) unsafe fn set_unchecked(&mut self, lane: usize, value: bool) {
unsafe {
self.0.as_mut()[lane / 8] ^= ((value ^ self.test_unchecked(lane)) as u8) << (lane % 8)
}
}

#[inline]
#[must_use = "method returns a new vector and does not mutate the original value"]
pub fn to_int(self) -> Simd<T, N> {
pub(crate) fn to_int(self) -> Simd<T, N> {
unsafe {
core::intrinsics::simd::simd_select_bitmask(
self.0,
@@ -118,19 +118,19 @@ where

#[inline]
#[must_use = "method returns a new mask and does not mutate the original value"]
pub unsafe fn from_int_unchecked(value: Simd<T, N>) -> Self {
pub(crate) unsafe fn from_int_unchecked(value: Simd<T, N>) -> Self {
unsafe { Self(core::intrinsics::simd::simd_bitmask(value), PhantomData) }
}

#[inline]
pub fn to_bitmask_integer(self) -> u64 {
pub(crate) fn to_bitmask_integer(self) -> u64 {
let mut bitmask = [0u8; 8];
bitmask[..self.0.as_ref().len()].copy_from_slice(self.0.as_ref());
u64::from_ne_bytes(bitmask)
}

#[inline]
pub fn from_bitmask_integer(bitmask: u64) -> Self {
pub(crate) fn from_bitmask_integer(bitmask: u64) -> Self {
let mut bytes = <LaneCount<N> as SupportedLaneCount>::BitMask::default();
let len = bytes.as_mut().len();
bytes
@@ -141,7 +141,7 @@ where

#[inline]
#[must_use = "method returns a new mask and does not mutate the original value"]
pub fn convert<U>(self) -> Mask<U, N>
pub(crate) fn convert<U>(self) -> Mask<U, N>
where
U: MaskElement,
{
@@ -151,13 +151,13 @@ where

#[inline]
#[must_use = "method returns a new bool and does not mutate the original value"]
pub fn any(self) -> bool {
pub(crate) fn any(self) -> bool {
self != Self::splat(false)
}

#[inline]
#[must_use = "method returns a new bool and does not mutate the original value"]
pub fn all(self) -> bool {
pub(crate) fn all(self) -> bool {
self == Self::splat(true)
}
}
20 changes: 10 additions & 10 deletions library/portable-simd/crates/core_simd/src/masks/full_masks.rs
Original file line number Diff line number Diff line change
@@ -3,7 +3,7 @@
use crate::simd::{LaneCount, MaskElement, Simd, SupportedLaneCount};

#[repr(transparent)]
pub struct Mask<T, const N: usize>(Simd<T, N>)
pub(crate) struct Mask<T, const N: usize>(Simd<T, N>)
where
T: MaskElement,
LaneCount<N>: SupportedLaneCount;
@@ -80,7 +80,7 @@ macro_rules! impl_reverse_bits {
#[inline(always)]
fn reverse_bits(self, n: usize) -> Self {
let rev = <$int>::reverse_bits(self);
let bitsize = core::mem::size_of::<$int>() * 8;
let bitsize = size_of::<$int>() * 8;
if n < bitsize {
// Shift things back to the right
rev >> (bitsize - n)
@@ -102,36 +102,36 @@ where
{
#[inline]
#[must_use = "method returns a new mask and does not mutate the original value"]
pub fn splat(value: bool) -> Self {
pub(crate) fn splat(value: bool) -> Self {
Self(Simd::splat(if value { T::TRUE } else { T::FALSE }))
}

#[inline]
#[must_use = "method returns a new bool and does not mutate the original value"]
pub unsafe fn test_unchecked(&self, lane: usize) -> bool {
pub(crate) unsafe fn test_unchecked(&self, lane: usize) -> bool {
T::eq(self.0[lane], T::TRUE)
}

#[inline]
pub unsafe fn set_unchecked(&mut self, lane: usize, value: bool) {
pub(crate) unsafe fn set_unchecked(&mut self, lane: usize, value: bool) {
self.0[lane] = if value { T::TRUE } else { T::FALSE }
}

#[inline]
#[must_use = "method returns a new vector and does not mutate the original value"]
pub fn to_int(self) -> Simd<T, N> {
pub(crate) fn to_int(self) -> Simd<T, N> {
self.0
}

#[inline]
#[must_use = "method returns a new mask and does not mutate the original value"]
pub unsafe fn from_int_unchecked(value: Simd<T, N>) -> Self {
pub(crate) unsafe fn from_int_unchecked(value: Simd<T, N>) -> Self {
Self(value)
}

#[inline]
#[must_use = "method returns a new mask and does not mutate the original value"]
pub fn convert<U>(self) -> Mask<U, N>
pub(crate) fn convert<U>(self) -> Mask<U, N>
where
U: MaskElement,
{
@@ -220,14 +220,14 @@ where

#[inline]
#[must_use = "method returns a new bool and does not mutate the original value"]
pub fn any(self) -> bool {
pub(crate) fn any(self) -> bool {
// Safety: use `self` as an integer vector
unsafe { core::intrinsics::simd::simd_reduce_any(self.to_int()) }
}

#[inline]
#[must_use = "method returns a new bool and does not mutate the original value"]
pub fn all(self) -> bool {
pub(crate) fn all(self) -> bool {
// Safety: use `self` as an integer vector
unsafe { core::intrinsics::simd::simd_reduce_all(self.to_int()) }
}
2 changes: 1 addition & 1 deletion library/portable-simd/crates/core_simd/src/ops.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use crate::simd::{cmp::SimdPartialEq, LaneCount, Simd, SimdElement, SupportedLaneCount};
use crate::simd::{LaneCount, Simd, SimdElement, SupportedLaneCount, cmp::SimdPartialEq};
use core::ops::{Add, Mul};
use core::ops::{BitAnd, BitOr, BitXor};
use core::ops::{Div, Rem, Sub};
2 changes: 1 addition & 1 deletion library/portable-simd/crates/core_simd/src/simd/cmp/eq.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
use crate::simd::{
ptr::{SimdConstPtr, SimdMutPtr},
LaneCount, Mask, Simd, SimdElement, SupportedLaneCount,
ptr::{SimdConstPtr, SimdMutPtr},
};

/// Parallel `PartialEq`.
2 changes: 1 addition & 1 deletion library/portable-simd/crates/core_simd/src/simd/cmp/ord.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
use crate::simd::{
LaneCount, Mask, Simd, SupportedLaneCount,
cmp::SimdPartialEq,
ptr::{SimdConstPtr, SimdMutPtr},
LaneCount, Mask, Simd, SupportedLaneCount,
};

/// Parallel `PartialOrd`.
9 changes: 5 additions & 4 deletions library/portable-simd/crates/core_simd/src/simd/num/float.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
use super::sealed::Sealed;
use crate::simd::{
cmp::{SimdPartialEq, SimdPartialOrd},
LaneCount, Mask, Simd, SimdCast, SimdElement, SupportedLaneCount,
cmp::{SimdPartialEq, SimdPartialOrd},
};

/// Operations on SIMD vectors of floats.
@@ -263,7 +263,8 @@ macro_rules! impl_trait {
unsafe { core::intrinsics::simd::simd_as(self) }
}

// https://github.com/llvm/llvm-project/issues/94694
// workaround for https://github.com/llvm/llvm-project/issues/94694 (fixed in LLVM 20)
// tracked in: https://github.com/rust-lang/rust/issues/135982
#[cfg(target_arch = "aarch64")]
#[inline]
fn cast<T: SimdCast>(self) -> Self::Cast<T>
@@ -302,14 +303,14 @@ macro_rules! impl_trait {

#[inline]
fn to_bits(self) -> Simd<$bits_ty, N> {
assert_eq!(core::mem::size_of::<Self>(), core::mem::size_of::<Self::Bits>());
assert_eq!(size_of::<Self>(), size_of::<Self::Bits>());
// Safety: transmuting between vector types is safe
unsafe { core::mem::transmute_copy(&self) }
}

#[inline]
fn from_bits(bits: Simd<$bits_ty, N>) -> Self {
assert_eq!(core::mem::size_of::<Self>(), core::mem::size_of::<Self::Bits>());
assert_eq!(size_of::<Self>(), size_of::<Self::Bits>());
// Safety: transmuting between vector types is safe
unsafe { core::mem::transmute_copy(&bits) }
}
4 changes: 2 additions & 2 deletions library/portable-simd/crates/core_simd/src/simd/num/int.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
use super::sealed::Sealed;
use crate::simd::{
cmp::SimdOrd, cmp::SimdPartialOrd, num::SimdUint, LaneCount, Mask, Simd, SimdCast, SimdElement,
SupportedLaneCount,
LaneCount, Mask, Simd, SimdCast, SimdElement, SupportedLaneCount, cmp::SimdOrd,
cmp::SimdPartialOrd, num::SimdUint,
};

/// Operations on SIMD vectors of signed integers.
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
use super::sealed::Sealed;
use crate::simd::{cmp::SimdOrd, LaneCount, Simd, SimdCast, SimdElement, SupportedLaneCount};
use crate::simd::{LaneCount, Simd, SimdCast, SimdElement, SupportedLaneCount, cmp::SimdOrd};

/// Operations on SIMD vectors of unsigned integers.
pub trait SimdUint: Copy + Sealed {
3 changes: 2 additions & 1 deletion library/portable-simd/crates/core_simd/src/simd/prelude.rs
Original file line number Diff line number Diff line change
@@ -7,10 +7,11 @@

#[doc(no_inline)]
pub use super::{
Mask, Simd,
cmp::{SimdOrd, SimdPartialEq, SimdPartialOrd},
num::{SimdFloat, SimdInt, SimdUint},
ptr::{SimdConstPtr, SimdMutPtr},
simd_swizzle, Mask, Simd,
simd_swizzle,
};

#[rustfmt::skip]
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
use super::sealed::Sealed;
use crate::simd::{cmp::SimdPartialEq, num::SimdUint, LaneCount, Mask, Simd, SupportedLaneCount};
use crate::simd::{LaneCount, Mask, Simd, SupportedLaneCount, cmp::SimdPartialEq, num::SimdUint};

/// Operations on SIMD vectors of constant pointers.
pub trait SimdConstPtr: Copy + Sealed {
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
use super::sealed::Sealed;
use crate::simd::{cmp::SimdPartialEq, num::SimdUint, LaneCount, Mask, Simd, SupportedLaneCount};
use crate::simd::{LaneCount, Mask, Simd, SupportedLaneCount, cmp::SimdPartialEq, num::SimdUint};

/// Operations on SIMD vectors of mutable pointers.
pub trait SimdMutPtr: Copy + Sealed {
44 changes: 44 additions & 0 deletions library/portable-simd/crates/core_simd/src/swizzle.rs
Original file line number Diff line number Diff line change
@@ -214,6 +214,17 @@ where
/// Rotates the vector such that the first `OFFSET` elements of the slice move to the end
/// while the last `self.len() - OFFSET` elements move to the front. After calling `rotate_elements_left`,
/// the element previously at index `OFFSET` will become the first element in the slice.
/// ```
/// # #![feature(portable_simd)]
/// # #[cfg(feature = "as_crate")] use core_simd::simd::Simd;
/// # #[cfg(not(feature = "as_crate"))] use core::simd::Simd;
/// let a = Simd::from_array([0, 1, 2, 3]);
/// let x = a.rotate_elements_left::<3>();
/// assert_eq!(x.to_array(), [3, 0, 1, 2]);
///
/// let y = a.rotate_elements_left::<7>();
/// assert_eq!(y.to_array(), [3, 0, 1, 2]);
/// ```
#[inline]
#[must_use = "method returns a new vector and does not mutate the original inputs"]
pub fn rotate_elements_left<const OFFSET: usize>(self) -> Self {
@@ -238,6 +249,17 @@ where
/// Rotates the vector such that the first `self.len() - OFFSET` elements of the vector move to
/// the end while the last `OFFSET` elements move to the front. After calling `rotate_elements_right`,
/// the element previously at index `self.len() - OFFSET` will become the first element in the slice.
/// ```
/// # #![feature(portable_simd)]
/// # #[cfg(feature = "as_crate")] use core_simd::simd::Simd;
/// # #[cfg(not(feature = "as_crate"))] use core::simd::Simd;
/// let a = Simd::from_array([0, 1, 2, 3]);
/// let x = a.rotate_elements_right::<3>();
/// assert_eq!(x.to_array(), [1, 2, 3, 0]);
///
/// let y = a.rotate_elements_right::<7>();
/// assert_eq!(y.to_array(), [1, 2, 3, 0]);
/// ```
#[inline]
#[must_use = "method returns a new vector and does not mutate the original inputs"]
pub fn rotate_elements_right<const OFFSET: usize>(self) -> Self {
@@ -261,6 +283,17 @@ where

/// Shifts the vector elements to the left by `OFFSET`, filling in with
/// `padding` from the right.
/// ```
/// # #![feature(portable_simd)]
/// # #[cfg(feature = "as_crate")] use core_simd::simd::Simd;
/// # #[cfg(not(feature = "as_crate"))] use core::simd::Simd;
/// let a = Simd::from_array([0, 1, 2, 3]);
/// let x = a.shift_elements_left::<3>(255);
/// assert_eq!(x.to_array(), [3, 255, 255, 255]);
///
/// let y = a.shift_elements_left::<7>(255);
/// assert_eq!(y.to_array(), [255, 255, 255, 255]);
/// ```
#[inline]
#[must_use = "method returns a new vector and does not mutate the original inputs"]
pub fn shift_elements_left<const OFFSET: usize>(self, padding: T) -> Self {
@@ -283,6 +316,17 @@ where

/// Shifts the vector elements to the right by `OFFSET`, filling in with
/// `padding` from the left.
/// ```
/// # #![feature(portable_simd)]
/// # #[cfg(feature = "as_crate")] use core_simd::simd::Simd;
/// # #[cfg(not(feature = "as_crate"))] use core::simd::Simd;
/// let a = Simd::from_array([0, 1, 2, 3]);
/// let x = a.shift_elements_right::<3>(255);
/// assert_eq!(x.to_array(), [255, 255, 255, 0]);
///
/// let y = a.shift_elements_right::<7>(255);
/// assert_eq!(y.to_array(), [255, 255, 255, 255]);
/// ```
#[inline]
#[must_use = "method returns a new vector and does not mutate the original inputs"]
pub fn shift_elements_right<const OFFSET: usize>(self, padding: T) -> Self {
2 changes: 1 addition & 1 deletion library/portable-simd/crates/core_simd/src/to_bytes.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
use crate::simd::{
num::{SimdFloat, SimdInt, SimdUint},
LaneCount, Simd, SimdElement, SupportedLaneCount,
num::{SimdFloat, SimdInt, SimdUint},
};

mod sealed {
4 changes: 2 additions & 2 deletions library/portable-simd/crates/core_simd/src/vector.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
use crate::simd::{
LaneCount, Mask, MaskElement, SupportedLaneCount, Swizzle,
cmp::SimdPartialOrd,
num::SimdUint,
ptr::{SimdConstPtr, SimdMutPtr},
LaneCount, Mask, MaskElement, SupportedLaneCount, Swizzle,
};

/// A SIMD vector with the shape of `[T; N]` but the operations of `T`.
@@ -83,7 +83,7 @@ use crate::simd::{
/// converting `[T]` to `[Simd<T, N>]`, and allows soundly operating on an aligned SIMD body,
/// but it may cost more time when handling the scalar head and tail.
/// If these are not enough, it is most ideal to design data structures to be already aligned
/// to `mem::align_of::<Simd<T, N>>()` before using `unsafe` Rust to read or write.
/// to `align_of::<Simd<T, N>>()` before using `unsafe` Rust to read or write.
/// Other ways to compensate for these facts, like materializing `Simd` to or from an array first,
/// are handled by safe methods like [`Simd::from_array`] and [`Simd::from_slice`].
///
4 changes: 2 additions & 2 deletions library/portable-simd/crates/core_simd/tests/layout.rs
Original file line number Diff line number Diff line change
@@ -7,8 +7,8 @@ macro_rules! layout_tests {
test_helpers::test_lanes! {
fn no_padding<const LANES: usize>() {
assert_eq!(
core::mem::size_of::<core_simd::simd::Simd::<$ty, LANES>>(),
core::mem::size_of::<[$ty; LANES]>(),
size_of::<core_simd::simd::Simd::<$ty, LANES>>(),
size_of::<[$ty; LANES]>(),
);
}
}
2 changes: 1 addition & 1 deletion library/portable-simd/crates/core_simd/tests/pointers.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
#![feature(portable_simd)]

use core_simd::simd::{
ptr::{SimdConstPtr, SimdMutPtr},
Simd,
ptr::{SimdConstPtr, SimdMutPtr},
};

macro_rules! common_tests {
2 changes: 1 addition & 1 deletion library/portable-simd/crates/core_simd/tests/round.rs
Original file line number Diff line number Diff line change
@@ -58,7 +58,7 @@ macro_rules! float_rounding_test {
// all of the mantissa digits set to 1, pushed up to the MSB.
const ALL_MANTISSA_BITS: IntScalar = ((1 << <Scalar>::MANTISSA_DIGITS) - 1);
const MAX_REPRESENTABLE_VALUE: Scalar =
(ALL_MANTISSA_BITS << (core::mem::size_of::<Scalar>() * 8 - <Scalar>::MANTISSA_DIGITS as usize - 1)) as Scalar;
(ALL_MANTISSA_BITS << (size_of::<Scalar>() * 8 - <Scalar>::MANTISSA_DIGITS as usize - 1)) as Scalar;

let mut runner = test_helpers::make_runner();
runner.run(
Original file line number Diff line number Diff line change
@@ -12,7 +12,7 @@ macro_rules! impl_float {
$(
impl FlushSubnormals for $ty {
fn flush(self) -> Self {
let is_f32 = core::mem::size_of::<Self>() == 4;
let is_f32 = size_of::<Self>() == 4;
let ppc_flush = is_f32 && cfg!(all(
any(target_arch = "powerpc", all(target_arch = "powerpc64", target_endian = "big")),
target_feature = "altivec",
Loading