Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[X86] mulhu + srl pattern not recognized #132166

Open
nikic opened this issue Mar 20, 2025 · 2 comments · May be fixed by #132548
Open

[X86] mulhu + srl pattern not recognized #132166

nikic opened this issue Mar 20, 2025 · 2 comments · May be fixed by #132548

Comments

@nikic
Copy link
Contributor

nikic commented Mar 20, 2025

https://llvm.godbolt.org/z/9Khd9Gdhh

define <8 x i16> @mul_and_shift16(<8 x i16> %a, <8 x i16> %b) {
  %a.ext = zext <8 x i16> %a to <8 x i32>
  %b.ext = zext <8 x i16> %b to <8 x i32>
  %mul = mul <8 x i32> %a.ext, %b.ext
  %shift = lshr <8 x i32> %mul, splat(i32 16)
  %trunc = trunc <8 x i32> %shift to <8 x i16>
  ret <8 x i16> %trunc
}

define <8 x i16> @mul_and_shift17(<8 x i16> %a, <8 x i16> %b) {
  %a.ext = zext <8 x i16> %a to <8 x i32>
  %b.ext = zext <8 x i16> %b to <8 x i32>
  %mul = mul <8 x i32> %a.ext, %b.ext
  %shift = lshr <8 x i32> %mul, splat(i32 17)
  %trunc = trunc <8 x i32> %shift to <8 x i16>
  ret <8 x i16> %trunc
}

Results in:

mul_and_shift16:                        # @mul_and_shift16
        pmulhuw xmm0, xmm1
        ret
mul_and_shift17:                        # @mul_and_shift17
        pmulhuw xmm0, xmm1
        punpcklwd       xmm1, xmm0              # xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1],xmm1[2],xmm0[2],xmm1[3],xmm0[3]
        punpckhwd       xmm0, xmm0              # xmm0 = xmm0[4,4,5,5,6,6,7,7]
        psrld   xmm0, 17
        psrld   xmm1, 17
        packssdw        xmm1, xmm0
        movdqa  xmm0, xmm1
        ret

The second one could be just pmulhuw + psrlw.

@llvmbot
Copy link
Member

llvmbot commented Mar 20, 2025

@llvm/issue-subscribers-backend-x86

Author: Nikita Popov (nikic)

https://llvm.godbolt.org/z/9Khd9Gdhh ```llvm define <8 x i16> @mul_and_shift16(<8 x i16> %a, <8 x i16> %b) { %a.ext = zext <8 x i16> %a to <8 x i32> %b.ext = zext <8 x i16> %b to <8 x i32> %mul = mul <8 x i32> %a.ext, %b.ext %shift = lshr <8 x i32> %mul, splat(i32 16) %trunc = trunc <8 x i32> %shift to <8 x i16> ret <8 x i16> %trunc }

define <8 x i16> @mul_and_shift17(<8 x i16> %a, <8 x i16> %b) {
%a.ext = zext <8 x i16> %a to <8 x i32>
%b.ext = zext <8 x i16> %b to <8 x i32>
%mul = mul <8 x i32> %a.ext, %b.ext
%shift = lshr <8 x i32> %mul, splat(i32 17)
%trunc = trunc <8 x i32> %shift to <8 x i16>
ret <8 x i16> %trunc
}

Results in:

mul_and_shift16: # @mul_and_shift16
pmulhuw xmm0, xmm1
ret
mul_and_shift17: # @mul_and_shift17
pmulhuw xmm0, xmm1
punpcklwd xmm1, xmm0 # xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1],xmm1[2],xmm0[2],xmm1[3],xmm0[3]
punpckhwd xmm0, xmm0 # xmm0 = xmm0[4,4,5,5,6,6,7,7]
psrld xmm0, 17
psrld xmm1, 17
packssdw xmm1, xmm0
movdqa xmm0, xmm1
ret

The second one could be just pmulhuw + psrlw.
</details>

@RKSimon
Copy link
Collaborator

RKSimon commented Mar 21, 2025

@abhishek-kaushik22 if you're looking at this - combineShiftToPMULH just needs adjusting to look for ShiftAmt >= 16 (and maybe some suitable known/signbits checks).

abhishek-kaushik22 added a commit to abhishek-kaushik22/llvm-project that referenced this issue Mar 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants