Use generic dot kernels for WASM128_GENERIC#5689
Merged
martin-frbg merged 3 commits intoOpenMathLib:developfrom Mar 19, 2026
Merged
Use generic dot kernels for WASM128_GENERIC#5689martin-frbg merged 3 commits intoOpenMathLib:developfrom
martin-frbg merged 3 commits intoOpenMathLib:developfrom
Conversation
Collaborator
|
Great, thanks |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to #5685, #2867, and #4023.
This switches SDOTKERNEL and DDOTKERNEL for WASM128_GENERIC from the trivial riscv64 fallback to
kernel/generic/dot.c, and adds a WASM SIMD widening path there for DSDOT.In local direct WASM benchmarking with Emscripten/Node, contiguous inputs improved over the current baseline by about 8.96x at n=1048576, 2.00x at n=2097152, and 1.72x at n=4194304 for sdot, by about 1.33x at n=1048576, 1.47x at n=2097152, and 1.14x at n=4194304 for ddot, and by about 1.14x at n=1048576, 1.83x at n=2097152, and 1.14x at n=4194304 for dsdot.
I also tried a dedicated WASM-specific DSDOT prototype locally, but it did not show a clear overall benefit over the generic implementation, so this keeps the simpler generic path.