Trivial build fix for SIMD_HAS_SUBSCRIPT_GATHER#6
Closed
prlw1 wants to merge 315 commits intoGSI-HPC:mainfrom
Closed
Trivial build fix for SIMD_HAS_SUBSCRIPT_GATHER#6prlw1 wants to merge 315 commits intoGSI-HPC:mainfrom
prlw1 wants to merge 315 commits intoGSI-HPC:mainfrom
Conversation
... to work without conversion to private base ChangeLog: * simd.h (basic_simd::basic_simd): Use _M_as_base() to convert to base class. (basic_simd::_M_as_base): New. * simd_mask.h (basic_simd_mask::basic_simd_mask): Call _Impl::_S_convert directly.
ChangeLog: * simd.h: Disable construction from a contiguous range.
ChangeLog: * simd.h: Use to_address(it) instead of addressof(*it)
ChangeLog: * simd.h (simd::operator<<): New. (simd::operator>>): New.
ChangeLog: * simd_reductions.h (reduce): New masked overloads. Constrain _BinaryOperation to avoid ambiguity with masked overload.
ChangeLog: * simd.h (simd::copy_to): Implement masked store without use of the std::experimental base types.
ChangeLog: * detail.h (__detail::__arithmetic): Moved to simd_abi.h. (__detail::__vectorizable): Likewise. * simd.h (simd::operator[]): Implement directly (without calling into _Base). Add special case for _AbiArray. * simd_abi.h (__detail::_AbiArray): New. (__detail::_SimdImplArray): New. (__detail::_MaskImplArray): New (stub). (__detail::_DeduceAbi): Prefer _AbiArray over _AbiCombine.
ChangeLog: * permute.h (simd_permute): Mark helper lambda as always_inline.
ChangeLog: * constexpr_tests.c++: Test __static_range_size. * detail.h (__detail::__static_range_size): Remove constraint, fix SFINAE, and handle C-arrays.
New shorthand __pv2 for std::experimental::parallelism_v2 and std::experimental::parallelism_v2::__proposed. ChangeLog: * Makefile: Fix check-skylake-avx512 target name. * constexpr_tests.c++: Add new tests. Replace __detail with __pv2 scope. * constexpr_wrapper.h: New file. Copied from vir-simd. Add literals. * detail.h: Move __arithmetic and __vectorizable from simd_abi.h. Add __pv2 qualification. * fwddecl.h: Define __pv2 namespace. Declare basic_simd and basic_simd_mask. Declare mask and simd reductions, simd_split, and simd_cat. * mask_reductions.h: Use __pv2 instead of __detail. * simd_split.h: Likewise. * simd_reductions.h: Likewise. Remove defaults that are now in fwddecl.h. * simd.h: Use __pv2 instead of __detail. Don't inherit stdx::simd anymore. (operator[]): Complete range check/assume. Add new case for array _M_data. (_M_is_constprop): Add case for _Impl::_S_is_constprop. * simd_abi.h: Use __pv2 instead of __detail. More complete _AbiArray and implementation. Constrain vectorizable template parameters. Pass arrays by const-ref. (_SimdImplArray::_S_masked_assign): New. (_SimdImplArray::_S_is_constprop): New. (_MaskImplArray): New. (_SimdTupleMeta): New. (_SimdTupleData): New. (_SimdTuple): New. (_SimdImplAbiCombine): New. (_MaskImplAbiCombine::_S_generator): New. * simd_mask.h: Use __pv2 instead of __detail. (operator[]): Complete range check/assume. Add new case for array _M_data.
ChangeLog: * Makefile: Add -fconcepts-diagnostics-depth=3. * constexpr_tests.c++: Test for random_access_range and not output_range. * simd.h (simd::begin, simd::end): New. * simd_iterator.h: New file. * simd_mask.h (simd_mask::begin, simd_mask::end): New.
ChangeLog: * constexpr_wrapper.h:
ChangeLog: * detail.h:
ChangeLog: * detail.h:
ChangeLog: * simd.h:
ChangeLog: * loads_and_stores/ce.cpp:
Returns an unspecified value if none_of(mask) is true. ChangeLog: * detail.h: * mask_reductions.h: * mask_reductions/ce.cpp: New file.
ChangeLog: * fwddecl.h: * mask_reductions.h: * mask_reductions/ce.cpp:
ChangeLog: * mask_reductions.h: * simd_abi.h:
ChangeLog: * simd.h:
Update constexpr_wrapper from vir-simd repo. ChangeLog: * constexpr_wrapper.h:
ChangeLog: * fwddecl.h (simd_alignment, simd_alignment_v): New. * simd.h (simd_alignment): Partial specializations for basic_simd and basic_simd_mask.
ChangeLog: * detail.h:
Copied and modified large parts of the vec-builtin and x86 implementation in <experimental/simd>. Branching on CPU features now uses conditions derived from template parameters, not globals. This should make it easier to adopt multi-arch / multi-veclen compilation at some point (once the compiler supports it). Reduced the number of template and lambda instantiations, to reduce compile time and space requirements. Notably, iterations on index spaces now use a single lambda instead of a function template invoking one lambda specialization per index. Also removing the use of std::invoke has a *huge* impact. ChangeLog: * Makefile: Add more tests and compile them in many different configurations. * arm_detail.h: New file. * constexpr_tests.c++: Add another test for size 7. * constexpr_wrapper.h: Update copyright. * detail.h: Copy and adjust several functions from experimental/simd. Add _FloatingPointFlags, _MachineFlags, and _BuildFlags / __build_flags. Explicitly list the vectorizable types. Preliminary support for std::float(16|32|64)_t. * detail_bitmask.h: New file. * flags.h: Add TS-like load/store flags. * fwddecl.h: Remove TS usage. Add _VecAbi, _Avx512Abi, and _ScalarAbi. * interleave.h: Update copyright. * iota.h: Update copyright. Reduce lambda instantiations. * mask_reductions.h: Move x86 specific code into simd_x86.h. Call into Abi implementation types when available. * permute.h: Update copyright. Replace always_inline macro. * power_detail.h: New file. * simd.h: Add mossing unary operators +, -, ~, ++, --. Constrain all operators on using the same operator on the value-type. * simd_abi.h: Implement ABI tag deduction. * simd_builtin.h: New file. * simd_config.h: New file. * simd_converter.h: New file. * simd_iterator.h: Update copyright. * simd_mask.h: Remove base class and implement ABI-specific / implementation-defined conversions. * simd_reductions.h: Move x86 optimization into simd_x86.h. * simd_scalar.h: New file. * simd_split.h: Remove <experimental/simd> dependency. * simd_x86.h: New file. * tests/misc.cpp: New file. * tests/shift_left.cpp: New file. * tests/shift_right.cpp: New file. * unittest.h: New file. * x86_detail.h: Add _MachineFlags, __x86_builtin_int, __to_x86_intrin, and __movmsk.
ChangeLog: * Makefile: Generate all check targets and help targets without using a shell using the foreach and eval make functions. Call a submake in the check recipes with a randomized set of concrete checks to run.
Change _VecAbi and _Avx512Abi to use number of elements as template parameter instead of number of bytes. This simplifies _S_size, _S_full_size, and _S_is_partial, which don't need to be templates anymore. More importantly, it removes the need for passing the value-type as a template parameter to some of the Impl functions. Remove unused _SimdBase and _MaskBase. Have _SimdTraits depend on build flags, adding a new _SimdMaskTraits to work out the right ABI for AVX w/o AVX2. ChangeLog: * arm_detail.h: Include detail.h. * constexpr_tests.c++: Add sanity checks relating to AVX w/o AVX2. * detail.h (__make_dependent): New. (_SimdTraits): Add __build_flags template argument. (_SimdMaskTraits): New. * fwddecl.h (_VecAbi, _Avx512Abi): Change template parameter name. (__native_abi_impl_recursive): Adjust for change from bytes to width. (_DeduceAbi): Move default definition to simd_abi.h. * power_detail.h: Include detail.h. * simd.h (simd::mask_type): Test for void not vectorizable to document intent. (basic_simd(basic_simd_mask) deduction guide): Defer ABI tag deduction to __simd_abi_for_mask trait. * simd_abi.h: Adjust for ABI tag template parameter change from bytes to width. (_AbiCombine): Move from std::__detail to std namespace. (_SimdImplArray::_S_masked_assign): Handle some AVX w/o AVX2 cases. * simd_builtin.h: Adjust for ABI tag template parameter change from bytes to width. * simd_mask.h (__simd_abi_for_mask): New. * simd_scalar.h: Remove template heads from _S_size, _S_full_size, and _S_is_partial. Remove _SimdBase, _MaskBase. * simd_x86.h: Adjust for ABI tag template parameter change. (_SimdMaskTraits): Specialize for the AVX w/o AVX2 case. (_ImplBuiltin::_S_load): Overload for AVX512 bitmasks. (_ImplBuiltin::_S_select_bitmask): Swap argument names. Enable constexpr eval. (_ImplBuiltin::_S_select): Enable constexpr eval. Disambiguate overloads when _MaskMember<_TV> is a bitmask type. (_ImplBuiltin::_S_masked_assign): Broadcast scalar argument to a vector when calling _S_select_bitmask. (_ImplBuiltin::_S_bit_and, _S_bit_or, _S_bit_xor, _S_to_bits): Overload for bitmasks. (_ImplBuiltin::_S_bit_shift_right, _S_bit_shift_left): Replace several reinterpret_cast with __vec_bitcast_trunc on return. Adjust ABI/Impl type needed after template parameter change. Fix conditions for sizeof<16 inputs. (_ImplBuiltin::_S_ldexp): Use __make_dependent to instantiate _Rebind only on use of _S_ldexp. Start of supporting sizeof<16 inputs. * unittest.h (instantiate_tests_for_value_type): Sanity check that if simd<T, N> is usable, then the corresponding mask is also usable. * x86_detail.h (__x86_builtin_fp): New. (__to_x86_intrin): Normalize floating-point types using __x86_builtin_fp.
60k check targets: - 0.6s for 'make help', listing all 60k targets - 0.5s for 'make debug', parsing the whole Makefile and a bit of output All check targets are shuffled differently on every make invocation without significant overhead. The 'check' target works without sub-make, whereas all the other check-% targets recurs once (which might become a problem with too long command lines). Per -march, a header is generated, a PCH is built from it, and the header is automatically included into the builds. ChangeLog: * Makefile: Rewrite.
ChangeLog: * Makefile: Remove stale help/% target. Accommodate new unittest.h location. Remove -I. flag. Fix required check target. * tests/misc.cpp: Adjust unittest.h include. * tests/shift_left.cpp: Likewise. * tests/shift_right.cpp: Likewise. * unittest.h: Renamed to unittest.h.
ChangeLog: * Makefile: Require 0 failed tests or fail with non-zero exit status.
Use icerun for running tests to enable -j<large number>. ChangeLog: * Makefile: Run tests in icerun. Compile and link in one step if DIRECT is non-empty. Document DIRECT in help target. Unconditionally set DIRECT=1 without icecream. Build without icecream wrapper but with icerun when DIRECT is non-empty.
ChangeLog: * Makefile.common:
ChangeLog: * .github/workflows/CI.yml: Renamed to Clang.yml. * .github/workflows/GCC.yml: New file. * README.md:
ChangeLog: * Makefile.common: Remove std::byte from testtypes. * README.md: Document SIMD_STD_BYTE. * bits/fwddecl.h (__is_vectorizable): Enable specialization for std::byte. * bits/simd.h (basic_simd): The conversion constructor needs to check whether function-style cast rather can constructible_from for enums. (operator[]): Use static_cast rather than relying on implicit conversion (needed for std::byte). * bits/simd_config.h: Add SIMD_STD_BYTE. * bits/simd_meta.h (__loadstore_convertible_to): For std::byte this needs to check for explicit rather than implicit conversion. * bits/vec_detail.h (__canonical_vec_type): Add partial specialization for all enums, mapping to their underlying_type. * tests/arithmetic.cpp: Constrain tests. * tests/misc.cpp: Likewise. * tests/simd_alg.cpp: These tests need to work for std::byte. But to setup the test values, no arithmetic operators can be used anymore. * tests/unittest.h: Add std::byte. * tests/unittest_pch.h (test_iota_max): Reduce numeric_limits::max if Max NTTP is negative. Add partial specialization for enums. (test_iota): Pass negative Max NTTP on to test_iota_max.
ChangeLog: * bits/simd_x86.h (_SimdConverter): constrain to different _From and _To types.
... which was used for debugging the test. ChangeLog: * tests/shift_right.cpp (constant_p): Remove.
ChangeLog: * README.md:
ChangeLog: * README.md: Document iota, permute, and simd_mask <-> bitset/int paper implementation status. * bits/fwddecl.h (zero_element, uninit_element): New. * bits/permute.h (permute_zero): Remove. (permute): Refactor to call _S_permute. * bits/simd_builtin.h (_S_permute): New. * bits/simd_meta.h (__index_permutation_function_size): Change size argument type to int. * bits/simd_scalar.h (_S_permute): New.
ChangeLog: * bits/fwddecl.h: Move math exposition-only concepts and traits here from simd_math.h for declarations of math functions. (isfinite, isunordered): Declare. * bits/simd_math.h (isfinite, isunordered): New. * bits/mask_reductions.h (all_of, any_of, none_of): Call ABI implementation functions with _M_data member instead of basic_simd_mask object. * bits/simd_abi.h (_S_any_of, _S_all_of, _S_none_of): Change interface from basic_simd_mask to _MaskMember. * bits/simd_builtin.h (_S_any_of, _S_all_of, _S_none_of): Likewise. * bits/simd_scalar.h (_S_any_of, _S_all_of, _S_none_of): Likewise. * bits/simd_x86.h (_S_any_of, _S_all_of, _S_none_of): Likewise. (_S_divides): Use masked divp[hsd] with AVX10/AVX512 on partial vectors. (_S_isnan): Implement consteval/constprop variant for _Avx512Abi.
ChangeLog: * bits/fwddecl.h (__deduce_t): Use __canonical_vec_type to reduce the number of possible _DeduceAbi specializations. Add _PrefAbi argument. * bits/vec_detail.h (__canonical_vec_type): Moved to fwddecl.h. * bits/simd.h (rebind, resize): Determine new ABI tag using _Rebind on ingoing ABI tag. * bits/simd_abi.h (_AbiList::_S_A0_is_valid): Take _S_defer_to_scalar_abi into consideration. (_Rebind): New. (_AbiMaxSize): New. (__is_valid_preferred_abi): New. (_DeduceAbi): Add _PrefAbi argument. Change existing partial specializations to trigger on _NoAbiPreference. Add partial specialization for given _PrefAbi. * bits/simd_builtin.h (_S_defer_to_scalar_abi): New. (_Rebind): Simplify to __deduce_t with _VecAbi. * bits/simd_scalar.h (_S_defer_to_scalar_abi, _Rebind): New. * bits/simd_x86.h (_S_defer_to_scalar_abi): New. (_Rebind): Simplify to __deduce_t with _Avx512Abi. * constexpr_tests.c++: Add _AbiMaxSize and ABI deduction tests.
ChangeLog: * bits/simd.h: Add a reason to deleted functions. * bits/simd_mask.h: Likewise.
ChangeLog: * bits/simd_abi.h (_SimdTuple): Change integral_constants into plain constexpr ints.
Recently __has_single_bit, __bit_ceil, __bit_floor, etc. have started to require unsigned integers, rejecting int. Therefore, cast to unsigned where it was called with a signed int. Fixes GSI-HPCgh-1 ChangeLog: * bits/detail.h: * bits/mask_reductions.h: * bits/simd_abi.h: * bits/simd_builtin.h: * bits/simd_converter.h: * bits/simd_reductions.h: * bits/simd_x86.h: * bits/vec_detail.h (__glibcxx_simd_erroneous_unless): New. (__is_power2_minus_1): Moved from detail.h and rewritten to not use bit functions. (__signed_has_single_bit, __signed_bit_ceil): New. (__signed_bit_floor): New. * constexpr_tests.c++:
ChangeLog: * Makefile.common: Pass -p to mkdir.
ChangeLog: * tests/unittest_pch.h: Avoid wrap-around to negative from numeric_limits<long long>::max().
ChangeLog: * bits/simd.h: Use _GLIBCXX_DELETE_MSG instead of delete. * bits/simd_mask.h: Likewise.
This should make the warning go away for users that have warnings in system headers enabled. ChangeLog: * Makefile.common: Remove -Wno-psabi from CXXFLAGS. * bits/simd_x86.h: Ignore -Wpsabi diagnostic. * bits/vec_detail.h: Likewise.
ChangeLog: * .github/workflows/Clang.yml: Run on cplusplus-ci:latest. * .github/workflows/GCC.yml: Likewise. * .github/workflows/build-ci-docker.yml: New file. * Dockerfile: New file.
ChangeLog: * .github/workflows/Clang.yml: Adjust container image URL. * .github/workflows/GCC.yml: Likewise. * .github/workflows/build-ci-docker.yml: Removed. * Dockerfile: Removed.
ChangeLog: * .github/workflows/GCC.yml:
ChangeLog: * Makefile: * README.md:
ChangeLog: * Makefile.common: Determine obj directory from SIMD_OBJ_SUBST if defined.
ChangeLog: * Makefile:
ChangeLog: * bits/simd_x86.h (_S_divides): Add initial branch for consteval and constprop => never call __builtin_ia32_... in those cases.
ChangeLog: * bits/simd_builtin.h (_S_permute): Rewrite using integral_constants instead of plain int.
ChangeLog: * Makefile.common: Replace obj with $(objdir) * Makefile.more: Ditto. * Makefile: Ditto.
ChangeLog: * Makefile.common: * Makefile:
Collaborator
|
Thank you for that fix. At this point the main branch (and its derived branches) are unlikely to survive. I'm focusing on a different architecture (see 'rewrite' branch). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.