-
Notifications
You must be signed in to change notification settings - Fork 260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new Algorithms using explicit batch type #496
base: master
Are you sure you want to change the base?
Conversation
a2cc837
to
2be89a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! The implementation is really neat. I have some questions regarding the names of the functions as you can see below.
Regarding the failure in the tests, I think you have to include xsimd_fallback.hpp
in the cpp file, so that the compiler can find the default implementation when a type is not supported by the current instructions set.
include/xsimd/stl/algorithms.hpp
Outdated
template <class I1, class I2, class O1, class UF> | ||
void transform(I1 first, I2 last, O1 out_first, UF&& f) | ||
template <class I1, class I2, class O1, class UF, class UFB> | ||
void transform_batch(I1 first, I2 last, O1 out_first, UF&& f, UFB&& fb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not keeping transform
as the function name?
5459d0a
to
983a7c4
Compare
test/test_algorithms.cpp
Outdated
@@ -216,6 +275,102 @@ TEST_F(xsimd_reduce, using_custom_binary_function) | |||
} | |||
} | |||
|
|||
TEST(algorithms, reduce_batch) | |||
{ | |||
const double nan = std::numeric_limits<double>::quiet_NaN(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For ARM, vectorization for double
is available only on 64bits arch. Therefore, this test should be guarded with something like
#if XSIMD_ARM_INSTR_SET >= XSIMD_ARM8_64_NEON_VERSION || XSIMD_X86_INSTR_SET >= XSIMD_X86_SSE2_VERSION
include/xsimd/stl/algorithms.hpp
Outdated
using enable_if_increment = typename std::enable_if<has_increment<T>::value>::type; | ||
|
||
template <class T> | ||
using enable_if_not_increment = typename std::enable_if<!has_increment<T>::value>::type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to move these metafunctions in some detail
namespace, they're not supposed to be part of the API.
include/xsimd/stl/algorithms.hpp
Outdated
typename = enable_if_increment<I2>, | ||
typename = enable_if_increment<O1>, | ||
typename = enable_if_not_increment<UF>, | ||
typename = enable_if_not_increment<UFB>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be more readable to gather these conditions so that you can use a single enable_if
condition. That could be something like:
template <class... Args>
struct have_increment : all_true<has_increment<Args>::value...> {};
template <class... Args>
struct not_have_increment : all_true<!has_increment<Args>::value...> {};
template <class I1, class I2, class I3, class UF, class UFB>
using enable_binary_algorithm_t = typename std::enable_if<have_increment<I1, I2, O1>::value && not_have_increment<UF, UFB>::value, int>::type;
Besides, default template parameters are not considered by the compiler for overload selection, so it's better to use the enable_if
as the template parameter evaluating to an int when it's valid:
template <class I1, class I2, class O1, class UF, class UFB,
enable_binary_algorithm_t<I1, I2, O1, UF, UFB> = 0>
...
983a7c4
to
0f40c17
Compare
0f40c17
to
8a81e7c
Compare
6c6dc1f
to
52984ef
Compare
New algorithms are added:
Tests are updated.
README.md updated.
Using the added algorithms I've create a
nanmean_fast
function where the benchmark are the faster respect other implementation:BM_nanmean
has a classic C style whileBM_nanmean_with_xtensor
is essentiallyxt::mean(xt::filter(e, !xt::isnan(e)))
wheree
is a tensorBM_nanmean_xt
is thext::nanmean
BM_nanmean_fast
use the addedreduce_batch
andcount_if
algorithms