Bryce Allen
b61b9c8f6c
gt: link gtl on cray amd systems, more iters
...
Option to disable managed tests, TEST_MANAGED=OFF cmake var
3 years ago
Bryce Allen
04652c0059
gt: add MPI_Allreduce test
3 years ago
Bryce Allen
b00d52af2a
gt: fix dim1 nobuff
3 years ago
Bryce Allen
a885062c83
gt: mpi sum err and time, enable all
3 years ago
Bryce Allen
84deab6ced
gt: don't use circular exchange for y deriv
3 years ago
Bryce Allen
7967991bd9
cmake: updates for gt stencil2d
3 years ago
Bryce Allen
02ab50ab60
gt: WIP y (outer) dim test
3 years ago
Bryce Allen
5bcf1382ba
gt: parameterize space
3 years ago
Bryce Allen
4f82b8359b
sycl: drop cl namespace, debug dev ids
3 years ago
Bryce Allen
cecb98415e
cmake: use main gtensor
...
necessary pr has been merged now
3 years ago
Bryce Allen
3e13fee811
fix sycl oo version, better debug
3 years ago
Bryce Allen
d960429a83
add new sycl version using span2d class
3 years ago
Bryce Allen
8e43edc21e
use milliseconds for timing
3 years ago
Bryce Allen
44d72ad2bb
switch exchange/stencil dim on sycl example
3 years ago
Bryce Allen
1d779fd510
update comment
3 years ago
Bryce Allen
b2ed53adc1
switch boundary exchange / stencil direction
...
Contiguous staging vectors are required for multi-d exchange
when the non outer most dimension is exchanged. The previous
version was exchanging y, the outer most dimension, and the
data was already contiguous.
3 years ago
Bryce Allen
936f0851c8
fixes for sycl port
3 years ago
Bryce Allen
55bb0d26d1
WIP add sycl port of stencil2d
3 years ago
Bryce Allen
e5e3ca178a
more precision when printing timings
3 years ago
Bryce Allen
124654b576
update gt daxpy example for new gt-blas handle api
3 years ago
Bryce Allen
9fb70b5169
print n_iter and n_warmup
3 years ago
Bryce Allen
37a97f24dd
add iteration loop
3 years ago
Bryce Allen
2309afb2ab
print stage_host
3 years ago
Bryce Allen
7c332265d9
optional stage via host
3 years ago
Bryce Allen
c9a375df4a
check that nmpi divides n_global
3 years ago
Bryce Allen
88e2d23c7f
fix physical boundary for rank 0, comments
3 years ago
Bryce Allen
4dc1ad4603
add clang-format conf from gtensor
3 years ago
Bryce Allen
35860709f3
fix send/recv size
3 years ago
Bryce Allen
6e98c0c5a4
add ex 2d array with noncontiguous 1d stencil
3 years ago
Bryce Allen
baff75c6b1
add timer for exchange
3 years ago
Bryce Allen
849e894109
remove unneeded syncs
3 years ago
Bryce Allen
4143c5f06f
add n_global arg, print sizes in rank 0
3 years ago
Bryce Allen
74bfc20d50
remove fmt dependency, public oneapi won't build it
3 years ago
Bryce Allen
23d882d089
add 1d stencil example
3 years ago
Bryce Allen
349837e9c7
fix mpi init/set device order
4 years ago
Bryce Allen
df5f830a26
gt and cmake fixes
4 years ago
Bryce Allen
d791b81cb6
add gt port of mpi_daxpy
4 years ago
Bryce Allen
2434b39b53
add mpigatherinplace example for reproducing pmpi wrapper bug
5 years ago
Bryce Allen
7a1d10349e
Use MPI_IN_PLACE in one of the allgathers
...
Try to reproduce nsys segfault seen when running GENE, which
has an in place allgather as the BT for the segfault.
5 years ago
Bryce Allen
cff437eace
barrier off by default
5 years ago
Bryce Allen
cd6e6f7eb5
add jlse runners, more flexible node counter
5 years ago
Bryce Allen
12d76b4a42
update ignores
5 years ago
Bryce Allen
909f8880de
add mpi barrier before allgather
5 years ago
Bryce Allen
924b721ad7
fix summit job script run script arg order
5 years ago
Bryce Allen
02b31f0427
hacky multi-node support
...
assumes 6 procs per node
5 years ago
Bryce Allen
c32b86422f
distribute total across ranks
...
useful for test < 6 ranks per node
5 years ago
Bryce Allen
538c22a22f
add avg script for parsing timings in *.txt
5 years ago
Bryce Allen
37ad5e87ce
use define to switch between managed/unmanaged
5 years ago
Bryce Allen
6940ce7ceb
add mem free print, fit in 8GB gpu
5 years ago
Bryce Allen
3ebd09725e
add mpi wtime counters, fix make clean
5 years ago