-
b61b9c8f6c
gt: link gtl on cray amd systems, more iters
main
Bryce Allen
2023-04-28 14:22:52 -0400
-
04652c0059
gt: add MPI_Allreduce test
Bryce Allen
2023-03-25 17:36:48 -0700
-
b00d52af2a
gt: fix dim1 nobuff
Bryce Allen
2023-03-25 13:48:40 -0700
-
a885062c83
gt: mpi sum err and time, enable all
Bryce Allen
2023-03-25 15:12:53 -0500
-
84deab6ced
gt: don't use circular exchange for y deriv
Bryce Allen
2023-03-25 12:58:11 -0700
-
7967991bd9
cmake: updates for gt stencil2d
Bryce Allen
2023-03-25 10:53:50 -0700
-
02ab50ab60
gt: WIP y (outer) dim test
Bryce Allen
2023-03-25 10:53:12 -0700
-
5bcf1382ba
gt: parameterize space
Bryce Allen
2023-03-24 14:58:14 -0700
-
4f82b8359b
sycl: drop cl namespace, debug dev ids
Bryce Allen
2023-01-23 16:56:24 +0000
-
cecb98415e
cmake: use main gtensor
Bryce Allen
2023-01-23 16:55:22 +0000
-
3e13fee811
fix sycl oo version, better debug
Bryce Allen
2022-11-01 08:27:26 -0500
-
d960429a83
add new sycl version using span2d class
Bryce Allen
2022-10-31 15:39:45 -0500
-
8e43edc21e
use milliseconds for timing
Bryce Allen
2022-10-29 13:47:07 +0000
-
44d72ad2bb
switch exchange/stencil dim on sycl example
Bryce Allen
2022-10-29 13:44:45 +0000
-
1d779fd510
update comment
Bryce Allen
2022-10-29 13:44:38 +0000
-
b2ed53adc1
switch boundary exchange / stencil direction
Bryce Allen
2022-10-29 12:41:12 +0000
-
936f0851c8
fixes for sycl port
Bryce Allen
2022-10-27 18:29:08 +0000
-
55bb0d26d1
WIP add sycl port of stencil2d
Bryce Allen
2022-10-25 22:14:57 +0000
-
e5e3ca178a
more precision when printing timings
Bryce Allen
2022-10-24 18:07:38 -0400
-
124654b576
update gt daxpy example for new gt-blas handle api
Bryce Allen
2022-10-24 18:01:22 -0400
-
9fb70b5169
print n_iter and n_warmup
Bryce Allen
2022-10-24 18:55:02 +0000
-
37a97f24dd
add iteration loop
Bryce Allen
2022-10-24 18:43:12 +0000
-
2309afb2ab
print stage_host
Bryce Allen
2022-10-24 18:12:09 +0000
-
7c332265d9
optional stage via host
Bryce Allen
2022-10-24 12:15:47 -0400
-
c9a375df4a
check that nmpi divides n_global
Bryce Allen
2022-10-24 11:32:49 -0400
-
88e2d23c7f
fix physical boundary for rank 0, comments
Bryce Allen
2022-10-24 08:36:41 -0500
-
4dc1ad4603
add clang-format conf from gtensor
Bryce Allen
2022-10-24 08:26:00 -0500
-
35860709f3
fix send/recv size
Bryce Allen
2022-10-23 17:24:29 -0700
-
6e98c0c5a4
add ex 2d array with noncontiguous 1d stencil
Bryce Allen
2022-10-23 23:50:05 +0000
-
baff75c6b1
add timer for exchange
Bryce Allen
2022-10-23 22:22:01 +0000
-
849e894109
remove unneeded syncs
Bryce Allen
2022-10-23 20:16:18 +0000
-
4143c5f06f
add n_global arg, print sizes in rank 0
Bryce Allen
2022-10-23 18:51:17 +0000
-
74bfc20d50
remove fmt dependency, public oneapi won't build it
Bryce Allen
2022-10-23 13:27:16 -0500
-
23d882d089
add 1d stencil example
Bryce Allen
2022-10-23 01:32:50 +0000
-
349837e9c7
fix mpi init/set device order
Bryce Allen
2021-07-17 14:23:50 +0000
-
df5f830a26
gt and cmake fixes
Bryce Allen
2021-07-16 22:07:00 -0400
-
d791b81cb6
add gt port of mpi_daxpy
Bryce Allen
2021-07-16 21:36:50 -0400
-
2434b39b53
add mpigatherinplace example for reproducing pmpi wrapper bug
Bryce Allen
2020-09-02 18:42:48 -0400
-
7a1d10349e
Use MPI_IN_PLACE in one of the allgathers
Bryce Allen
2020-09-02 16:34:37 -0400
-
cff437eace
barrier off by default
Bryce Allen
2020-08-11 15:35:17 +0000
-
cd6e6f7eb5
add jlse runners, more flexible node counter
Bryce Allen
2020-08-11 15:34:07 +0000
-
12d76b4a42
update ignores
Bryce Allen
2020-08-11 10:23:33 -0400
-
909f8880de
add mpi barrier before allgather
Bryce Allen
2020-08-10 11:33:59 -0400
-
924b721ad7
fix summit job script run script arg order
Bryce Allen
2020-08-10 11:33:00 -0400
-
02b31f0427
hacky multi-node support
Bryce Allen
2020-08-07 18:50:11 -0400
-
c32b86422f
distribute total across ranks
Bryce Allen
2020-08-07 17:53:16 -0400
-
538c22a22f
add avg script for parsing timings in *.txt
Bryce Allen
2020-08-07 18:07:56 -0400
-
37ad5e87ce
use define to switch between managed/unmanaged
Bryce Allen
2020-08-07 14:14:56 -0400
-
6940ce7ceb
add mem free print, fit in 8GB gpu
Bryce Allen
2020-08-07 13:21:22 -0400
-
3ebd09725e
add mpi wtime counters, fix make clean
Bryce Allen
2020-08-07 13:05:34 -0400
-
3dd6045f2e
move finialize to outside profiler area
Bryce Allen
2020-08-07 13:02:07 -0400
-
55af9daa9b
make: fix summit build
Bryce Allen
2020-08-06 11:13:32 -0400
-
063e592dcf
fix all* cuda malloc size
Bryce Allen
2020-08-06 11:13:18 -0400
-
134c933e86
use managed mem for allgather, cleanup
Bryce Allen
2020-08-06 10:11:58 -0400
-
714a96d1ea
update cuda errors for 11
Bryce Allen
2020-08-06 07:42:59 -0400
-
3e99cf443b
fix allgather recv size
Bryce Allen
2020-08-06 07:42:46 -0400
-
4d504dd5b1
add versions with nvtx
Bryce Allen
2020-08-05 16:45:42 -0400
-
df9a3a79a8
add env var debugging
Bryce Allen
2020-03-31 14:31:11 -0400
-
74b23dff0b
initial version
Bryce Allen
2020-02-24 17:20:21 -0500