Commit Graph

53 Commits

Author SHA1 Message Date
Bryce Allen
02ab50ab60 gt: WIP y (outer) dim test 2023-03-25 10:53:12 -07:00
Bryce Allen
5bcf1382ba gt: parameterize space 2023-03-24 14:58:14 -07:00
Bryce Allen
4f82b8359b sycl: drop cl namespace, debug dev ids 2023-01-23 16:56:24 +00:00
Bryce Allen
cecb98415e cmake: use main gtensor
necessary pr has been merged now
2023-01-23 16:55:22 +00:00
Bryce Allen
3e13fee811 fix sycl oo version, better debug 2022-11-01 08:27:26 -05:00
Bryce Allen
d960429a83 add new sycl version using span2d class 2022-10-31 15:39:45 -05:00
Bryce Allen
8e43edc21e use milliseconds for timing 2022-10-29 13:47:07 +00:00
Bryce Allen
44d72ad2bb switch exchange/stencil dim on sycl example 2022-10-29 13:44:45 +00:00
Bryce Allen
1d779fd510 update comment 2022-10-29 13:44:38 +00:00
Bryce Allen
b2ed53adc1 switch boundary exchange / stencil direction
Contiguous staging vectors are required for multi-d exchange
when the non outer most dimension is exchanged. The previous
version was exchanging y, the outer most dimension, and the
data was already contiguous.
2022-10-29 12:41:12 +00:00
Bryce Allen
936f0851c8 fixes for sycl port 2022-10-28 17:49:01 +00:00
Bryce Allen
55bb0d26d1 WIP add sycl port of stencil2d 2022-10-25 22:41:45 +00:00
Bryce Allen
e5e3ca178a more precision when printing timings 2022-10-24 18:07:38 -04:00
Bryce Allen
124654b576 update gt daxpy example for new gt-blas handle api 2022-10-24 18:01:22 -04:00
Bryce Allen
9fb70b5169 print n_iter and n_warmup 2022-10-24 18:55:02 +00:00
Bryce Allen
37a97f24dd add iteration loop 2022-10-24 18:43:12 +00:00
Bryce Allen
2309afb2ab print stage_host 2022-10-24 18:12:09 +00:00
Bryce Allen
7c332265d9 optional stage via host 2022-10-24 12:15:47 -04:00
Bryce Allen
c9a375df4a check that nmpi divides n_global 2022-10-24 11:32:49 -04:00
Bryce Allen
88e2d23c7f fix physical boundary for rank 0, comments 2022-10-24 08:36:41 -05:00
Bryce Allen
4dc1ad4603 add clang-format conf from gtensor 2022-10-24 08:26:00 -05:00
Bryce Allen
35860709f3 fix send/recv size 2022-10-23 17:24:29 -07:00
Bryce Allen
6e98c0c5a4 add ex 2d array with noncontiguous 1d stencil 2022-10-23 23:55:42 +00:00
Bryce Allen
baff75c6b1 add timer for exchange 2022-10-23 22:22:01 +00:00
Bryce Allen
849e894109 remove unneeded syncs 2022-10-23 20:16:18 +00:00
Bryce Allen
4143c5f06f add n_global arg, print sizes in rank 0 2022-10-23 18:51:17 +00:00
Bryce Allen
74bfc20d50 remove fmt dependency, public oneapi won't build it 2022-10-23 13:27:16 -05:00
Bryce Allen
23d882d089 add 1d stencil example 2022-10-23 13:07:29 -05:00
Bryce Allen
349837e9c7 fix mpi init/set device order 2021-07-17 14:23:50 +00:00
Bryce Allen
df5f830a26 gt and cmake fixes 2021-07-16 22:07:00 -04:00
Bryce Allen
d791b81cb6 add gt port of mpi_daxpy 2021-07-16 21:36:50 -04:00
Bryce Allen
2434b39b53 add mpigatherinplace example for reproducing pmpi wrapper bug 2020-09-02 18:42:48 -04:00
Bryce Allen
7a1d10349e Use MPI_IN_PLACE in one of the allgathers
Try to reproduce nsys segfault seen when running GENE, which
has an in place allgather as the BT for the segfault.
2020-09-02 16:34:37 -04:00
Bryce Allen
cff437eace barrier off by default 2020-08-11 15:35:17 +00:00
Bryce Allen
cd6e6f7eb5 add jlse runners, more flexible node counter 2020-08-11 15:34:46 +00:00
Bryce Allen
12d76b4a42 update ignores 2020-08-11 10:23:33 -04:00
Bryce Allen
909f8880de add mpi barrier before allgather 2020-08-10 11:37:45 -04:00
Bryce Allen
924b721ad7 fix summit job script run script arg order 2020-08-10 11:33:00 -04:00
Bryce Allen
02b31f0427 hacky multi-node support
assumes 6 procs per node
2020-08-07 18:50:39 -04:00
Bryce Allen
c32b86422f distribute total across ranks
useful for test < 6 ranks per node
2020-08-07 18:50:39 -04:00
Bryce Allen
538c22a22f add avg script for parsing timings in *.txt 2020-08-07 18:07:56 -04:00
Bryce Allen
37ad5e87ce use define to switch between managed/unmanaged 2020-08-07 14:14:56 -04:00
Bryce Allen
6940ce7ceb add mem free print, fit in 8GB gpu 2020-08-07 13:21:22 -04:00
Bryce Allen
3ebd09725e add mpi wtime counters, fix make clean 2020-08-07 13:05:34 -04:00
Bryce Allen
3dd6045f2e move finialize to outside profiler area 2020-08-07 13:02:07 -04:00
Bryce Allen
55af9daa9b make: fix summit build 2020-08-06 11:13:32 -04:00
Bryce Allen
063e592dcf fix all* cuda malloc size 2020-08-06 11:13:18 -04:00
Bryce Allen
134c933e86 use managed mem for allgather, cleanup 2020-08-06 10:11:58 -04:00
Bryce Allen
714a96d1ea update cuda errors for 11
deprecated API was removed
2020-08-06 07:42:59 -04:00
Bryce Allen
3e99cf443b fix allgather recv size 2020-08-06 07:42:46 -04:00