Commit Graph

  • b61b9c8f6c gt: link gtl on cray amd systems, more iters main Bryce Allen 2023-04-28 14:22:52 -0400
  • 04652c0059 gt: add MPI_Allreduce test Bryce Allen 2023-03-25 17:36:48 -0700
  • b00d52af2a gt: fix dim1 nobuff Bryce Allen 2023-03-25 13:48:40 -0700
  • a885062c83 gt: mpi sum err and time, enable all Bryce Allen 2023-03-25 15:12:53 -0500
  • 84deab6ced gt: don't use circular exchange for y deriv Bryce Allen 2023-03-25 12:58:11 -0700
  • 7967991bd9 cmake: updates for gt stencil2d Bryce Allen 2023-03-25 10:53:50 -0700
  • 02ab50ab60 gt: WIP y (outer) dim test Bryce Allen 2023-03-25 10:53:12 -0700
  • 5bcf1382ba gt: parameterize space Bryce Allen 2023-03-24 14:58:14 -0700
  • 4f82b8359b sycl: drop cl namespace, debug dev ids Bryce Allen 2023-01-23 16:56:24 +0000
  • cecb98415e cmake: use main gtensor Bryce Allen 2023-01-23 16:55:22 +0000
  • 3e13fee811 fix sycl oo version, better debug Bryce Allen 2022-11-01 08:27:26 -0500
  • d960429a83 add new sycl version using span2d class Bryce Allen 2022-10-31 15:39:45 -0500
  • 8e43edc21e use milliseconds for timing Bryce Allen 2022-10-29 13:47:07 +0000
  • 44d72ad2bb switch exchange/stencil dim on sycl example Bryce Allen 2022-10-29 13:44:45 +0000
  • 1d779fd510 update comment Bryce Allen 2022-10-29 13:44:38 +0000
  • b2ed53adc1 switch boundary exchange / stencil direction Bryce Allen 2022-10-29 12:41:12 +0000
  • 936f0851c8 fixes for sycl port Bryce Allen 2022-10-27 18:29:08 +0000
  • 55bb0d26d1 WIP add sycl port of stencil2d Bryce Allen 2022-10-25 22:14:57 +0000
  • e5e3ca178a more precision when printing timings Bryce Allen 2022-10-24 18:07:38 -0400
  • 124654b576 update gt daxpy example for new gt-blas handle api Bryce Allen 2022-10-24 18:01:22 -0400
  • 9fb70b5169 print n_iter and n_warmup Bryce Allen 2022-10-24 18:55:02 +0000
  • 37a97f24dd add iteration loop Bryce Allen 2022-10-24 18:43:12 +0000
  • 2309afb2ab print stage_host Bryce Allen 2022-10-24 18:12:09 +0000
  • 7c332265d9 optional stage via host Bryce Allen 2022-10-24 12:15:47 -0400
  • c9a375df4a check that nmpi divides n_global Bryce Allen 2022-10-24 11:32:49 -0400
  • 88e2d23c7f fix physical boundary for rank 0, comments Bryce Allen 2022-10-24 08:36:41 -0500
  • 4dc1ad4603 add clang-format conf from gtensor Bryce Allen 2022-10-24 08:26:00 -0500
  • 35860709f3 fix send/recv size Bryce Allen 2022-10-23 17:24:29 -0700
  • 6e98c0c5a4 add ex 2d array with noncontiguous 1d stencil Bryce Allen 2022-10-23 23:50:05 +0000
  • baff75c6b1 add timer for exchange Bryce Allen 2022-10-23 22:22:01 +0000
  • 849e894109 remove unneeded syncs Bryce Allen 2022-10-23 20:16:18 +0000
  • 4143c5f06f add n_global arg, print sizes in rank 0 Bryce Allen 2022-10-23 18:51:17 +0000
  • 74bfc20d50 remove fmt dependency, public oneapi won't build it Bryce Allen 2022-10-23 13:27:16 -0500
  • 23d882d089 add 1d stencil example Bryce Allen 2022-10-23 01:32:50 +0000
  • 349837e9c7 fix mpi init/set device order Bryce Allen 2021-07-17 14:23:50 +0000
  • df5f830a26 gt and cmake fixes Bryce Allen 2021-07-16 22:07:00 -0400
  • d791b81cb6 add gt port of mpi_daxpy Bryce Allen 2021-07-16 21:36:50 -0400
  • 2434b39b53 add mpigatherinplace example for reproducing pmpi wrapper bug Bryce Allen 2020-09-02 18:42:48 -0400
  • 7a1d10349e Use MPI_IN_PLACE in one of the allgathers Bryce Allen 2020-09-02 16:34:37 -0400
  • cff437eace barrier off by default Bryce Allen 2020-08-11 15:35:17 +0000
  • cd6e6f7eb5 add jlse runners, more flexible node counter Bryce Allen 2020-08-11 15:34:07 +0000
  • 12d76b4a42 update ignores Bryce Allen 2020-08-11 10:23:33 -0400
  • 909f8880de add mpi barrier before allgather Bryce Allen 2020-08-10 11:33:59 -0400
  • 924b721ad7 fix summit job script run script arg order Bryce Allen 2020-08-10 11:33:00 -0400
  • 02b31f0427 hacky multi-node support Bryce Allen 2020-08-07 18:50:11 -0400
  • c32b86422f distribute total across ranks Bryce Allen 2020-08-07 17:53:16 -0400
  • 538c22a22f add avg script for parsing timings in *.txt Bryce Allen 2020-08-07 18:07:56 -0400
  • 37ad5e87ce use define to switch between managed/unmanaged Bryce Allen 2020-08-07 14:14:56 -0400
  • 6940ce7ceb add mem free print, fit in 8GB gpu Bryce Allen 2020-08-07 13:21:22 -0400
  • 3ebd09725e add mpi wtime counters, fix make clean Bryce Allen 2020-08-07 13:05:34 -0400
  • 3dd6045f2e move finialize to outside profiler area Bryce Allen 2020-08-07 13:02:07 -0400
  • 55af9daa9b make: fix summit build Bryce Allen 2020-08-06 11:13:32 -0400
  • 063e592dcf fix all* cuda malloc size Bryce Allen 2020-08-06 11:13:18 -0400
  • 134c933e86 use managed mem for allgather, cleanup Bryce Allen 2020-08-06 10:11:58 -0400
  • 714a96d1ea update cuda errors for 11 Bryce Allen 2020-08-06 07:42:59 -0400
  • 3e99cf443b fix allgather recv size Bryce Allen 2020-08-06 07:42:46 -0400
  • 4d504dd5b1 add versions with nvtx Bryce Allen 2020-08-05 16:45:42 -0400
  • df9a3a79a8 add env var debugging Bryce Allen 2020-03-31 14:31:11 -0400
  • 74b23dff0b initial version Bryce Allen 2020-02-24 17:20:21 -0500