14 Commits (b61b9c8f6ca975026812341ca7306d69bf549042)

Author SHA1 Message Date
Bryce Allen 7a1d10349e Use MPI_IN_PLACE in one of the allgathers
5 years ago
Bryce Allen cff437eace barrier off by default
5 years ago
Bryce Allen cd6e6f7eb5 add jlse runners, more flexible node counter
5 years ago
Bryce Allen 909f8880de add mpi barrier before allgather
5 years ago
Bryce Allen 02b31f0427 hacky multi-node support
5 years ago
Bryce Allen c32b86422f distribute total across ranks
5 years ago
Bryce Allen 37ad5e87ce use define to switch between managed/unmanaged
5 years ago
Bryce Allen 6940ce7ceb add mem free print, fit in 8GB gpu
5 years ago
Bryce Allen 3ebd09725e add mpi wtime counters, fix make clean
5 years ago
Bryce Allen 3dd6045f2e move finialize to outside profiler area
5 years ago
Bryce Allen 063e592dcf fix all* cuda malloc size
5 years ago
Bryce Allen 134c933e86 use managed mem for allgather, cleanup
5 years ago
Bryce Allen 3e99cf443b fix allgather recv size
5 years ago
Bryce Allen 4d504dd5b1 add versions with nvtx
5 years ago