Bryce Allen
							
						 | 
						
							
							
							
							
								
							
								7a1d10349e
								
							
								
							
						 | 
						
							
							
								
								Use MPI_IN_PLACE in one of the allgathers
							
							
							
							
							
							
							
							Try to reproduce nsys segfault seen when running GENE, which
has an in place allgather as the BT for the segfault. 
							
						 | 
						
							5 years ago | 
						
					
				
					
						
							
							
								 
								Bryce Allen
							
						 | 
						
							
							
							
							
								
							
								cff437eace
								
							
								
							
						 | 
						
							
							
								
								barrier off by default
							
							
							
							
							
						 | 
						
							5 years ago | 
						
					
				
					
						
							
							
								 
								Bryce Allen
							
						 | 
						
							
							
							
							
								
							
								cd6e6f7eb5
								
							
								
							
						 | 
						
							
							
								
								add jlse runners, more flexible node counter
							
							
							
							
							
						 | 
						
							5 years ago | 
						
					
				
					
						
							
							
								 
								Bryce Allen
							
						 | 
						
							
							
							
							
								
							
								909f8880de
								
							
								
							
						 | 
						
							
							
								
								add mpi barrier before allgather
							
							
							
							
							
						 | 
						
							5 years ago | 
						
					
				
					
						
							
							
								 
								Bryce Allen
							
						 | 
						
							
							
							
							
								
							
								02b31f0427
								
							
								
							
						 | 
						
							
							
								
								hacky multi-node support
							
							
							
							
							
							
							
							assumes 6 procs per node 
							
						 | 
						
							5 years ago | 
						
					
				
					
						
							
							
								 
								Bryce Allen
							
						 | 
						
							
							
							
							
								
							
								c32b86422f
								
							
								
							
						 | 
						
							
							
								
								distribute total across ranks
							
							
							
							
							
							
							
							useful for test < 6 ranks per node 
							
						 | 
						
							5 years ago | 
						
					
				
					
						
							
							
								 
								Bryce Allen
							
						 | 
						
							
							
							
							
								
							
								37ad5e87ce
								
							
								
							
						 | 
						
							
							
								
								use define to switch between managed/unmanaged
							
							
							
							
							
						 | 
						
							5 years ago | 
						
					
				
					
						
							
							
								 
								Bryce Allen
							
						 | 
						
							
							
							
							
								
							
								6940ce7ceb
								
							
								
							
						 | 
						
							
							
								
								add mem free print, fit in 8GB gpu
							
							
							
							
							
						 | 
						
							5 years ago | 
						
					
				
					
						
							
							
								 
								Bryce Allen
							
						 | 
						
							
							
							
							
								
							
								3ebd09725e
								
							
								
							
						 | 
						
							
							
								
								add mpi wtime counters, fix make clean
							
							
							
							
							
						 | 
						
							5 years ago | 
						
					
				
					
						
							
							
								 
								Bryce Allen
							
						 | 
						
							
							
							
							
								
							
								3dd6045f2e
								
							
								
							
						 | 
						
							
							
								
								move finialize to outside profiler area
							
							
							
							
							
						 | 
						
							5 years ago | 
						
					
				
					
						
							
							
								 
								Bryce Allen
							
						 | 
						
							
							
							
							
								
							
								063e592dcf
								
							
								
							
						 | 
						
							
							
								
								fix all* cuda malloc size
							
							
							
							
							
						 | 
						
							5 years ago | 
						
					
				
					
						
							
							
								 
								Bryce Allen
							
						 | 
						
							
							
							
							
								
							
								134c933e86
								
							
								
							
						 | 
						
							
							
								
								use managed mem for allgather, cleanup
							
							
							
							
							
						 | 
						
							5 years ago | 
						
					
				
					
						
							
							
								 
								Bryce Allen
							
						 | 
						
							
							
							
							
								
							
								3e99cf443b
								
							
								
							
						 | 
						
							
							
								
								fix allgather recv size
							
							
							
							
							
						 | 
						
							5 years ago | 
						
					
				
					
						
							
							
								 
								Bryce Allen
							
						 | 
						
							
							
							
							
								
							
								4d504dd5b1
								
							
								
							
						 | 
						
							
							
								
								add versions with nvtx
							
							
							
							
							
						 | 
						
							5 years ago |