DLRM computation time requires a throughput per accelerator of 15 GB/s+ with a compute time of 0.00038 (GB200) and 0.00056 (MI300X). This issue captures the need to double check the computation time for the benchmark
DLRM computation time requires a throughput per accelerator of 15 GB/s+ with a compute time of 0.00038 (GB200) and 0.00056 (MI300X).
This issue captures the need to double check the computation time for the benchmark