Research: Reproducible benchmarks for batch-invariant LLM inference across models & GPUs (A10, A100, H100)
          research          cuda          pytorch          triton          benchmarks          gpu-kernels          vllm          llm-inference          batch-invariance          deterministic-inference      
    - 
            Updated
            Sep 28, 2025 
- Python