GitHub - modelscope/RM-Gallery: A One-Stop Reward Model Platform

English | 中文

RM-Gallery: A One-Stop Reward Model Platform

🗂️ Table of Contents

📢 News
🌟 Why RM-Gallery?
📥 Installation
🚀 RM Gallery Walkthrough
📚 Documentation
🤝 Contribute
📝 Citation

📢 News

[2025-07-09] We release RM Gallery v0.1.0 now, which is also available in PyPI!

🌟 Why RM-Gallery?

RM-Gallery is a one-stop platform for training, building and applying reward models. It provides a comprehensive solution for implementing reward models at both task-level and atomic-level, with high-throughput and fault-tolerant capabilities.

RM-Gallery Framework

🏋️‍♂️ Training RM

Integrated RM Training Pipeline: Provides an RL-based framework for training reasoning reward models, compatible with popular frameworks (e.g., verl), and offers examples for integrating RM-Gallery into the framework.

RM Training Pipeline improves accuracy on RM Bench

This image demonstrates the effectiveness of the RM Training Pipeline. On RM Bench, after more than 80 training steps, the accuracy improved from around 55.8% with the baseline model (Qwen2.5-14B) to approximately 62.5%.

🏗️ Building RM

Unified Reward Model Architecture: Flexible implementation of reward models through standardized interfaces, supporting various architectures (model-based/free), reward formats (scalar/critique), and scoring patterns (pointwise/listwise/pairwise)
Comprehensive RM Gallery: Provides a rich collection of ready-to-use Reward Model instances for diverse tasks (e.g., math, coding, preference alignment) with both task-level(RMComposition) and component-level(RewardModel). Users can directly apply RMComposition/RewardModel for specific tasks or assemble custom RMComposition via component-level RewardModel.
Rubric-Critic-Score Paradigm: Adopts the Rubric+Critic+Score-based reasoning Reward Model paradigm, offering best practices to help users generate rubrics with limited preference data.

The two images above show that after applying the Rubric+Critic+Score paradigm and adding 1–3 rubrics to the base model (Qwen3-32B), there were significant improvements on both RewardBench2 and RMB-pairwise.

🛠️ Applying RM

Multiple Usage Scenarios: Covers multiple Reward Model (RM) usage scenarios with detailed best practices, including Training with Rewards (e.g., post-training), Inference with Rewards (e.g., Best-of-N，data-correction)
High-Performance RM Serving: Leverages the New API platform to deliver high-throughput, fault-tolerant reward model serving, enhancing feedback efficiency.

📥 Installation

RM Gallery requires Python >= 3.10 and < 3.13

📦 Install From source

# Pull the source code from GitHub
git clone https://github.com/modelscope/RM-Gallery.git

# Install the package
pip install .

Install From PyPi

pip install rm-gallery

🚀 Quick Start

Your First Reward Model

from rm_gallery.core.reward.registry import RewardRegistry

# 1. Choose a pre-built reward model
rm = RewardRegistry.get("safety_listwise_reward")

# 2. Prepare your data
from rm_gallery.core.data.schema import DataSample
sample = DataSample(...)  # See docs for details

# 3. Evaluate
result = rm.evaluate(sample)
print(result)

That's it! 🎉

👉 5-Minute Quickstart Guide - Get started in minutes

👉 Interactive Notebooks - Try it hands-on

📖 Key Features

🏗️ Building Reward Models

Choose from 35+ pre-built reward models or create your own:

# Use pre-built models
rm = RewardRegistry.get("math_correctness_reward")
rm = RewardRegistry.get("code_quality_reward")
rm = RewardRegistry.get("helpfulness_listwise_reward")

# Or build custom models
class CustomReward(BasePointWiseReward):
    def _evaluate(self, sample, **kwargs):
        # Your custom logic here
        return RewardResult(...)

📚 See all available reward models →

🏋️‍♂️ Training Reward Models

Train your own reward models with VERL framework:

# Prepare data and launch training
cd examples/train/pointwise
./run_pointwise.sh

📚 Training guide →

🧪 Evaluating on Benchmarks

Test your models on standard benchmarks:

RewardBench2 - Latest reward model benchmark
RM-Bench - Comprehensive evaluation
Conflict Detector - Detect evaluation conflicts
JudgeBench - Judge capability evaluation

📚 Evaluation guide →

🛠️ Real-World Applications

Best-of-N Selection - Choose the best from multiple responses
Data Refinement - Improve data quality with reward feedback
Post Training (RLHF) - Integrate with reinforcement learning
High-Performance Serving - Deploy as scalable service

📚 Application guides →

📚 Documentation

📖 Complete Documentation - Full documentation site

Quick Links

5-Minute Quickstart - Get started fast
Interactive Examples - Hands-on Jupyter notebooks
Building Custom RMs - Create your own
Training Guide - Train reward models
API Reference - Complete API docs
Changelog - Version history and updates

🤝 Contribute

Contributions are always encouraged!

We highly recommend install pre-commit hooks in this repo before committing pull requests. These hooks are small house-keeping scripts executed every time you make a git commit, which will take care of the formatting and linting automatically.

pip install -e .
pre-commit install

Please refer to our Contribution Guide for more details.

📝 Citation

Reference to cite if you use RM-Gallery in a paper:

@software{
title = {RM-Gallery: A One-Stop Reward Model Platform},
author = {The RM-Gallery Team},
url = {https://github.com/modelscope/RM-Gallery},
month = {07},
year = {2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 224 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
external		external
rm_gallery		rm_gallery
tests		tests
.flake8		.flake8
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
mkdocs.yml		mkdocs.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RM-Gallery: A One-Stop Reward Model Platform

🗂️ Table of Contents

📢 News

🌟 Why RM-Gallery?

🏋️‍♂️ Training RM

🏗️ Building RM

🛠️ Applying RM

📥 Installation

📦 Install From source

Install From PyPi

🚀 Quick Start

Your First Reward Model

📖 Key Features

🏗️ Building Reward Models

🏋️‍♂️ Training Reward Models

🧪 Evaluating on Benchmarks

🛠️ Real-World Applications

📚 Documentation

Quick Links

🤝 Contribute

📝 Citation

About

Uh oh!

Releases

Packages

Contributors 7

Uh oh!

Languages

License

modelscope/RM-Gallery

Folders and files

Latest commit

History

Repository files navigation

RM-Gallery: A One-Stop Reward Model Platform

🗂️ Table of Contents

📢 News

🌟 Why RM-Gallery?

🏋️‍♂️ Training RM

🏗️ Building RM

🛠️ Applying RM

📥 Installation

📦 Install From source

Install From PyPi

🚀 Quick Start

Your First Reward Model

📖 Key Features

🏗️ Building Reward Models

🏋️‍♂️ Training Reward Models

🧪 Evaluating on Benchmarks

🛠️ Real-World Applications

📚 Documentation

Quick Links

🤝 Contribute

📝 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Uh oh!

Languages

Packages