Issue search results

Filter by

32 results

(52 ms)inPRIME-RL/PRIME (press backspace or delete to remove)

PRIME-RL/PRIME
Code for training the SFT_MODEL=Eurus-2-7B-SFT

Hello, Thanks for the good job. Could you release the code for training the sft start point model Eurus-2-7B-SFT ？ If that provided, it will be easier to reproduce the results of your work since settings ...

RiverTre

Opened
7 days ago

PRIME-RL/PRIME
Why does the `prompt_length` increase during training?

Hello, I attempted to replicate your method by conducting training on my local machine. While reviewing the metrics on wandb, I noticed that the prompt_length is increasing during training. I was under ...

hanbyul-kim

Opened
9 days ago

PRIME-RL/PRIME
Error when running the examples/run_prime_main.sh

Hi, when I tried to replicate the training, I ran bash examples/run_prime_main.sh Got error: File /mnt/workspace/PRIME/training/verl/workers/actor/dp_actor.py , line 57, in _forward_micro_batch ...

Urheen

Opened
11 days ago

PRIME-RL/PRIME
what is the reason for "fix precision"?

In the newest code, why the dtype for loading models is changed to float32 rather than bfloat16?

Nipers

Opened
15 days ago

PRIME-RL/PRIME
old_log_probs VS ref_log_probs

According to the pseudo code in the blog, the reference model is always the SFT version model, I found this code in the DPPrime.py: if self.reference_module is not None: ref_log_prob = torch.cat([self._forward_micro_batch(module=self.reference_module, ...

Nipers

Opened
22 days ago

PRIME-RL/PRIME
The CE Loss in the blog

I noticed something that confuses me, is the CE Loss in the blog missing a negative sign?

OOOHS

Opened
22 days ago

PRIME-RL/PRIME
some operations that may improve the efficiency

Thank you again for providing such elegant online RL example. I have reimplement your experiments and get positive result. I am using single A800 node with 8 GPUs to run the experiment. Here I list several ...

Nipers

Opened
24 days ago

PRIME-RL/PRIME
Coding

Is this also beat 4o s coding performance, or at least same as it?

samyon7

Opened
26 days ago

PRIME-RL/PRIME
save prm checkpoint issue

rawsh

Opened
26 days ago

PRIME-RL/PRIME
Initial sft model with Qwen2.5-Math-7B-Instruct

Hello, have compare with an initial sft model with Qwen2.5-Math-7B-Instruct, rather than your own one?

merlinarer

Opened
27 days ago

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues

ProTip!

Restrict your search to the title by using the in:title qualifier.

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues

ProTip!

Press the

key to activate the search input again and adjust your query.

Languages

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter by

State

Advanced

PRIME-RL/PRIME
Code for training the SFT_MODEL=Eurus-2-7B-SFT

PRIME-RL/PRIME
Why does the `prompt_length` increase during training?

PRIME-RL/PRIME
Error when running the examples/run_prime_main.sh

PRIME-RL/PRIME
what is the reason for "fix precision"?

PRIME-RL/PRIME
old_log_probs VS ref_log_probs

PRIME-RL/PRIME
The CE Loss in the blog

PRIME-RL/PRIME
some operations that may improve the efficiency

PRIME-RL/PRIME
Coding

PRIME-RL/PRIME
save prm checkpoint issue

PRIME-RL/PRIME
Initial sft model with Qwen2.5-Math-7B-Instruct

Learn how you can use GitHub Issues to plan and track your work.

Learn how you can use GitHub Issues to plan and track your work.

issues Search Results · repo:PRIME-RL/PRIME language:Python

Filter by

State

Advanced

32 results

Learn how you can use GitHub Issues to plan and track your work.

Learn how you can use GitHub Issues to plan and track your work.