Skip to content

issues Search Results · repo:PRIME-RL/PRIME language:Python

Filter by

32 results
 (52 ms)

32 results

inPRIME-RL/PRIME (press backspace or delete to remove)

Hello, Thanks for the good job. Could you release the code for training the sft start point model Eurus-2-7B-SFT ? If that provided, it will be easier to reproduce the results of your work since settings ...
  • RiverTre
  • 2
  • Opened 
    7 days ago
  • #52

Hello, I attempted to replicate your method by conducting training on my local machine. While reviewing the metrics on wandb, I noticed that the prompt_length is increasing during training. I was under ...
  • hanbyul-kim
  • 1
  • Opened 
    9 days ago
  • #49

Hi, when I tried to replicate the training, I ran bash examples/run_prime_main.sh Got error: File /mnt/workspace/PRIME/training/verl/workers/actor/dp_actor.py , line 57, in _forward_micro_batch ...
  • Urheen
  • 7
  • Opened 
    11 days ago
  • #48

In the newest code, why the dtype for loading models is changed to float32 rather than bfloat16?
  • Nipers
  • 3
  • Opened 
    15 days ago
  • #47

According to the pseudo code in the blog, the reference model is always the SFT version model, I found this code in the DPPrime.py: if self.reference_module is not None: ref_log_prob = torch.cat([self._forward_micro_batch(module=self.reference_module, ...
  • Nipers
  • 2
  • Opened 
    22 days ago
  • #45

I noticed something that confuses me, is the CE Loss in the blog missing a negative sign?
  • OOOHS
  • 2
  • Opened 
    22 days ago
  • #44

Thank you again for providing such elegant online RL example. I have reimplement your experiments and get positive result. I am using single A800 node with 8 GPUs to run the experiment. Here I list several ...
  • Nipers
  • 4
  • Opened 
    24 days ago
  • #43

Is this also beat 4o s coding performance, or at least same as it?
  • samyon7
  • 3
  • Opened 
    26 days ago
  • #42

.
  • rawsh
  • Opened 
    26 days ago
  • #41

Hello, have compare with an initial sft model with Qwen2.5-Math-7B-Instruct, rather than your own one?
  • merlinarer
  • 1
  • Opened 
    27 days ago
  • #39
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Press the
/
key to activate the search input again and adjust your query.
Issue search results · GitHub