Right, with respect to how many RLHF pairs, you can compare it to prior result of ~0%. But what I don't understand, and especially with the hype claims the paper makes about "autonomous super-human reasoning", is why can't they just keep running it and get much higher than 50%? Seems like there's another aspect that is preventing getting higher scores, and makes me wonder if these architectures are really just plateauing. Don't get me wrong, it's some good work; it's just the language of the paper has some ridiculous hype.

Replies (1)

Ah. It's true that everyone wants to claim the world. It isn't my work, I'm just plotting points and drawing lines