Using a small draft model to generate tokens that a large model verifies in one forward pass
14 views