4 Comments
User's avatar
Remixa's avatar

Thanks BENJAMIN.

But I've been wondering since it seems that their experiments are mostly based on completion tasks. In theory, could these methods (context length extension methods such as LongRecipe, YARN and more) also apply to instruction finetuning rather than just completion tasks?

Expand full comment
Remixa's avatar

In other words, how is the current chat (instruct) model implemented with large context length?

Expand full comment
Benjamin Marie's avatar

Yes, because it is the same.

instruction fine-tuning *is* a completion task. An LLM only does completion. The larger context can be a long chat prompt here or a dialogue history.

Expand full comment
Remixa's avatar

Got it! Thank you for clarifying this matter for me!

Expand full comment