But I've been wondering since it seems that their experiments are mostly based on completion tasks. In theory, could these methods (context length extension methods such as LongRecipe, YARN and more) also apply to instruction finetuning rather than just completion tasks?
Thanks BENJAMIN.
But I've been wondering since it seems that their experiments are mostly based on completion tasks. In theory, could these methods (context length extension methods such as LongRecipe, YARN and more) also apply to instruction finetuning rather than just completion tasks?
In other words, how is the current chat (instruct) model implemented with large context length?
Yes, because it is the same.
instruction fine-tuning *is* a completion task. An LLM only does completion. The larger context can be a long chat prompt here or a dialogue history.
Got it! Thank you for clarifying this matter for me!