Efficient Long Context Generalization with…

Sep 10, 2024

The Weekly Salt #34

4 Comments

Thanks BENJAMIN.

But I've been wondering since it seems that their experiments are mostly based on completion tasks. In theory, could these methods (context length extension methods such as LongRecipe, YARN and more) also apply to instruction finetuning rather than just completion tasks?

Expand full comment

Sep 11Edited

In other words, how is the current chat (instruct) model implemented with large context length?

Expand full comment

Yes, because it is the same.

instruction fine-tuning *is* a completion task. An LLM only does completion. The larger context can be a long chat prompt here or a dialogue history.

Expand full comment

Got it! Thank you for clarifying this matter for me!

Expand full comment

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts