Microsoft has introduced a new method for training language models called 'Instruction Pre-Training.' This approach uses an instruction synthesizer to generate instruction-response pairs, which are then mixed with regular corpora for pretraining. The method was detailed in a recent paper titled 'Instruction Pre-Training: Language Models are Supervised Multitask Learners' by researchers D. Cheng, Y. Gu, S. Huang, J. Bi, M. Huang, and F. Wei from Microsoft Research. The fine-tuned Mistral 7B model was employed as the instruction synthesizer in this process. The paper was featured as CompSci Paper of the Day, Issue 44.
I read about a fascinating hack to generate a high-quality dataset for LLM instruction finetuning this weekend. It's a fully automated way that doesn't require any seed questions and even runs locally. How does it work? https://t.co/EKt38OyZ5w
[CL] Instruction Pre-Training: Language Models are Supervised Multitask Learners D Cheng, Y Gu, S Huang, J Bi, M Huang, F Wei [Microsoft Research] (2024) https://t.co/eQWRVlHD5w - The paper proposes Instruction Pre-Training, which augments raw corpora with instruction-response… https://t.co/0EexedclML
CompSci Paper of the Day, Issue 44: Instruction Pre-Training: Language Models are Supervised Multitask Learners 1/4 🧵 https://t.co/Fi8Vq2iIVb
Microsoft releases interesting new way of training language models called "Instruction Pre-Training." > Fine-tuned Mistral 7B as "instruction synthesizer" to create a bunch of synth instruction-response pairs and mix them in with regular corpora. > They tried it out on general… https://t.co/dSQEnzYu7e
Instruction pre-training is a new approach that enhances LLM pretraining by using instruction-response pairs from an instruction synthesizer instead of raw data. Explore this method in this @gradio Space: https://t.co/s69FwiiAvz https://t.co/dd8GC7MFvc