๋ชฉ๋ก2024/09 (2)

SJ_Koding

[LLM] Selective Reflection-Tuning ์š”์•ฝ ๋ฐ ์ •๋ฆฌ (feat. Reflection Llama-3.1 70B ๋…ผ๋ž€)

Selective Reflection Tuning Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning (2024.06)LLM Fine-tuning์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•ด ๋ฐ์ดํ„ฐ ํ’ˆ์งˆ์„ ํ–ฅ์ƒํ•˜๋ ค๋Š” ์‹œ๋„, ๊ทธ๋ฆฌ๊ณ  ๋ฐ์ดํ„ฐ ์ƒ์„ฑ์— ๋Œ€ํ•œ ๋‹ค์–‘ํ•œ ๋ฐฉ๋ฒ•๋ก ์ด ์—ฐ๊ตฌ๋˜์–ด์™”์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋Š” ๋ชจ๋‘ ํ•™์ƒ๋ชจ๋ธ(์ดํ•˜ Student, ์ฃผ๋กœ Llama-3.1 8B, Solar 10.8B ๋“ฑ๋“ฑ์˜ sLM๊ธ‰ ๋ชจ๋ธ)์˜ ํ˜ธํ™˜์„ฑ์„ ๊ณ ๋ คํ•˜์ง€ ์•Š์•˜๋‹ค๋Š” ๊ฒƒ์„ ํ•ต์‹ฌ์œผ๋กœ ์ด์•ผ๊ธฐํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์ฆ‰ Student์˜ ์ œํ•œ๋œ ์„ฑ๋Šฅ๋•Œ๋ฌธ์— GPT4o๋“ฑ์ด ๋งŒ๋“ค์–ด๋‚ธ ๊ณ ํ’ˆ์งˆ ํ”„๋กฌํ”„ํŠธ๋กœ fine-tuning์„ ์ง„ํ–‰ํ•˜๋”๋ผ๋„ ์ด๋ฅผ ๋ชจ๋ฐฉํ•  ์ˆ˜ ์—†๋‹ค๋ผ๋Š” ์˜๋ฏธ๋กœ ๋ฐ›์•„๋“ค์—ฌ์ง‘๋‹ˆ๋‹ค...

LLM 2024. 9. 10. 16:27