chanmuzi

<PEFT> S-LoRA: Serving Thousands of Concurrent LoRA Adapters (2023.12)

2023.12.16· Paper Review

관심있는 NLP 논문을 읽어보고 간단히 정리했습니다. 혹시 부족하거나 잘못된 내용이 있다면 댓글 부탁드립니다 🙇‍♂️ usechatgpt init success - 많은 LoRA adapters를 scalable하게 serving할 수 있도록 designed된 system, S-LoRA - Unified Paging, custom CUDA kernels를 도입 1. Introduction "pretrain-then-finetune" 패러다임이 성행함에 따라 수많은 variants가 생성됨 Low-Rank Adaptation (LoRA)와 같은 parameter-efficient fine-tuning (PEFT) method가 발전됨 원조 LoRA는 adapter의 파라미터를 기존 모델의 파라미터와 me..

<RAG> [Context Tuning] Context Tuning for Retrieval Augmented Generation (2023.12)

2023.12.15· Paper Review

관심있는 NLP 논문을 읽어보고 간단히 정리했습니다. 혹시 부족하거나 잘못된 내용이 있다면 댓글 부탁드립니다 🙇‍♂️ usechatgpt init success [Apple] - query에 explicit information이 부족한 경우 retrieval 성능 향상에 도움이 되는 Context Tuning for RAG를 제안 - LambdaMART에 Reciprocal Rank Fusion (RRF)를 적용한 lightweight 모델이 GPT-4 기반 retireval보다 뛰어남 1. Introduction 다양한 태스크를 두루 잘 처리할 수 있는 LLM의 능력 덕분에, LLM을 planning agent로 활용하고자 하는 연구가 활발히 이뤄짐 최신 정보가 반영되지 않는 등의 inherent l..

<LLM, XAI> Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations (2023.12)

2023.12.14· Paper Review

관심있는 NLP 논문을 읽어보고 간단히 정리했습니다. 혹시 부족하거나 잘못된 내용이 있다면 댓글 부탁드립니다 🙇‍♂️ usechatgpt init success [Gen AI at Meta] - Llama2-7b 모델을 자체 수집된 데이터셋에 대해 instruction-tune한 Llama Guard 모델 공개 - OpenAI Moderation Evaluation, ToxicChat과 같은 기존 벤치마크에서도 우수한 성능 - Llama Guard는 multi-class classification과 generating binary decision score를 수행하는 언어 모델 1. Introduction LLM이 급격하게 성장함에 따라 conversational AI agent에 대한 관심도 엄청나게 ..

<LMM> Gemini: A Family of Highly Capable Multimodal Models (2023.12)

2023.12.13· Paper Review

관심있는 NLP 논문을 읽어보고 간단히 정리했습니다. 혹시 부족하거나 잘못된 내용이 있다면 댓글 부탁드립니다 🙇‍♂️ usechatgpt init success [Gemini Team, Google] - image, audio, video, text understanding에 있어서 압도적인 능력을 보여주는 multimodal models faimily, Gemini - MMLU에서 human-expert 이상의 performance를 달성한 최초의 케이스 1. Introduction 여러 modalities를 아우르는 능력을 지녔으면서도 각 도메인에서 뛰어난 understanding & reasoning 능력을 갖춘 Gemini 모델을 학습시켰음 모델의 크기는 세 종류로 구분됨 Ultra: for hi..

<SSM> Mamba: Linear-Time Sequence Modeling with Selective State Spaces (2023.12)

2023.12.12· Paper Review

관심있는 NLP 논문을 읽어보고 ChatGPT를 이용해 정리했습니다. (요약을 제외한 모든 내용은 ChatGPT가 요약한 내용입니다 😁) 혹시 부족하거나 잘못된 내용이 있다면 댓글 부탁드립니다 🙇‍♂️ usechatgpt init success [Carnegie Mellon University, Princeton University] - selective SSMs을 simplified end-to-end neural network architecture로 통합함 - attention 또는 심지어 MLP block을 포함하지 않음 1. Introduction 기초 모델(FMs)의 개요 FMs는 대규모 데이터로 사전 학습된 후 하위 작업에 적용되는 대형 모델로, 현대 머신러닝에서 효과적인 패러다임으로 부상...

<CoX> Chain of Code: Reasoning with a Language Model-Augmented Code Emulator (2023.12)

2023.12.11· Paper Review

관심있는 NLP 논문을 읽어보고 간단히 정리했습니다. 혹시 부족하거나 잘못된 내용이 있다면 댓글 부탁드립니다 🙇‍♂️ usechatgpt init success [Google DeepMind, Stanford University, University of California, Berkeley] - LM의 code-driven reasoning 능력을 향상시켜주는 간단하면서도 효과적인 extension, Chain of Code (CoC) 공개 - 실행 가능한 코드는 interpreter로 실행해보고, 그럴 수 없는 것은 LM을 활용하여 emulate하는 방식, LMulator 도입 1. Introduction 복잡한 문제를 여러 세부 태스크로 쪼개어 처리하는 Chain of Thought (CoT) 방식..

전체 글

티스토리툴바