chanmuzi
<LLM> Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models (2024.02)