chanmuzi
<CoT, Agent> ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent (2023.12)