chanmuzi
<Multi-modal> [CLIP] Learning Transferable Visual Models From Natural Language Supervision (2021.02)