tl;dr: I wrote Mooncake Transfer Engine.

I’m focusing on the research of in-memory distributed storage systems. Currently, I am working at the Mooncake team. I received my Ph.D degree from Tsinghua University in 2023 (supervised by Prof. Zuoning Chen and Yongwei Wu), and Bachelor degree from Xidian University in 2017.

🔥 News

  • 2025.06:  🎉🎉 Blackwell Mooncake Transfer Engine supports NVLink. SGLang achieved 7,583 tokens per second per GPU for decoding on the GB200 NVL72.
  • 2025.05:  🎉🎉 NVIDIA Dynamo NIXL offically supports Mooncake Transfer Engine as a backend.
  • 2025.02:  🎉🎉 Mooncake Paper has been awarded as FAST 25 Best Paper.
  • 2024.11:  🎉🎉 We have offically released Mooncake to the open-source community.

📝 Publications

FAST 2025
sym

Mooncake: Trading More Storage for Less Computation — A KVCache-centric Architecture for Serving LLM Chatbot

Ruoyu Qin, Zheming Li, Weiran He, Jialei Cui, Feng Ren, Mingxing Zhang, Yongwei Wu, Weimin Zheng, Xinran Xu

Project

  • FAST 25 Best Paper!
  • Mooncake’s innovative architecture enables Kimi to handle 100%+ more requests with Transfer Engine.
  • Closing corperation with SGLang/LMCache/Dynamo/Alibaba/Ant Group/Huawei/…

🎖 Honors and Awards

  • 2017.07 Outstanding Undergraduate, Xidian University
  • 2016.10 CCF Outstanding Undergraduate Award
  • 2016.10 National Scholarship (Undergraduate) (Top 1%)
  • 2014.10 National Scholarship (Undergraduate) (Top 1%)

📖 Educations

  • 2017.08 - 2023.06, Ph.D. student in MadSys Research Group, Department of Computer Science and Technology, Tsinghua University, Beijing
  • 2013.08 - 2017.07, Bachelor student in Computer School, Xidian University, Xi’an, Shaanxi

💬 Invited Talks

  • 2025.05, Mooncake, Zhejiang Lab internal talk

💻 Services

  • Member of Storage Benchmarking Workgroup, CCF Technical Committee of Information Storage Technology (CCF TCIST)