Zengzhi Wang

Shanghai Jiao Tong University. (zengzhi.wang [at] sjtu dot edu dot cn).

sentosa_1.jpg

We should dream big.

Hi, there! I am Zengzhi Wang (王增志), a first-year PhD student at GAIR Lab, Shanghai Jiao Tong University, advised by Prof. Pengfei Liu. Before that, I received my master’s degree in Computer Science at the Nanjing University of Science & Technology advised by Prof. Rui Xia and Assoc. Prof. Jianfei Yu. I obtained my bachelor’s degree in Software Engineering at Wuhan Institute of Technology.

I curated data and trained models — and in turn, data, models, and results also trained me. Recently, I focus on

news

May 01, 2025 One paper (ProX) accepted by ICML’25.
Apr 09, 2025 PhDing @ SJTU (just started).
Sep 27, 2024 MathPile and OlymipcArena were accpeted by NeurIPS 2024 D&B Track 2024.
May 17, 2024 A paper (ChatGPT-Sentiment Evaluation) accpeted by COLM 2024.
May 17, 2024 A paper accpeted by ACL 2024 Main Conference. Congrats. to Qiming for her first ACL paper during her PhD.

selected publications

  1. NeurIPS D&B 2024
    MathPile: A Billion-Token-Scale Pretraining Corpus for Math
    Zengzhi Wang, Xuefeng Li, Rui Xia, and Pengfei Liu
    In The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2024
  2. ICML 2025
    Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale
    Fan Zhou*, Zengzhi Wang*, Qian Liu, Junlong Li, and Pengfei Liu
    In International Conference on Machine Learning, 2025
  3. Preprint 2025
    MegaMath: Pushing the Limits of Open Math Corpora
    Fan Zhou*, Zengzhi Wang*, Nikhil Ranjan, Zhoujun Cheng, Liping Tang, Guowei He, Zhengzhong Liu, and Eric P. Xing
    In Preprint, 2025
  4. Preprint 2025
    OctoThinker: Revisiting Mid-Training In the Era of RL Scaling
    Zengzhi Wang*, Fan Zhou*, Xuefeng Li*, and Pengfei Liu
    2025
    Notion Blog