My research focuses on multi-modal action composition — building generalizable robot behaviors by composing specialized models across sensory modalities and time scales.
Multisensory Learning: Integrating vision, touch, audio, and language for fine-grained, contact-rich manipulation.
Action Composition: Composing specialized models — skills, action abstractions, and physics-inspired world models — across diverse data and time scales to generalize to complex tasks.
I'm always excited to explore new collaborations in robotics and machine learning! If you're interested, please drop me an email. I'd love to chat!
haonan_chen [at] seas (dot) harvard (dot) edu /
haonan [at] cs (dot) stanford (dot) edu
News
To junior PhD/master/undergraduate students: If you'd like to chat about life, career plans, or research ideas in ML/robotics, please feel free to email me to schedule a meeting. I dedicate 30 minutes each week for these conversations and encourage students from underrepresented groups or anyone in need to reach out.
CoStream: Composing Simple Behaviors for Generalizable Complex Manipulation
Haonan Chen*,
Yuxiang Ma*,
Stephen Tian,
Xiaoshen Han,
Wenlong Huang,
Feiyang Wu,
Yunzhu Li,
Jiajun Wu,
Edward H. Adelson,
Yilun Du * Equal contribution Preprint 2026, [Project][Paper][BibTeX]@misc{costream2026,
title={CoStream: Composing Simple Behaviors for Generalizable Complex Manipulation},
author={Chen, Haonan and Ma, Yuxiang and Tian, Stephen and Han, Xiaoshen and Huang, Wenlong and Wu, Feiyang and Li, Yunzhu and Wu, Jiajun and Adelson, Edward H. and Du, Yilun},
year={2026}
}
OAT: Ordered Action Tokenization
Chaoqi Liu,
Xiaoshen Han,
Jiawei Gao,
Yue Zhao,
Haonan Chen,
Yilun Du RSS 2026, [Project][Paper][Code][Blog][BibTeX] Media Coverage:[MarkTechPost]@misc{liu2026orderedactiontokenization,
title={OAT: Ordered Action Tokenization},
author={Chaoqi Liu and Xiaoshen Han and Jiawei Gao and Yue Zhao and Haonan Chen and Yilun Du},
year={2026},
eprint={2602.04215},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2602.04215},
}
IMPACT: Learning Internal-Model Predictive Control for Forceful Robotic Manipulation
Jiawei Gao,
Chaoqi Liu,
Peilin Wu,
Haonan Chen,
Yilun Du Preprint 2026, [Project][Paper][Code][Video][BibTeX]@article{gao2024impact,
title={Learning Internal-Model Predictive Control for Forceful Robotic Manipulation},
author={Gao, Jiawei and Liu, Chaoqi and Wu, Peilin and Chen, Haonan and Du, Yilun},
journal={arXiv preprint arXiv:2406.14558},
year={2024}
}
SIMPACT: Simulation-Enabled Action Planning using Vision-Language Models
Haowen Liu*,
Shaoxiong Yao*,
Haonan Chen,
Jiawei Gao,
Jiayuan Mao,
Jia-Bin Huang,
Yilun Du * Equal contribution CVPR 2026, [Project][Paper][BibTeX]@article{simpact2025,
title={SIMPACT: Simulation-Enabled Action Planning using Vision-Language Models},
author={Liu, Haowen and Yao, Shaoxiong and Chen, Haonan and Gao, Jiawei and Mao, Jiayuan and Huang, Jia-Bin and Du, Yilun},
journal={arXiv preprint arXiv:2512.05955},
year={2025}
}
Multi-Modal Manipulation via Multi-Modal Policy Consensus
Haonan Chen,
Jiaming Xu*,
Hongyu Chen*,
Kaiwen Hong,
Binghao Huang,
Chaoqi Liu,
Jiayuan Mao,
Yunzhu Li,
Yilun Du+, and
Katherine Driggs-Campbell+ * Equal contribution, + Equal advising ICRA 2026, [Project][Paper][Code][Dataset][Video][Audio][Blog][Deepwiki][BibTeX]@inproceedings{chen2026multimodal,
title={Multi-Modal Manipulation via Multi-Modal Policy Consensus},
author={Chen, Haonan and Xu, Jiaming and Chen, Hongyu and Hong, Kaiwen and Huang, Binghao and Liu, Chaoqi and Mao, Jiayuan and Li, Yunzhu and Du, Yilun and Driggs-Campbell, Katherine},
booktitle={2026 IEEE International Conference on Robotics and Automation (ICRA)},
year={2026}
} Best Paper Award at CVPR 2026 Workshop on Multi-Sensory Modeling for Embodied Intelligence [Link] Media Coverage: Featured in Video Friday on [IEEE Spectrum]