Yujia Bao

Machine Learning Researcher

I am a machine learning researcher and a life-long engineer. I love building things—pushing the frontier of AI to make it more useful, safe, and available to everyone. Currently, I am an Associate Director at Accenture, where I lead the research and engineering development of AI Refinery, an agentic AI platform designed to help Fortune 500 companies build and govern complex agentic workflows.

My recent research spans scalable agent architectures, context management, LLM post-training, and evaluation. I am most excited about research that has direct product impact.

I received my Ph.D. in Computer Science from MIT CSAIL, advised by Regina Barzilay.

Experience & Education

Professional Experience

2023 - Current

Associate Director

Accenture

Leading a team of 80+ researchers and engineers. Developing AI Refinery, an agentic AI platform for enterprise.

2022 - 2023

Lead Machine Learning Scientist

Insitro

Machine learning research for drug discovery and development.

2017 - 2022

Researcher

MIT CSAIL

Machine learning research on interpretability, transfer learning, and fairness.

Education

2017 - 2022

Ph.D. in Computer Science

MIT CSAIL

Advisor: Regina Barzilay

2016 - 2017

M.A. in Mathematics

University of Wisconsin–Madison

2012 - 2016

B.S. in Mathematics

Shanghai Jiao Tong University

Recent Work

AI Refinery: Enterprise Agentic Platform

Leading engineering and research for AI Refinery, enabling developers to build and govern complex agentic workflows.

Website SDK

Enterprise LLM Customization

Leading the development of customized LLMs for high-stakes business and media domains.

Tech Report Fortune Analytics

Machine Learning Foundations

Foundational research on efficient post-training, inference, hierarchical memory, cross-user collaboration, agent evaluation and more. See publications below for details.

Publications

See Google Scholar for full list of publications.

2026

MCP-Bench: Benchmarking tool-using llm agents with complex real-world tasks via mcp servers

Zhenting Wang, Qi Chang, Hemani Patel, Shashank Biju, Cheng-En Wu, Quan Liu, Aolin Ding, Alireza Rezazadeh, Ankit Shah, Yujia Bao, Eugene Siow

International Conference on Learning Representations (ICLR)

Paper Code

DRAGON: Guard LLM Unlearning in Context via Negative Detection and Reasoning

Yaxuan Wang, Chris Yuhao Liu, Quan Liu, Jinglong Pang, Wei Wei, Yujia Bao, Yang Liu

International Conference on Learning Representations (ICLR)

Paper

2025

PromptBridge: Cross-Model Prompt Transfer for Large Language Models

Yaxuan Wang, Quan Liu, Zhenting Wang, Zichao Li, Wei Wei, Yang Liu, Yujia Bao

arXiv:2512.01420

Paper

WebDART: Dynamic Decomposition and Re-planning for Complex Web Tasks

Jingbo Yang, Bairu Hou, Wei Wei, Shiyu Chang, Yujia Bao

arXiv:2510.06587

Paper Code

SFT-GO: Supervised Fine-Tuning with Group Optimization for Large Language Models

Gyuhak Kim, Sumiran Singh Thakur, Su Min Park, Wei Wei, Yujia Bao

arXiv:2506.15021

Paper

Collaborative Memory: Multi-User Memory Sharing in LLM Agents with Dynamic Access Control

Alireza Rezazadeh, Zichao Li, Ange Lou, Yuying Zhao, Wei Wei, Yujia Bao

arXiv:2505.18279

Paper

Advertising in AI systems: Society must be vigilant

Menghua Wu, Yujia Bao

arXiv:2505.18425

Paper

Enhancing Retrieval Systems with Inference-Time Logical Reasoning

Felix Faltings, Wei Wei, Yujia Bao

Annual Meeting of the Association for Computational Linguistics (ACL)

Paper

KVLink: Accelerating Large Language Models via Efficient KV Cache Reuse

Jingbo Yang, Bairu Hou, Wei Wei, Yujia Bao, Shiyu Chang

Conference on Neural Information Processing Systems (NeurIPS)

Paper Code

H-CoT: Hijacking the Chain-of-Thought Safety Reasoning Mechanism to Jailbreak Large Reasoning Models

Martin Kuo, Jianyi Zhang, Aolin Ding, Qinsi Wang, Louis DiValentin, Yujia Bao, Wei Wei, Hai Li, Yiran Chen

arXiv:2502.12893

Paper Code

Improving Data Efficiency via Curating LLM-Driven Rating Systems

Jinlong Pang, Jiaheng Wei, Ankit Parag Shah, Zhaowei Zhu, Yaxuan Wang, Chen Qian, Yang Liu, Yujia Bao, Wei Wei

International Conference on Learning Representations (ICLR)

Paper Code

LLM Unlearning via Loss Adjustment with Only Forget Data

Yaxuan Wang, Jiaheng Wei, Chris Yuhao Liu, Jinlong Pang, Quan Liu, Ankit Parag Shah, Yujia Bao, Yang Liu, Wei Wei

International Conference on Learning Representations (ICLR)

Paper Code

From isolated conversations to hierarchical schemas: Dynamic tree memory representation for llms

Alireza Rezazadeh, Zichao Li, Wei Wei, Yujia Bao

International Conference on Learning Representations (ICLR)

Paper

Sample, estimate, aggregate: A recipe for causal discovery foundation models

Menghua Wu, Yujia Bao, Regina Barzilay, Tommi Jaakkola

Transactions on Machine Learning Research (TMLR)

Paper Code

2024

Harnessing business and media insights with large language models

Yujia Bao, Ankit Parag Shah, Neeru Narang, Jonathan Rivers, Rajeev Maksey, Lan Guan, Louise N Barrere, Shelley Evenson, Rahul Basole, Connie Miao, others

arXiv:2406.06559

Paper