Skip to main content

MANSION

Multi-floor lANguage-to-3D Scene generatIOn for loNg-horizon tasks

Lirong Che*,1,2  Shuo Wen*,3,§  Shan Huang1  Chuang Wang2

Yuzhe Yang2  Gregory Dudek3  Xueqian Wang†,1  Jian Su†,2

1Tsinghua University 2AgiBot 3McGill University, MILA – Quebec AI Institute

* Equal contribution. † Corresponding authors.
§ Work done during an internship at AgiBot.

Meet MANSION

Generate building-scale, multi-floor 3D interactive worlds from a single natural-language prompt, and explore MansionWorld with 1,000+ buildings. Bring your embodied agents and train at scale!

Building Map

Click to switch floor preview:

Building Diagram
Floor 1 Preview
Long-Horizon Task
"Starting from the 3th floor, go to the 1st floor to pick up my noodle delivery, then put it into the refrigerator in the 4th floor dining area."

MansionWorld Dataset

MansionWorld contains 1000+ interactive multi-floor buildings, covering 2–10 floors and 10,000+ rooms in total, including non-residential environments such as offices, hospitals, schools, and supermarkets.

Distribution

Sample Scenes Preview

A selection of diverse environments available in the dataset.

A three-story luxury villa equipped with entertainment and exercise facilities

A three-story luxury villa equipped with entertainment and exercise facilities

A large-scale hospital

A large-scale hospital

A high school building

A high school building

A four story office building

A four story office building

A entertainment complex

A entertainment complex

A compact apartment designed for two people

A compact apartment designed for two people

Citation

If you use MANSION in your research, please cite:

@misc{che2026mansionmultifloorlanguageto3dscene,
  title         = {MANSION: Multi-floor lANguage-to-3D Scene generatIOn for loNg-horizon tasks},
  author        = {Lirong Che and Shuo Wen and Shan Huang and Chuang Wang and Yuzhe Yang and Gregory Dudek and Xueqian Wang and Jian Su},
  year          = {2026},
  eprint        = {2603.11554},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CV},
  url           = {https://arxiv.org/abs/2603.11554},
}