Going from Anywhere
Long "Wonderjourneys"
Going to Everywhere
Controlled WonderJourney
Abstract
We introduce WonderJourney, a modularized framework for perpetual scene generation. Unlike prior work on view generation that focuses on a single type of scenes, we start at any user-provided location (by a text description or an image), and generate a journey through a long sequence of diverse yet coherently connected 3D scenes. We leverage an LLM to generate textual descriptions of the scenes in this journey, a text-driven point cloud generation pipeline to make a compelling and coherent sequence of 3D scenes, and a large VLM to verify the generated scenes. We show compelling, diverse visual results across various scene types and styles, forming imaginary ``wonderjourneys''.
No, no! The adventures first, explanations take such a dreadful time. --- Alice's Adventures in Wonderland
Overview Video
Approach
Our modular design does not require any training, allowing easy future improvements from the quick advances in vision and language models.
BibTeX