TL;DR:
Visual Chronicles is the first use of MLLMs to analzye massive collections of images,
to answer open-ended queries such as “what are the trending changes in a city?”.
We build a system that breaks down the massive-scale analysis into 2 stages, local analysis and global aggregation.
We design effective and scalable solutions for each stage using MLLMs.
The results of the analysis are trends in text, along with the visual evidence of trending changes.
Select different trends above to view. For each trend, click a plotted dark-colored icon to view a specific before/after image pair (click the image for better views). Light-colored icons show change locations only. View in full screen (top-right button) to spot some changes small in scale. Changes are sub-sampled for better visualizations.
We can search for trending changes happened within a speicifc temporal window, e.g. 2020-2022.
We can also search for trending changes relevant to a specific semantic concept, e.g., retail stores.
We can connect discoveries (left) in Visual Chronicles to socioeconomic events or policy (right).
@misc{deng2025visualchronicles,
title={Visual Chronicles: Using Multimodal LLMs to Analyze Massive
Collections of Images},
author={Boyang Deng and Songyou Peng and Kyle Genova and Gordon Wetzstein
and Noah Snavely and Leonidas Guibas and Thomas Funkhouser},
year={2025},
eprint={2504.08727},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2504.08727},
}
Acknowledgements: Thanks to Jiahui Lei, Anh Thai, Jiapeng Tang, Linyi Jin, Luming Tang, Rundi Wu, Ian Huang, Colton Stearns, Francis Engelman, Manu Gopakumar, Suyeon Choi, Haley So, Richard Tucker, Abhijit Kundu, Jonathan Barron, Glenn Entis, and David Salesin, for their comments and constructive discussions; to Abhijit Kundu, William Freeman, and John Quintero for helping review our draft; G.W. was in part supported by Google, Samsung, and Stanford HAI. B.D. was in part supported by a Qualcomm Innovation Fellowship. This project page is adopted from the Streetscapes project page designed by Richard Tucker.
Disclaimers: Google Maps Street View images used with permission from Google.