Sponsored by



About the Workshop

As computer vision systems increasingly transition into real-world applications, reliable and scalable localization across heterogeneous devices becomes critical. The CroCoDL workshop brings together researchers from computer vision, robotics, and augmented reality to address the unique challenges of cross-device, multi-agent localization in complex, real-world environments. With a focus on 3D vision, visual localization, egocentric and embodied AI, and AR/VR/MR, this workshop aims to foster dialogue around bridging the gap between academic benchmarks and real-world deployment. This inaugural edition features leading experts from academia and industry and introduces CroCoDL, a new large-scale benchmark dataset capturing synchronized sensor data from smartphones, mixed-reality headsets, and legged robots across diverse environments. Through invited talks, a paper track, and an open competition, the workshop will highlight recent advances and open challenges in localization under domain shifts, sensor diversity, and dynamic scene conditions. By uniting communities working on structure-from-motion, neural rendering, and embodied AI, CroCoDL offers a platform to drive innovation toward robust, scalable localization systems capable of operating across devices, agents, and perspectives.

Accepted papers

Full papers

PixCuboid: Room Layout Estimation from Multi-view Featuremetric Alignment
Best Student Paper Award
Gustav Hanning, Kalle Åström, Viktor Larsson

LightGlueStick: a Fast and Robust Glue for Joint Point-Line Matching
Aidyn Ubingazhibov, Rémi Pautrat, Iago Suárez, Shaohui Liu, Marc Pollefeys, Viktor Larsson

Triangulation of 3D target points from radar range and bearing data
Magnus Oskarsson

The Overlooked Value of Test-time Reference Sets in Visual Place Recognition
Mubariz Zaffar, Liangliang Nan, Sebastian Scherer, Julian F. P. Kooij

Extended abstracts

AsymLoc: Towards Asymmetric Feature Matching for Efficient Visual Localization
Mohammad Omama, Gabriele Berton, Yelin Kim

Resolution Where It Counts: Hash-based GPU-Accelerated 3D Reconstruction via Variance-Adaptive Voxel Grids
Lorenzo De Rebotti, Emanuele Giacomini, Giorgio Grisetti, Luca Di Giammarino

G-solver: Gaussian Belief Propagation and Gaussian Processes for Continuous-Time SLAM
Davide Ceriola, Simone Ferrari, Luca Di Giammarino, Leonardo Brizi, Giorgio Grisetti

Geo-NVS-w: Geometry-Aware Novel View Synthesis In-the-Wild with an SDF Renderer
Anastasios Tsalakopoulos, Angelos Kanlis, Evangelos Chatzis, Antonis Karakottas, Dimitrios Zarpalas

3D Reconstruction of Underwater Features in Lake Tahoe with an Autonomous Underwater Vehicle
Selena Sun, Rohan Tan Bhowmik, Elizabeth McElhinney

Renderer-Aware Cramér-Rao Bounds for Camera Pose on SE(3)
Arun Muthukkumar

Challenge Winners

🥇PICO, Bytedance Inc. Team
92.62% overall recall
Meixia Lin*, Mingkai Liu*, Shuxue Peng, DiKai Fan, Shengyu Gu, Xianliang Huang, Haoyang Ye, Xiao Liu

🥈Hoang/Khang/Gabriele Team
91.05% overall recall
Huy-Hoang Bui*, Danh-Khang Cao*, Gabriele Berton*

Schedule

1PM-5PM Honolulu Time
Monday, October 20th, 2025
Room 301B

Time Activity
13.00 - 13.15 Opening
13.15 - 13.45 Keynote by Gabriela Csurka
Privacy Preserving Visual Localization
13.45 - 14.00 Introduction of CroCoDL challenge
14.00 - 14.15 Challenge Winners
14.15 - 14.45 Keynote by Ayoung Kim
Bridging heterogeneous sensors for robust
and generalizable localization
14.45 - 15.00 Highlighted Paper Talk
15.00 - 16.00 Posters (248-257) and Coffee Break
16.00 - 16.30 Keynote by David Caruso
Current performance of visual-inertial SLAM
on egocentric data
16.30 - 17.00 Keynote by Torsten Sattler
Vision Localization Across Modalities
17.00 - 17.05 Closing

Invited Speakers

Ayoung Kim
Dr. Ayoung Kim

Associate Professor at Seoul National University

Ayoung Kim leads the Robust Perception for Mobile Robotics Lab at Seoul National University. Her research aims to develop robust and reliable perception systems that enhance mobile robot navigation in complex and dynamic environments. She has made significant contributions to the field, particularly in visual localization, sensor fusion, and learning-based perception techniques.

Torsten Sattler
Dr. Torsten Sattler

Senior Researcher at Czech Technical University in Prague

Torsten Sattler is a senior researcher at the Czech Technical University in Prague (CTU), where he leads the Spatial Intelligence group. He works towards making 3D computer vision algorithms such as 3D reconstruction and visual localization more robust and reliable through machine learning models trained on scene understanding and 3D computer vision tasks.

Gabriela Csurka
Dr. Gabriela Csurka

Principal Research Scientist at Naver Labs Europe

Gabriela Csurka is a principal research scientist at Naver Labs Europe. Her work bridges fundamental research and real-world applications in robotics and augmented reality. Her contributions to the field span various topics, including feature learning, visual place recognition, and cross-domain adaptation.

David Caruso
Dr. David Caruso

Research Scientist at Meta Reality Labs Research

David Caruso is a Research Scientist at Meta Reality Lab with 10 years of industry experience in human localization technology, including VIO and Visual SLAM, as well as non-visual approaches through multi-sensor fusion. He obtained his PhD from University Paris-Saclay where he pioneered new Visual-Inertial-Magnetic odometry technology. At Meta, he works on the core algorithms powering Project’s Aria localization stack.

Challenge

unknown

The workshop challenge is centered around a newly accepted dataset at CVPR 2025 — CroCoDL: Cross-device Collaborative Dataset for Localization (pre-rebuttal version available at link). To advance research in visual co-localization, we introduce CroCoDL, a significantly larger andmore diverse dataset and benchmark, as shown in Figure 1. CroCoDL is the first dataset to incorporate sensor recordings from both robots and mixed-reality headsets and covering a wider range of real-world environments than any existing cross-device visual localization dataset. It includes synchronized sensor streams from three primary devices: hand-held smartphones, head-mounted HoloLens 2, and the legged robot Spot.

unknown

For this challenge, we have selected two large-scale locations — Hydrology and Succulent — where we will release mapping and query splits. These splits will be used to evaluate visual localization performance in a crossdevice setup, meaning that the map is generated using data from one device, while the goal is to localize images taken by a different device within this map. The primary evaluation metric for submissions will be single-image localization recall at 50 cm and 5 degrees of pose error.

The competition will be split into two tracks:

  • T1. End-to-end methods. This track includes end-to-end methods such as coordinate scene regression, pose regression, neural radiance fields, gaussian splatting, feed-forward SfM.

  • T2. Other methods. This track includes any other approaches that do not classify in the first track. These approaches may incorporate learned components (e.g., feature extractors, sparse / dense matchers, or keypoint refinement techniques), but are not end-to-end trainable.

  • To be eligible for the awards, the top-performing teams will be required to submit a detailed technical report outlining their method.

    Timeline for the Challenge:

    • Deadline: 6th of October 23:59 (11:59PM) AOE

    Call for Papers

    Please note that 4-page extended abstracts generally do not conflict with the dual submission policies of other conferences. In contrast, 8-page full papers, if accepted, will appear in the proceedings and are therefore subject to the dual submission policy. This means they must not be under review or accepted at another conference at the same time.

    We invite 8-page full papers for inclusion in the proceedings, as well as 4-page extended abstracts. Extended abstracts may present either new or previously published work; however, they will not be included in the official proceedings.

    All submissions must be anonymous and comply with the official ICCV 2025 guidelines.

    Topics of Interest

    • 3D Reconstruction
    • Visual localization & structure-from-motion
    • Image retrieval
    • Implicit Scene Representations
    • Egocentric & embodied AI
    • Domain shift & sensor diversity
    • Real-world deployment at scale

    Submission Timeline for 4-page abstracts:

    • Submission Portal: OpenReview
    • Abstract Submission Opens: 16th of May, 2025
    • Abstract Submission Deadline: 26th of September, 2025
    • Notification to Authors: 10th of October, 2025
    • Camera-ready Submission: 17th of October, 2025

    Submission Timeline for 8-page full papers:

    • Submission Portal: OpenReview
    • Paper Submission Opens: 16th of May, 2025
    • Paper Submission Deadline: 30th of June, 2025
    • Notification to Authors: 7th of July, 2025
    • Camera-ready Submission: 15th of August, 2025

    Organizers


    Dr. Zuria Bauer

    Postdoc
    ETH Zurich


    JProf. Hermann Blum

    Junior Professor
    University Bonn


    Dr. Mihai Dusmanu

    Senior Scientist
    Microsoft


    Linfei Pan

    PhD student
    ETH Zurich


    Dr. Qunjie Zhou

    Research Scientist
    Nvidia


    Petar Lukoviv

    Research Assistant
    ETH Zurich


    Prof. Marc Pollefeys

    Professor, ETH Zurich
    Director, Spatial AI Lab, Microsoft

    Affiliations