Multi-Object Navigation (MultiON) Challenge

Held in conjunction with the Embodied AI Workshop at CVPR 2023

An example episode from the MultiON challenge 2023.
Goals:  real-goal-1real-goal-2real-goal-3


Multi-Object Navigation (MultiON) challenge is hosted at Embodied AI workshop, CVPR 2023. This challenge is built upon the task introduced in MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation (NeurIPS 2020). MultiON is a long-horizon generalization of the object-navigation task, i.e., where the agent has to navigate to multiple goals.


Challenge starts
(Dataset and starter code available and
EvalAI opens up for Minival Phase submissions)
March 20, 2023
Leaderboard opens
(Test Standard Phase and
Test Challenge Phase submissions)
March 20, 2023
Challenge submission deadline June 3, 2023 (AoE)
Winner announcement at Embodied AI Workshop, CVPR 2023 June, 2023


In MultiON, an agent is tasked with navigating to a sequence of objects. These objects are flexibly inserted on surfaces into a realistic 3D environment. The task is based on the Habitat platform and Habitat-Matterport 3D Semantics (HM3D-Semantics) scenes.

Each episode contains 3 target objects randomly sampled from a set of 50 classes. For every class there are multiple objects available (e.g., for the "notebook" class there are 9 different objects). We did this to add instance-level variation within each category and test the perception capabilities of the agents. We used 3D objects from ShapeNet. Additionally:
  • The training set contains only 40 possible classes, the validation 45 (with 5 "zero-shot objects"), and the test/test challenge have 45 (with 5 "zero-shot objects", different from the validation ones).
  • Each episode contains 5 distractor (non-target) objects randomly scattered on surfaces around the environment, to increase the difficulty of the task.
In summary, in each episode, the agent is initialized at a random starting position and orientation in an unseen environment and provided a sequence of 3 target objects randomly sampled (without replacement) from the set of 50 objects. The agent must navigate to each target object in the sequence (in the given order), avoiding distractor objects and and call the FOUND action to indicate discovery. The agent has access to an RGB-D camera and a noiseless GPS+Compass sensor. GPS+Compass sensor provides the agent's current location and orientation information relative to the start of the episode.

Evaluation Details

The episode terminates when an agent discovers all objects in the sequence of the current episode or when it calls an incorrect FOUND action. A FOUND action is incorrect if it is called when the agent is not within a 1.5m from its current target object. Note that this does not require the agent to be viewing the object at the time of calling FOUND. After the episode terminates, the agent is evaluated using the Progress and PPL metrics that are defined below.
Progress: The proportion of objects correctly found in the episode.
PPL: Progress weighted by path length. PPL quantifies the efficiency of agent's trajectory with respect to the optimal trajectory.

Submission Guidelines

Participants must make submission through our EvalAI page. There are three phases in the challenge.

Phase 1: Minival Phase

This phase evaluates MultiON agents on the minival set of the MultiON dataset. This phase is meant to be used for sanity checking the results of remote evaluation against your local evaluations.

Phase 2: Test Standard Phase

This results of this phase will be used to prepare the public leaderboard for the challenge. We suggest using this phase for reporting results in papers and for comparing with other methods. Each team is allowed a maximum of 3 submissions per day for this phase.

Phase 3: Test Challenge Phase

Only submissions made in this phase are considered as entries to the MultiON Challenge since this will be used to decide the winners. Each team is allowed a total of 3 submissions to this phase until the end of this phase. For detailed submission instruction, please refer this.

Challenge Updates

Any updates related to the challenge will be posted here. Please join the Google Group email list to receive updates about the challenge: click here to join or send an email to

Terms and Conditions

The Habitat-Sim is released under MIT license. To use HM3D-Semantics dataset, please refer here. If you use Habitat-Sim or the MultiON dataset in a paper, please consider citing the following publications:
    title     =     {Habitat: {A} {P}latform for {E}mbodied {AI} {R}esearch},
    author    =     {Manolis Savva and Abhishek Kadian and Oleksandr Maksymets and Yili Zhao and Erik Wijmans and Bhavana Jain and Julian Straub and Jia Liu and Vladlen Koltun and Jitendra Malik and Devi Parikh and Dhruv Batra},
    booktitle =     {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    year      =     {2019}
    title       =   {Multi-ON: Benchmarking Semantic Map Memory using Multi-Object Navigation},
    author      =   {Saim Wani and Shivansh Patel and Unnat Jain and Angel X. Chang and Manolis Savva},
    booktitle   =   {Neural Information Processing Systems (NeurIPS)},
    year        =   {2020},


Tommaso Campari
Tommaso Campari
SFU, U of Padova, FBK
Unnat Jain
Unnat Jain
Facebook AI Research, CMU