Back Announcing the Obstacle Tower Challenge winners and open source release

Announcing the Obstacle Tower Challenge winners and open source release

The Computational Science research group, led by Gianni De Fabritiis, has classified second at the Unity obstacle tower challenge, where an agent had to navigate a complex 3d environmet and solve tasks.

08.08.2019

Imatge inicial

 

We share below a blog post by Arthur Juliani and Jeffrey Shih (Unity) about the Obstacle Tower Challenge. The Computational Science research group, led by Gianni De Fabritiis, has classified second at this competition, in which an agent had to navigate a complex 3d environmet and solve tasks.

 

After six months of competition (and a few last-minute submissions), we are happy to announce the conclusion and winners of the Obstacle Tower Challenge. We want to thank all of the participants for both rounds and congratulate Alex Nichols, the Compscience.org team, and Songbing Choi for placing in the challenge. We are also excited to share that we have open-sourced Obstacle Tower for the research community to extend for their own needs.

Challenge winners

We started this challenge in February as a way to help foster research in the AI community, by providing a challenging new benchmark of agent performance built in Unity, which we called Obstacle Tower. The Obstacle Tower was developed to be difficult for current machine learning algorithms to solve, and push the boundaries of what was possible in the field by focusing on procedural generation. Key to that was only allowing participants access to one hundred instances of the Obstacle Tower, and evaluating their trained agents on a set of unique procedurally generated towers they had never seen before. In this way, agents had to be able not only to solve the versions of the environment they had seen before, but also do well on unexpected variations, a key property of intelligence referred to as generalization.

Once we created Obstacle Tower we performed preliminary benchmarking ourselves using two of the state-of-the-art algorithms at the time. Our learned agents were able to solve a little over an average of 3 floors solved on these unseen instances of the tower used for evaluation. In contrast, humans without experience playing video games are able to solve an average of 15 floors, often getting as high as 20 floors into a tower.

Since the start of the contest we have received close to 3,000 of submitted agents and been delighted to watch as participant’s continued to submit even more compelling agents for evaluation. The top six final agents submitted by participants were able to solve over 10 floors of unseen versions of the tower, with the top entry solving an average of nearly 20 floors, similar to the performance of experienced human players. We wanted to highlight all participants who solved at least ten floors during evaluation, as well as our top three winners.

 

Challenge Winners

Place

Name

Username

Average floors

Average reward

1st

Alex Nichols

unixpickle

19.4

35.86

2nd

Compscience.org

giadefa

16

28.7

3rd

Songbin Choi

sungbinchoi

13.2

23.2

Honorable Mentions

Place

Name

Username

Average floors

Average reward

4th

Joe Booth

joe_booth

10.8

18.06

5th

Doug Meng

dougm

10

16.5

6th

UEFDL

Miffyli

10

16.42

 

Open source release

We are happy to announce that all of the source code for Obstacle Tower is now available under the Apache 2 license. We waited to open source until the contest was completed to prevent anyone from reverse-engineering the task or evaluation process. Now that it is over, we hope researchers and users are able to take things apart to help learn how to solve the task better, as well as modify the Obstacle Tower for your own needs. The Obstacle Tower was built to be highly modular, and relies heavily on procedural generation of multiple aspects of the environment, from the floor layout to the item and module placement in each room. We expect that this modularity will make it easy for researchers to define their own custom tasks using the pieces and tools we’ve built.

The focus of the Obstacle Tower Challenge is what we refer to in our paper as weak generalization (sometimes called within-distribution generalization). For the challenge, agents had access to one hundred towers, and were tested on an additional five towers. Importantly, all of these towers were generated using the same set of rules. As such, there were no big surprises for the agents.

Also of interest is a different kind of generalization, what we refer to as the strong kind (or sometimes called out of distribution). In this scenario, the agent would be tested on a version of Obstacle Tower, which was generated using a different set of rules from the training set. In our paper, we held out a separate visual theme for the evaluation phase, which used different textures, geometry, and lighting. Because our baseline agents performed catastrophically in these cases, we opted to only test for weak generalization in the challenge. That being said, we think that strong generalization benchmarks can be an even better measure of progress in artificial intelligence, as humans are easily able to strongly generalize, while agents typically fail at such tasks. We look forward to the community extending our work and proposing their own unique benchmarks using this open source release.

Thanks so much to everyone who participated and our partners at Google Cloud for providing GCP credits and AICrowd for hosting the challenge. When we started the competition we weren’t sure if participants would be able to pass the ten floor threshold, but the community has impressed us with getting as far as 19 floors into unseen versions of the tower. That being said, each instance of Obstacle Tower contains 100 floors. This means that there is still 80% of the tower left unsolved! Furthermore, there is a greater need for control and planning in the upper floors, as enemies, more dangerous rooms, and more complicated floor layouts are introduced. We think this means there is a lot of room for new methods to be developed in the field to make additional progress. We look forward to seeing what progress is made over the next months and years as researchers continue to tackle Obstacle Tower.

If you have any questions about the challenge please email us at [email protected]. If you’d like to work on this exciting intersection of Machine Learning and Games, we are hiring for several positions, please apply!

To learn more about the project, as well as how to extend it for your own uses, head over to the GitHub page for the project.

 

2nd Place - Compscience.org

At the computational science laboratory (www.compscience.org) at Universitat Pompeu Fabra, Gianni and Miha work at the interface between computing and different application areas, looking at developing computational models with intelligent behavior. Gianni is the head of the computational science laboratory at University Pompeu Fabra, an Icrea research professor, and a founder at Acellera. Miha is a PhD student in Gianni’s biology group. The team felt that the Obstacle Tower Challenge was a good way to quickly learn and iterate new ideas in a relevant 3D environment.

The team’s final model was PPO with a reduced action set and a reshaped reward function. For the first floors the team also used KL terms to induce behaviors into the agent similar to what Alex Nichols did. But was later dropped in higher floors. The team also used a sampling algorithm at the key floors to focus the actors to run more in floors and seeds where it was neither good nor bad. Later, the team used a more standard sampling at higher floors. The team did not have enough time to assess the exact benefits of each method, which they plan to do in the future. They plan to release the source code as soon as they can understand better and generalise these aspects. Lastly, the team tried world models (create a very compressed representation of the observation with an autoencoder and build a policy using evolutionary algorithms over this space). It did not work but the team learned a lot.

The team enjoyed the Obstacle Tower and believe that more realistic environments in terms of physics will be important so that the agents can do amazing things with enough samples. The team used 10B steps to train their agent. You can find out more about the team on Github and the lab’s website

 

Multimedia

Categories:

SDG - Sustainable Development Goals:

Els ODS a la UPF

Contact