Researchers at the University of Oxford present DynPoint: an AI algorithm designed to facilitate rapid synthesis of new scenes for unconstrained monochrome videos

Researchers at the University of Oxford present DynPoint: an AI algorithm designed to facilitate rapid synthesis of new scenes for unconstrained monochrome videos

The computer vision community places great emphasis on the new view synthesis (VS) due to its potential to advance artificial reality and enhance a machine’s ability to understand the visual and geometric aspects of specific scenarios. Recent techniques using neural rendering algorithms have achieved realistic reconstruction of static scenes. However, existing methods that rely on epipolar geometric relationships are more suitable for static situations, while real-world scenarios with dynamic elements present challenges to these methods.

Recent work has primarily focused on collecting views in dynamic settings using one or more multilayer perceptrons (MLPs) to encode spatiotemporal scene information. One approach involves creating a comprehensive latent representation of the target video down to the frame level. However, the limited memory capacity of MLPs or other representation methods limits the applicability of this approach to shorter videos despite its ability to provide visually accurate results.

To address this limitation, researchers from the University of Oxford introduced DynPoint. This unique method does not rely on learning underlying latent representations to efficiently generate views from longer monocular videos. DynPoint uses explicit estimation of consistent depth and scene flow for surface points, unlike traditional methods that implicitly encode the information. The information of multiple reference frames is combined into the target frame using these estimates. Next, a hierarchical neural point cloud is generated from the collected data, and the views of the target frame are aggregated using this hierarchical point cloud.

This grouping process is supported by learning the correspondence between target and reference frames, with the help of depth inference and scene flow. To enable rapid synthesis of the target frame within a single video clip, researchers introduce a representation that combines information from reference frames to the target frame. Extensive evaluations of DynPoint’s speed and accuracy of view fitting are being performed on Nerfie, Nvidia, HyperNeRF, iPhone, and Davis datasets. The proposed model demonstrates superior performance in terms of accuracy and speed, as demonstrated by experimental results.


Check the paper. All credit for this research goes to the researchers in this project. Also don’t forget to join We have 32k+ ML SubReddit, 40k+ Facebook community, Discord channelAnd Email newsletterwhere we share the latest AI research news, cool AI projects, and more.

If you like our work, you’ll love our newsletter.

We are also on cable And WhatsApp.


Dhanshree Shenwai is a Computer Science Engineer and well experienced in FinTech companies covering Finance, Cards, Payments and Banking with keen interest in AI applications. She is passionate about exploring new technologies and developments in today’s evolving world making everyone’s life easy.


🔥 Meet Retouch4me: a set of AI-powered plugins for photo retouching

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *