InstanceFusion: Real-time Instance-level 3D Reconstruction Using a Single RGBD Camera

Lu, Feixiang; Peng, Haotian; Wu, Hongyu; Yang, Jun; Yang, Xinhang; Cao, Ruizhi; Zhang, Liangjun; Yang, Ruigang; Zhou, Bin

dc.contributor.author	Lu, Feixiang	en_US
dc.contributor.author	Peng, Haotian	en_US
dc.contributor.author	Wu, Hongyu	en_US
dc.contributor.author	Yang, Jun	en_US
dc.contributor.author	Yang, Xinhang	en_US
dc.contributor.author	Cao, Ruizhi	en_US
dc.contributor.author	Zhang, Liangjun	en_US
dc.contributor.author	Yang, Ruigang	en_US
dc.contributor.author	Zhou, Bin	en_US
dc.contributor.editor	Eisemann, Elmar and Jacobson, Alec and Zhang, Fang-Lue	en_US
dc.date.accessioned	2020-10-29T18:51:02Z
dc.date.available	2020-10-29T18:51:02Z
dc.date.issued	2020
dc.identifier.issn	1467-8659
dc.identifier.uri	https://doi.org/10.1111/cgf.14157
dc.identifier.uri	https://diglib.eg.org:443/handle/10.1111/cgf14157
dc.description.abstract	We present InstanceFusion, a robust real-time system to detect, segment, and reconstruct instance-level 3D objects of indoor scenes with a hand-held RGBD camera. It combines the strengths of deep learning and traditional SLAM techniques to produce visually compelling 3D semantic models. The key success comes from our novel segmentation scheme and the efficient instancelevel data fusion, which are both implemented on GPU. Specifically, for each incoming RGBD frame, we take the advantages of the RGBD features, the 3D point cloud, and the reconstructed model to perform instance-level segmentation. The corresponding RGBD data along with the instance ID are then fused to the surfel-based models. In order to sufficiently store and update these data, we design and implement a new data structure using the OpenGL Shading Language. Experimental results show that our method advances the state-of-the-art (SOTA) methods in instance segmentation and data fusion by a big margin. In addition, our instance segmentation improves the precision of 3D reconstruction, especially in the loop closure. InstanceFusion system runs 20.5Hz on a consumer-level GPU, which supports a number of augmented reality (AR) applications (e.g., 3D model registration, virtual interaction, AR map) and robot applications (e.g., navigation, manipulation, grasping). To facilitate future research and reproduce our system more easily, the source code, data, and the trained model are released on Github: https://github.com/Fancomi2017/InstanceFusion.	en_US
dc.publisher	The Eurographics Association and John Wiley & Sons Ltd.	en_US
dc.subject	Computing methodologies
dc.subject	Scene understanding
dc.subject	Vision for robotics
dc.subject	Perception
dc.title	InstanceFusion: Real-time Instance-level 3D Reconstruction Using a Single RGBD Camera	en_US
dc.description.seriesinformation	Computer Graphics Forum
dc.description.sectionheaders	Vision Meets Graphics
dc.description.volume	39
dc.description.number	7
dc.identifier.doi	10.1111/cgf.14157
dc.identifier.pages	433-445

Files in this item

Name:: v39i7pp433-445.pdf
Size:: 10.79Mb
Format:: PDF

View/Open

Name:: supplementary_video-final.mp4
Size:: 53.07Mb
Format:: Unknown

View/Open

This item appears in the following Collection(s)

39-Issue 7
Pacific Graphics 2020 - Symposium Proceedings

Show simple item record