Research

Incremental Class Discovery for Semantic Segmentation with RGBD Sensing

nakajima_190710
Proposed method incrementally discovers new classes (e.g., pictures) in the reconstructed 3D map.

This work addresses the task of open world semantic segmentation using RGBD sensing to discover new semantic classes over time. Although there are many types of objects in the real-word, current semantic segmentation methods make a closed world assumption and are trained only to segment a limited number of object classes. Towards a more open world approach, we propose a novel method that incrementally learns new classes for image segmentation.

ICCV 2019

Yoshikatsu Nakajima, Byeongkeun Kang, Hideo Saito, Kris Kitani

[YouTube] [Paper]

DetectFusion: Detecting and Segmenting Both Known and Unknown Dynamic Objects in Real-time SLAM

hachiuma_190709
Reconstructed SLAM object maps

We present DetectFusion, an RGB-D SLAM system that runs in real time and can robustly handle semantically known and unknown objects that can move dynamically in the scene. Our system detects, segments and assigns semantic class labels to known objects in the scene, while tracking and reconstructing them even when they move independently in front of the monocular camera.

BMVC 2019

Ryo Hachiuma, Christian Pirchheim, Dieter Schmalstieg, Hideo Saito

[YouTube] [Paper]

EventNet: Asynchronous Recursive Event Processing

sekikawa_181207
Overview of asynchronous event-based pipeline of EventNet (inference) in contrast to the conventional frame-based pipeline of CNN

Event cameras are bio-inspired vision sensors that mimic retinas to asynchronously report per-pixel intensity changes rather than outputting an actual intensity image at regular intervals. This new paradigm of image sensor offers significant potential advantages; namely, sparse and nonredundant data representation. Unfortunately, however, most of the existing artificial neural network architectures, such as a CNN, require dense synchronous input data, and therefore, cannot make use of the sparseness of the data. We propose EventNet, a neural network designed for real-time processing of asynchronous event streams in a recursive and event-wise manner.

CVPR 2019

Yusuke Sekikawa, Kosuke Hara, Hideo Saito

[YouTube] [Paper]

Fast and Accurate Semantic Mapping through Geometric-based Incremental Segmentation

nakajima_180302
Flow of the proposed framework.

We propose an efficient and scalable method for incrementally building a dense, semantically annotated 3D map in real-time. The proposed method assigns class probabilities to each region, not each element (e.g., surfel and voxel), of the 3D map which is built up through a robust SLAM framework and incrementally segmented with a geometric-based segmentation method.

IROS 2018

Yoshikatsu Nakajima, Keisuke Tateno, Federico Tombari, Hideo Saito

[YouTube] [Paper] [Award]

Constant Velocity 3D Convolution

sekikawa_181126
Decomposed spatiotemporal 3D convolution.

We propose a novel 3-D convolution method, cv3dconv, for extracting spatiotemporal features from videos. It reduces the number of sum-of-products operations in 3-D convolution by thousands of times by assuming the constant moving velocity of the features.

3DV 2018

Yusuke Sekikawa, Kohta Ishikawa, Kosuke Hara, Yuuichi Yoshida, Koichiro Suzuki, Ikuro Sato, Hideo Saito