Research
Incremental Class Discovery for Semantic Segmentation with RGBD Sensing

This work addresses the task of open world semantic segmentation using RGBD sensing to discover new semantic classes over time. Although there are many types of objects in the real-word, current semantic segmentation methods make a closed world assumption and are trained only to segment a limited number of object classes. Towards a more open world approach, we propose a novel method that incrementally learns new classes for image segmentation.
ICCV 2019
Yoshikatsu Nakajima, Byeongkeun Kang, Hideo Saito, Kris Kitani
DetectFusion: Detecting and Segmenting Both Known and Unknown Dynamic Objects in Real-time SLAM

We present DetectFusion, an RGB-D SLAM system that runs in real time and can robustly handle semantically known and unknown objects that can move dynamically in the scene. Our system detects, segments and assigns semantic class labels to known objects in the scene, while tracking and reconstructing them even when they move independently in front of the monocular camera.
BMVC 2019
Ryo Hachiuma, Christian Pirchheim, Dieter Schmalstieg, Hideo Saito
EventNet: Asynchronous Recursive Event Processing

Event cameras are bio-inspired vision sensors that mimic retinas to asynchronously report per-pixel intensity changes rather than outputting an actual intensity image at regular intervals. This new paradigm of image sensor offers significant potential advantages; namely, sparse and nonredundant data representation. Unfortunately, however, most of the existing artificial neural network architectures, such as a CNN, require dense synchronous input data, and therefore, cannot make use of the sparseness of the data. We propose EventNet, a neural network designed for real-time processing of asynchronous event streams in a recursive and event-wise manner.
CVPR 2019
Yusuke Sekikawa, Kosuke Hara, Hideo Saito
Fast and Accurate Semantic Mapping through Geometric-based Incremental Segmentation

We propose an efficient and scalable method for incrementally building a dense, semantically annotated 3D map in real-time. The proposed method assigns class probabilities to each region, not each element (e.g., surfel and voxel), of the 3D map which is built up through a robust SLAM framework and incrementally segmented with a geometric-based segmentation method.
IROS 2018
Yoshikatsu Nakajima, Keisuke Tateno, Federico Tombari, Hideo Saito
Constant Velocity 3D Convolution

We propose a novel 3-D convolution method, cv3dconv, for extracting spatiotemporal features from videos. It reduces the number of sum-of-products operations in 3-D convolution by thousands of times by assuming the constant moving velocity of the features.
3DV 2018
Yusuke Sekikawa, Kohta Ishikawa, Kosuke Hara, Yuuichi Yoshida, Koichiro Suzuki, Ikuro Sato, Hideo Saito