Type: Dataset
Tags: Depth, RGB, images, Microsoft, Kinect, nyu

title= {NYU Depth Dataset V2},
keywords= {Depth, RGB, images, Microsoft, Kinect, nyu},
journal= {},
author= {Nathan Silberman and Pushmeet Kohli and Derek Hoiem, Rob Fergus},
year= {},
url= {},
license= {},
abstract= {The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect

1449 densely labeled pairs of aligned RGB and depth images
464 new scenes taken from 3 cities
407,024 new unlabeled frames
Each object is labeled with a class and an instance number (cup1, cup2, cup3, etc)
The dataset has several components:

Labeled: A subset of the video data accompanied by dense multi-class labels. This data has also been preprocessed to fill in missing depth labels.
Raw: The raw rgb, depth and accelerometer data as provided by the Kinect.
Toolbox: Useful functions for manipulating the data and labels.

The raw dataset contains the raw image and accelerometer dumps from the kinect. The RGB and Depth camera sampling rate lies between 20 and 30 FPS (variable over time). While the frames are not synchronized, the timestamps for each of the RGB, depth and accelerometer files are included as part of each filename and can be synchronized to produce a continuous video using the get_synched_frames.m function in the Toolbox.

The dataset is divided into different folders which correspond to each ’scene’ being filmed, such as ‘living_room_0012′ or ‘office_0014′. The file hierarchy is structured as follows:

Files that begin with the prefix a- are the accelerometer dumps. These dumps are written to disk in binary and can be read with file get_accel_data.mex. Files that begin with the prefix r- and d- are the frames from the RGB and depth cameras, respectively. Since no preprocessing has been performed, the raw depth images must be projected onto the RGB coordinate space into order to align the images.

The matlab toolbox has several useful functions for handling the data.

camera_params.m - Contains the camera parameters for the Kinect used to capture the data.
crop_image.m – Crops an image to use only the area when the depth signal is projected.
fill_depth_colorization.m – Fills in the depth using Levin et al's Colorization method.
fill_depth_cross_bf.m - Fills in the depth using a cross-bilateral filter at multiple scales.
get_accel_data.m - Returns the accelerometer parameters at a specific moment in time.
get_instance_masks.m – Returns a set of binary masks, one for each object instance in an image.
get_rgb_depth_overlay.m – Returns a visualization of the RGB and Depth alignment.
get_synched_frames.m - Returns a set of synchronized RGB and Depth frames that can be used to produced RGBD videos of each scene.
get_timestamp_from_filename.m – Returns the timestamp from the raw dataset filenames. This is useful for sampling the RAW video dumps at even intervals in time.
project_depth_map.m – Projects the Depth map from the Kinect on the RGB image plane.

tos= {},
superseded= {},
terms= {}

Send Feedback