Dataset

Annotated Dataset

The second part of our dataset consists of annotated MATLAB Level 5 MAT files. Each of file contains one extracted, preprocessed and annotated hyperspectral image. You may load the file using Python as follows.

from scipy.io import loadmat
data = loadmat('path_to_file.mat')

This method returns a Python dictionary object with the following entries.

HyKo1:

  • data: [numpy.ndarray] The preprocessed image
  • wavelengths: [numpy.ndarray] List of wavelengths, corresponding to the spectral bands of the preprocessed image
  • labels: [numpy.ndarray] A per pixel annotation of the preprocessed image.

Hyko2:

  • image: [numpy.ndarray] The raw image extracted from the bagfile before preprocessing.
  • data: [numpy.ndarray] The preprocessed image
  • wavelengths: [numpy.ndarray] List of wavelengths, corresponding to the spectral bands of the preprocessed image
  • label_*: [numpy.ndarray]* One or more entries start with label_. Each of them contains a per pixel annotation of the preprocessed image.
  • id_*: [String]: Unique ID corresponding to each type of annotation to make sure no two annotations are getting confused. This entry is irrelevant for practical usage, however useful for keeping order. The exact names correspond to the names of each of the label_* entries.
  • stats: [numpy.ndarray] Six values to determine the exposure of the image. It was used for choosing usable images and is irrelevant for practical usage.

The following script gives a starting point for usage.

#!/usr/bin/env python3

from scipy.io import loadmat
from argparse import ArgumentParser
import matplotlib.pyplot as plt

def show_img(im, title, **args):
    plt.imshow(im, **args)
    plt.axis('off')
    plt.title(title)
    plt.show()

def example(path_in, show):
    # conditional show_img
    img_show = show_img if show else lambda *x, **y: None

    print('Processing {}.'.format(path_in))

    # load the file
    data = loadmat(path_in)

    # what was in the file?
    content = [i for i in data.keys() if not i.startswith('_')]
    print("The mat file contains {} entries:\n{}".format(len(content), content))

    # image
    title = "The 'image' entry is the raw image.\nIt was captured by the hyper spectral camera."
    print(title)
    img_show(data['image'], title, cmap='gray')

    # ['image', 'data', 'wavelengths', 'stats']

    # data
    title = "The 'data' entry contains the extracted hypercube.\n Its shape is {}x{}x{}".format(*data['data'].shape)
    print(title)
    img_show(data['data'].sum(axis=-1), title, cmap='gray')
    # access
    hyper = data['data']
    img_show(hyper[:, :, 0], "", cmap='gray')

    # wavelengths
    title = "The 'wavelengths' entry contains a list of the wavelengths of the hypercube."
    print(title)
    plt.plot(data['wavelengths'][0], 'ob')
    plt.title(title)
    plt.show()

def main():
    command_help = 'This script loads a mat file and visualizes its content. It\'s mean to be a starting point.'
    parser = ArgumentParser(description=command_help)
    parser.add_argument('-i', '--input', help='input file [default="./"]', default='./')
    parser.add_argument('-s', '--show', help='show images [Y/n]', default='Y')
    args = vars(parser.parse_args())
    file_input = args['input']
    show = (args['show'] == 'Y')
    print('Input file:  ', file_input)
    print('Show images: ', show)
    example(file_input, show)

if __name__ == '__main__':
    main()