Devoted to open data and open source in science and education.

View All Tutorials

This tutorial is a part of a series!

Click below to view all lessons in the series!


R programming (52)
Hierarchical Data Formats (HDF5) (15)
Spatial Data & GIS (22)
LiDAR (10)
Raster Data (14)
Remote Sensing (24)
Data Visualization (4)
Hyperspectral Remote Sensing (17)
Time Series (15)
Phenology (7)
Vector Data (6)
Metadata (1)
Git & GitHub (7)
(1) (1) (13)

Tutorial by R Package

dplyr (7)
ggplot2 (16)
h5py (2)
lubridate (time series) (6)
maps (1)
maptools (1)
plyr (2)
raster (26)
rasterVis (raster time series) (3)
rgdal (GIS) (24)
rgeos (2)
rhdf5 (11)
sp (5)
scales (4)
gridExtra (4)
ggtheme (0)
grid (2)
reshape2 (3)
plotly (5)

View ALL Tutorial Series

Twitter Youtube Github


R Bloggers


In this tutorial, we will calculate the Normalized Difference Vegetation Index (NDVI).


After completing this tutorial, you will be able to:

  • Calculate NDVI.
  • Write Python script to create other vegetation and water indices.

Install Python Packages

  • numpy
  • pandas
  • gdal
  • matplotlib
  • h5py

Download Data

To complete this tutorial, you will need data available from the NEON 2017 Data Institute teaching data set available for download.

Caution: This data set includes all the data for the 2017 Data Institute, including hyperspectral and lidar data sets and is therefore a large file (12 GB). Ensure that you have sufficient space on your hard drive before you begin the download. If not, download to an external hard drive and make sure to correct for the change in file path when working through the tutorial.ß

Download NEON Teaching Data Subset: Data Institute 2017 Data Set

The LiDAR and imagery data used to create this raster teaching data subset were collected over the National Ecological Observatory Networks field sites and processed at NEON headquarters. The entire dataset can be accessed by request from the NEON Airborne Data Request Page on the NEON website.

Download the neon_aop_refl_hdf5_functions module

The Normalized Difference Vegetation Index (NDVI) is a standard band-ratio calculation frequently used to analyze ecological remote sensing data. NDVI indicates whether the remotely-sensed target contains live green vegetation. When sunlight strikes objects, certain wavelengths of this spectrum are absorbed and other wavelengths are reflected. The pigment chlorophyll in plant leaves strongly absorbs visible light (with wavelengths in the range of 400-700 nm) for use in photosynthesis. The cell structure of the leaves, however, strongly reflects near-infrared light (wavelengths ranging from 700 - 1100 nm). Plants reflect up to 60% more light in the near infrared portion of the spectrum than they do in the green portion of the spectrum. By comparing the ratio of Near Infrared (NIR) to Visible (VIS) bands in hyperspectral data, we can obtain a quick look at vegetation in the region of interest. NDVI is a normalized measure of the difference between reflectance at near infrared and visible bands of the electromagnetic spectrum.

The formula for NDVI is:


NDVI is calculated from the visible and near-infrared light reflected by vegetation. Healthy vegetation (left) absorbs most of the visible light that hits it, and reflects a large portion of near-infrared light. Unhealthy or sparse vegetation (right) reflects more visible light and less near-infrared light. Source: NASA

Set up Environment

Start by setting plot preferences and loading the neon_aop_refl_hdf5_functions module.

%matplotlib inline
import warnings

Read in SERC Flightline & Subset

#Define inputs
filename = '../data/SERC/hyperspectral/NEON_D02_SERC_DP1_20160807_160559_reflectance.h5'
sercRefl, sercRefl_md, wavelengths = h5refl2array(filename)

clipExtDict = {}
clipExtDict['xMin'] = 367400.
clipExtDict['xMax'] = 368100.
clipExtDict['yMin'] = 4305750.
clipExtDict['yMax'] = 4306350.

clipExtent = (clipExtDict['xMin'],clipExtDict['xMax'],clipExtDict['yMin'],clipExtDict['yMax'])
clipIndex = calc_clip_index(clipExtDict,sercRefl_md['ext_dict'])
sercReflSubset = subset_clean_refl(sercRefl,sercRefl_md,clipIndex)

Stack NIR and VIS bands

Now that we have uploaded all the required functions, we can calculate NDVI and plot it. We will compute NDVI using bands 58 and 90. These correspond to wavelength ranges of:

band 58: lambda_NIR = 666.6-671.6 nm

band 90: lambda_NIR = 826.8-831.9 nm

#Select bands to be used in the NDVI calculation
ndvi_bands = (58,90) #NIR and VIS (Red) bands

#Check the center wavelengths that these bands represent
band_width = wavelengths.value[1]-wavelengths.value[0]

print('band 58 wavelength range: ' + str(round(wavelengths.value[57]-band_width/2,2)) 
      + '-' + str(round(wavelengths.value[57]+band_width/2,2)) + ' nm')
print('band 90 wavelength range: ' + str(round(wavelengths.value[89]-band_width/2,2)) 
      + '-' + str(round(wavelengths.value[89]+band_width/2,2)) + ' nm')

#Use the stack_subset_bands function to create a stack of the subsetted red and NIR bands needed to calculate NDVI

ndvi_stack = stack_subset_bands(sercRefl,sercRefl_md,ndvi_bands,clipIndex)
band 58 wavelength range: 666.59-671.6 nm
band 90 wavelength range: 826.84-831.85 nm

Calculate NDVI & Plot

vis = ndvi_stack[:,:,0].astype(float)
nir = ndvi_stack[:,:,1].astype(float)

ndvi = np.divide((nir-vis),(nir+vis))

                title='SERC Subset NDVI \n (VIS = Band 58, NIR = Band 90)',cmap_title='NDVI',colormap='seismic')

Extract Spectra Using Masks

In the second part of this tutorial, we will learn how to extract the average spectra of pixels whose NDVI exceeds a specified threshold value. There are several ways to do this using numpy, including the mask functions, as well as numpy.where and finally using boolean indexing.

To start, lets copy the NDVI calculated above and use booleans to create an array only containing NDVI > 0.6.

import copy
ndvi_gtpt6 = copy.copy(ndvi)
ndvi_gtpt6[ndvi<0.6] = np.nan  #set all pixels with NDVI < 0.6 to nan, keeping only values > 0.6
print('Mean NDVI > 0.6:',round(np.nanmean(ndvi_gtpt6),2))

plot_band_array(ndvi_gtpt6,clipExtent,(0.6,1),title='SERC Subset NDVI > 0.6 \n (VIS = Band 58, NIR = Band 90)',
Mean NDVI > 0.6: 0.87

Function to calculate the mean spectra for reflectance values thresholed by

NDVI using

import as ma
def calculate_mean_masked_spectra(reflArray,ndvi,ndvi_threshold,ineq='>'):
    mean_masked_refl = np.zeros(reflArray.shape[2])
    for i in np.arange(reflArray.shape[2]):
        refl_band = reflArray[:,:,i]
        if ineq == '>':
            ndvi_mask = ma.masked_where((ndvi<=ndvi_threshold) | (np.isnan(ndvi)),ndvi)
        elif ineq == '<':
            ndvi_mask = ma.masked_where((ndvi>=ndvi_threshold) | (np.isnan(ndvi)),ndvi)
            print('ERROR: Invalid inequality. Enter < or >')
        masked_refl = ma.MaskedArray(refl_band,mask=ndvi_mask.mask)
        mean_masked_refl[i] = ma.mean(masked_refl)
    return mean_masked_refl

We can test out this function for various NDVI thresholds. We’ll test two together, and you can try out different values on your own. Let’s look at the average spectra for healthy vegetation (NDVI > 0.6), and for a lower threshold (NDVI < 0.3).

sercSpectra_ndvi_gtpt6 = calculate_mean_masked_spectra(sercReflSubset,ndvi,0.6)
sercSpectra_ndvi_ltpt3 = calculate_mean_masked_spectra(sercReflSubset,ndvi,0.3,ineq='<') 
import pandas

#Remove water vapor band windows & last 10 bands 
w = copy.copy(wavelengths.value)
w[((w >= 1340) & (w <= 1445)) | ((w >= 1790) & (w <= 1955))]=np.nan
w[-10:]=np.nan;  # the last 10 bands sometimes have noise - best to eliminate

nan_ind = np.argwhere(np.isnan(w))

sercSpectra_ndvi_gtpt6[nan_ind] = np.nan
sercSpectra_ndvi_ltpt3[nan_ind] = np.nan

#Create dataframe with masked NDVI mean spectra
sercSpectra_ndvi_df = pandas.DataFrame()
sercSpectra_ndvi_df['wavelength'] = w
sercSpectra_ndvi_df['mean_refl_ndvi_gtpt6'] = sercSpectra_ndvi_gtpt6
sercSpectra_ndvi_df['mean_refl_ndvi_ltpt3'] = sercSpectra_ndvi_ltpt3
ax = plt.gca();
                         edgecolor='none',kind='scatter',label='NDVI > 0.6',legend=True);
                         edgecolor='none',kind='scatter',label='NDVI < 0.3',legend=True);
ax.set_title('Mean Spectra of Reflectance Masked by NDVI')
ax.set_xlim([np.nanmin(w),np.nanmax(w)]); ax.set_ylim(0,0.45)
ax.set_xlabel("Wavelength, nm"); ax.set_ylabel("Reflectance")

Get Lesson Code

(some browsers may require you to right click.)