Date(s): Aug 10

Ecologists working across scales and integrating disparate datasets face new challenges to data management and analysis that demand toolkits that go above and beyond the spreadsheet. This workshop will offer ecologists an overview of the variety of data formats and types that are typically encountered when working with ‘Big Data’, and an introduction to available tools in R for working with these formats. The first half of the workshop will introduce participants to these formats, including ASCII, NetCDF4, HDF5, and las. Participants will then learn to (1) access and visualize large datasets in these formats using R, and (2) to use metadata to efficiently integrate datasets from multiple ecological data sources for analysis. In the second half of the workshop, we will apply the knowledge from the first half to work through a practical example of how to integrate field-collected vegetation structure data with remotely sensed LiDAR data. For this example, we will use data collected by the National Ecological Observatory Network (NEON), a continental-scale, NSF-funded effort to collect and freely serve terabytes of data per year (stored in a diversity of formats) over the next 30 years to enable ecological research. Participants will therefore leave the workshop with a basic understanding of the data that NEON and other large projects offer and some basic tools that support the use of Big Data to enhance their own research.

Workshop Instructors

  • Ted Hart
  • Leah A. Wasser
  • Sarah Elmendorf
  • Kate Thibault


Time Topic
12:00 Welcome, Intro to NEON Data
12:35 HDF5 Formats
1:30 BREAK
1:50 Data Visualization
2:50 BREAK
3:10 Bringing It Together: Visualization & HDF5 metadata
5:00 End