NEON EDUCATION bio photo

NEON EDUCATION

Devoted to open data and open source in science and education.

View All Tutorials

This tutorial is a part of a series!

Click below to view all lessons in the series!

Tags

R programming (56)
Hierarchical Data Formats (HDF5) (15)
Spatial Data & GIS (22)
LiDAR (10)
Raster Data (14)
Remote Sensing (25)
Data Visualization (4)
Hyperspectral Remote Sensing (18)
Time Series (17)
Phenology (8)
Vector Data (6)
Metadata (1)
Git & GitHub (7)
(1) (1) (14) (1) (1) (1) (1)

Tutorial by R Package

dplyr (9)
ggplot2 (18)
h5py (2)
lubridate (time series) (7)
maps (1)
maptools (1)
plyr (2)
raster (26)
rasterVis (raster time series) (3)
rgdal (GIS) (24)
rgeos (2)
rhdf5 (11)
sp (5)
scales (4)
gridExtra (4)
ggtheme (0)
grid (2)
reshape2 (3)
plotly (5)

View ALL Tutorial Series




Twitter Youtube Github


Blog.Roll

R Bloggers

This tutorial goes over how to convert data downloaded from the NEON Data Portal in zipped month-by-site files into individual files with all data from the given site(s) and months. Temperature data is used as an example.

Download the Data

To start, you must have your data of interest downloaded from the NEON Data Portal.

The stacking function will only work on Comma Seperated Value (.csv) files and not the NEON data stored in other formats (HDF5, etc).

Your data will download in a single zipped file.

The example data is any single-aspirated air temperature available from 1 January 2015 to 31 December 2016.

neonDataStackR package

This package was written to stack data downloaded in month-by-site files into a full table with all the data of interest from all sites in the downloaded date range.

More information on the package see the README in the associated GitHub repo NEONScience/NEON-utilities.

First, install the package from the GitHub repo. You must have the devtools package installed to do this. Then load the package.

library(devtools)
install_github("NEONScience/NEON-utilities/neonDataStackR", dependencies=TRUE)

## Skipping install of 'neonDataStackR' from a github remote, the SHA1 (5d81df60) has not changed since last install.
##   Use `force = TRUE` to force installation

library (neonDataStackR)

Now there is a single function to run in this package stackByTable(). The output will yield data grouped into new files by table name. For example the single aspirated air temperature data product contains 1 minute and 30 minute interval data. The output from this function is one .csv with 1 minute data and one .csv with 30 minute data.

Depending on your file size this function may run for a while. The 2015 and 2016 single aspirated air temperature from two sites that I used for the example took about 25 minutes to complete.

stackByTable("data/NEON_temp-air-single.zip")


## Unpacked  2016-02-SERC-DP1.00002.001-basic-20160708035158.zip
## Unpacked  2016-03-SERC-DP1.00002.001-basic-20160708035642.zip
## Joining, by = c("domainID", "siteID", "horizontalPosition", "verticalPosition", "startDateTime", "endDateTime", "tempSingleMean", "tempSingleMinimum", "tempSingleMaximum", "tempSingleVariance", "tempSingleNumPts", "tempSingleExpUncert", "tempSingleStdErMean", "finalQF")
## Joining, by = c("domainID", "siteID", "horizontalPosition", "verticalPosition", "startDateTime", "endDateTime", "tempSingleMean", "tempSingleMinimum", "tempSingleMaximum", "tempSingleVariance", "tempSingleNumPts", "tempSingleExpUncert", "tempSingleStdErMean", "finalQF")
## Stacked  SAAT_1min
## Joining, by = c("domainID", "siteID", "horizontalPosition", "verticalPosition", "startDateTime", "endDateTime", "tempSingleMean", "tempSingleMinimum", "tempSingleMaximum", "tempSingleVariance", "tempSingleNumPts", "tempSingleExpUncert", "tempSingleStdErMean", "finalQF")
## Joining, by = c("domainID", "siteID", "horizontalPosition", "verticalPosition", "startDateTime", "endDateTime", "tempSingleMean", "tempSingleMinimum", "tempSingleMaximum", "tempSingleVariance", "tempSingleNumPts", "tempSingleExpUncert", "tempSingleStdErMean", "finalQF")
## Stacked  SAAT_30min
## Finished: All of the data are stacked into  2  tables!

From the single-aspirated air temperature data we are given two final tables. One with 1 minute intervals: SAAT_1min and one for 30 minute intervals: SAAT_30min.

These .csv files are now ready for use.


Get Lesson Code

(some browsers may require you to right click.)