R for Accessing ERDDAP Data¶

Setting up the environment¶

To access data on ERDDAP you can use the R package rerddap . Similar to the ckanr package, you must set your coding environment to the correct server. ERDDAP offers multiple servers with this script, and defaults to the NOAA GEO-IDE UAF ERDDAP server.

Example

library(rerddap) #loading package library

#Directing code to CanWIN's ERDDAP server
Sys.setenv(RERDDAP_DEFAULT_URL = "https://canwinerddap.ad.umanitoba.ca/erddap")
servers() #provides you with a lits of ERDDAP servers you can access
eurl() #prints the URL of the server you are currently using and is used in functions to reference back to CanWIN server URL
ed_datasets() #provides you with a full list of datasets on the server
ed_datasets() #list of table type datasets
ed_datasets() #list of grid type datasets

Searching and Filtering Datasets¶

Once you have set up your environment, you can perform more advanced searches and queries on ERDDAP datasets. For more advanced search options, see rerddap documentation.

Example

#To search ERDDAP using specific parameters and variables use the ed_search function
ed_search(query = "<insert variable>", page = NULL, page_size = NULL, which = "<tabledap or griddap>", url = eurl())

#E.g, To filter for specific variables and dataset types:
ed_search(query = 'temperature',which = "tabledap")
ed_search(query = 'bathymetry', which = "tabledap")

Importing a Dataset File into RStudio¶

When you have found a dataset you would like to retrieve, you will need the dataset ID. This can be found on the ERDDAP dataset page, in the last column. The example below uses our Greenedge nutrient dataset, and the tabledap type,

Example

dat <- ed_datasets() #giving the function a name in our environnment so we can then call out the dataset ID's to look at a given dataset
IDs <- dat$Dataset.ID
print(IDs) #by selecting the ID or the position of the ID we can now search the datasets
#or you can access all the datasetIDs through your stored dataset types
tab$Dataset.ID
green <- tab$Dataset.ID[5] #saving datasetID in environment

#For more information on the dataset
info(green) #prints in console all variable information
browse(green) #opens a browser page for you to view the dataset on CanWIN ERDDAP site
tabledap(green) #prints the entire dataset in your console
griddap("<replace with datasetID>") #function for grid type datasets

#Saving the dataset in a variable within your environment will then allow you to further filter and modify the data prior to saving it to your harddrive
nutrient <- tabledap(green)

#Additionally, there is an interactive mode for viewing datasets
#Here is a full example showing how you can filter using the parameters fields and ranges 
## Not run:
if (interactive()) { # interactive mode is optional, you can open the url in a defualt browser and print information in console
  browse('GreenEdge_Nutrient_3e39_afd6_0a1e')# browse by dataset_id
  my_info <- info('GreenEdge_Nutrient_3e39_afd6_0a1e')
  browse(my_info)# browse info class
  my_tabledap <- tabledap('GreenEdge_Nutrient_3e39_afd6_0a1e', fields=c('latitude','longitude','itis_tsn'), 'time>=2011-10-25', 'time<=2011-10-31')
  browse(my_tabledap)# browse tabledap class
}

Downloading a Dataset Directly to Hard drive¶

Similarly to ckanr, there is a disk() function that allows you to specify the hard drive pathway where you would like to save the modified dataset. There is also an option to save using memory(), which stores your RStudio environment as a data frame.

Example

#Use the disk() function to save data to a specific location
disk(path = "D:/R/Ckan",overwrite = TRUE) # if you wish to overwrite files you may have saved previously select TRUE, otherwise FALSE

memory() # this will not work for netCDF files since they will need to be wrtitten to disk and will save to coding environment

#To save to your harddrive
tabledap( green,  fields = NULL,  distinct = FALSE,  orderby = NULL,  orderbymax = NULL,  orderbymin = NULL,  orderbyminmax = NULL,  units = NULL,
          url = eurl(),  store = disk(),  callopts = list()) # for tabledap type datasets

griddap( "<replace with datasetID>",  fields = "all",  stride = 1,  fmt = "nc",  url = eurl(),  store = disk(path = "D:/R/Ckan",overwrite = TRUE),  read = TRUE,  callopts = list()) # for grid type datasets