Introduction

Within this document, a ‘use case’ was developed that can combine LifeWatch data with a second dataset to produce a meaningful chart. The result of this development needed to be a visualisation of the differences in monthly sea surface temperatures of the Belgian part of the North Sea, between the 1970’s and 2017. While the 2017 data was provided by the LifeWatch Underway Dataset, the 1970’s data was retrieved from the 4DEMON dataportal.

Disclaimer: Please note that this use case is developed to illustrate the use of 4DEMON and LifeWatch data. Any interpretation or conclusion fetched from these results is completely your own. When using this document to produce scientifically sound products for, make sure to re-evaluate all assumptions, validation and cleaning steps made throughout this document.

Step 1: Retrieve data

First, we retrieve the sea water temperature data used for this example study. As mentioned above, the data consists of 4DEMON data from the seventies and recent LifeWatch Underway data from 2017. Check out LifeWatch and 4DEMON to learn more about these projects and the data.

4DEMON

All 4DEMON data can be downloaded using the 4DEMON Dataportal. You can either download the data as a csv-file or get a webservice url that directly downloads the data for you, the latter being used for this use case.

#Read 4DEMON dataset from Webservice link. The link used in this code is set in the hidden code chunck, which can be checked in the source

DEMON <- read.csv(DEMON_link)

LifeWatch

The LifeWatch Underway data can be obtained using the Lifewatch data explorer. There, you can query the data on the timeframe as well as the sample period. Setting the sample period to smaller intervals (for instance 60 minutes) will result in more datapoints. Once queried, you can download the data as a .TAB file.

In order to be able to load the data into R, the file needs to be converted to a .CSV format (e.g. with excel).

Once in the right format, save the .CSV file in the working directory and specify the path in the script.

# Don't forget to specify path or put <LWdownload_yyyy-mm-dd-##-##>.csv in working directory
path_to_LifeWatch = "LWdownload_2018-08-02-11-41.csv"

# Read in the LifeWatch data from filepath
LifeWatch <- read.csv(path_to_LifeWatch, header=TRUE)

Inspect the datasets

  • 4DEMON dataset:

What does each column contain?

colnames(DEMON)
##  [1] "FID"           "latitude"      "longitude"     "depth"        
##  [5] "datetime"      "value"         "parametername" "dataprovider" 
##  [9] "datasettitle"  "stationname"   "unit"          "valuesign"

Now, we create a table of the sampled years. We will filter the seventies data later.

table(year(DEMON$datetime))
## 
## 1968 1970 1971 1974 1975 1976 1977 1978 1979 1994 1995 1996 1997 1998 1999 
##   15    4   30  110  503  627  580  474    1   46   27   79  180  107  127 
## 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2012 2013 2014 2015 2016 
##  123   78   79   57   52   61   49   17    3   12   10   27   40   42   21
  • LifeWatch dataset:
head(LifeWatch)
##              Time Latitude Longitude Temperature Salinity Conductivity
## 1  2/20/2017 8:15 51.25116  2.894092      5.3545   31.556        3.117
## 2  2/20/2017 9:00 51.23505  2.812398      5.4044   32.669        3.213
## 3 2/20/2017 10:00 51.26288  2.820402      5.3644   32.775        3.218
## 4 2/20/2017 11:00 51.24802  2.766703      5.6572   33.335        3.293
## 5 2/20/2017 12:00 51.23358  2.810230      5.4017   32.537        3.200
## 6 2/20/2017 13:00 51.31516  2.870728      4.9632   31.906        3.105
##     Depth ChlorophylA
## 1 -995.37        -999
## 2   12.27        -999
## 3   10.28        -999
## 4   10.20        -999
## 5   10.77        -999
## 6   11.86        -999

We need to convert the LifeWatch data to a different time format, i.e “yyyy-mm-dd” to be able to pass to year() (function) and check unique values in $Year(column).

LifeWatch$Time <- ymd(date(mdy_hm(LifeWatch$Time)))
table(year(LifeWatch$Time)) # only data from 2017
## 
## 2017 
## 1422

Step 2: Filter data

As we are only interested in the sea surface temperatures from the Belgian part of the North Sea, we have to filter the data before visualization. NA-values and data from outside the aspired time interval need to be removed first. Secondly, we defined “surface water” as the first 5m of the water column, so measurements at greater depth need to be removed too.

Lastly, only datapoints located inside the Belgian part of the North Sea should be considered. Therefore, a polygon of the Belgian part of the North Sea was constructed using the coordinates specified in this law article. We added two extra coordinates to specify the costal boundaries between Belgium and France (51.08897,2.54550) and between Belgium and the Netherlands (51.36887,3.36652) to include coastal measurements.

4DEMON

Filter the data:

    1. Only keep data with no NA values for latitude or depth.
DEMON_filtered <- filter(DEMON,!is.na(DEMON$latitude), !is.na(DEMON$depth)) 
    1. Only keep surface water temperatures (<=5m depth), measured in the seventies (1970-1979).
DEMON_filtered <- filter(DEMON_filtered, DEMON_filtered$depth <= 5 & 1970 <= year(DEMON_filtered$datetime) & year(DEMON_filtered$datetime) <= 1979)
    1. Select samples that are within the Belgian part of the North Sea.
#Define Belgian part of the North Sea (BCP) as polygon
BCP <- data.frame("x" = c(51.2691,51.5577,51.6130,51.805,51.8761,51.5516, 51.08897, 51.36887),
                  "y"=c(2.3902,2.2383,2.2533,2.4816,2.5391,3.0813, 2.54550, 3.36652))
write.table(BCP, row.names = FALSE, col.names = FALSE)
#Select points inside the BCP polygon
indices_DEMON <- inpip(DEMON_filtered[,c(2,3)],BCP) 
DEMON_BCP <- DEMON_filtered[indices_DEMON,]
    1. Now we prepare the data for visualization, by creating and keeping only the columns of interest (e.g. month, year and temperature).
## Prepare data for visualisation
DEMON_BCP_plot <- DEMON_BCP[,c(5,6)]
DEMON_BCP_plot[,3] <- year(DEMON_BCP$datetime)
DEMON_BCP_plot[,1] <- month(DEMON_BCP_plot$datetime)
colnames(DEMON_BCP_plot) <- c("Month","Temp","Year")

LifeWatch

For the Lifewatch data we apply the same filters as on the 4DEMON dataset, except the time filter as this has already been done during the downloading of the data from the Data Explorer.

    1. Filter NA values.
sum(is.na(LifeWatch[,2,3])) # is != 0 if there is missing data
    1. Only keep sea surface temperatures (<=5m depth).
LifeWatch_filtered = LifeWatch[LifeWatch$Depth <= 5,]
    1. Select samples that are within the Belgian part of the North Sea.
indices_LifeWatch <- inpip(LifeWatch_filtered[,c(2,3)],BCP) #Select points inside the BCP polygon
LifeWatch_plot = LifeWatch_filtered[indices_LifeWatch, c(1,4)] #also select the relevant columns for plotting
    1. We again prepare the data for visualization, by creating and keeping only the columns of interest (e.g. month, year and temperature).
LifeWatch_plot[,3] <- year(LifeWatch_plot$Time)
LifeWatch_plot[,1] <- month(LifeWatch_plot$Time)
colnames(LifeWatch_plot) <- c("Month","Temp","Year")

Merge datasets

The datasets are merged to facilitate plotting with plotly.

#merge datasets
dataset <- bind_rows(DEMON_BCP_plot,LifeWatch_plot)
colnames(dataset) <- c("Month","Temp","Year")

Reorder levels of the ‘Year’ column in order to scroll chronologically.

#reorder levels of the 'Year' column in order to scroll from most present to most old
dataset$Year = factor(dataset$Year, levels = c(unique(dataset$Year)[order(as.numeric(unique(dataset$Year)), decreasing = TRUE )]))

To make the plot more readable the month indentifiers are mapped to 3-letter abbreviations which we can retrieve from the month.abb R object.

#create descriptive 'month' column 
df.dict = data.frame(month.abb, 1:12)
colnames(df.dict) = c("month.abb", "Month")
dataset = merge(dataset, df.dict, on = "Month")
dataset$month.abb= factor(dataset$month.abb, levels = month.abb)

Step 3: Plot combined dataset interactively with the plotly package

#Aggregate data or not (to check)
DD2 =aggregate(dataset$Temp, list(dataset$month.abb, dataset$Year), mean)
colnames(DD2) = c("Month","Year","Temp")
#Round for clearer plot hover
DD2$Temp = round(DD2$Temp, 1)
#Add axis layout
f <- list(
  family = "Courier New, monospace",
  size = 18,
  color = "black"
)
x <- list(
  title = "Month",
  titlefont = f
)
y <- list(
  title = "Sea surface temperature (?C)",
  titlefont = f
)

# Plot chart
p <- plot_ly(DD2, x = ~Month, y = ~Temp, color = ~as.factor(Year)) %>%
  add_lines() %>%
  layout(xaxis = x, yaxis = y, title = "Sea surface temperatures in degrees per month")
p

Session info

# Run if interested in session info
# sessionInfo()