Environmental Modelling - Assignment 2
Water Quality Time Series - Data Mining/Analysis

This Web page describes the 2nd assignment in the module "Information Management in hydroinformatics systems" (semester 3) dealing with water quality time series for a data mining/analysis project using R scripts.

General Objective

Assignment 1 deals with handling and management of one single time series. Assignment 2 extends this task for several time series of biophysical water quality state variables towards data mining/analysis. Objective of the 2nd assignment is to explore this data set using data mining/analysis methods with R and or Python scripts.

Data Files

This assignment provides time series files from one measurement station at a river in the state of Brandenburg. The data files contain raw data with gaps and irrigularities. Please have a look in the file readme.txt for the biophysical state variables and their units.

Variable Unit 1996-2001 2000-2020
chlorophyll-a total [mirco g/l] ce.txt ce_2000_2020.txt
conductivity [micro s] lf.txt lf_2000_2020.txt
oxygen content [mg/l] o2.txt o2_2000_2020.txt
oxygen saturation [%] ot.txt
pH-value   ph.txt ph_2000_2020.txt
global radiation [W/m] st.txt
air temperature [C] tl.txt tl_2000_2020.txt
turbidity [%] tr.txt tr_2000_2020.txt
water temperature [C] tw.txt tw_2000_2020.txt
uv absorption 254nm [%] uv.txt uv_2000_2020.txt


The time series data are 10 min values for the time period 1.1.1996-31.12.2001 (6 years) and hourly values for the time period 1.1.2000 - 31.12.2020 (20 years).

Assignment Targets - R/Python Script

Target of the 2nd assignment is to write R/Python scripts to analyse the given data sets to explore knowledge and understanding of the relationship between the different biophysical state variables based on the data given for six years.

Please start with data pre-processing steps to harmonize the data for a deeper analysis. Examples are metadata values, gap identification and filling as well as scaling towards time series with larger time steps than 10 min and aggregation.

Try to analyse the data towards relationships between the different biophysical state variables on different time scales (daily, seasonal, annual ... changes and trends). Explore the data with different analysis and mining methods, examples will be given in the lectures.

Assignment Report

Please write a suitable assignment report about your analysis / mining activities including the successful and non-successful steps and conclude with your findings. The report should include the results by numbers/tables as well as suitable diagrams/plots. Add the written and applied R/Python scripts and details of the results in the appendix.