This Web page describes the 1st assignment in the module "Information Management in hydroinformatics systems" (semester 3) dealing with time series analysis using R scripts.
This assignment description is using R as one typical
programming langauge and data analysis environment.
Students are free to use any other alternative language such as
Python, Java, C++, Matlab, Octave, SPSS, ...
which is suitable to solve the given tasks.
Students partipating in the courses to model river Rhine
are free to replace the given Test Data Files with the
provided HYMOG data files for the measurements at Ruhrort and Wesel.
Modern sensor technology in combination with Internet/Web technology opens new
opportunities for measurements and related "big data" handling in water related projects.
One important type of data are time series of scalar physical state variables such as
temperature, humidity, discharge, radiation, precipitation, moisture, ...
Different sensor and data analysis systems are using different data format to store
time series data, most traditional formats are ASCII formats such as CSV based structures
for spreadsheet applications. Typical tasks in hydroinformatics projects are the implementation
of tool(s) to read, to process and to analyse such time series data files.
Objective of the 1st assignment is to write R script(s) to read time series files with a specific format,
to pre-process and to analyse the time series
as well as to generate a analysis report including suitable diagrams/plots.
This assignment provides three different time series files with different physical state variable but same format. The data is exported from the DWD Weste-XL service in CSV format:
The time series data files are hourly values for the year 2010 from a measurement station nearby Cottbus (geo-location is specified in the files).
Target of the 1st assignment is to write R script(s) to handle the provided test data time series (all three data files).
The assignment work is structured in four parts:
Please write suitable R script(s) to read the time series data from the CSV data files into a suitable R data structure (e.g. arrays/vectors, data.frame, zoo). Please consider suitable data types for the time/date information and the scalar value information. The R data structure should be reduced by extraction towards the time/date and the scalar value information (column 3 and 4 in the CSV files).
Please check with R script(s) all three time series towards gaps or
other irregularities, assuming a regular time series with 1 hour time step.
Gaps and irregularities should be reported on the console.
Pre-process the three time series by analysing the
key information value range (min and max) and mean value
for the related scalar value information.
Please analyse the time series month-wise for 2010 by
calculation of the min, max and mean value for each month in 2010.
Precipitation time series requires additional total sum of precipitation for the related time window.
Transform the time series from hourly time steps to daily and weekly time series.
Please create suitable plots for the given time series:
Please report all results (numbers and plots) in a time series report. This can be done manually by copy & paste towards an office package.
Examples for the implementation different working steps by R Scripts will be presented in the lectures.
The assignment report contains the implemented R script and the report of the performed working steps.