We fund the Earth system science community to create educational content through annual calls for educational pilots. The outcomes will be available through the NFDI4Earth educational portal for the public. The submitted proposals are evaluated by NFDI4Earth co-applicants according to the following criteria (read the full guideline here):
a. Relevance to NFDI4Earth
b. State of the art content
c. Novelty (addressing the gaps in existing OERs in ESS)
d. Use of active teaching methods
e. Relevance to RDM in ESS
f. Potential for integration into NFDI4Earth curricula
Here's a peek at the educational pilots that have made the cut year after year, each adding something special to our understanding of Earth system sciences:
The "Artificial Intelligence – Basics and Geographical Applications" educational pilot, developed by Ruhr University Bochum's Institute of Geography, is designed for M.Sc. and PhD students in Earth System Science (ESS). It offers a comprehensive introduction to artificial intelligence (AI) with a focus on geospatial applications, specifically in the context of environmental monitoring, geosimulation, and Earth observation.
The course consists of Jupyter Notebooks and learning videos, providing practical, hands-on learning opportunities. Key topics covered include pattern and object recognition (e.g., traffic and ship detection), predictive modeling of natural and economic phenomena (e.g., water levels, housing markets, weather), and geosimulation for urban growth predictions. The course explores various AI techniques, from machine learning algorithms like Random Forest and Support Vector Machines to neural networks, including convolutional and recurrent neural networks.
Students are guided through theoretical concepts and practical applications of AI in geospatial contexts using Python and GIS tools. By the end of the course, learners will be able to distinguish between different AI concepts, use machine learning tools, and build and train their own simple neural networks for geospatial analysis.
The course materials are presented in English, ensuring accessibility to a broad audience, and require basic knowledge of digital image processing and GIS. It aims to fill a gap in the educational landscape by demonstrating how AI techniques can be applied to geographic data and help learners develop self-directed research skills in AI-driven geospatial applications.
The "Classification and Change Detection in Remote Sensing" course from TU Dresden provides modular, self-paced learning material on satellite data processing and analysis. Delivered through Jupyter notebooks in R, the course is aimed at BSc and MSc students and young researchers in Earth and Environmental Sciences.
The course introduces students to the structure and handling of raster data, including cleaning, visualization, and basic image processing. It then covers techniques for enhancing data and extracting important features, helping students prepare datasets for analysis. Both unsupervised and supervised classification methods are explored, teaching students how to classify land cover using machine learning techniques like Random Forest and Support Vector Machines. The course also emphasizes the importance of evaluating the accuracy of classification results.
In addition, students learn multi-temporal analysis and change detection methods, enabling them to track changes between different datasets over time, which is crucial for environmental monitoring. The course also includes techniques for visualizing and quantifying changes, providing students with the tools to communicate their research findings effectively.
The course materials are designed to be interactive and flexible, allowing students to modify the code and use their own data. The content is delivered in English and is available as open educational resources, encouraging its use and adaptation across different learning environments.
The increasing availability of satellite data in recent years opens up new applications in many areas of environmental science. The processing of large amounts of data, especially satellite data, is one of the most important pillars of environmental monitoring. However, they also require extensive knowledge and appropriately trained personnel. Educational institutions such as universities have the task of adapting to these requirements. This adaptation must include all steps of the complex process chain for processing satellite data. In this context, it is important not only to train technical skills, but also that methodological competencies enable students to critically evaluate their own work steps. To reduce the complexity, a modular structure is used, which also makes it possible to take into account existing skills in the data processing chain.
At the latest due to the restrictions by the Corona pandemic digital teaching and learning opportunities experienced an enormous boost. Experience has shown that flexible content can be an essential element in motivating learning. The growing importance of MOOCs impressively underlines this development. These developments and effects represent an opportunity to transform the necessary content of satellite data processing into teaching and self-learning materials.
The objective is to develop Jupyter notebooks as self-learning material which provide a processing chain of common classification task with remote sensing data.
The project comprises three main work stages, whereby the technical implementation of the modules is the central element, namely methodological development of learning modules on the process flow of processing satellite data, technical implementation of the modules in Jupyter notebooks with example datasets, and testing and assessment of the modules with MSc students in Environmental Sciences at TU Dresden.
Generally, simulation and modelling of the environmental processes are accomplished on the grid level in which the investigation region is discretized to numerous grid points in the three dimensions of space plus time. Consequently, these simulations produce enormous data sets and processing this data extends beyond the current average Personal Computer capacity. However, only some people have access to high-performance computing centers. Additionally, the possibility of speeding up calculations and modelling exists in each PC through compiled programming languages such as Fortran. This solution speeds up computations and can reduce the CO2 footprint drastically.
R is one of the languages widely used in data analysis, visualization, and presentation, and it has a wide supporting community and thousands of packages. Nevertheless, Fortran is one of the fastest-performing languages -if not the fastest in number crunching- and one of the oldest. Due to the latter, interest in Fortran is constantly low. Considering all the above, the need for educational material that links R and Fortran is essential.
This project aims to provide one OER platform that will be a one-stop for all R users looking for speed in general and users from Environmental Science disciplines in particular.
Many developers have made efforts to speed up R using C++; however, integration of Fortran and R in such a package has yet to exist, to the best of the authors' knowledge. Filling this gap is important because Fortran is well-suited for numerical and scientific computations due to its array processing capabilities, performance, and efficiency. Commonly, computationally demanding models are written in Fortran; thus, integrating Fortran and R will allow environmental modelers and researchers to minimize changes between different programming languages.
The "Climatematch" EduPilot delivers educational content focused on teaching computational tools for climate science. The course materials are hosted online and consist of interactive Jupyter notebooks, coding tutorials, narrative explanations, and small embedded videos. These resources are designed for climate scientists, data engineers, and students with some programming knowledge who want to enhance their ability to work with large climate datasets.
The course is available in an open-access format and is designed to support flipped classrooms. Instructors or teaching assistants can use the materials to lead discussions and facilitate learning. The curriculum includes a wide range of climate science topics, including paleoclimate, modern climate data (atmospheric, oceanographic, and land), future climate projections, and the socio-economic and political dimensions of climate action. There is also a focus on using machine learning to address climate challenges.
Students have the opportunity to engage in research projects, using "project stubs" that allow them to work with big datasets in innovative ways.
Lead isotopes are a well-known geochronological tool. However, lead
isotope signatures can also be used to link non-ferrous metal objects to ore
deposits because they do not fractionate in metallurgical processes. Based on
this link, lead isotopes are a powerful tool to reconstruct past economical
networks. When combined with other methods, they also help to decipher past interactions between humankind and the environment, especially the impact of
mining activities. For these reasons, lead isotopes are a particularly
well-suited example for an interdisciplinary approach that combines Earth
System Sciences, Humanities, and Data Sciences.
The Educational Pilot “Teaching lead
isotope geochemistry and application in archaeometry (LIGA-A)” will create a
collection of educational materials that highlights this interlinkage and the
importance of modern data scientific approaches to the topic. The educational
materials will stand on their own but follow the way of lead isotope signatures
from their generation in ore deposits through the metallurgical process and their
measurement in the lab to the proper handling of such data, their visualization
and interpretation and finally their application in concert with data from e.g.
archaeological excavations, textual sources, and sediment cores.
To reach this aim, the educational resources
will utilize a wide range of formats such as presentations, quizzes,
animations, interactive visualization, and coding exercises. At the same time, the
Educational Pilot will focus on the creation of materials that are as inclusive
as possible from a technical point of view but also with regards to different
impairments of the learners.
For the efficient handling of large gridded datasets, the concept of a
datacube has received much attention in the last years. A datacube stores
datasets with common axes (like latitude, longitude, time) in a neatly
organized and easily accessible format, that e. g. allows fast data subsetting.
Part of the convenience of a datacube originates from the data being stored in
so called chunks; memory readable standardized subsets of the data that allow
efficient data access and parallel processing. However, accessing data on disk
also creates an overhead on computation time from input/output operations. Thus
access to the data cube is only fast when the data is provided with suitable
chunking aligned to the analysis in question: To illustrate, if data is chunked
for time series access, it will be inefficient to access a map (one timepoint
from each chunk), and vice versa if data is chunked for spatial processing, it
will be inefficient to access a time series separated across many chunks.
A proper chunking for efficient data reading and writing is especially
important due to the following factors: The datasets that we have to handle in
the earth system sciences are getting so large that they cannot be loaded in
full into the working memory anymore. But when data have to be accessed on
disk, the number of input/output operations should be minimized to avoid
limiting computation speed. More and more data is also available in the cloud
and needs to be made cloud compatible. Since data latency times become even more
important in the cloud, the data is compressed. It is then very important to
only decompress the data that is needed for the given analysis to optimize
resources and computational speed. Both can be achieved by optimal chunking.
This course provides interactive
notebooks and explorable explanations to give the student an intuition of the
usage of different chunking strategies and their influence on the performance
of the computations. The material will be provided as interactive Jupyter
notebooks, so that the learners could follow along, experiment and modify the
code at their own pace. The notebooks will be made available in Binder,
allowing interactive online code execution, to lower the entry barrier. The
material will be provided in English. The target group is expected to have some
programming experience and some experience in the work with gridded data.
Coding exercises are an important component of teaching data analysis in
ESS today. Manually correcting assignments is often a heavy workload for
exercise instructors. Students also often do not submit in time nor receive
timely feedback. Therefore, automated code checking systems are promising for a
wide range of teaching activities in ESS education. Several universities offer
this service, based on different software architectures and infrastructures.
Most of them are closed to their own students. In addition, the same basic
content is often designed repeatedly at different universities, or even in
different departments of a university.
Nbgrader is an existing tool that supports creating and grading
assignments for Jupyter Notebooks. It can be easily deployed in a conventional
server, where student users can program Python code online in a
Jupyter-Notebook interface and the exercise instructors can automatically grade
their submissions. The Institute of Cartography and Geoinformatics at the
University of Hannover has implemented such a system and successfully deployed
it for teaching activities using Python as the programming language since 2021
for their courses such as GIS I - modeling and data structure, laser scanning
data processing, SLAM and etc.
The reuse of existing teaching materials is also
of great importance. Within the education-oriented project ICAML -
Interdisciplinary Center for Applied Machine Learning2 (Coordinated by
co-applicant Martin Werner, BMBF funded 2018-2020), numerous Jupyter Notebook
tutorials for machine learning topics in geospatial data analysis were
developed and introduced to the community. While an interactive code checking
process is important to further develop these tutorials and make these contents
interactive and effortless to be included in future E-Teaching activities
related to geospatial data analysis.
Changes in land use/cover are taking place worldwide on a variety of
spatiotemporal scales and intensities. In this context, urbanization is a
process that is affecting more and more areas of society and nature. Today, more
than half of the world's population already lives in cities – in some European
countries, the figure is up to 80%. Even though built-up areas account for only
2-3% of the land surface worldwide, their “ecological footprint” is enormous.
Agricultural land, in particular, is being taken up for the expansion of
settlement and transport areas. The analysis of such changes based on
heterogeneous geospatial data sources is an important work step to estimate the
future evolution of socio-ecological parameters such as migration, erosion,
runoff patterns, biodiversity, etc.
Regional case studies from “hot spots” of
urbanization will be used to perform the necessary work steps to capture and
quantify urbanization in the context of sustainable development (Sustainable
Development Goal 11). Modern methods for accessing open geodata will be
presented and the extraction of thematic information from volunteered
geographic information (VGI), social media geographic information (SMGI) and
earth observation (EO) data with Python will be taught. The learners can
comprehend all work steps independently on their own computer. Basic knowledge
of digital image processing and Geographic Information Systems is required.