Cloud Data Mining Project

Motivation

The data volumes e-sciences are facing are already reaching petabyte scale and continue growing at exponential rates. A scalable, distributed infrastructure for data management and analysis is essential in such an environment.

In this project, we focus on the analysis of structured data, like trees or graphs, employing cloud computing techniques.

Research Topics

Current Status

One current research focus is a high level scripting interface allowing for easy and comfortable distributed tree processing. We put special emphasis on exploiting modern infrastructure like multi-core CPUs. Moreover, we aim to integrate optimization techniques from relational database systems as well as adaptive reoptimization of workflows during runtime.

Documents