Basic Research in Informatics for Creating the Knowledge Society
ABOUT BRICKS
Background
Consortium
Organization
Boards
Funding


RESEARCH
Projects
Publications
Phd Theses
Posters


NEWS & AGENDA
News
Agenda


CONTACT
Contact
RESEARCH: PROJECTS
Click on a theme or a project in the table below for more information.
ThemesPDCMSVISAFM
ProjectsPDC1    PDC2    PDC3MSV1    MSV2    MSV3IS1    IS2    IS3    IS4/5
IS6    IS7    IS8
AFM1    AFM2    AFM3    AFM4
AFM5    AFM6    AFM7    AFM8

Project leader: Prof.dr. Martin Kersten (CWI)
Consortium: CWI
Industrial partners (non-exhaustive): RUG (Groningen), Kapteyn Instituut, JHU (USA), SDSS team
Total FTE: 3.4fte (heads: 2 faculty: 1 PhD, 1 PD)
Project IS4/5: Cracking a Scientific Database
Scientific data management challenges the capabilities offered by database systems, both in terms to deal with petabyte data volumes and to facilitate scalable and complex query interaction in a distributed setting. In this project we plan to study the architectural consequences of continuous meta-data reorganization decisions and multi-step, adaptive query processing. Reorganization will be an integral part of the query evaluation-process. Every query is first analyzed for its contribution to "crack" the database into multiple pieces, such that both the required subset is easily retrieved and subsequent queries may benefit from the new partitioning structure. A similar argumentation exists for query processing over a scientific database. The project creates an experimental setting to develop and evaluate novel database techniques to aid scientific discoveries in astronomy.

Sloan Digital SkyServer
This subproject aims to create a mirror site for the SDSS. A single closed-source solution is known, but also indicates the sizeable challenges to support the requirements of this scientific database. The database definition alone is 200 pages SQL, the database contains >200M objects, and over a million queries are handled every month. The modern open-source database management system MonetDB is considered the prime candidate to act as an experimentation platform. It enables experimentation at all levels of a DBMS architecture, including data structures, query optimization, and distributed storage management.

Streaming scientific workflow
This subproject concentrates on distributed (and hence scalable) techniques to manage both the update load and the scientific discovery algorithms. Large (radio) telescopes generate a myriad of data, which requires cleaning, calibration, and analysis in an e-Science grid setting. The outcome of this process is a multi-gigabyte daily stream of events records, to be archived for a long period. In passing, high-priority real-time analysis is required to track phenomena of interest. A strong focus is development of streaming database technology.

Industrial cooperation & LOFAR
We cooperate with astronomers of the LOFAR project on database support for detecting transients (Kapteyn Instituut RUG, Anton Pannekoek Institute, UvA), and database research labs (Microsoft Bay-area Research Centre) to understand the domain-specific issues and experiences gained in the sole working version of SDSS.

Highlights
As this project is part of the third phase of the BRICKS program (financed through the second open round July 2006), challenges rather than results are presented. Likewise, no BRICKS key publications are currently available.

Research challenges
Scientific databases have been recognized as one of the most challenging areas of database research. It requires a fundamental assessment and renovation at all levels of a database system architecture, including the following:

  • Lightweight database compression techniques to reduce the massive storage requirements.
  • Optimization techniques geared at mathematical analysis of event streams.
  • Approximate query-processing algorithms against inherently incomplete and noisy data.
  • Distributed database techniques to scale to the size and the number of sites involved.
  • A sound architecture of a data stream engine to cope with the large volumes and real-time analysis required.

Economic & societal impact
Many of the research results are the result of intensive and fruitful collaborations with industrial parties, which use the results to enhance their competitive edge. In this way, the results have a significant impact on the society.

The MonetDB platform is concurrently developed in the Bsik program MultimediaN, where the emphasis is on multimedia search, Philips research lab to empower their ambient home environment, Regie voor Geo Informatiesystem, aimed at improved GIS applications, and with the Dutch Forensics Institute to simplify digital forensics.

The results of this project are made available through a real-life mirror of the SDSS. This way we provide a bridge between astronomy research worldwide and database research. It also provides the basis for quality assurance and empirical research in scientific database application scenarios.

Future work 2007-2009
This project has started late 2006. The functional prototype of MonetDB/Sky server has been created and is expected to go live 2nd quarter of 2007.

IS4/5 Researchers funded by BRICKS

  • M.Sc. Erietta Liarou (CWI)
  • Dr. Milena Ivanova (CWI)
  • Drs. Romulo Pereira Goncalves (CWI)

Other researchers involved

  • Prof.dr. M. Kersten (CWI)
  • Prof.dr. R. van Liere (CWI)
  • Dr. N. Nes (CWI)
  • Drs. E. Liarou (CWI)
  • Dr. R. Nijboer (ASTRON/LOFAR)
  • Dr. O. Smirnov (ASTRON/LOFAR)
  • G. van Diepen (ASTRON/LOFAR)

For more information, please refer to the publications and posters of this project.


© 2004-2009 BRICKS Consortium