Realizing the Hyperdatabase Vision |
The Vision
The amount of stored information is exploding as a consequence of the immense progress in computer and communication technology during the last decades. However tools for accessing relevant information and processing globally distributed information in a convenient manner are under-developed. In order to improve this situation, we envision the concept of a hyperdatabase that provides database functionality at a much higher level of abstraction, i.e., at the level of complete information components in an n-tier architecture. In analogy to traditional database systems that manage shared data and transactions, a hyperdatabase manages shared information components and transactional processes. It provides "higher-order data independence by guaranteeing the immunity of applications not only against changes in the data storage and access structures but also against changes in the application components and services, e.g., with respect to the location, implementation, workload, and number of replica of components. In our vision, a hyperdatabase will be the key infrastructure for developing and managing future information systems.
|
 |
Figure 1
The Hyperdatabase Vision
|
At the interface, it will support component and service definition and deployment, specification of transactional processes encompassing multiple application service invocations, service publication and subscription (see Figure 1). Under the cover, it will perform metadata management, scheduling, optimal routing of service requests, monitoring, flexible failure treatment, availability, and scalability. As illustrated in Figure 2, we have established a number of broad research directions tackling various problems of hyperdatabases:
Transactional coordination in composite systems continues our tradition in transaction research. The two areas below, database clusters and multimedia information management benefit from transaction research and are two examples of large scale information systems where we explore and realize the hyperdatabase vision by various prototype systems. In the vertical axis, we have established a new area information dynamics and mobilities that complements the foundation work in transactions by asynchronous decentralized "coordination. In the following we present a short description of the four areas.
Transactional Coordination in Composite Systems:
We have studied the problem of ensuring correctness of concurrent executions in composite n-tier systems. Every coordinator in the composite system performs its transaction management ensuring (local) correctness and (local) recovery. The problem is how global correctness and global recovery is ensured. In the past we have extensively studied this problem from a foundational point of view, and performed several evaluations. In transactional processes, our more recent research activities, we go beyond transactions in that we not only specify conflicts between invocations but we also know about compensation and about retriability. We allow to specify alternative executions and based on these we generalize the "all-or nothing atomicity of transactions to a notion called guaranteed termination. It means that a single process will eventually terminate along a well-defined path even in case of failure and under concurrency. We have started new investigations aiming at decentralized coordination exploring mobile agent technology, and exploiting cost information of service invocations to optimize scheduling without sacrificing correctness. Work on transactions and transactional processes is a foundation and a basis for transaction implementations in the other main research areas described below.
Database Cluster - PowerDB
|
 |
Figure 2 (left)
Physical view on the database cluster of 128 DBMS nodes
Figure 3 (right)
Research areas of the Database Group |
In the PowerDB project we explore a hyperdatabase consisting of a set of component databases in a PC cluster as shown in Figure 3. The objective is to bypass the limits of scalability and availability of todays database technology. In every component we have a complete DBMS with its data. Clients access data via the coordinator, i.e., via the hyperdatabase. We explore protocols for high-level transaction management under special consideration of semantic conflicts and of data partitioning and replication. Replication of complete databases contributes to considerable speed-ups in case of read transactions. Due to the second layer transaction management we avoid the disadvantages of traditional commit protocols and of synchronous updates. In addition, query routing aims at detecting components that have sufficiently fresh data and that have the shortest response time due to queries that have been processed before. Replication can be full or be restricted to certain parts of the database. We investigate methods that dynamically allow to add more components to the cluster. We put special emphasis on "Online Analytical Processing in a Cluster of Databases, and on "XML Document Management with PowerDB.
|
|
Multimedia Information Management
|
|
Figure 4 Multimedia components coordinated by a hyperdatabase in the ETHWorld application |
Multimedia information systems consist of many specialized
components such as databases, object repositories, special
image servers, feature extractors, and indexing components.
ISIS, our Interactive SImilarity Search engine, builds on
top of OSIRIS that provides a framework to implement, call
and combine services. In this context, ISIS consists of a
number of core services to store, analyze and index
multimedia documents. These services run in a large cluster
(with more than 100 nodes) which is maintained and observed
by the underlying OSIRIS system. Simple transactional
processes for insertion, similarity search, and bulk load
can run in parallel and the subtasks are "optimally" and
reliably assigned to the components by the OSIRIS system as
shown in Figure 4. At any point in time, a new component can
be added to the cluster in order to improve response times.
Interactive similarity retrieval is based on the VA-File, a
simple but efficient approximation of the inherently high-
dimensional feature vectors. In order to improve the
retrieval effectiveness, we support complex similarity
queries consisting of several reference images, several
feature types, textual attributes and predicates. In
combination with relevance feedback, our similarity search
system provides a convenient interface for effective
queries, as exemplified in Figure 5. We further apply these
techniques to organize, manage, and present the individual
information spaces of users in a more natural and efficient
way.
|
 |
Figure 5
Image Search: Most 5 similar images to the query image ( ) before (left) and after (right) consideration of relevance feedback. Result found in a test collection of 350000 images |
Information Dynamics and Mobility
The combination of wireless and wired connectivity along with increasingly small and powerful mobile devices, such as laptops, personal digital assistants, handheld PCs, and smart phones, enables a wide range of new applications that will radically change the way information is managed and processed today. Therefore in this new research area we put strong emphasis on networked information
systems where at any point in time nodes may become (partially) disconnected. Nevertheless processing should continue with the objective of afterwards resolving potential conflicts if there are any, i.e., by performing some coordination afterwards when nodes are re-connected. In our vision, information systems will be composed of self-describing and self-organizing mobile information components that are abstractions of both data and application logic.
|
|
!!! Dieses Dokument stammt aus dem
ETH Web-Archiv und wird nicht mehr gepflegt !!!
!!! This document is stored in the
ETH Web archive and is no longer maintained !!!