Department of Computer Science,
Swiss Federal Institute of Technology,
CH-1015 Lausanne, Switzerland
IEM Department, University of Burgundy,
F-21000 Dijon, France
An obstacle to the establishment of a federated database system is the heterogeneity of the local databases. Data model heterogeneity occurs when the local systems use different languages to define and manipulate data. Another type of heterogeneity occurs when one single phenomenon in the domain of discourse is modelled differently in local systems. Conversely, identical schema items in local systems can in fact represent different phenomena. A study of different types of heterogeneity has been presented in Conflicts and Correspondence Assertions in Interoperable Databases (Spaccapietra-91).
To overcome the problems of heterogeneity, a solution in two phases is usually advocated: translation followed by integration (see, Femus : A Federated Multilingual Database System (Andersson-93)) . In the translation phase, schemas in the local database systems, expressed in different data models, are translated into schemas in one single conceptual model that is used as intermediate model. In the integration phase, correspondence assertions are specified that relate items of the local schemas to each other. The integrator then builds a virtual federated schema according to the correspondence assertions. The federated schema describes data from all the local systems and can be translated back to the data models of the members of the federation. This allows a local user to query the federated schema using his own language. Note that a federated schema does not have to be global. There may be several possibly overlapping federated schemas according to the requirements of the applications. A federated schema can be constructed partly from other federated schemas.
The following two sections give an outline of the translation and the integration phase.
The translation between a set of data models is carried out in two phases. First, the inheritance hierarchy is extended with concepts of the data models that are not already defined as meta-types. Next, suppose that the schema-types Sij and Skl, expressed in models Mi and Mk respectively, are instances of the same meta-type. When translating the schema Si to a schema in the model Mk, all Sij can be directly mapped to Skl. The main advantage of using a meta-model for the translation is the possibility to reuse translation rules in several data model translations. Moreover, with this approach only one translation per model is required, independently of the number of models in the federation. A more detailed description of the mechanism can be found in TIME - Traducteur Intelligent avec Metamodele Extensible, (Nicolle-95) .
It may happen that there is a semantic gap between the intermediate model and the models used by the members of the federation. A common situation is e.g. when relational databases are members of a federated database system. The information content of a relational schema is not sufficient to construct a conceptual schema. E.g., an entity relationship schema specifies a set of entity types or object types, whereas in a relational schema, there is no explicit representation of the object types. The relation schemes that carry properties of the same object type must be identified and translated to an entity type.
A method for the construction of a conceptual schema describing a relational database is presented in Extracting an Entity Relationship Schema from a Relational Database through Reverse Engineering, (Andersson-94) . The underlying assumption is that the data semantics is reflected in the utilization of the data. The method is divided in an information acquisition phase, when the application code is used to extract knowledge about the relational database system, followed by a schema generation phase. The method is supported by an interactive tool.
(Andersson-93) M. Andersson, Y. Dupont, S. Spaccapietra, K. Yetongnon, M. Tresch, H. Ye. ``Femus : A Federated Multilingual Database System'', Advanced Database Systems, LNCS 759, N.R. Adam, B.K. Bhargava [Eds], Springer-Verlag, New York 1993.
(Andersson-94) M. Andersson, ``Extracting an Entity Relationship Schema from a Relational Database through Reverse Engineering'', in Proceedings of the 13th Int. Conf. on the Entity Relationship Approach, Manchester UK, Dec. 1994
(Dennebouy-95) Y. Dennebouy, M. Andersson, A. Auddino, Y. Dupont, E. Fontana, M. Gentile, S. Spaccapietra, ``SUPER: Visual Interfaces for Object Relationship Data Models'', in Journal of Visual Languages and Computing, Special Issue on Visual Query Languages, 1995
(Nicolle-95) C. Nicolle, ``TIME - Traducteur Intelligent avec Metamodele Extensible, Technical Report, No.95/05, University of Burgundy, May 1995.
(Sheth-90) A. Sheth, J. Larson, ``Federated Database Systems for Managing Distributed and Heterogeneous and Autonomous Databases'', ACM Computing Survey, 22(3). September 1990.
(Spaccapietra-91) S. Spaccapietra, C. Parent, ``Conflicts and Correspondence Assertions in Interoperable Databases'', in SIGMOD RECORD, 20(4), December 91.
(Spaccapietra-92) S. Spaccapietra, C. Parent, Y. Dupont, ``Model-Independent Assertions for Integration of Heterogeneous Schemas'', in Very Large Database Journal, 1(1), July 1992.