
Virtual electronic market places and virtual enterprises have become important applications for query processing [Jhi00]. Building a scalable virtual B2B market place with hundreds or thousands of participating suppliers requires highly flexible, distributed query processing capabilities. Architecting an electronic market place as a data warehouse by integrating all the data from all participating enterprises in one centralized data repository incurs severe problems:
We propose a reference architecture for building scalable and dynamic market places and a framework for evaluating so-called HyperQueries in such an environment. HyperQueries are essentially query evaluation sub-plans "sitting behind" hyperlinks. This way the electronic market place can be built as an intermediary between the client and the providers executing their sub-queries referenced via hyperlinks. The hyperlinks are embedded as attribute values within data objects of the intermediary's database. Retrieving such a virtual object automatically initiates the execution of the referenced HyperQuery in order to materialize the entire object. Thus, sensitive data can remain under the full control of the data providers. Instead of replicating the data at the intermediary, only the hyperlink is embedded.
We give a short overview of the basic architecture of our implementation, the QueryFlow system [1]. The proposed architecture claims to be a reference architecture for building open, extensible, and scalable electronic B2B market places. Furthermore we demonstrate the execution of queries in such an environment. Figure 1 sketches how hyperlinks of the market place refer to HyperQueries at the remote hosts. Thereby the remote hosts have the possibility to implement the HyperQueries using different approaches: at first, a remote host can state an SQL query, second a complex business application or even human input can be used using special wrappers, and finally the remote hosts can delegate the query to further hosts.
| Figure 1: HyperQueries are referenced by Hyperlinks |
We propose a reference architecture for building scalable electronic market places. During the implementation of our prototypical system, we payed special attention to rely on standardized (or proposed) protocols such as: XML [BPSMM00] and XML Schema [XML00] for all data being processed, SQL for querying data [2], X.509 certificates [HFPS00] and XML Signature [XML01] for authentication, and HTTP [FGM+99] and SOAP [BEK+00] for exchanging data between multiple hosts. Figure 2 depicts the basic components of the system, that can be described as follows:
| Figure 2: The Architecture of the QueryFlow System |
We demonstrate the HyperQuery technique with a scenario of the car manufacturing industry. We assume a hierarchical supply chain of suppliers and sub-contractors. A typical process of e-procurement to cover unscheduled demands of the production is to query a market place for these products and to select the incoming offers by price, terms of delivery, available quantity, etc. The price of the needed products can vary by customer/supplier-specific sales discounts, the quantity of materials to be provided, duties, plant utilization, etc.
In traditional distributed query processing systems such a query can only be executed if a global schema exists or all local databases are replicated at the market place. Considering an environment, where hundreds of suppliers participate in a market place, one global query which integrates the sub-queries for all participants would be too complex and error-prone.
Following our approach the suppliers have to register their products
at the market place, which they want to participate in, and specify
by hyperlinks the sub-plans to compute the price information at
their sites.
These hyperlinks to sub-plans are embedded as virtual attributes into
the tables of the market place. For instance
hq://supplier1.com/Price?ProdID=1255 would refer to
supplier1.com and request at the remote host the
calculation of ProdID=1255 with the sub-plan named
Price.
The SQL-like query
select p.ProductDescription, c.Supplier, c.Price from NeededProducts p, Catalog@MarketPlace c where p.ProductDescription = c.ProductDescription order by p.ProductDescription, c.Price expires Friday, May 18, 2001 5:00:00 PM CETreturns the prices and suppliers of all needed products. The query execution is stopped at the latest at the given value of the expires clause. Only the results gathered so far are considered.
When evaluating these hyperlinks, our QueryFlow system distinguishes between two modes: In hierarchical mode (Figure 3(a)) the initiator of a HyperQuery is in the charge of collecting the processed data. Under broadcast mode (Figure 3(b)) data objects are routed directly to the query initiator. The two basic patterns for both modes are shown in Figure 4(a)/(b), where the smaller boxes represent HyperQueries and the Dispatch operator is responsible for routing objects to the HyperQuery given by the hyperlink. The decision, which processing mode is used, relies-with some restrictions-only to the initiator of a HyperQuery. Thus, the initiator determines, if the results should be sent directly to the client, or if the initiator is in charge of collecting the objects processed by the HyperQueries. During the execution trace of one query both processing modes can be mixed and nested to obtain more complex, multi-level scenarios. So, HyperQueries may be arbitrary complex and involve sub-contractors, too. As our system is written in Java and provides secure extensibility, user-defined operators can be integrated into the query plan and are loaded on demand. Thus, special wrappers for accessing legacy systems, applications, JDBC databases, XML data sources, or even human input can be used within the HyperQueries. All our operators are tuned for processing mass data, are pipelined, and the communication operators work push-based, i.e., objects are sent to the next sub-plan when the local processing is finished. A detailed description of HyperQuery processing including security, optimization issues, and implementation details can be found in [KW01a,KW01b].
|
|
| (a) Hierarchical Processing | (b) Broadcast Processing |
| Figure 3:Execution
Traces (The dashed (red) lines indicate the flow of control and intermediate results, the solid (black) lines indicate the flow of result objects) |
|
|
|
| (a) Hierarchical Mode | (b) Broadcast Mode |
| Figure 4:Patterns for HyperQuery Execution | |
The execution of the query in our QueryFlow system is demonstrated here.
The following publications are avaliable:
| BEK+00 | D. Box, D. Ehnebuske, G. Kakivaya, A. Layman,
N. Mendelsohn, H. F. Nielsen, S. Thatte, and D. Winer. Simple Object Access Protocol (SOAP) 1.1. http://www.w3.org/TR/SOAP, May 2000. |
| BKK+01 | R. Braumandl, M. Keidl, A. Kemper, D. Kossmann, A. Kreutz, S. Seltzsam, and K. Stocker. ObjectGlobe: Ubiquitous query processing on the Internet. The VLDB Journal: Special Issue on E-Services, 10(3):48--71, August 2001. |
| BPSMM00 | T. Bray, J. Paoli, C. M. Sperberg-McQueen, and E. Maler. Extensible Markup Language (XML) 1.0 (Second Edition). http://www.w3.org/XML/, October 2000. |
| CFR+01 | D. Chamberlin, D. Florescu, J. Robie, J. Simeon, and M. Stefanescu. XQuery: A Query Language for XML. http://www.w3.org/TR/xquery/, June 2001. |
| FGM+99 | R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee. Hypertext Transfer Protocol -- HTTP/1.1. ftp://ftp.isi.edu/in-notes/rfc2616.txt, June 1999. |
| HFPS99 | R. Housley, W. Ford, W. Polk, and D. Solo. Internet X.509 Public Key Infrastructure Certificate and CRL Profile. http://www.rfc-editor.org/rfc/rfc2459.txt, January 1999. |
| Jhi00 | A. Jhingran. Moving up the food chain: Supporting E-Commerce Applications on Databases. ACM SIGMOD Record, 29(4):50--54, December 2000. |
| KW01a | A. Kemper and C. Wiesner. HyperQueries: Dynamic Distributed Query Processing on the Internet. In Proc. of the Conf. on Very Large Data Bases (VLDB), pages 551-560, Rome, Italy, September 2001. |
| KW01b | A. Kemper and C. Wiesner. HyperQueries: Dynamic Distributed Query Processing on the Internet. Technical report, Universität Passau, Fakultät für Mathematik und Informatik, October 2001. |
| XML00 | XML Schema, April 2000. http://www.w3.org/xml/Schema. |
| XML01 | XML Signature, August 2001. http://www.w3.org/TR/2001/PR-xmldsig-core-20010820/. |
[2] We plan to support XQuery [CFR+01], too.