Order-Preserving Hash Joins: Sorting (Almost) For Free


download complete text (postscript, 180k)

Authors:
Jens Claußen
Alfons Kemper
Donald Kossmann
Abstract:
Database systems must be able to produce ordered query results; either to pass them to application programs or end users or to compute standard or modern database operations for decision support such as aggregation, moving sums, moving averages, top N or bottom N. In this paper, we present a new approach that significantly speeds up the ability of database systems to produce ordered query results for join queries. We call the key element used in this approach order-preserving hash joins (or OHJ for short). Just like the well-known (index) nested-loop join method, OHJ preserves the order of one of its input tables, and thus, OHJ makes it possible to use indices or early sort operations in order to carry out the ordering part of a query very cheaply. Other than nested-loop joins, however, OHJs show good performance even if the tables involved in the query are very large, and thus, OHJs are also able to carry out the joining part of a query efficiently. We discuss a series of examples for which order-preserving hash joins are applicable and present the results of performance analyses which demonstrate that order-preserving hash joins significantly speed up the execution of many important classes of decision support queries.

Last modified: Wed Nov 10 12:00:48 MET 1999