|
TU München - Fakultät für
Informatik
Lehrstuhl III: Datenbanksysteme
|
|
The BestmatchJoin Operator
Introduction
The Internet constitutes a huge distributed information source. New
expressive query operators are needed to generate "interesting"
data combining these data sources. A common problem is finding best
matching pairs of data objects given user-defined criteria.
Traditional techniques (e.g., nearest-beighbor search, closest-point
search, or methods based on rating functions) do not give satisfying
results, because a single "best" pair cannot be determined, since
diverse pairs, each being best in different aspects of the comparison,
are interesting. For example, a closest-point search only produces
those pairs, which have the minimum distance, but not pairs being best
in one dimension and being worse in other dimensions.
This drawback of traditional techniques is solved by our novel class
of BestmatchJoin-operators. The result of these operators are pairs of
objects, which are not dominated by other pairs. A pair of objects is
dominated by another pair of objects, if it is worse in all comparison
dimensions. Hence, all comparison dimensions are treated equally. The
domination of pairs is represented by a partial order, which consists
of user-defined comparison functions on the individual dimensions. For
instance, the similarity on a specific dimension can be calculated by
the distance, set inclusion, etc.

Figure 1: Result of the BestmatchJoin Operator
Figure 1 shows an example for a 2-dimensional BestmatchJoin of two
sets. The points of the two input datasets are marked by circles and
crosses. One pair dominates another, if the distance in both
dimensions is smaller. The grey shaded pairs depict the result of the
BestmatchJoin operator.
Application Scenarios
Some sample applications scenarios are for instance:
- Matching of job seekers with job opportunities based on
user-defined criteria like achieved and needed qualifications,
distance to travel, degrees, experience, etc. Using the
BestmatchJoin not only the best job seeker for a job opportunity
is found, but all incomparable (and therefore equal) job
seekers for a special job opportunity are computed.
- Assembling a composite part. The composite part shall be made
of subparts, each of which is provided by different vendors
with different tolerances, price, etc. The composite part shall for
instance be rated by the overall tolerance and the price. The
perfect composite part might not exist, but using the
BestmatchJoin the cheapest composite part with a high
tolerance, a more expensive one with a better tolerance, etc. can
be computed.
Research Topics
- Variants of the BestmatchJoin: Outer-BestmatchJoin-Variants,
BestmatchJoin with restrictions
- Algorithms for the class of BestmatchJoin-Operators
- Algorithms for computing BestmatchJoin-Operators on streams
- Optimization and Filtering techniques
People
Lehrstuhl für Datenbanksysteme
Letzte Änderung:
25.05.2005 um 14:37:48