I/O Cost and Buffer Requirements for Various Join Implementation

Author: Christoph F. Eick
Source (in part): He. Garcia-Molina,...:"Database System Implementation"

Assumptions: Cost to generate the output relation are not considered; that means, the minimum required buffer size is 2 blocks (one block for each of the two relations to be joined). The two relations to be joined are R and S, and B(S) and B(R) denote the number of blocks of S, and R, and T(S) and T(R) denote the number of tupels of R and S, respectively. Moreover, we assume B(S)<=B(R); moreover, we assume that R has a hashed index on the join attribute, and that each value of R's join attribute occurs with frequency f.

Join VariantBuffersizeApprox. Cost
Block Nested Loop JoinB(S)+1B(R)+B(S)
Block Nested Loop JoinM>=2B(R)*B(S)/(M-1)
Simple Sort Merge Joinsqrt(B(S))5*(B(R)+B(S))
Sort Merge Joinsqrt(B(S)+B(R))3*(B(R)+B(S))
Hash Joinsqrt(B(S))3*(B(R)+B(S))
Hybrid Hash Joinsqrt(B(S))(3 - 2M/B(S))*(B(R)+B(S))
Index Nested Loop JoinM>=3B(S) + T(S)*(1+f)

Database Systems and Joins they support

Database System Supported Joins
Sybase ASEindex nested loop, sort merge.
Sybase ASIQblock nested loop, index nested loop, hash, sort merge, join indexes.
Oracle 8block nested loop, sort merge, variant hybrid hash.
IBM DB2block nested loop, sort merge, hybrid hash.
Microsoft SQL Serverblock nested loop, index nested loop, hash, sort merge, hash teams.
Informixblock nested loop, index nested loop, hybrid hash.

last updated: March 23, 2000, 2:22p