Assumptions: Cost to generate the output relation are not considered; that means, the minimum required buffer size is 2 blocks (one block for each of the two relations to be joined). The two relations to be joined are R and S, and B(S) and B(R) denote the number of blocks of S, and R, and T(S) and T(R) denote the number of tupels of R and S, respectively. Moreover, we assume B(S)<=B(R); moreover, we assume that R has a hashed index on the join attribute, and that each value of R's join attribute occurs with frequency f.
Join Variant | Buffersize | Approx. Cost |
---|---|---|
Block Nested Loop Join | B(S)+1 | B(R)+B(S) |
Block Nested Loop Join | M>=2 | B(R)*B(S)/(M-1) |
Simple Sort Merge Join | sqrt(B(S)) | 5*(B(R)+B(S)) |
Sort Merge Join | sqrt(B(S)+B(R)) | 3*(B(R)+B(S)) |
Hash Join | sqrt(B(S)) | 3*(B(R)+B(S)) |
Hybrid Hash Join | sqrt(B(S)) | (3 - 2M/B(S))*(B(R)+B(S)) |
Hash Index-based Join | M>=3 | B(S) + T(S)*(1+f) |