In our previous installments we implemented two of the most well-known self-balancing binary search trees: AVL and Red-black trees.
We had a few classes on AVL trees in our basic data structures & algorithms class back in college, which made its implementation far less of a challenge than the Red-black tree. So besides the fundamental guidance of CLRS I had to do quite some googling to get it working. While googling I noticed there were quite a lot of questions about which (AVL or RB) tree was “better” in some sense, be it insertion, search time, deletion time, etc. Most textbooks and articles dismiss this question just by stating the factor differences in either trees’ worst case heights, as we briefly mentioned in the past installment. If you’re anything like me, however, you’ll want to see some comparisons where the trees are actually tested. So I decided to do some simple benchmarking to test those theoretical worst-cases. Here’s what I found out.
First off, we need at least two cases: worst and average case. As we know from the previous installments, the worst possible case for BST insertion is when you are inserting continuously increasing or decreasing values, e.g. 1, 2, 3, 4, … . In this case, a pure BST would behave exactly like a (doubly) linked list, while self-balancing trees should should spread out node distribution. The worst possible searches would be the top or bottom values, i.e. those close to the end of the “list”: a pure BST would have to traverse the entire list (n time), while self-balancing trees should enjoy a
time with some factor k.
What would an “average case” look like? Hard to say; depend on what is average for your application. It might just be the case that sequences are the average case. Since we can’t define a “universal” average case and for the sake of simplicity, we will define the average case as a sequence of random numbers drawn from C’s rand() function (one might argue that this is actually the “best” case since on the long run the BST will “naturally” become quite reasonably balanced, but let’s not get picky about terminology).
Figure 1
Continue reading “Trees, part IV – Benchmarking Red-black and AVL trees” →