## Trees, Part II: AVL Tree

Masters classes started a few weeks ago, taking their toll on my productivity here. Sorry about that!

So we (pardon the nosism, but I think it sounds less egocentric than writing “I” all the time) hinted at AVL trees back on our Trees, Part I post. Specifically, we learned that:

a binary search tree (BST), provides O(h) time search, insert and delete operations (h is the tree height.

Linear time (O(h)) doesn’t sound very good – if h is close to n, we’ll have the same performance as a linked list. What if there were a way to bound the tree height to some sub-linear factor? As it turns out, there are several ways to do so, and the general idea of somehow keeping the tree height limited to a certain factor of the number of elements it holds is called height balancing. Ergo we’ll want to look into (height) balanced/self-balancing binary search trees (BBST)

```
Burger

M
.   .
.       .
.           .
.               .
E .                 P .
.     .                   .
.         .                   .
.             .                   .
D .                 I                   Y
.
.
.
.
F
```

AVL tree

Since binary search trees have at most two children, the best tree height (i.e. smallest) we can achieve is log2 n (n being the number of elements in the tree). There are several self-balancing BSTs developed over the years. It seems that up there in the US college professors tend to prefer the red-black tree when studying BBSTs, whilst over here AVL is preferred. In any case, AVL tree was the first BBST ever devised, so we’ll adopt it as our BBST model.

AVL trees (named after its two Soviet inventors Adelson-Velsky and Landis) use a series of rotations to keep the tree balanced. To keep track of when a certain subtree rooted at some node needs to be rotated, we maintain (or calculate) a balance factor variable for each node, which is the difference between the node’s left and right children’s heights, i.e.:

balance_factor(n) = n.left_child.height – n.right_child.height

## Shortest path, part I – Dijkstra’s algorithm

Now that we have a way to represent graphs, we can discuss one of the most important problems in graph theory: the shortest path problem (SPP). More or less formally, we’ll define SPP as:

Given a weighted graph G(V,E), find the sequence P = {v0, v1, v2, …, v(n-1)}, vi ∈ V, from vertex V0 to vertex V(n-1), such that the list of edges EP = {(v0,v1), (v1,v2), … (v(n-2), v(n-1))} exists and the summation of costs of all elements e ∈ EP is the smallest possible.

In other words, find the less expensive (ergo “shortest”) path between two vertices.

The trivial solution is using BFS starting at vertex A and stopping when it reaches vertex B. However, BFS doesn’t look at the edge costs: it calculates the path with least edges, not the path with least total cost.

Although not necessarily the fastest, Dijkstra’s algorithm is probably the most popular way to solve the shortest path problem due to its simplicity and elegance. The algorithm relies heavily on priority queues, so make sure to take a look at that if you haven’t already.

Pseudocode

```dist[from] = 0
for v : G
if v != source
dist[v] = infinity
prev[v] = -1