NGenerics Overview - Binary Trees and Binary Search Trees

Previous instalments

General Data Structures

HashList

Trees

GeneralTree and the Visitor Pattern

If you haven’t read my previous post on General Trees, it might be a good idea to do so before reading further.

A Binary Tree is a refined version of the General Tree that limits the number of children each node can have - two to be exact. Child nodes are thus referred to as the Left and Right child nodes. The BinaryTree<T> class in NGenerics provides an implementation of a Binary Tree, which, when simplified, looks something like this :

public class BinaryTree
{
    BinaryTree Left { get; set; }
    BinaryTree Right { get; set; }

    T Data { get; set; }
}

We can represent a simple binary tree visually like this:

Binary tree

Binary Trees support pre-order and post-order traversal as discussed in my previous post, and also support in-order traversal. An in-order traversal visits the left node, the root node and lastly the node on the right. When supplied with an InOrderVisitor instance, the visitor will visit tree nodes in the order (7, 4, 8, 2, 5, 1, 3, 9, 6, 10) :

In order traversal

As with General Trees, the BinaryTree<T> class supports breadth first traversal and has ICollection<T> semantics.

Binary Search Trees

Search trees attempt to address the problem of reducing the number of comparisons in finding items. While not as efficient as Hash tables with their O(1) performance, they do cut down search times over larger data sets. Search trees are able to do this by adding constraints to tree data structures. A Binary Search Tree is the simplest of search trees with the following rules in place:

Each value in the tree is distinct from any other (unique).
Each left child node has a value smaller than the value of its parent.
Each right child node has a value larger than the value of its parent.

The BinarySearchTree<TKey, TValue> class in NGenerics implements the IDictionary<TKey, TValue> interface for simple lookups based on keys, and supports the usual traversal algorithms. Re-using our example tree and changing it to comply to the rules could result in the following tree:

Binary tree

It’s easy to see how a Binary Tree can cut down search times - when using a sequential data structure (array, list), or when manually searching an unordered tree like the plain BinaryTree, we potentially need to search through every item in the collection before we find it with a worst-case performance of O(n). With Binary Search Trees we can cut down that time by comparing node values with the search value and deciding on which branch to continue searching on - enough to make searching a Binary Trees on average an O(log n) operation. To illustrate, searching for the value “3” in our sample tree will result in 4 comparisons :

Binary search tree

Although Binary Trees are not nearly as efficient as hash tables with regards to searching (assuming that you’ve implemented a proper hash function), Binary Search Trees (and most other search trees) provide an enormous benefit over hash tables in that we can traverse items in sorted order without any preprocessing. We can use an in-order traversal to yield items in sorted order:

In order traversal

Binary Trees do suffer from a couple of worst-case performance scenarios where there performance could degrade to O(n) in searches. Adding items to the tree in ascending order will result in the following worst-case scenario:

Binary search tree worst case

Adding items in descending order will result in a tree that grows in the opposite direction, with the same devastating effect. Self-balancing trees address this problem but that’s a topic for a future post. If the data you’re pumping into the tree is random you’ll probably never hit this situation - but if you have data with some order in it a self-balancing tree might be a better fit.

Photo by Andrew Haimerl on Unsplash

ngenerics datastructures

See also