Trees (5 hrs)
A tree is a hierarchical data structure used in computer science to represent relationships among elements in a way that mimics natural hierarchies. Trees consist of nodes connected by edges, and they are widely used for searching, sorting, and representing hierarchical data, such as organizational structures, file systems, and more.
1. Introduction to Trees
In computer science, a tree is a collection of nodes where each node contains a value and has references (or pointers) to its child nodes. A tree structure is used for various applications such as representing hierarchical structures, expression parsing, database indexing, and more.
Basic Tree Terminology:
- Root: The top node of the tree, which does not have any parent.
- Node: A fundamental unit that stores data and links to child nodes.
- Edge: A connection between two nodes.
- Leaf: A node with no children.
- Subtree: A tree formed by a node and all of its descendants.
- Parent: A node that has child nodes.
- Child: A node directly connected to another node when moving away from the root.
- Sibling: Nodes that share the same parent.
Types of Trees:
- Binary Tree: A tree where each node has at most two children (left and right).
- Binary Search Tree (BST): A binary tree where the left child is smaller than the parent node, and the right child is larger.
- AVL Tree: A self-balancing binary search tree where the height of the two child subtrees of every node differs by no more than one.
- B-Tree: A self-balancing search tree designed for systems that read and write large blocks of data.
2. Basic Operations in Binary Tree
A binary tree is a tree where each node has at most two children, typically referred to as the left child and the right child. Basic operations performed on binary trees include:
Insertion:
- Inserting a node into a binary tree generally involves finding the appropriate place for the new node. In a binary search tree, for example, the node is inserted in the correct position based on its value.
class TreeNode:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
def insert(root, value):
if root is None:
return TreeNode(value)
if value < root.value:
root.left = insert(root.left, value)
else:
root.right = insert(root.right, value)
return root
Search:
- Searching for a node in a binary search tree involves comparing the value to be searched with the root node. Based on the comparison, we either move left or right recursively.
def search(root, value):
if root is None or root.value == value:
return root
if value < root.value:
return search(root.left, value)
return search(root.right, value)
Deletion:
- Deletion in a binary tree can be complex. If the node to be deleted has:
- No children: Simply remove the node.
- One child: Replace the node with its child.
- Two children: Replace the node with its in-order successor (or predecessor).
3. Tree Search and Insertion/Deletion
Tree Search:
- Search is the process of finding a specific value in the tree.
- In a binary search tree, the search follows the property that all nodes on the left subtree are smaller and all nodes on the right subtree are larger than the root node. This allows for efficient search operations.
Insertion and Deletion:
- Insertion and Deletion in a binary search tree are recursive operations that maintain the tree’s structure. Insertion involves placing the new value in the correct position, while deletion involves re-arranging the tree to maintain its properties.
4. Binary Tree Traversals
Tree traversal is the process of visiting all the nodes in a tree and performing some operation on each. There are three common types of traversals in a binary tree:
1. Pre-order Traversal:
In pre-order traversal, the root node is processed first, followed by the left subtree, and then the right subtree.
def pre_order_traversal(root):
if root:
print(root.value, end=' ')
pre_order_traversal(root.left)
pre_order_traversal(root.right)
2. In-order Traversal:
In in-order traversal, the left subtree is processed first, followed by the root node, and then the right subtree. This traversal is particularly useful in binary search trees, as it processes the nodes in ascending order.
def in_order_traversal(root):
if root:
in_order_traversal(root.left)
print(root.value, end=' ')
in_order_traversal(root.right)
3. Post-order Traversal:
In post-order traversal, the left subtree is processed first, followed by the right subtree, and then the root node. This is useful for tasks like deletion, where you need to delete the children before the parent.
def post_order_traversal(root):
if root:
post_order_traversal(root.left)
post_order_traversal(root.right)
print(root.value, end=' ')
5. Tree Height, Level, and Depth
-
Height of a Tree: The height of a tree is the length of the longest path from the root to any leaf. The height of a tree with no nodes is considered to be -1, and the height of a tree with only one node is 0.
-
Level of a Node: The level of a node is the number of edges from the root to the node.
-
Depth of a Node: The depth of a node is the same as the level of the node. It represents the distance from the root to the node.
def height(root):
if root is None:
return -1
left_height = height(root.left)
right_height = height(root.right)
return max(left_height, right_height) + 1
6. Balanced Trees: AVL Trees
A balanced tree is a tree structure in which the height of the left and right subtrees of any node differ by at most one. This ensures that the tree remains balanced, leading to more efficient operations (search, insertion, and deletion).
AVL Trees:
An AVL tree (named after its inventors Adelson-Velsky and Landis) is a self-balancing binary search tree. It ensures that the height difference between the left and right subtrees of any node is no more than one. If this balance property is violated, rotations are used to restore balance.
Balancing Algorithm:
- Left Rotation: Used when the right subtree is taller than the left.
- Right Rotation: Used when the left subtree is taller than the right.
- Left-Right Rotation: A combination of left and right rotations used when there is a left-heavy subtree on the right side.
- Right-Left Rotation: A combination of right and left rotations used when there is a right-heavy subtree on the left side.
# Simple example of rotations in AVL trees (not a complete implementation)
def left_rotate(x):
y = x.right
x.right = y.left
y.left = x
return y
7. Huffman Algorithm
The Huffman coding algorithm is used for lossless data compression. It assigns variable-length codes to input characters, with shorter codes assigned to more frequent characters. The algorithm uses a binary tree, known as a Huffman tree, to assign these codes.
Steps:
- Build a frequency table for the characters.
- Construct a priority queue (min-heap) where the lowest frequencies are given the highest priority.
- Build the Huffman tree by repeatedly merging the two smallest trees in the queue.
- Assign binary codes based on the tree structure.
8. Game Trees
A game tree is a tree representation of possible moves in a game. The nodes represent game states, and the edges represent player moves. Game trees are widely used in algorithms for games like chess, checkers, and tic-tac-toe.
Minimax Algorithm:
The minimax algorithm is used to minimize the possible loss for a worst-case scenario. It is a recursive algorithm for choosing the optimal move in a two-player game. The algorithm simulates all possible moves and selects the best one.
9. B-Trees
A B-tree is a self-balancing tree data structure that maintains sorted data and allows for efficient insertion, deletion, and search operations. B-trees are used in databases and file systems due to their efficiency in handling large amounts of data.
B-tree Properties:
- Each node can have multiple children (more than two).
- Nodes store multiple keys, which help in narrowing down searches.
- All leaf nodes are at the same level, ensuring balanced growth.
Conclusion
Trees are a fundamental data structure that plays a crucial role in organizing and accessing data efficiently. Operations like search, insertion, and deletion are optimized in tree structures such as binary trees and B-trees. Advanced topics like AVL trees, Huffman coding, and game trees provide practical solutions for more complex computational problems, ensuring faster processing, better data management, and improved performance across a wide range of applications.