Tree Data Structure Java Library

Description

This Library contains different implementations of the tree data structures, such as K-ary, binary, expression trees etc.

Requirements

The Library requires Java SE Development Kit 7 or higher

Gradle dependency

dependencies {
	compile 'com.scalified:tree:0.2.5'
}

Theory

Definition

A tree data structure can be defined recursively (locally) as a collection of nodes (starting at a root node), where each node is a data structure consisting of a value, together with a list of references to nodes (the children), with the constraints that no reference is duplicated, and none points to the root

A tree is a (possibly non-linear) data structure made up of nodes or vertices and edges without having any cycle. The tree with no nodes is called the null or empty tree. A tree that is not empty consists of a root node and potentially many levels of additional nodes that form a hierarchy

Terminology

Node - a single point of a tree Edge - line, which connects two distinct nodes Root - top node of the tree, which has no parent Parent - a node, other than the root, which is connected to other successor nodes Child - a node, other than the root, which is connected to predecessor Leaf - a node without children Path - a sequence of nodes and edges connecting a node with a descendant Path Length - number of nodes in the path - 1 Ancestor - the top parent node of the path Descendant - the bottom child node of the path Siblings - nodes, which have the same parent Subtree - a node in a tree with all of its proper descendants, if any Node Height - the number of edges on the longest downward path between that node and a leaf Tree Height - the number of edges on the longest downward path between the root and a leaf (root height) Depth (Level) - the path length between the root and the current node Ordered Tree - tree in which nodes has the children ordered Labeled Tree - tree in which a label or value is associated with each node of the tree Expression Tree - tree which specifies the association of an expression's operands and its operators in a uniform way, regardless of whether the association is required by the placement of parentheses in the expression or by the precedence and associativity rules for the operators involved Branching Factor - maximum number of children a node can have Pre order - a form of tree traversal, where the action is called firstly on the current node, and then the pre order function is called again recursively on each of the subtree from left to right Post order - a form of tree traversal, where the post order function is called recursively on each subtree from left to right and then the action is called

Trees Representation

Array-of-Pointers

One of the simplest ways of representing a tree is to use for each node a structure, which contains an array of pointers to the children of that node The maximum number of children a node can have is known as branching factor (BF) The ith component of the array at a node contains a pointer to the ith child of that node. A missing child can be represented by a null pointer

Array-of-Pointers Representation

Array-of-Pointers representation makes it possible to quickly access the ith child of any node This representation, however, is very wasteful of space when only a few nodes in the tree have many children. In this case, most of the pointers in the arrays will be null

The Library, however, is optimized for using Array-of-Pointers representation likewise it is done in the java.util.ArrayList. The subtrees size grows automatically in case of reaching the limit. Additionally you may specify what is called a branching factor, that the initial number of subtrees before grow (like capacity in the java.util.ArrayList)

Leftmost-Child—Right-Sibling

In the Leftmost-Child—Right-Sibling representation each node contains a pointer to its leftmost child; a node does not have pointers to any of its other children. In order to locate the second and subsequent children of a node n, the children link list created, in which each child c points to the child of n immediately to the right of c. That node is called the right sibling of c

Leftmost-Child-Right-Sibling Representation

In the figure above n3 is the right sibling of n2, n4 is the right sibling of n3, and n4 has no right sibling. The children of n1 can be found by following its leftmost-child pointer to n2, then the right-sibling pointer to n3, and then the right-sibling pointer of n3 to n4. n4, in turn has no right-sibling pointer (it is null indeed), therefore n1 has no more children The figure above can be represented in such a way:

Leftmost-Child-Right-Sibling Another Representation

The downward arrows are the leftmost-child links; the sideways arrows are the right-sibling links

Recursions on Trees

Recursion on Trees

Traversal

Traversal is a process of visiting each node in a tree data structure, exactly once, in a systematic way

Pre order is a form of tree traversal, where the action is called firstly on the current node, and then the pre order function is called again recursively on each subtree from left to right Pre order listing on the figure is: + a − b c d

Post order is a form of tree traversal, where the post order function is called recursively on each subtree from left to right and then the action is called Post order listing on the figure is: a b c − d * +

Evaluating Expression Tree

There are three types of expressions: infix, prefix and postfix

Infix are expressions in the ordinary notation, where binary operators appear between their operands. For the figure above the infix expression is: a + (b − c) * d. The equivalent prefix and postfix expressions are: + a − b c d for prefix and a b c − d * + for postfix Having either the prefix or postfix expression the infix expression can be constructed in a simple way. For instance, in order to construct the infix expression from prefix one, firstly the operator that is followed by the required number of operands with no embedded operators must be found In prefix expression + a * − b c d the − b c is such a string, since the minus sign, like all operators in running example takes two operands. This subexpression is then replaced by a new symbol, say x = − b c. Then, this process of identifying an operator followed by its operands repeated again. Now the expression is: + a * x d. The next subexpression will be y = * x d. This reduces the target expression to + a y, which is lastly converted to just a + y The remainder of the infix expression a + y then is reconstructed by retracing the steps above. Firstly, the y is substituted to x * d, changing the target expression to: a + (x * d). And lastly the x is replaced by − b c or simply b − c. So the target expression now is: a + ((b − c) * d) or a + (b − c) * d The conversion from postfix to infix expression uses the same algorithm. The only difference is that firstly the operator, which is preceded by the requisite number of operands must be found in order to decompose the postfix expression

Structural Induction

Structural Induction is a proof method that is used to prove some statement S(T) is true for all trees T

Structural Induction

For the basis S(T) must be shown to be true when T consists of a single node. For the induction, T is supposed to be a tree with root r and children c1, c2, …, ck, for some k ≥ 1. If T1, T2, …, Tk are the subtrees of T whose roots are c1, c2, …, ck respectively, then the inductive step is to assume that S(T) is true for all trees T. Structural Induction does not make reference to the exact number of nodes in a tree, except to distinguish the basis (one node) from the inductive step (more than one node)

The general template for Structural Induction has the following outline:

Specify the statement S(T) to be proved, where T is a tree
Prove the basis, that S(T) is true whenever T is a tree with a single node
Set up the inductive step by letting T be a tree with root r and k ≥ 1 subtrees, T1, T2, …, Tk. State that the inductive hypothesis assumed: S(Ti) is true for each of the subtrees Ti, i = 1, 2, …, k
Prove that S(T) is true under the assumptions mentioned in step 3

Structural Induction is generally needed to prove the correctness of a recursive program that acts on trees

As an example of using the Structural Induction to prove the correctness of statement the evaluate function (listed

Tree

Install / Use

README

Tree Data Structure Java Library

Description

Requirements

Gradle dependency

Theory

Definition

Terminology

Trees Representation

Array-of-Pointers

Leftmost-Child—Right-Sibling

Recursions on Trees

Traversal

Evaluating Expression Tree

Structural Induction