Wednesday, February 9, 2011

kd-tree

In computer science, a kd-tree (short for k-dimensional tree) is a space-partitioning data structure for organizing points in a k-dimensional space. kd-trees are a useful data structure for several applications, such as searches involving a multidimensional search key (e.g. range searches and nearest neighbor searches). kd-trees are a special case of BSP trees.
The kd-tree is a binary tree in which every node is a k-dimensional point. Every non-leaf node can be thought of as implicitly generating a splitting hyperplane that divides the space into two parts, known as subspaces. Points to the left of this hyperplane represent the left sub-tree of that node and points right of the hyperplane are represented by the right sub-tree. The hyperplane direction is chosen in the following way: every node in the tree is associated with one of the k-dimensions, with the hyperplane perpendicular to that dimension's axis. So, for example, if for a particular split the "x" axis is chosen, all points in the subtree with a smaller "x" value than the node will appear in the left subtree and all points with larger "x" value will be in the right sub tree. In such a case, the hyperplane would be set by the x-value of the point, and its normal would be the unit x-axis.

This algorithm implemented in the Python programming language is as follows:

         class Node:pass
        def kdtree(pointList, depth=0):
            if not pointList:
                return
            # Select axis based on depth so that axis cycles through all valid values
            k = len(pointList[0]) # assumes all points have the same dimension
            axis = depth % k

            # Sort point list and choose median as pivot element
            pointList.sort(key=lambda point: point[axis])
            median = len(pointList) // 2 # choose median

            # Create node and construct subtrees
            node = Node()
            node.location = pointList[median]
            node.leftChild = kdtree(pointList[0:median], depth+1)
            node.rightChild = kdtree(pointList[median+1:], depth+1)
           return node

pointList = [(2,3), (5,4), (9,6), (4,7), (8,1), (7,2)]
tree = kdtree(pointList)
Another Example:
import numpy
NDIM = 3 # number of dimensions

# read points into array
a = numpy.fromfile('million_3D_points.txt', sep=' ')
a.shape = a.size / NDIM, NDIM

point = [ 69.06310224, 2.23409409, 50.41979143] # use the same point as above
print 'point:', point

from scipy.spatial import KDTree

# find 10 nearest points
tree = KDTree(a, leafsize=a.shape[0]+1)
distances, ndx = tree.query([point], k=10)

# print 10 nearest points to the chosen one
print a[ndx]


♥ ¸¸.•*¨*•♫♪♪♫•*¨*•.¸¸♥
http://en.wikipedia.org/wiki/K-d_tree
http://stackoverflow.com/questions/2486093/millions-of-3d-points-how-to-find-the-10-of-them-closest-to-a-given-point/2486341#2486341

No comments:

Post a Comment