CPPDescent 80d9539 (with uncommitted changes)
A C++ KNN-Graph creation library
Loading...
Searching...
No Matches
Functions
cppdescent Namespace Reference

Functions for the creation of a K-NN graph. More...

Functions

int compareGraphVertices (Pointer vertex1, Pointer vertex2)
 
int compareGraphVertexPairs (Pointer p1, Pointer p2)
 
VectorreadBinData (const char *fp, int dimensions)
 Reads the data from a binary file.
 
void writeBinGraph (const char *fp, Graph *graph, int K)
 Writes a computed graph to a binary file.
 
GraphreadBinGraph (const char *fp, int dimensions)
 Reads a graph from a binary file. To work correctly, the file needs to be first created from the 'writeBinGraph' function.
 
float recall (Graph *bfGraph, Graph *nnGraph, int N, int K)
 Returns the recall of the graph computed by NN-Descent, compared to the brute force graph.
 
float euclideanDistance (Pointer a, Pointer b)
 Returns the Euclidean distance between two points of arbitrary dimension.
 
float manhattanDistance (Pointer a, Pointer b)
 Returns the Manhattan distance between two points of arbitrary dimension.
 
int compareEdgesEuclidean (Pointer first, Pointer second)
 Compare edges using the euclideanDistance function.
 
int compareVertexPairsEuclidean (Pointer first, Pointer second)
 
int compareEdgesManhattan (Pointer first, Pointer second)
 Compare edges using the manhattanDistance function.
 
GraphKNNBruteForceGraph (Vector *data, int K, CompareFunc compare)
 Computes the K-NN graph using brute force.
 
GraphNNDescent_KNNGraph (Vector *data, int K, int D, int Trees, float delta, float rho, DistanceFunc distance)
 Computes the K-NN graph for the given dataset using the NN-Descent algorithm.
 
PQueueNNDescent_Query (Graph *graph, int K, CompareFunc compare, Vector *query)
 Computes the K Nearest Neighbors of the query point in the graph.
 
void RPTree (Graph *graph, Vector *vec, int K, int D, int dimensions)
 Creates a graph using random projection trees.
 

Detailed Description

Functions for the creation of a K-NN graph.

Function Documentation

◆ compareEdgesEuclidean()

int cppdescent::compareEdgesEuclidean ( Pointer  first,
Pointer  second 
)

Compare edges using the euclideanDistance function.

Parameters
firstA Pointer to the first element.
secondA Pointer to the second element.
Returns
int

◆ compareEdgesManhattan()

int cppdescent::compareEdgesManhattan ( Pointer  first,
Pointer  second 
)

Compare edges using the manhattanDistance function.

Parameters
firstA Pointer to the first element.
secondA Pointer to the second element.
Returns
int

◆ euclideanDistance()

float cppdescent::euclideanDistance ( Pointer  a,
Pointer  b 
)

Returns the Euclidean distance between two points of arbitrary dimension.

Parameters
firstA pointer to the first point.
secondA pointer to the second point.
Returns
long double The Euclidean distance.

◆ KNNBruteForceGraph()

Graph * cppdescent::KNNBruteForceGraph ( Vector data,
int  K,
CompareFunc  compare 
)

Computes the K-NN graph using brute force.

Useful for testing the recall and the performance of our NN-Descent algorithm.

The user is responsible deletion of the allocated memory of the graph.

Parameters
dataA pointer to the parent N-sized vector.
KThe number of Nearest Neigbors to find.
compareThe function to use to compare the distances.
Returns
Graph* A pointer to the optimal K-NN graph.

◆ manhattanDistance()

float cppdescent::manhattanDistance ( Pointer  a,
Pointer  b 
)

Returns the Manhattan distance between two points of arbitrary dimension.

Parameters
firstA pointer to the first point.
secondA pointer to the second point.
Returns
long double The Manhattan distance.

◆ NNDescent_KNNGraph()

Graph * cppdescent::NNDescent_KNNGraph ( Vector data,
int  K,
int  D,
int  Trees,
float  delta,
float  rho,
DistanceFunc  distance 
)

Computes the K-NN graph for the given dataset using the NN-Descent algorithm.

The user is responsible deletion of the allocated memory of the returned graph.

Parameters
dataA Vector of the vertices of the graph.
KThe number of nearest neighbors to compute.
KThe number of nearest neighbors to compute.
KThe number of nearest neighbors to compute.
deltaThe iterations will stop when the number of edges that were updated is less than delta*N*K.
rhoThe sampling rate
distanceThe function to be used to compute the distances between vertices.
Returns
Graph* The complete K-NN graph of the dataset.

◆ NNDescent_Query()

PQueue * cppdescent::NNDescent_Query ( Graph graph,
int  K,
CompareFunc  compare,
Vector query 
)

Computes the K Nearest Neighbors of the query point in the graph.

Parameters
graphThe K-NN Graph, in which we search for neighbors of the query.
K
compareThe function used to compare edges.
queryThe query point, given as a Vector. Must be of the same dimensions as the rest points of the graph.
Returns
PQueue* A priority queue of the K-NN of the query, with 'max' being the most distant point relative to the query. It contains GraphVertexPairs from the query point directed to the other nodes. If the query point is of wrong dimensions, then nullptr is returned.

◆ readBinData()

Vector * cppdescent::readBinData ( const char *  fp,
int  dimensions 
)

Reads the data from a binary file.

The data read are stored in a N-sized Vector, where N is the <uint32_t> at the start of the file. Each node of this vector is a pointer to another 100-sized vector, of which the values are pointers to floats.

All of the vectors (the parent N-sized and each of the 100-sized) must be freed by the user.

Parameters
fpA string (char *) of the filepath to the binary dataset.
dimensionsThe dimension of each point.
Returns
Vector* An N-sized vector with vectors for each element.

◆ readBinGraph()

Graph * cppdescent::readBinGraph ( const char *  fp,
int  dimensions 
)

Reads a graph from a binary file. To work correctly, the file needs to be first created from the 'writeBinGraph' function.

Parameters
fpThe filepath.
dimensionsThe dimensions of the datapoints.
Returns
Graph*

◆ recall()

float cppdescent::recall ( Graph bfGraph,
Graph nnGraph,
int  N,
int  K 
)

Returns the recall of the graph computed by NN-Descent, compared to the brute force graph.

Parameters
bfGraph
nnGraph
N
K
Returns
float

◆ RPTree()

void cppdescent::RPTree ( Graph graph,
Vector vec,
int  K,
int  D,
int  dimensions 
)

Creates a graph using random projection trees.

Starts by splitting the dataspace using a hyperplane, and calling itself recursively in each "slice".

The starting graph has no edges, only the vertices.

Parameters
graph
vec
KThe open upper bound of elements in a leaf.
DThe maximum number of elements in a leaf. Must be less than K to result in a connected graph.
dimensionsThe dimensions of each datapoint.

◆ writeBinGraph()

void cppdescent::writeBinGraph ( const char *  fp,
Graph graph,
int  K 
)

Writes a computed graph to a binary file.

The structure of the binary file that will be created is the following:

  1. <uint32_t> N: the number of the vertices
  2. N * 100 floats: each vertex of 100 dimensions
  3. N * K * 1 int: For each vertex sequentially, the positions of its K neighbors.
Parameters
fpThe filepath to the created file.
K
graph