Getting started

In this tutorial, we demonstrate some basic features of Torchhd. We show how the library makes it easy to represent and manipulate information in hyperspace through the fictitious example in the following table:

Record

Fruit

Weight

Season

\(r_1\)

apple

149.0

fall

\(r_2\)

lemon

70.5

winter

\(r_3\)

mango

173.2

summer

Basis-hypervectors

The first step to encode these records is to define the basis-hypervectors for each information set. Since the nature of these information sets is different, so are the basis-hypervector sets used to encode them. We start by defining the number of dimensions of the hyperspace and then using the methods from the torchhd module to create the basis-hypervectors with the apropriate correlation profile:

import torchhd

d = 10000 # dimensions
fruits = torchhd.random(3, d)
weights = torchhd.level(10, d)
seasons = torchhd.circular(4, d)
var = torchhd.random(3, d)

which creates hypervectors for the 3 fruit types, 10 weight levels, 4 seasons and the 3 variables. The figure below illustrates the distance between the pairs of hypervectors in each set:

_images/basis-hvs.png

Similar behavior can be achieved using the classes in the torchhd.embeddings module. The classes add convenience methods for mapping values to hypervectors. For example, to map the interval \([0, 200]\) to the ten weight hypervectors the functional version above requires an explicit mapping to an index:

import torch

weight = torch.tensor([149.0])
# explicit mapping of the fruit weight to an index
w_i = torchhd.value_to_index(weight, 0, 200, 10)
weights[w_i]  # select representation of 149

whereas the embeddings have this common behavior built-in:

from torchhd import embeddings

W_emb = embeddings.Level(10, d, low=0, high=200)
# select representation of 149
W_emb(weight)  # same result as weights[w_i]

Operations

Once the basis-hypervectors are defined, we can use the MAP operations from torchhd to encode more complex objects by combining basis-hypervectors. The hypervector for record \(r_1\) can be created as follows:

f = torchhd.bind(var[0], fruits[0])   # fruit = apple
w = torchhd.bind(var[1], weights[w_i]) # weight = 149
s = torchhd.bind(var[2], seasons[3])   # season = fall
r1 = torchhd.bundle(torchhd.bundle(f, w), s)

which is equivalent to using the following shortened syntax:

r1 = var[0] * fruits[0] + var[1] * weights[w_i] + var[2] * seasons[3]

Data Structures

Alternatively, we can use one of the commonly used encodings provided in the torchhd module. Using these, record \(r_1\) can be encoded as follows:

# combine values in one tensor of shape (3, d)
values = torch.stack([fruits[0], weights[w_i], seasons[3]])
r1 = torchhd.hash_table(var, values)

The torchhd.structures module contains the same encoding patterns in addition to binary trees and finite state automata, but provides them as data structures. This module provides class-based implementations of HDC data structures. Using the hash table class, record \(r_1\) can be represented as follows:

from torchhd import structures

r1 = structures.HashTable(d)  # r1 = 0
r1.add(var[0], fruits[0])     # r1 + var[0] * fruits[0]
r1.add(var[1], weights[w_i])   # r1 + var[1] * weights[w_i]
r1.add(var[2], seasons[3])     # r1 + var[2] * seasons[3]
# query the hash table by key:
fruit = r1.get(var[0])   # r1 * var[0]
weight = r1.get(var[1])  # r1 * var[1]
season = r1.get(var[2])  # r1 * var[2]