verskyt.layers

Neural network layers implementing Tversky similarity computations.

Module: projection

Tversky neural network layers.

Implements TverskySimilarityLayer (Equation 6) and TverskyProjectionLayer (Equation 7) from the paper.

class TverskySimilarityLayer(in_features: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, learnable_theta: bool = False, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', use_contrast_form: bool = False, feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform')[source]

Bases: Module

Tversky Similarity Layer (Equation 6 from paper).

Computes similarity between two objects using learnable feature bank and Tversky parameters (α, β, θ).

S_Ω,α,β,θ(a,b): ℝ^d × ℝ^d → ℝ

__init__(in_features: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, learnable_theta: bool = False, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', use_contrast_form: bool = False, feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform')[source]

Initialize Tversky Similarity Layer.

Parameters:
  • in_features – Dimension of input vectors

  • num_features – Number of features in feature bank (|Ω|)

  • alpha – Initial value for α parameter (weight for a’s distinctive features)

  • beta – Initial value for β parameter (weight for b’s distinctive features)

  • learnable_ab – Whether α and β are learnable parameters

  • learnable_theta – Whether θ is a learnable parameter (only for contrast form)

  • theta – Initial value or constant for numerical stability

  • intersection_reduction – Method for computing feature intersections

  • difference_reduction – Method for computing feature differences

  • use_contrast_form – Use linear combination instead of ratio form

  • feature_init – Initialization method for feature bank

reset_parameters()[source]

Initialize parameters according to specified method.

forward(a: Tensor, b: Tensor) Tensor[source]

Compute element-wise Tversky similarity between objects a and b.

Parameters:
  • a – First object tensor of shape [batch_size, in_features]

  • b – Second object tensor of shape [batch_size, in_features]

Returns:

Similarity scores of shape [batch_size]

class TverskyProjectionLayer(in_features: int, num_prototypes: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', normalize_features: bool = False, normalize_prototypes: bool = False, prototype_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', shared_feature_bank: Parameter | None = None, bias: bool = False)[source]

Bases: Module

A projection layer based on Tversky similarity (Equation 7 from paper).

This layer replaces standard linear projections by computing Tversky similarity between inputs and learned prototype vectors. Unlike linear layers, it can model non-linear functions like XOR with a single layer, making it suitable for complex pattern recognition tasks.

The layer implements: P_Ω,α,β,θ,Π(a): ℝ^d → ℝ^p

Where: - Ω: Learnable feature bank of shape [num_features, in_features] - Π: Learnable prototype vectors of shape [num_prototypes, in_features] - α, β: Asymmetry parameters controlling feature distinctiveness weights - θ: Numerical stability constant

This layer can serve as a drop-in replacement for nn.Linear in many architectures, offering improved interpretability and non-linear modeling capabilities.

prototypes

Learnable prototype vectors of shape [num_prototypes, in_features].

Type:

nn.Parameter

feature_bank

Learnable feature bank of shape [num_features, in_features].

Type:

nn.Parameter

alpha

Tversky weight for input-distinctive features.

Type:

nn.Parameter or torch.Tensor

beta

Tversky weight for prototype-distinctive features.

Type:

nn.Parameter or torch.Tensor

bias

Optional bias term of shape [num_prototypes].

Type:

nn.Parameter or None

__init__(in_features: int, num_prototypes: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', normalize_features: bool = False, normalize_prototypes: bool = False, prototype_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', shared_feature_bank: Parameter | None = None, bias: bool = False)[source]

Initialize Tversky Projection Layer.

Parameters:
  • in_features (int) – Size of each input sample’s embedding dimension.

  • num_prototypes (int) – Number of prototype vectors to learn. This typically corresponds to the output dimension or number of classes.

  • num_features (int) – Size of the shared feature bank (|Ω|). This is a key hyperparameter controlling the expressiveness of the feature space.

  • alpha (float, optional) – Initial Tversky weight for input-distinctive features (x π). Higher values increase sensitivity to features present in input but not in prototypes. Defaults to 0.5.

  • beta (float, optional) – Initial Tversky weight for prototype-distinctive features (π x). Higher values increase sensitivity to features present in prototypes but not in input. Defaults to 0.5.

  • learnable_ab (bool, optional) – Whether α and β are learnable parameters. If False, they remain fixed at initial values. Defaults to True.

  • theta (float, optional) – Small constant for numerical stability in similarity computation. Defaults to 1e-7.

  • intersection_reduction (Union[IntersectionReduction, str], optional) – Method for aggregating feature intersections. Options: “product”, “min”, “max”, “mean”, “gmean”, “softmin”. Defaults to “product”.

  • difference_reduction (Union[DifferenceReduction, str], optional) – Method for computing feature differences. Options: “ignorematch”, “substractmatch”. Defaults to “substractmatch”.

  • normalize_features (bool, optional) – Whether to L2-normalize feature bank vectors during forward pass. Defaults to False.

  • normalize_prototypes (bool, optional) – Whether to L2-normalize input and prototype vectors during forward pass. Defaults to False.

  • prototype_init (Literal, optional) – Initialization method for prototype vectors. Options: “uniform”, “normal”, “xavier_uniform”, “xavier_normal”. Defaults to “xavier_uniform”.

  • feature_init (Literal, optional) – Initialization method for feature bank. Same options as prototype_init. Defaults to “xavier_uniform”.

  • shared_feature_bank (Optional[nn.Parameter], optional) – Pre-existing feature bank to share across layers. If provided, feature_init is ignored. Defaults to None.

  • bias (bool, optional) – Whether to include a learnable bias term of shape [num_prototypes]. Defaults to False.

Example

>>> # Create a projection layer as drop-in replacement for nn.Linear
>>> layer = TverskyProjectionLayer(
...     in_features=128,
...     num_prototypes=10,  # like nn.Linear(128, 10)
...     num_features=64,    # internal feature space size
...     learnable_ab=True
... )
>>> x = torch.randn(32, 128)  # batch of 32 samples
>>> output = layer(x)         # shape: [32, 10]
reset_parameters()[source]

Initialize parameters according to specified methods.

forward(x: Tensor) Tensor[source]

Compute forward pass through the Tversky projection layer.

Projects the input to prototype similarity space by computing Tversky similarity between each input and all learned prototype vectors.

Parameters:

x (torch.Tensor) – Input tensor of shape [batch_size, in_features].

Returns:

Tversky similarity scores of shape

[batch_size, num_prototypes]. Values are in [0, 1] range for standard Tversky Index formulation, representing similarity to each prototype.

Return type:

torch.Tensor

Note

This layer can serve as a drop-in replacement for nn.Linear, but produces similarity-based rather than linear projections.

get_prototype(index: int) Tensor[source]

Get a specific prototype vector.

set_prototype(index: int, value: Tensor)[source]

Set a specific prototype vector.

get_feature(index: int) Tensor[source]

Get a specific feature vector.

set_feature(index: int, value: Tensor)[source]

Set a specific feature vector.

property weight

Compatibility property for drop-in replacement of nn.Linear.

extra_repr() str[source]

String representation with layer configuration.

Classes

TverskyProjectionLayer

class TverskyProjectionLayer(in_features: int, num_prototypes: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', normalize_features: bool = False, normalize_prototypes: bool = False, prototype_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', shared_feature_bank: Parameter | None = None, bias: bool = False)[source]

A projection layer based on Tversky similarity (Equation 7 from paper).

This layer replaces standard linear projections by computing Tversky similarity between inputs and learned prototype vectors. Unlike linear layers, it can model non-linear functions like XOR with a single layer, making it suitable for complex pattern recognition tasks.

The layer implements: P_Ω,α,β,θ,Π(a): ℝ^d → ℝ^p

Where: - Ω: Learnable feature bank of shape [num_features, in_features] - Π: Learnable prototype vectors of shape [num_prototypes, in_features] - α, β: Asymmetry parameters controlling feature distinctiveness weights - θ: Numerical stability constant

This layer can serve as a drop-in replacement for nn.Linear in many architectures, offering improved interpretability and non-linear modeling capabilities.

prototypes

Learnable prototype vectors of shape [num_prototypes, in_features].

Type:

nn.Parameter

feature_bank

Learnable feature bank of shape [num_features, in_features].

Type:

nn.Parameter

alpha

Tversky weight for input-distinctive features.

Type:

nn.Parameter or torch.Tensor

beta

Tversky weight for prototype-distinctive features.

Type:

nn.Parameter or torch.Tensor

bias

Optional bias term of shape [num_prototypes].

Type:

nn.Parameter or None

__init__(in_features: int, num_prototypes: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', normalize_features: bool = False, normalize_prototypes: bool = False, prototype_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', shared_feature_bank: Parameter | None = None, bias: bool = False)[source]

Initialize Tversky Projection Layer.

Parameters:
  • in_features (int) – Size of each input sample’s embedding dimension.

  • num_prototypes (int) – Number of prototype vectors to learn. This typically corresponds to the output dimension or number of classes.

  • num_features (int) – Size of the shared feature bank (|Ω|). This is a key hyperparameter controlling the expressiveness of the feature space.

  • alpha (float, optional) – Initial Tversky weight for input-distinctive features (x π). Higher values increase sensitivity to features present in input but not in prototypes. Defaults to 0.5.

  • beta (float, optional) – Initial Tversky weight for prototype-distinctive features (π x). Higher values increase sensitivity to features present in prototypes but not in input. Defaults to 0.5.

  • learnable_ab (bool, optional) – Whether α and β are learnable parameters. If False, they remain fixed at initial values. Defaults to True.

  • theta (float, optional) – Small constant for numerical stability in similarity computation. Defaults to 1e-7.

  • intersection_reduction (Union[IntersectionReduction, str], optional) – Method for aggregating feature intersections. Options: “product”, “min”, “max”, “mean”, “gmean”, “softmin”. Defaults to “product”.

  • difference_reduction (Union[DifferenceReduction, str], optional) – Method for computing feature differences. Options: “ignorematch”, “substractmatch”. Defaults to “substractmatch”.

  • normalize_features (bool, optional) – Whether to L2-normalize feature bank vectors during forward pass. Defaults to False.

  • normalize_prototypes (bool, optional) – Whether to L2-normalize input and prototype vectors during forward pass. Defaults to False.

  • prototype_init (Literal, optional) – Initialization method for prototype vectors. Options: “uniform”, “normal”, “xavier_uniform”, “xavier_normal”. Defaults to “xavier_uniform”.

  • feature_init (Literal, optional) – Initialization method for feature bank. Same options as prototype_init. Defaults to “xavier_uniform”.

  • shared_feature_bank (Optional[nn.Parameter], optional) – Pre-existing feature bank to share across layers. If provided, feature_init is ignored. Defaults to None.

  • bias (bool, optional) – Whether to include a learnable bias term of shape [num_prototypes]. Defaults to False.

Example

>>> # Create a projection layer as drop-in replacement for nn.Linear
>>> layer = TverskyProjectionLayer(
...     in_features=128,
...     num_prototypes=10,  # like nn.Linear(128, 10)
...     num_features=64,    # internal feature space size
...     learnable_ab=True
... )
>>> x = torch.randn(32, 128)  # batch of 32 samples
>>> output = layer(x)         # shape: [32, 10]
reset_parameters()[source]

Initialize parameters according to specified methods.

forward(x: Tensor) Tensor[source]

Compute forward pass through the Tversky projection layer.

Projects the input to prototype similarity space by computing Tversky similarity between each input and all learned prototype vectors.

Parameters:

x (torch.Tensor) – Input tensor of shape [batch_size, in_features].

Returns:

Tversky similarity scores of shape

[batch_size, num_prototypes]. Values are in [0, 1] range for standard Tversky Index formulation, representing similarity to each prototype.

Return type:

torch.Tensor

Note

This layer can serve as a drop-in replacement for nn.Linear, but produces similarity-based rather than linear projections.

get_prototype(index: int) Tensor[source]

Get a specific prototype vector.

set_prototype(index: int, value: Tensor)[source]

Set a specific prototype vector.

get_feature(index: int) Tensor[source]

Get a specific feature vector.

set_feature(index: int, value: Tensor)[source]

Set a specific feature vector.

property weight

Compatibility property for drop-in replacement of nn.Linear.

extra_repr() str[source]

String representation with layer configuration.

The main layer for replacing nn.Linear with Tversky similarity-based projections.

Key Methods:

  • forward(x) - Compute similarity to all prototypes

  • get_prototype(index) - Access individual prototype vectors

  • set_prototype(index, value) - Modify prototype vectors for interventions

  • reset_parameters() - Reinitialize all parameters

Properties:

  • weight - Compatibility property returning prototypes (for nn.Linear replacement)

TverskySimilarityLayer

class TverskySimilarityLayer(in_features: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, learnable_theta: bool = False, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', use_contrast_form: bool = False, feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform')[source]

Tversky Similarity Layer (Equation 6 from paper).

Computes similarity between two objects using learnable feature bank and Tversky parameters (α, β, θ).

S_Ω,α,β,θ(a,b): ℝ^d × ℝ^d → ℝ

__init__(in_features: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, learnable_theta: bool = False, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', use_contrast_form: bool = False, feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform')[source]

Initialize Tversky Similarity Layer.

Parameters:
  • in_features – Dimension of input vectors

  • num_features – Number of features in feature bank (|Ω|)

  • alpha – Initial value for α parameter (weight for a’s distinctive features)

  • beta – Initial value for β parameter (weight for b’s distinctive features)

  • learnable_ab – Whether α and β are learnable parameters

  • learnable_theta – Whether θ is a learnable parameter (only for contrast form)

  • theta – Initial value or constant for numerical stability

  • intersection_reduction – Method for computing feature intersections

  • difference_reduction – Method for computing feature differences

  • use_contrast_form – Use linear combination instead of ratio form

  • feature_init – Initialization method for feature bank

reset_parameters()[source]

Initialize parameters according to specified method.

forward(a: Tensor, b: Tensor) Tensor[source]

Compute element-wise Tversky similarity between objects a and b.

Parameters:
  • a – First object tensor of shape [batch_size, in_features]

  • b – Second object tensor of shape [batch_size, in_features]

Returns:

Similarity scores of shape [batch_size]

Layer for computing element-wise similarity between pairs of objects.

Key Methods:

  • forward(a, b) - Compute similarity between object pairs

  • reset_parameters() - Reinitialize parameters

Usage Examples

Basic Projection Layer

import torch
from verskyt.layers import TverskyProjectionLayer

# Create layer (replaces nn.Linear(128, 10))
layer = TverskyProjectionLayer(
    in_features=128,
    num_prototypes=10,
    num_features=256,
    learnable_ab=True
)

# Forward pass
x = torch.randn(32, 128)
similarities = layer(x)  # shape: [32, 10]

Pairwise Similarity Layer

from verskyt.layers import TverskySimilarityLayer

# Create similarity layer
sim_layer = TverskySimilarityLayer(
    in_features=64,
    num_features=128,
    learnable_ab=True
)

# Compute pairwise similarities
a = torch.randn(32, 64)
b = torch.randn(32, 64)
similarities = sim_layer(a, b)  # shape: [32]

Parameter Access and Modification

layer = TverskyProjectionLayer(10, 5, 20)

# Access learned representations
prototypes = layer.prototypes.detach()
features = layer.feature_bank.detach()

# Modify specific prototype (for intervention studies)
new_prototype = torch.zeros(10)
layer.set_prototype(0, new_prototype)

# Access Tversky parameters
print(f"Alpha: {layer.alpha.item()}")
print(f"Beta: {layer.beta.item()}")