verskyt.layers¶
Neural network layers implementing Tversky similarity computations.
Module: projection¶
Tversky neural network layers.
Implements TverskySimilarityLayer (Equation 6) and TverskyProjectionLayer (Equation 7) from the paper.
- class TverskySimilarityLayer(in_features: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, learnable_theta: bool = False, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', use_contrast_form: bool = False, feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform')[source]¶
Bases:
ModuleTversky Similarity Layer (Equation 6 from paper).
Computes similarity between two objects using learnable feature bank and Tversky parameters (α, β, θ).
S_Ω,α,β,θ(a,b): ℝ^d × ℝ^d → ℝ
- __init__(in_features: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, learnable_theta: bool = False, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', use_contrast_form: bool = False, feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform')[source]¶
Initialize Tversky Similarity Layer.
- Parameters:
in_features – Dimension of input vectors
num_features – Number of features in feature bank (|Ω|)
alpha – Initial value for α parameter (weight for a’s distinctive features)
beta – Initial value for β parameter (weight for b’s distinctive features)
learnable_ab – Whether α and β are learnable parameters
learnable_theta – Whether θ is a learnable parameter (only for contrast form)
theta – Initial value or constant for numerical stability
intersection_reduction – Method for computing feature intersections
difference_reduction – Method for computing feature differences
use_contrast_form – Use linear combination instead of ratio form
feature_init – Initialization method for feature bank
- class TverskyProjectionLayer(in_features: int, num_prototypes: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', normalize_features: bool = False, normalize_prototypes: bool = False, prototype_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', shared_feature_bank: Parameter | None = None, bias: bool = False)[source]¶
Bases:
ModuleA projection layer based on Tversky similarity (Equation 7 from paper).
This layer replaces standard linear projections by computing Tversky similarity between inputs and learned prototype vectors. Unlike linear layers, it can model non-linear functions like XOR with a single layer, making it suitable for complex pattern recognition tasks.
The layer implements: P_Ω,α,β,θ,Π(a): ℝ^d → ℝ^p
Where: - Ω: Learnable feature bank of shape [num_features, in_features] - Π: Learnable prototype vectors of shape [num_prototypes, in_features] - α, β: Asymmetry parameters controlling feature distinctiveness weights - θ: Numerical stability constant
This layer can serve as a drop-in replacement for nn.Linear in many architectures, offering improved interpretability and non-linear modeling capabilities.
- prototypes¶
Learnable prototype vectors of shape [num_prototypes, in_features].
- Type:
nn.Parameter
- feature_bank¶
Learnable feature bank of shape [num_features, in_features].
- Type:
nn.Parameter
- alpha¶
Tversky weight for input-distinctive features.
- Type:
nn.Parameter or torch.Tensor
- beta¶
Tversky weight for prototype-distinctive features.
- Type:
nn.Parameter or torch.Tensor
- bias¶
Optional bias term of shape [num_prototypes].
- Type:
nn.Parameter or None
- __init__(in_features: int, num_prototypes: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', normalize_features: bool = False, normalize_prototypes: bool = False, prototype_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', shared_feature_bank: Parameter | None = None, bias: bool = False)[source]¶
Initialize Tversky Projection Layer.
- Parameters:
in_features (int) – Size of each input sample’s embedding dimension.
num_prototypes (int) – Number of prototype vectors to learn. This typically corresponds to the output dimension or number of classes.
num_features (int) – Size of the shared feature bank (|Ω|). This is a key hyperparameter controlling the expressiveness of the feature space.
alpha (float, optional) – Initial Tversky weight for input-distinctive features (x π). Higher values increase sensitivity to features present in input but not in prototypes. Defaults to 0.5.
beta (float, optional) – Initial Tversky weight for prototype-distinctive features (π x). Higher values increase sensitivity to features present in prototypes but not in input. Defaults to 0.5.
learnable_ab (bool, optional) – Whether α and β are learnable parameters. If False, they remain fixed at initial values. Defaults to True.
theta (float, optional) – Small constant for numerical stability in similarity computation. Defaults to 1e-7.
intersection_reduction (Union[IntersectionReduction, str], optional) – Method for aggregating feature intersections. Options: “product”, “min”, “max”, “mean”, “gmean”, “softmin”. Defaults to “product”.
difference_reduction (Union[DifferenceReduction, str], optional) – Method for computing feature differences. Options: “ignorematch”, “substractmatch”. Defaults to “substractmatch”.
normalize_features (bool, optional) – Whether to L2-normalize feature bank vectors during forward pass. Defaults to False.
normalize_prototypes (bool, optional) – Whether to L2-normalize input and prototype vectors during forward pass. Defaults to False.
prototype_init (Literal, optional) – Initialization method for prototype vectors. Options: “uniform”, “normal”, “xavier_uniform”, “xavier_normal”. Defaults to “xavier_uniform”.
feature_init (Literal, optional) – Initialization method for feature bank. Same options as prototype_init. Defaults to “xavier_uniform”.
shared_feature_bank (Optional[nn.Parameter], optional) – Pre-existing feature bank to share across layers. If provided, feature_init is ignored. Defaults to None.
bias (bool, optional) – Whether to include a learnable bias term of shape [num_prototypes]. Defaults to False.
Example
>>> # Create a projection layer as drop-in replacement for nn.Linear >>> layer = TverskyProjectionLayer( ... in_features=128, ... num_prototypes=10, # like nn.Linear(128, 10) ... num_features=64, # internal feature space size ... learnable_ab=True ... ) >>> x = torch.randn(32, 128) # batch of 32 samples >>> output = layer(x) # shape: [32, 10]
- forward(x: Tensor) Tensor[source]¶
Compute forward pass through the Tversky projection layer.
Projects the input to prototype similarity space by computing Tversky similarity between each input and all learned prototype vectors.
- Parameters:
x (torch.Tensor) – Input tensor of shape [batch_size, in_features].
- Returns:
- Tversky similarity scores of shape
[batch_size, num_prototypes]. Values are in [0, 1] range for standard Tversky Index formulation, representing similarity to each prototype.
- Return type:
Note
This layer can serve as a drop-in replacement for nn.Linear, but produces similarity-based rather than linear projections.
- property weight¶
Compatibility property for drop-in replacement of nn.Linear.
Classes¶
TverskyProjectionLayer¶
- class TverskyProjectionLayer(in_features: int, num_prototypes: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', normalize_features: bool = False, normalize_prototypes: bool = False, prototype_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', shared_feature_bank: Parameter | None = None, bias: bool = False)[source]¶
A projection layer based on Tversky similarity (Equation 7 from paper).
This layer replaces standard linear projections by computing Tversky similarity between inputs and learned prototype vectors. Unlike linear layers, it can model non-linear functions like XOR with a single layer, making it suitable for complex pattern recognition tasks.
The layer implements: P_Ω,α,β,θ,Π(a): ℝ^d → ℝ^p
Where: - Ω: Learnable feature bank of shape [num_features, in_features] - Π: Learnable prototype vectors of shape [num_prototypes, in_features] - α, β: Asymmetry parameters controlling feature distinctiveness weights - θ: Numerical stability constant
This layer can serve as a drop-in replacement for nn.Linear in many architectures, offering improved interpretability and non-linear modeling capabilities.
- prototypes¶
Learnable prototype vectors of shape [num_prototypes, in_features].
- Type:
nn.Parameter
- feature_bank¶
Learnable feature bank of shape [num_features, in_features].
- Type:
nn.Parameter
- alpha¶
Tversky weight for input-distinctive features.
- Type:
nn.Parameter or torch.Tensor
- beta¶
Tversky weight for prototype-distinctive features.
- Type:
nn.Parameter or torch.Tensor
- bias¶
Optional bias term of shape [num_prototypes].
- Type:
nn.Parameter or None
- __init__(in_features: int, num_prototypes: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', normalize_features: bool = False, normalize_prototypes: bool = False, prototype_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform', shared_feature_bank: Parameter | None = None, bias: bool = False)[source]¶
Initialize Tversky Projection Layer.
- Parameters:
in_features (int) – Size of each input sample’s embedding dimension.
num_prototypes (int) – Number of prototype vectors to learn. This typically corresponds to the output dimension or number of classes.
num_features (int) – Size of the shared feature bank (|Ω|). This is a key hyperparameter controlling the expressiveness of the feature space.
alpha (float, optional) – Initial Tversky weight for input-distinctive features (x π). Higher values increase sensitivity to features present in input but not in prototypes. Defaults to 0.5.
beta (float, optional) – Initial Tversky weight for prototype-distinctive features (π x). Higher values increase sensitivity to features present in prototypes but not in input. Defaults to 0.5.
learnable_ab (bool, optional) – Whether α and β are learnable parameters. If False, they remain fixed at initial values. Defaults to True.
theta (float, optional) – Small constant for numerical stability in similarity computation. Defaults to 1e-7.
intersection_reduction (Union[IntersectionReduction, str], optional) – Method for aggregating feature intersections. Options: “product”, “min”, “max”, “mean”, “gmean”, “softmin”. Defaults to “product”.
difference_reduction (Union[DifferenceReduction, str], optional) – Method for computing feature differences. Options: “ignorematch”, “substractmatch”. Defaults to “substractmatch”.
normalize_features (bool, optional) – Whether to L2-normalize feature bank vectors during forward pass. Defaults to False.
normalize_prototypes (bool, optional) – Whether to L2-normalize input and prototype vectors during forward pass. Defaults to False.
prototype_init (Literal, optional) – Initialization method for prototype vectors. Options: “uniform”, “normal”, “xavier_uniform”, “xavier_normal”. Defaults to “xavier_uniform”.
feature_init (Literal, optional) – Initialization method for feature bank. Same options as prototype_init. Defaults to “xavier_uniform”.
shared_feature_bank (Optional[nn.Parameter], optional) – Pre-existing feature bank to share across layers. If provided, feature_init is ignored. Defaults to None.
bias (bool, optional) – Whether to include a learnable bias term of shape [num_prototypes]. Defaults to False.
Example
>>> # Create a projection layer as drop-in replacement for nn.Linear >>> layer = TverskyProjectionLayer( ... in_features=128, ... num_prototypes=10, # like nn.Linear(128, 10) ... num_features=64, # internal feature space size ... learnable_ab=True ... ) >>> x = torch.randn(32, 128) # batch of 32 samples >>> output = layer(x) # shape: [32, 10]
- forward(x: Tensor) Tensor[source]¶
Compute forward pass through the Tversky projection layer.
Projects the input to prototype similarity space by computing Tversky similarity between each input and all learned prototype vectors.
- Parameters:
x (torch.Tensor) – Input tensor of shape [batch_size, in_features].
- Returns:
- Tversky similarity scores of shape
[batch_size, num_prototypes]. Values are in [0, 1] range for standard Tversky Index formulation, representing similarity to each prototype.
- Return type:
Note
This layer can serve as a drop-in replacement for nn.Linear, but produces similarity-based rather than linear projections.
- property weight¶
Compatibility property for drop-in replacement of nn.Linear.
The main layer for replacing nn.Linear with Tversky similarity-based projections.
Key Methods:
forward(x)- Compute similarity to all prototypesget_prototype(index)- Access individual prototype vectorsset_prototype(index, value)- Modify prototype vectors for interventionsreset_parameters()- Reinitialize all parameters
Properties:
weight- Compatibility property returning prototypes (fornn.Linearreplacement)
TverskySimilarityLayer¶
- class TverskySimilarityLayer(in_features: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, learnable_theta: bool = False, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', use_contrast_form: bool = False, feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform')[source]¶
Tversky Similarity Layer (Equation 6 from paper).
Computes similarity between two objects using learnable feature bank and Tversky parameters (α, β, θ).
S_Ω,α,β,θ(a,b): ℝ^d × ℝ^d → ℝ
- __init__(in_features: int, num_features: int, alpha: float = 0.5, beta: float = 0.5, learnable_ab: bool = True, learnable_theta: bool = False, theta: float = 1e-07, intersection_reduction: IntersectionReduction | str = 'product', difference_reduction: DifferenceReduction | str = 'substractmatch', use_contrast_form: bool = False, feature_init: Literal['uniform', 'normal', 'xavier_uniform', 'xavier_normal'] = 'xavier_uniform')[source]¶
Initialize Tversky Similarity Layer.
- Parameters:
in_features – Dimension of input vectors
num_features – Number of features in feature bank (|Ω|)
alpha – Initial value for α parameter (weight for a’s distinctive features)
beta – Initial value for β parameter (weight for b’s distinctive features)
learnable_ab – Whether α and β are learnable parameters
learnable_theta – Whether θ is a learnable parameter (only for contrast form)
theta – Initial value or constant for numerical stability
intersection_reduction – Method for computing feature intersections
difference_reduction – Method for computing feature differences
use_contrast_form – Use linear combination instead of ratio form
feature_init – Initialization method for feature bank
Layer for computing element-wise similarity between pairs of objects.
Key Methods:
forward(a, b)- Compute similarity between object pairsreset_parameters()- Reinitialize parameters
Usage Examples¶
Basic Projection Layer¶
import torch
from verskyt.layers import TverskyProjectionLayer
# Create layer (replaces nn.Linear(128, 10))
layer = TverskyProjectionLayer(
in_features=128,
num_prototypes=10,
num_features=256,
learnable_ab=True
)
# Forward pass
x = torch.randn(32, 128)
similarities = layer(x) # shape: [32, 10]
Pairwise Similarity Layer¶
from verskyt.layers import TverskySimilarityLayer
# Create similarity layer
sim_layer = TverskySimilarityLayer(
in_features=64,
num_features=128,
learnable_ab=True
)
# Compute pairwise similarities
a = torch.randn(32, 64)
b = torch.randn(32, 64)
similarities = sim_layer(a, b) # shape: [32]
Parameter Access and Modification¶
layer = TverskyProjectionLayer(10, 5, 20)
# Access learned representations
prototypes = layer.prototypes.detach()
features = layer.feature_bank.detach()
# Modify specific prototype (for intervention studies)
new_prototype = torch.zeros(10)
layer.set_prototype(0, new_prototype)
# Access Tversky parameters
print(f"Alpha: {layer.alpha.item()}")
print(f"Beta: {layer.beta.item()}")