Skip to content

Add integration tests for negative samplers #255

@tizianocitro

Description

@tizianocitro

Is your feature request related to a problem? Please describe.

The samplers should be tested not only in isolation, but also through their integration with Dataset and HData.

Without integration tests, regressions may appear in the final sampled dataset even if the sampler itself seems correct. For example, negative hyperedges could overlap with positives, labels could become misaligned, duplicate negatives could be produced, or hyperedge attributes and weights could become inconsistent.

Describe the solution you would like

Add integration tests for the negative samplers under hyperbench/tests/data, for example in hyperbench/tests/data/negative_sampler_integration_test.py.

The tests should use all possible datasets for all samplers.

The tests should verify the behavior when negative samples are added through Dataset.add_negative_samples or the corresponding HData path.

For RandomNegativeSampler, test that:

  • The output contains the original positive hyperedges plus the requested number of negatives.
  • Positive labels are 1.
  • Negative labels are 0.
  • No sampled negative hyperedge is equal to an existing positive hyperedge.
  • No duplicate negative hyperedges are produced.
  • return_0based_negatives=True returns a valid standalone negative HData.
  • return_0based_negatives=False works correctly when negatives are added back to the original hypergraph.
  • Hyperedge attributes and hyperedge weights are handled consistently when enrichers are provided.

For CliqueNegativeSampler, test that:

  • Existing positive hyperedges are excluded.
  • Duplicate negatives are not produced.
  • max_candidates is respected.
  • A useful error is raised when there are not enough valid clique-based candidates.

Describe alternatives you've considered

Only testing the sampler output directly would be simpler, but it would not catch integration issues involving labels, rebasing, attributes, weights, and dataset-level APIs.

Additional context

Acceptance criteria:

  • Integration tests exist for RandomNegativeSampler.
  • Integration tests exist for CliqueNegativeSampler.
  • Tests verify positive/negative label alignment.
  • Tests verify that negatives do not overlap with positives.
  • Tests verify that duplicate negatives are not produced.
  • Tests verify deterministic behavior with a fixed seed.
  • Tests cover expected failure cases.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions