Skip to content

[exploratory] modules/language-models: semantic highlighting component#4230

Draft
cpoerschke wants to merge 1 commit intoapache:mainfrom
cpoerschke:explore-semantic-highlighting
Draft

[exploratory] modules/language-models: semantic highlighting component#4230
cpoerschke wants to merge 1 commit intoapache:mainfrom
cpoerschke:explore-semantic-highlighting

Conversation

@cpoerschke
Copy link
Contributor

done-ish list:

  • skeleton SemanticHighlightingComponent class in modules/language-models to avoid core/HighlightingComponent.java having modules/language-models dependency
  • created CustomModel.java and custom-model.json providing hard-coded mock embeddings for test use without an external model provider dependency
  • skeleton logic to use a language model to compute a score for a Passage
  • minimal SemanticHighlightingComponentTest class to illustrate usage

to-do list, non-exhaustive:

  • consideration of parameter details e.g. how to request semantic highlighting and the model(s) to use
  • how to obtain the vector against which passages are compared e.g. from some new parameter directly or by extraction from the q or hl.q parameter if a knn parser is used or by some other way?
  • consideration of Passage extraction e.g. currently it is term based but what if the q was a vector query i.e. no terms
  • should the PassageScorer and/or the Comparator<Passage> apply the language model to candidate passages?
  • how to properly and efficiently compute vector distances? currently using euclidian distance for illustration only.
  • tests
  • documentation
  • ???

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant