feat(search): add ngram support for Japanese text analysis in Elastic… by hachimitsu-pichi · Pull Request #2914 · traPtitech/traQ

hachimitsu-pichi · 2026-01-21T07:47:00Z

…search settings

Copilot

Pull request overview

Adds Japanese n-gram analysis support to the Elasticsearch message index to improve search behavior for Japanese text (close #2826).

Changes:

Adds a text.ngram multi-field to the ES mapping for message text.
Introduces an n-gram tokenizer plus index/search analyzers intended for Japanese partial matching.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-21T07:52:10Z

+			"fields": m{
+				"ngram": m{
+					"type":            "text",
+					"search_analyzer": "ja_ngram_search_analyzer",
+					"analyzer":        "ja_ngram_index_analyzer",
+				},
+			},


These mapping/analysis changes are only applied when the index is first created (the code creates the index only when it does not exist). Existing deployments won’t pick up the new text.ngram field/analyzers without deleting/recreating or reindexing into a new index. Consider adding an explicit migration/reindex path (e.g., versioned index name + alias) or documenting the required operational steps.

Copilot · 2026-01-21T07:52:11Z

+			"ja_ngram_index_analyzer": m{
+				"type": "custom",
+				"char_filter": []string{
+					"normalize",
+				},
+				"tokenizer": "ja_ngram_tokenizer",


Both n-gram analyzers reference a char_filter named normalize, but there is no analysis.char_filter definition in these settings and normalize is not a standard built-in char filter. This will cause index creation to fail with an unknown char filter error. Either define the intended char filter under settings.index.analysis.char_filter or switch to a valid built-in/plugin-provided char filter name, or remove the char_filter entry if it isn't required.

Copilot · 2026-01-21T07:52:11Z

+			"fields": m{
+				"ngram": m{
+					"type":            "text",
+					"search_analyzer": "ja_ngram_search_analyzer",
+					"analyzer":        "ja_ngram_index_analyzer",
+				},
+			},


This adds the text.ngram multi-field, but the search query in this file currently searches only the text field (not text.ngram). As a result, the new n-gram analyzer won’t affect search behavior. Consider updating the query to include text.ngram (optionally with a lower boost than text) so Japanese partial matching actually uses the n-gram index.

Suggested change

"fields": m{

"ngram": m{

"type": "text",

"search_analyzer": "ja_ngram_search_analyzer",

"analyzer": "ja_ngram_index_analyzer",

},

},

…ings

…t analysis

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-21T08:14:13Z

+			},
+		},
+		"filter": m{
+			"sudachi_split_filter": m{
+				"type": "sudachi_split",


The analysis map is closed before filter and analyzer are defined, making filter/analyzer siblings of analysis under index. Elasticsearch requires char_filter/tokenizer/filter/analyzer to all be nested under index.analysis; with the current nesting the index settings JSON will be rejected.

Copilot · 2026-01-21T08:14:13Z

+				},
+				"tokenizer": "ja_ngram_tokenizer",
+				"filter": []string{
+					"lowercase",
+				}},


The ja_ngram_search_analyzer literal ends with }}, on the same line as the filter slice closing brace, which is not gofmt-formatted. Since gofmt is enforced via golangci config (.golangci.yml:23-26), please run gofmt so the closing braces/commas are formatted correctly.

Copilot · 2026-01-21T08:14:13Z

 			},
-			"filter": m{
-				"sudachi_split_filter": m{
-					"type": "sudachi_split",
-					"mode": "search",
+
+			"ja_ngram_tokenizer": m{
+				"type":     "ngram",
+				"min_gram": 2,


ja_ngram_tokenizer is declared as a sibling of analysis.tokenizer (right after the tokenizer map is closed). Elasticsearch expects custom tokenizers to be defined inside the analysis.tokenizer object; otherwise analyzers referencing ja_ngram_tokenizer will fail when creating the index.

…bility

Eraxyso · 2026-01-21T08:18:29Z

elastic searchを再インデックスする必要があるらしい

feat(search): add ngram support for Japanese text analysis in Elastic…

0b16030

…search settings

Eraxyso requested a review from Copilot January 21, 2026 07:47

Copilot started reviewing on behalf of Eraxyso January 21, 2026 07:47 View session

Copilot AI reviewed Jan 21, 2026

View reviewed changes

hachimitsu-pichi added 2 commits January 21, 2026 17:01

feat(search): add ICU normalization char filter to Elasticsearch sett…

be8445f

…ings

feat(search): add ngram field to Elasticsearch query for improved tex…

01343b1

…t analysis

Eraxyso requested a review from Copilot January 21, 2026 08:10

Copilot started reviewing on behalf of Eraxyso January 21, 2026 08:10 View session

Copilot AI reviewed Jan 21, 2026

View reviewed changes

feat(search): refactor ja_ngram_tokenizer settings for improved reada…

1366794

…bility

hachimitsu-pichi requested a review from Takeno-hito January 21, 2026 08:20

Merge branch 'master' into feat/ngram-elasticsearch

e16a4ae

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(search): add ngram support for Japanese text analysis in Elastic…#2914

feat(search): add ngram support for Japanese text analysis in Elastic…#2914
hachimitsu-pichi wants to merge 5 commits intomasterfrom
feat/ngram-elasticsearch

hachimitsu-pichi commented Jan 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 21, 2026

Uh oh!

Copilot AI Jan 21, 2026

Uh oh!

Copilot AI Jan 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 21, 2026

Uh oh!

Copilot AI Jan 21, 2026

Uh oh!

Copilot AI Jan 21, 2026

Uh oh!

Eraxyso commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hachimitsu-pichi commented Jan 21, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Eraxyso commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants