go-embeddings

Go package defining a common interface for generating text and image embeddings.

Documentation

godoc is currently incomplete.

Motivation

This is a simple abstraction library, written in Go, around a variety of services which produce vector embeddings. There are many such libraries and this one is ours. It tries to be the "simplest dumbest" thing for the most common operations and data needs. These ideas are encapsulated in the EmbeddingsRequest and EmbeddingsResponse types.

type EmbeddingsRequest struct {
	Id    string `json:"id,omitempty"`
	Model string `json:"model"`
	Body  []byte `json:"body"`
}

type EmbeddingsResponse[T Float] interface {
	Id() string
	Model() string
	Embeddings() []T
	Dimensions() int32
	Precision() string
	Created() int64
}

The default implementation of the EmbeddingsResponse interface is the CommonEmbeddingsResponse type:

type CommonEmbeddingsResponse[T Float] struct {
	EmbeddingsResponse[T] `json:",omitempty"`
	CommonId              string `json:"id,omitempty"`
	CommonEmbeddings      []T    `json:"embeddings"`
	CommonModel           string `json:"model"`
	CommonCreated         int64  `json:"created"`
	CommonPrecision       string `json:"precision"`
}

While not specific to SFO Museum this package is targeted at the kinds of things SFO Museum needs to today meaning it may be lacking features you need or want.

Design

To account for the fact that most embeddings models still return float32 vector data but an increasing number of models return float64 vectors this package wraps both options in a Float interface.

type Float interface{ ~float32 | ~float64 }

That Float is then used as a generic value (for embeddings) in a common EmbeddingsResponse interface:

type EmbeddingsResponse[T Float] interface {
	Id() string
	Model() string
	Embeddings() []T
	Dimensions() int32
	Precision() string
	Created() int64
}

That interface is then used as the return value for an Embedder interface:

type Embedder[T Float] interface {
	TextEmbeddings(context.Context, *EmbeddingsRequest) (EmbeddingsResponse[T], error)
	ImageEmbeddings(context.Context, *EmbeddingsRequest) (EmbeddingsResponse[T], error)
}

This means that you need to specify the float type you want the interface to return when you instantiate that interface. For example:

ctx := context.Backgroud()

uri32 := "ollama://?model=embeddinggemma"
uri64 := "encoderfile://"

cl, _ := embeddings.NewEmbedder[float32](ctx, uri32)
cl, _ := embeddings.NewEmbedder[float64](ctx, uri64)

There are also handy NewEmbedder32 and NewEmbedder64 methods which are little more than syntactic sugar. For example:

ctx := context.Backgroud()

uri32 := "ollama://?model=embeddinggemma"
uri64 := "encoderfile://"

cl, _ := embeddings.NewEmbedder32(ctx, uri32)
cl, _ := embeddings.NewEmbedder64(ctx, uri64)

The NewEmbedder, NewEmbedder32 and NewEmbedder64 all have the same signature: A context.Context instance and a URI string used to configure and instantiate the underlying embeddings provider implementation. These are discussed in detail below.

Both the TextEmbeddings and ImageEmbeddings methods take the same input, a EmbeddingsRequest struct:

type EmbeddingsRequest struct {
	Id    string `json:"id,omitempty"`
	Model string `json:"model"`
	Body  []byte `json:"body"`
}

As mentioned both methods return an EmbeddingsResponse[T] instance. The default implementation of the EmbeddingsResponse[T] interface used by this package is the CommonEmbeddingsResponse type. See response.go for details.

Example

Error handling omitted for the sake of brevity.

import (
	"context"
	"encoding/json"
	"os"

	"github.com/sfomuseum/go-embeddings"
)

func main() {

	ctx := context.Background()

	emb, _ := embeddings.NewEmbedder32(ctx, "ollama://?model=embeddinggemma")

	req := &embeddings.EmbeddingsRequest{
		Body: []byte("Hello world"),
	}

	rsp, _ := emb.TextEmbeddings(ctx, req)

	enc := json.NewEncoder(os.Stdout)
	enc.Encode(rsp)

Which would return the following:

{
  "embeddings": [
    -0.21400317549705505,
    0.02651195414364338,
    ... more embeddings
    -0.04678588733077049,
    -0.042774248868227005
  ],
  "model": "ollama/embeddinggemma",
  "created": 1771985811,
  "precision": "float32"
}

Precision

The convention for precision values is a string, for example "float32". Typically an embeddings service will return vector embeddings with a single precision but the Embedder interface allows you to derive embeddings as either float32 or float64 value. In order to preserve the origin precision information if embeddings are requested in a precision other than that generated by a service the requested precision will be appened to the origin value.

For example, if you request float64 values from a service that returns float32 values those data will be recast and the precision string will be updated to read "float32#as-float64".

Implementations

encoderfile://

Derive vector embeddings from an instance of the Mozilla encoderfile application, running as an HTTP server.

encoderfile://?{PARAMETERS}

Name	Value	Required	Notes
client-uri	string	no	The URI for the `embedderfile` HTTP server endpoint. Default is `http://localhost:8080`. The gRPC server endpoint provided by `encoderfile` is not supported yet.

llamafile://

Derive vector embedding from an instance of the Mozilla llamafile application. Note that newer versions of llamafile not longer expose an interface for deriving embeddings so this implementation will only work with older builds. See the encoderfile:// implementation for an alternative.

llamafile://?{PARAMETERS}

Name	Value	Required	Notes
client-uri	string	no	The URI for the `llamafile` HTTP server endpoint. Default is `http://localhost:8080`.

mlxclip://

Derive vector embeddings from a Python script using the harperreed/mlx_clip library. The option requires a device using an Apple Silicon chip and involves a non-zero manual set up process discussed below.

Set up

The set up process for using mlx_clip is involved. The first step is to create a Python virtual environment:

$> python -mvenv /usr/local/src/mlxclip
$> cd  /usr/local/src/mlxclip
$> bash ./bin/activate

Next create some sub-folders used to store dependencies and data:

$> mkdir src
$> mkdir -p data/openai/clip-vit-base-patch32

First, install the mlx-data package:

$> cd /usr/local/src/mlxclip/src
$> git clone git@github.com:ml-explore/mlx-data.git
$> cd mlx-data
$> ../../bin/python install setup.py

Next, install the MLX clip package from the mlx-examples package:

$> cd /usr/local/src/mlxclip/src
$> git clone git@github.com:ml-explore/mlx-examples.git
$> cd mlx-examples/clip
$> ../../bin/pip install -r requirements.txt
$> ../../bin/python ./convert.py --mlx-path /usr/local/src/mlxclip/data/openai/clip-vit-base-patch32

Finally install the harperreed/mlx_clip package and copy it to the root of your virtual environment:

$> cd /usr/local/src/mlxclip/src
$> git clone git@github.com:harperreed/mlx_clip.git
$> cd mlx_clip
$> ../../bin/pip install -r requirements.txt
$> cp -r mlx_clip ../../

At this point you should be ready to use the command line tools. By the time you read this something may have changed or there may be additional steps to account for your environment. This is what has worked for me so far.

Command line (mlxclip://)

mlxclip://{PATH_TO_EMBEDDINGS_DOT_PY}?{PARAMETERS}

Valid query parameters are:

Name	Value	Required	Notes
model	string	yes	The path to directory with MLX-compatible model data.
python	string	no	The path to the Python runtime to use. For example one created by a Python virtual environment.

The mlxclip:// scheme will derive embeddings from a command line Python script (details below). For example:

./bin/embeddings \
	-client-uri 'mlxclip:///usr/local/src/mlxclip/mlx_cli.py?model=/usr/local/src/mlxclip/data/openai/clip-vit-base-patch32&python=/usr/local/src/mlxclip/bin/python' \
	image \
	test20.jpg
	
{"embeddings":[0.0049408292,0.034288883,... and so on

Set up

Copy the contents of mlxclip_cli_py.txt to /usr/local/src/mlxclip/mlxclip_cli.py.

Client-server (mlxclip-client://)

mlxclip-client://?{PARMETERS}

Valid query parameters are:

Name	Value	Required	Notes
server-uri	string	no	The URI of the mlx clip server producing embeddings. Default is `http://localhost:5000`.

For example:

$> echo "Hello world" | ./bin/embeddings -client-uri 'mlxclip-client://' text -
{"embeddings":[0.008282159, ... and so on

Set up

In addition to the set up steps above you will also need to do the following to set up the server that the (mlx) client will connect to:

$> cd /usr/local/src/mlxclip
$> bin/pip install fastapi uvicorn asyncio

Now copy the contents of mlxclip_server_py.txt to /usr/local/src/mlxclip/mlxclip_server.py. To start the server you would do this (adjusting as necessary for your environment):

$> ./bin/python ./mlx_server.py -h
usage: mlx_server.py [-h] --model_dir MODEL_DIR [--host HOST] [--port PORT] [--max-workers MAX_WORKERS]

MLX-Clip FastAPI embedding service

options:
  -h, --help            show this help message and exit
  --model_dir MODEL_DIR
                        Path to MLX CLIP model directory
  --host HOST           The host the service will listen on.
  --port PORT           The port the service will listen on.
  --max-workers MAX_WORKERS
                        The maximum number of concurrent mlx processes.
			
$> ./bin/python ./mlx_server.py --model_dir=data/openai/clip-vit-base-patch32/
INFO:__main__:Loading MLX-CLIP model from data/openai/clip-vit-base-patch32/
INFO:mlx_clip:Loading CLIP model from directory: data/openai/clip-vit-base-patch32/
INFO:     Started server process [22613]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://localhost:5000 (Press CTRL+C to quit)
INFO:     127.0.0.1:60982 - "POST /embeddings/image HTTP/1.1" 200 OK
INFO:     127.0.0.1:60983 - "POST /embeddings/image HTTP/1.1" 200 OK
INFO:     127.0.0.1:60984 - "POST /embeddings HTTP/1.1" 200 OK

mobileclip://

Derive vector embeddings from the MobileCLIP models exposed via an instance of the sfomuseum/swift-mobileclip gRPC endpoint.

mobileclip://?{PARAMETERS}

Name	Value	Required	Notes
client-uri	string	yes	The URI for the `swift-mobileclip` gRPC server endpoint. Default is `grpc://localhosr:8080`.

null://

Derive null (empty) vector embeddings. This is a "placeholder" implementation that will always return a zero-length list of embeddings.

null://

ollama://

Derive vector embeddings from an instance of the Ollama application.

ollama://?{PARAMETERS}

Name	Value	Required	Notes
client-uri	string	no	Default is `http://localhost:11434`.
model	string	yes	The name of the model to use for generating embeddings.

openclip://

Derive vector embeddings from a web service exposing the OpenCLIP model and library.

Set up

Create a new Python virtual environment and install the necessary dependencies:

$> python -m venv openclip
$> cd openclip/
$> bash bin/activate
$> bin/pip install open_clip_torch Pillow

Command-line (openclip://)

This option is not suported yet.

Client-server (openclip-client://)

openclip-client://?{PARAMETERS}

Name	Value	Required	Notes
server-uri	string	no	The URI of the HTTP endpoint exposing the OpenCLIP model functionality. Default is `http://localhost:5000`.

Derive OpenCLIP embeddings from an HTTP service. For example:

$> echo "hi there" | ./bin/embeddings -client-uri 'openclip-client://' text -
{"embeddings":[-0.24023438,0.09472656,0.12695312, ... and so on

Set up

In addition to the set up steps above you will also need to do the following to set up the server that the (openclip) client will connect to:

$> cd /usr/local/src/siglip
$> bin/pip install fastapi uvicorn

Then, copy the included code in openclip_server_py.txt in to a file called openclip_server.py and launch it as follows:

$> ./bin/python ./openclip_server.py
INFO:     Started server process [67888]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://localhost:5000 (Press CTRL+C to quit)
INFO:     127.0.0.1:61064 - "POST /embeddings HTTP/1.1" 200 OK

siglip://

Derive vector embeddings from a Python script using the Google SigLIP (2) models.

Set up

Set up is not yet automated so you'll need to do something like this:

$> cd /usr/local/src
$> python -m venv siglip
$> cd siglip/
$> bash bin/activate
$> bin/pip install torch transformers pillow protobuf SentencePiece

Command line (siglip://)

siglip://{OPTIONAL_HOST}{PATH_TO_SIGLIP_CLI_PY}?{PARAMETERS}`

Valid query parameters are:

Name	Value	Required	Notes
model	string	yes	The HuggingFace checkpoint URI of the model to use. For example "google/siglip-so400m-patch14-384"
python	string	no	The path to the Python runtime to use. For example one created by a Python virtual environment.

Derive embeddings from a local Python script operating on a siglip model (described below). For example:

$> echo "Hello world" | ./bin/embeddings -client-uri 'siglip://venv/usr/local/src/siglip/embeddings.py?model=google/siglip-base-patch16-224&python=/usr/local/src/siglip/bin/python' text -
{"embeddings":[0.010030805,-0.02573614,0.029724538,... and so on

Set up

Copy the siglip_cli_py.txt file in to a /usr/local/src/siglip/siglip_cli.py (or whatever suits your environment).

Client-server (siglip-client://)

siglip-client://?{PARAMETERS}

Valid parameters are:

Name	Value	Required	Notes
server-uri	string	no	The URI of the HTTP endpoint exposing the SigLIP model functionality. Default is `http://localhost:5000`.

Derive siglip embeddings from an HTTP service. For example:

$> ./bin/embeddings -client-uri 'siglip-client://' image test.pmg
{"embeddings":[-0.017064538,0.00726526,-0.0042089703 ... and so on

Set up

In addition to the set up steps above you will also need to do the following to set up the server that the (siglip) client will connect to:

$> cd /usr/local/src/siglip
$> bin/pip install fastapi uvicorn

Copy the contents of siglip_server_py.txt to /usr/local/src/siglip/siglip_server.py. To start the server you would do this (adjusting as necessary for your environment):

$> ./bin/python ./siglip_server.py --model_name google/siglip2-so400m-patch16-naflex
INFO:     Started server process [54813]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://localhost:5000 (Press CTRL+C to quit)

Tests

Because so many of the implementations above depend on the availability of external, third-party services their tests depend on the presence of Go build tags to run. They are :

Implementation	Build tag
encoderfile://	encoderfile
llamafile://	llamafile
mlxclip://	mlxclip
mobileclip://	mobileclip
ollama://	ollama
openclip://	openclip
siglip://	siglip

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
app/embeddings		app/embeddings
cmd/embeddings		cmd/embeddings
fixtures		fixtures
vendor		vendor
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
embedder.go		embedder.go
embeddings.go		embeddings.go
encoderfile.go		encoderfile.go
encoderfile_test.go		encoderfile_test.go
errors.go		errors.go
go.mod		go.mod
go.sum		go.sum
llamafile.go		llamafile.go
llamafile_client.go		llamafile_client.go
llamafile_test.go		llamafile_test.go
local_client.go		local_client.go
mlxclip.go		mlxclip.go
mlxclip_cli.go		mlxclip_cli.go
mlxclip_cli_py.txt		mlxclip_cli_py.txt
mlxclip_cli_test.go		mlxclip_cli_test.go
mlxclip_client.go		mlxclip_client.go
mlxclip_server_py.txt		mlxclip_server_py.txt
mobileclip.go		mobileclip.go
mobileclip_test.go		mobileclip_test.go
null.go		null.go
null_test.go		null_test.go
ollama.go		ollama.go
ollama_client.go		ollama_client.go
ollama_test.go		ollama_test.go
openclip_client.go		openclip_client.go
openclip_client_test.go		openclip_client_test.go
openclip_server_py.txt		openclip_server_py.txt
request.go		request.go
response.go		response.go
siglip_cli.go		siglip_cli.go
siglip_cli_py.txt		siglip_cli_py.txt
siglip_cli_test.go		siglip_cli_test.go
siglip_client.go		siglip_client.go
siglip_server_py.txt		siglip_server_py.txt

Folders and files

Latest commit

History

Repository files navigation

go-embeddings

Documentation

Motivation

Design

Example

Precision

Implementations

encoderfile://

See also

llamafile://

See also

mlxclip://

Set up

Command line (mlxclip://)

Set up

Client-server (mlxclip-client://)

Set up

See also

mobileclip://

See also

null://

ollama://

See also

openclip://

Set up

Command-line (openclip://)

Client-server (openclip-client://)

Set up

siglip://

Set up

Command line (siglip://)

Set up

Client-server (siglip-client://)

Set up

See also

Tests

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages