Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ test.ipynb
bin/
obj/
.vs/
.vscode/

# build, distribute, and bins
build/
Expand Down
10 changes: 5 additions & 5 deletions samples/README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# Foundry Local Samples

Explore complete working examples that demonstrate how to use Foundry Local — an end-to-end local AI solution that runs entirely on-device. These samples cover chat completions, audio transcription, tool calling, LangChain integration, and more.
Explore complete working examples that demonstrate how to use Foundry Local — an end-to-end local AI solution that runs entirely on-device. These samples cover chat completions, embeddings, audio transcription, tool calling, LangChain integration, and more.

> **New to Foundry Local?** Check out the [main README](../README.md) for an overview and quickstart, or visit the [Foundry Local documentation](https://learn.microsoft.com/azure/foundry-local/) on Microsoft Learn.

## Samples by Language

| Language | Samples | Description |
|----------|---------|-------------|
| [**C#**](cs/) | 12 | .NET SDK samples including native chat, audio transcription, tool calling, model management, web server, and tutorials. Uses WinML on Windows for hardware acceleration. |
| [**JavaScript**](js/) | 12 | Node.js SDK samples including native chat, audio transcription, Electron desktop app, Copilot SDK integration, LangChain, tool calling, web server, and tutorials. |
| [**Python**](python/) | 9 | Python samples using the OpenAI-compatible API, including chat, audio transcription, LangChain integration, tool calling, web server, and tutorials. |
| [**Rust**](rust/) | 8 | Rust SDK samples including native chat, audio transcription, tool calling, web server, and tutorials. |
| [**C#**](cs/) | 13 | .NET SDK samples including native chat, embeddings, audio transcription, tool calling, model management, web server, and tutorials. Uses WinML on Windows for hardware acceleration. |
| [**JavaScript**](js/) | 13 | Node.js SDK samples including native chat, embeddings, audio transcription, Electron desktop app, Copilot SDK integration, LangChain, tool calling, web server, and tutorials. |
| [**Python**](python/) | 10 | Python samples using the OpenAI-compatible API, including chat, embeddings, audio transcription, LangChain integration, tool calling, web server, and tutorials. |
| [**Rust**](rust/) | 9 | Rust SDK samples including native chat, embeddings, audio transcription, tool calling, web server, and tutorials. |
1 change: 1 addition & 0 deletions samples/cs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ Both packages provide the same APIs, so the same source code works on all platfo
| Sample | Description |
|---|---|
| [native-chat-completions](native-chat-completions/) | Initialize the SDK, download a model, and run chat completions. |
| [embeddings](embeddings/) | Generate single and batch text embeddings using the Foundry Local SDK. |
| [audio-transcription-example](audio-transcription-example/) | Transcribe audio files using the Foundry Local SDK. |
| [foundry-local-web-server](foundry-local-web-server/) | Set up a local OpenAI-compliant web server. |
| [tool-calling-foundry-local-sdk](tool-calling-foundry-local-sdk/) | Use tool calling with native chat completions. |
Expand Down
48 changes: 48 additions & 0 deletions samples/cs/embeddings/Embeddings.csproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<OutputType>Exe</OutputType>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>

<!-- Windows: target Windows SDK for WinML hardware acceleration -->
<PropertyGroup Condition="$([MSBuild]::IsOSPlatform('Windows'))">
<TargetFramework>net9.0-windows10.0.26100</TargetFramework>
<WindowsAppSDKSelfContained>false</WindowsAppSDKSelfContained>
<Platforms>ARM64;x64</Platforms>
<WindowsPackageType>None</WindowsPackageType>
<EnableCoreMrtTooling>false</EnableCoreMrtTooling>
</PropertyGroup>

<!-- Non-Windows: standard .NET -->
<PropertyGroup Condition="!$([MSBuild]::IsOSPlatform('Windows'))">
<TargetFramework>net9.0</TargetFramework>
</PropertyGroup>

<PropertyGroup Condition="'$(RuntimeIdentifier)'==''">
<RuntimeIdentifier>$(NETCoreSdkRuntimeIdentifier)</RuntimeIdentifier>
</PropertyGroup>

<!-- Windows: WinML for hardware acceleration -->
<ItemGroup Condition="$([MSBuild]::IsOSPlatform('Windows'))">
<PackageReference Include="Microsoft.AI.Foundry.Local.WinML" />
</ItemGroup>

<!-- Non-Windows: standard SDK -->
<ItemGroup Condition="!$([MSBuild]::IsOSPlatform('Windows'))">
<PackageReference Include="Microsoft.AI.Foundry.Local" />
</ItemGroup>

<!-- Linux GPU support -->
<ItemGroup Condition="'$(RuntimeIdentifier)' == 'linux-x64'">
<PackageReference Include="Microsoft.ML.OnnxRuntime.Gpu" />
<PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" />
</ItemGroup>

<!-- Shared utilities -->
<ItemGroup>
<Compile Include="../Shared/*.cs" />
</ItemGroup>

</Project>
74 changes: 74 additions & 0 deletions samples/cs/embeddings/Program.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
// <complete_code>
// <imports>
using Microsoft.AI.Foundry.Local;
// </imports>

// <init>
var config = new Configuration
{
AppName = "foundry_local_samples",
LogLevel = Microsoft.AI.Foundry.Local.LogLevel.Information
};

// Initialize the singleton instance.
await FoundryLocalManager.CreateAsync(config, Utils.GetAppLogger());
var mgr = FoundryLocalManager.Instance;
// </init>

// <model_setup>
// Get the model catalog
var catalog = await mgr.GetCatalogAsync();

// Get an embedding model
var model = await catalog.GetModelAsync("qwen3-0.6b-embedding") ?? throw new Exception("Embedding model not found");

// Download the model (the method skips download if already cached)
await model.DownloadAsync(progress =>
{
Console.Write($"\rDownloading model: {progress:F2}%");
if (progress >= 100f)
{
Console.WriteLine();
}
});

// Load the model
Console.Write($"Loading model {model.Id}...");
await model.LoadAsync();
Console.WriteLine("done.");
// </model_setup>

// <single_embedding>
// Get an embedding client
var embeddingClient = await model.GetEmbeddingClientAsync();

Check failure on line 43 in samples/cs/embeddings/Program.cs

View workflow job for this annotation

GitHub Actions / cs-samples (windows)

'IModel' does not contain a definition for 'GetEmbeddingClientAsync' and no accessible extension method 'GetEmbeddingClientAsync' accepting a first argument of type 'IModel' could be found (are you missing a using directive or an assembly reference?)

Check failure on line 43 in samples/cs/embeddings/Program.cs

View workflow job for this annotation

GitHub Actions / cs-samples (windows)

'IModel' does not contain a definition for 'GetEmbeddingClientAsync' and no accessible extension method 'GetEmbeddingClientAsync' accepting a first argument of type 'IModel' could be found (are you missing a using directive or an assembly reference?)

Check failure on line 43 in samples/cs/embeddings/Program.cs

View workflow job for this annotation

GitHub Actions / cs-samples (macos)

'IModel' does not contain a definition for 'GetEmbeddingClientAsync' and no accessible extension method 'GetEmbeddingClientAsync' accepting a first argument of type 'IModel' could be found (are you missing a using directive or an assembly reference?)

Check failure on line 43 in samples/cs/embeddings/Program.cs

View workflow job for this annotation

GitHub Actions / cs-samples (macos)

'IModel' does not contain a definition for 'GetEmbeddingClientAsync' and no accessible extension method 'GetEmbeddingClientAsync' accepting a first argument of type 'IModel' could be found (are you missing a using directive or an assembly reference?)

// Generate a single embedding
Console.WriteLine("\n--- Single Embedding ---");
var response = await embeddingClient.GenerateEmbeddingAsync("The quick brown fox jumps over the lazy dog");
var embedding = response.Data[0].Embedding;
Console.WriteLine($"Dimensions: {embedding.Count}");
Console.WriteLine($"First 5 values: [{string.Join(", ", embedding.Take(5).Select(v => v.ToString("F6")))}]");
// </single_embedding>

// <batch_embedding>
// Generate embeddings for multiple inputs
Console.WriteLine("\n--- Batch Embeddings ---");
var batchResponse = await embeddingClient.GenerateEmbeddingsAsync([
"Machine learning is a subset of artificial intelligence",
"The capital of France is Paris",
"Rust is a systems programming language"
]);

Console.WriteLine($"Number of embeddings: {batchResponse.Data.Count}");
for (var i = 0; i < batchResponse.Data.Count; i++)
{
Console.WriteLine($" [{i}] Dimensions: {batchResponse.Data[i].Embedding.Count}");
}
// </batch_embedding>

// <cleanup>
// Tidy up - unload the model
await model.UnloadAsync();
Console.WriteLine("\nModel unloaded.");
// </cleanup>
// </complete_code>
1 change: 1 addition & 0 deletions samples/js/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ These samples demonstrate how to use the Foundry Local JavaScript SDK (`foundry-
| Sample | Description |
|--------|-------------|
| [native-chat-completions](native-chat-completions/) | Initialize the SDK, download a model, and run non-streaming and streaming chat completions. |
| [embeddings](embeddings/) | Generate single and batch text embeddings using the Foundry Local SDK. |
| [audio-transcription-example](audio-transcription-example/) | Transcribe audio files using the Whisper model with streaming output. |
| [chat-and-audio-foundry-local](chat-and-audio-foundry-local/) | Unified sample demonstrating both chat and audio transcription in one application. |
| [electron-chat-application](electron-chat-application/) | Full-featured Electron desktop chat app with voice transcription and model management. |
Expand Down
73 changes: 73 additions & 0 deletions samples/js/embeddings/app.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
// <complete_code>
// <imports>
import { FoundryLocalManager } from 'foundry-local-sdk';
// </imports>

// Initialize the Foundry Local SDK
console.log('Initializing Foundry Local SDK...');

// <init>
const manager = FoundryLocalManager.create({
appName: 'foundry_local_samples',
logLevel: 'info'
});
// </init>
console.log('✓ SDK initialized successfully');

// <model_setup>
// Get an embedding model
const modelAlias = 'qwen3-0.6b-embedding';
const model = await manager.catalog.getModel(modelAlias);

// Download the model
console.log(`\nDownloading model ${modelAlias}...`);
await model.download((progress) => {
process.stdout.write(`\rDownloading... ${progress.toFixed(2)}%`);
});
console.log('\n✓ Model downloaded');

// Load the model
console.log(`\nLoading model ${modelAlias}...`);
await model.load();
console.log('✓ Model loaded');
// </model_setup>

// <single_embedding>
// Create embedding client
console.log('\nCreating embedding client...');
const embeddingClient = model.createEmbeddingClient();
console.log('✓ Embedding client created');

// Generate a single embedding
console.log('\n--- Single Embedding ---');
const response = await embeddingClient.generateEmbedding(
'The quick brown fox jumps over the lazy dog'
);

const embedding = response.data[0].embedding;
console.log(`Dimensions: ${embedding.length}`);
console.log(`First 5 values: [${embedding.slice(0, 5).map(v => v.toFixed(6)).join(', ')}]`);
// </single_embedding>

// <batch_embedding>
// Generate embeddings for multiple inputs
console.log('\n--- Batch Embeddings ---');
const batchResponse = await embeddingClient.generateEmbeddings([
'Machine learning is a subset of artificial intelligence',
'The capital of France is Paris',
'Rust is a systems programming language'
]);

console.log(`Number of embeddings: ${batchResponse.data.length}`);
for (let i = 0; i < batchResponse.data.length; i++) {
console.log(` [${i}] Dimensions: ${batchResponse.data[i].embedding.length}`);
}
// </batch_embedding>

// <cleanup>
// Unload the model
console.log('\nUnloading model...');
await model.unload();
console.log('✓ Model unloaded');
// </cleanup>
// </complete_code>
15 changes: 15 additions & 0 deletions samples/js/embeddings/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"name": "embeddings",
"version": "1.0.0",
"type": "module",
"main": "app.js",
"scripts": {
"start": "node app.js"
},
"dependencies": {
"foundry-local-sdk": "latest"
},
"optionalDependencies": {
"foundry-local-sdk-winml": "latest"
}
}
1 change: 1 addition & 0 deletions samples/python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ These samples demonstrate how to use Foundry Local with Python.
| Sample | Description |
|--------|-------------|
| [native-chat-completions](native-chat-completions/) | Initialize the SDK, start the local service, and run streaming chat completions. |
| [embeddings](embeddings/) | Generate single and batch text embeddings using the Foundry Local SDK. |
| [audio-transcription](audio-transcription/) | Transcribe audio files using the Whisper model. |
| [web-server](web-server/) | Start a local OpenAI-compatible web server and call it with the OpenAI Python SDK. |
| [tool-calling](tool-calling/) | Tool calling with custom function definitions (get_weather, calculate). |
Expand Down
2 changes: 2 additions & 0 deletions samples/python/embeddings/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
foundry-local-sdk; sys_platform != "win32"
foundry-local-sdk-winml; sys_platform == "win32"
61 changes: 61 additions & 0 deletions samples/python/embeddings/src/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# <complete_code>
# <imports>
from foundry_local_sdk import Configuration, FoundryLocalManager
# </imports>


def main():
# <init>
# Initialize the Foundry Local SDK
config = Configuration(app_name="foundry_local_samples")
FoundryLocalManager.initialize(config)
manager = FoundryLocalManager.instance

# Select and load an embedding model from the catalog
model = manager.catalog.get_model("qwen3-0.6b-embedding")
model.download(
lambda progress: print(
f"\rDownloading model: {progress:.2f}%",
end="",
flush=True,
)
)
print()
model.load()
print("Model loaded and ready.")

# Get an embedding client
client = model.get_embedding_client()
# </init>

# <single_embedding>
# Generate a single embedding
print("\n--- Single Embedding ---")
response = client.generate_embedding("The quick brown fox jumps over the lazy dog")
embedding = response.data[0].embedding
print(f"Dimensions: {len(embedding)}")
print(f"First 5 values: {embedding[:5]}")
# </single_embedding>

# <batch_embedding>
# Generate embeddings for multiple inputs
print("\n--- Batch Embeddings ---")
batch_response = client.generate_embeddings([
"Machine learning is a subset of artificial intelligence",
"The capital of France is Paris",
"Rust is a systems programming language",
])

print(f"Number of embeddings: {len(batch_response.data)}")
for i, data in enumerate(batch_response.data):
print(f" [{i}] Dimensions: {len(data.embedding)}")
# </batch_embedding>

# Clean up
model.unload()
print("\nModel unloaded.")


if __name__ == "__main__":
main()
# </complete_code>
1 change: 1 addition & 0 deletions samples/rust/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ members = [
"tool-calling-foundry-local",
"native-chat-completions",
"audio-transcription-example",
"embeddings",
"tutorial-chat-assistant",
"tutorial-document-summarizer",
"tutorial-tool-calling",
Expand Down
1 change: 1 addition & 0 deletions samples/rust/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ These samples demonstrate how to use the Rust binding for Foundry Local.
| Sample | Description |
|--------|-------------|
| [native-chat-completions](native-chat-completions/) | Non-streaming and streaming chat completions using the native chat client. |
| [embeddings](embeddings/) | Generate single and batch text embeddings using the native embedding client. |
| [audio-transcription-example](audio-transcription-example/) | Audio transcription (non-streaming and streaming) using the Whisper model. |
| [foundry-local-webserver](foundry-local-webserver/) | Start a local OpenAI-compatible web server and call it with a standard HTTP client. |
| [tool-calling-foundry-local](tool-calling-foundry-local/) | Tool calling with streaming responses, multi-turn conversation, and local tool execution. |
Expand Down
12 changes: 12 additions & 0 deletions samples/rust/embeddings/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
[package]
name = "embeddings"
version = "0.1.0"
edition = "2021"
description = "Native SDK embeddings (single and batch) using the Foundry Local Rust SDK"

[dependencies]
foundry-local-sdk = { path = "../../../sdk/rust" }
tokio = { version = "1", features = ["rt-multi-thread", "macros"] }

[target.'cfg(windows)'.dependencies]
foundry-local-sdk = { path = "../../../sdk/rust", features = ["winml"] }
Loading
Loading