Transcribe audio using AssemblyAI with Semantic Kernel plugins.
Add the AssemblyAI.SemanticKernel NuGet package to your project.
dotnet add package AssemblyAI.SemanticKernelNext, register the AssemblyAI plugin into your kernel:
using AssemblyAI.SemanticKernel;
using Microsoft.SemanticKernel;
// Build your kernel
var kernelBuilder = Kernel.CreateBuilder();
// add services like LLMs etc.
// Get AssemblyAI API key from env variables, or much better, from .NET configuration
string apiKey = Environment.GetEnvironmentVariable("ASSEMBLYAI_API_KEY")
  ?? throw new Exception("ASSEMBLYAI_API_KEY env variable not configured.");
kernelBuilder.AddAssemblyAIPlugin(new AssemblyAIPluginOptions
    {
        ApiKey = apiKey,
        PluginName = null,
        AllowFileSystemAccess = false
    });
var kernel = kernelBuilder.Build();You can configure three options:
- ApiKey: Configure the AssemblyAI API key
- PluginName: Configure the name of the plugin inside of Semantic Kernel. Defaults to "AssemblyAIPlugin".
- AllowFileSystemAccess: Allow the plugin to read files from the file system to upload audio files for transcriptions. Defaults to false.
kernelBuilder.AddAssemblyAIPlugin has overloads to configure the plugin using configuration and through a lambda.
Get the Transcribe function from the transcript plugin and invoke it with the context variables.
var result = await kernel.InvokeAsync<string>(
    nameof(AssemblyAIPlugin),
    AssemblyAIPlugin.TranscribeFunctionName,
    new KernelArguments
    {
        ["INPUT"] = "https://storage.googleapis.com/aai-docs-samples/espn.m4a"
    }
);
Console.WriteLine(result);You can also upload local audio and video file. To do this:
- Set the AssemblyAIPluginOptions.AllowFileSystemAccesstotrue.
- Configure the INPUTvariable with a local file path.
kernelBuilder.AddAssemblyAIPlugin(new AssemblyAIPluginOptions
    {
        ApiKey = apiKey,
        AllowFileSystemAccess = true
    });
...
var result = await kernel.InvokeAsync<string>(
    nameof(AssemblyAIPlugin), 
    AssemblyAIPlugin.TranscribeFunctionName, 
    new KernelArguments
    {
        ["INPUT"] = "./espn.m4a"
    }
);
Console.WriteLine(result);You can also invoke the function from within a semantic function like this.
const string prompt = """
                      Here is a transcript:
                      {{AssemblyAIPlugin.Transcribe "https://storage.googleapis.com/aai-docs-samples/espn.m4a"}}
                      ---
                      Summarize the transcript.
                      """;
var result = await kernel.InvokePromptAsync<string>(prompt);
Console.WriteLine(result);All the code above explicitly invokes the transcript plugin, but it can also be invoked as part of a plan. Check out the Sample project) which uses a plan to transcribe an audio file in addition to explicit invocation.
- The AssemblyAI integration only supports Semantic Kernel with .NET at this moment. If there's demand, we will extend support to other platforms, so let us know!
- Feel free to file an issue in case of bugs or feature requests.
