AI
The definitive, open-source Swift framework for interfacing with generative AI.
Install / Use
/learn @PreternaturalAI/AIREADME
[!IMPORTANT] This package is presently in its alpha stage of development (2026-03-04).
</div>
Supported Platforms
<p align="left"> <picture> <source media="(prefers-color-scheme: dark)" srcset="Images/macos.svg"> <source media="(prefers-color-scheme: light)" srcset="Images/macos-active.svg"> <img alt="macos" src="Images/macos-active.svg" height="24"> </picture> <picture> <source media="(prefers-color-scheme: dark)" srcset="Images/ios.svg"> <source media="(prefers-color-scheme: light)" srcset="Images/ios-active.svg"> <img alt="macos" src="Images/ios-active.svg" height="24"> </picture> <picture> <source media="(prefers-color-scheme: dark)" srcset="Images/ipados.svg"> <source media="(prefers-color-scheme: light)" srcset="Images/ipados-active.svg"> <img alt="macos" src="Images/ipados-active.svg" height="24"> </picture> <picture> <source media="(prefers-color-scheme: dark)" srcset="Images/tvos.svg"> <source media="(prefers-color-scheme: light)" srcset="Images/tvos-active.svg"> <img alt="macos" src="Images/tvos-active.svg" height="24"> </picture> <picture> <source media="(prefers-color-scheme: dark)" srcset="Images/watchos.svg"> <source media="(prefers-color-scheme: light)" srcset="Images/watchos-active.svg"> <img alt="macos" src="Images/watchos-active.svg" height="24"> </picture> </p>AI
The definitive, open-source Swift framework for interfacing with generative AI.
- Import the framework
- Initialize an AI Client
- LLM Clients Abstraction
- Supported Models
- Completions
- DALLE-3 Image Generation
- Audio
- Text Embeddings
Roadmap
Acknowledgements
License
Installation
Swift Package Manager
- Open your Swift project in Xcode.
- Go to
File->Add Package Dependency. - In the search bar, enter this URL.
- Choose the version you'd like to install.
- Click
Add Package.
Usage
Import the framework
+ import AI
Initialize an AI Client
Initialize an instance of an AI API provider of your choice. Here are some examples:
import AI
// OpenAI / GPT
import OpenAI
let client: OpenAI.Client = OpenAI.Client(apiKey: "YOUR_API_KEY")
// Anthropic / Claude
import Anthropic
let client: Anthropic.Client = Anthropic.Client(apiKey: "YOUR_API_KEY")
// Mistral
import Mistral
let client: Mistral.Client = Mistral.Client(apiKey: "YOUR_API_KEY")
// Groq
import Groq
let client: Groq.Client = Groq.Client(apiKey: "YOUR_API_KEY")
// ElevenLabs
import ElevenLabs
let client: ElevenLabs.Client = ElevenLabs.Client(apiKey: "YOUR_API_KEY")
You can now use client as an interface to the supported providers.
LLM Clients Abstraction
If you need to abstract out the LLM Client (for example, if you want to allow your user to choose between clients), simply initialize an instance of LLMRequestHandling with an LLM API provider of your choice. Here are some examples:
import AI
import OpenAI
import Anthropic
import Mistral
import Groq
// OpenAI / GPT
let client: any LLMRequestHandling = OpenAI.Client(apiKey: "YOUR_API_KEY")
// Anthropic / Claude
let client: any LLMRequestHandling = Anthropic.Client(apiKey: "YOUR_API_KEY")
// Mistral
let client: any LLMRequestHandling = Mistral.Client(apiKey: "YOUR_API_KEY")
// Groq
let client: any LLMRequestHandling = Groq.Client(apiKey: "YOUR_API_KEY")
You can now use client as an interface to an LLM as provided by the underlying provider.
Supported Models
Each AI Client supports multiple models. For example:
// OpenAI GPT Models
let gpt_4o_Model: OpenAI.Model = .gpt_4o
let gpt_4_Model: OpenAI.Model = .gpt_4
let gpt_3_5_Model: OpenAI.Model = .gpt_3_5
let otherGPTModels: OpenAI.Model = .chat(.gpt_OTHER_MODEL_OPTIONS)
// Open AI Text Embedding Models
let smallTextEmbeddingsModel: OpenAI.Model = .embedding(.text_embedding_3_small)
let largeTextEmbeddingsModel: OpenAI.Model = .embedding(.text_embedding_3_large)
let adaTextEmbeddingsModel: OpenAI.Model = .embedding(.text_embedding_ada_002)
// Anthropic Models
let caludeHaikuModel: Anthropic.Model = .haiku
let claudeSonnetModel: Anthropic.Model = .sonnet
let claudeOpusModel: Anthropic.Model = .opus
// Mistral Models
let mistralTiny: Mistral.Model = .mistral_tiny
let mistralSmall: Mistral.Model = Mistral.Model.mistral_small
let mistralMedium: Mistral.Model = Mistral.Model.mistral_medium
// Groq Models
let gemma_7b: Groq.Model = .gemma_7b
let llama3_8b: Groq.Model = .llama3_8b
let llama3_70b: Groq.Model = .llama3_70b
let mixtral_8x7b: Groq.Model = .mixtral_8x7b
// ElevenLabs Models
let multilingualV2: ElevenLabs.Model = .MultilingualV2
let turboV2: ElevenLabs.Model = .TurboV2 // English
let multilingualV1: ElevenLabs.Model = .MultilingualV1
let englishV1: ElevenLabs.Model = .EnglishV1
Completions
Basic Completions
Modern Large Language Models (LLMs) operate by receiving a series of inputs, often in the form of messages or prompts, and completing the inputs with the next probable output based on calculations performed by their complex neural network architectures that leverage the vast amounts of data on which it was trained.
You can use the LLMRequestHandling.complete(_:model:) function to generate a chat completion for a specific model of your choice. For example:
import AI
import OpenAI
let client: any LLMRequestHandling = OpenAI.Client(apiKey: "YOUR_KEY")
// the system prompt is optional
let systemPrompt: PromptLiteral = "You are an extremely intelligent assistant."
let userPrompt: PromptLiteral = "What is the meaning of life?"
let messages: [AbstractLLM.ChatMessage] = [
.system(systemPrompt),
.user(userPrompt)
]
// Each of these is Optional
let parameters = AbstractLLM.ChatCompletionParameters(
// .max or maximum amount of tokens is default
tokenLimit: .fixed(200),
// controls the randomness of the result
temperatureOrTopP: .temperature(1.2),
// stop sequences that indicate to the model when to stop generating further text
stops: ["END OF CHAPTER"],
// check the function calling section below
functions: nil)
let model: OpenAI.Model = .gpt_4o
do {
let result: String = try await client.complete(
messages,
parameters: parameters,
model: model,
as: .string)
return result
} catch {
print(error)
}
Vision: Image-to-Text
Language models (LLMs) are rapidly evolving and expanding into multimodal capabilities. This shift signifies a major transformation in the field, as LLMs are no longer limited to understanding and generating text. With Vision, LLMs can take an image as an input, and provide information about the content of the image.
import AI
import OpenAI
let client: any LLMRequestHandling = OpenAI.Client(apiKey: "YOUR_KEY")
let systemPrompt: PromptLiteral = "You are a VisionExpertGPT. You will receive an image. Your job is to list all the items in the image and write a one-sentence poem about each item. Make sure your poems are creative, capturing the essence of each item in an evocative and imaginative way."
let userPrompt: PromptLiteral = "List the items in this image and write a short one-sentence poem about each item. Only reply with the items and poems. NOTHING MORE."
// Image or NSImage is supported
let imageLiteral = try PromptLiteral(image: imageInput)
let model = OpenAI.Model.gpt_4o
let messages: [AbstractLLM.ChatMessage] = [
.system(systemPrompt),
.user {
.concatenate(separator: nil) {
userPrompt
imageLiteral
}
}]
let result: String = try await client.complete(
messages,
model: model,
as: .string
)
return result
Function Calling
Adding function calling in your completion requests allows your app to receive a structured JSON response from an LLM, ensuring a consistent data format.
To demonstrate how powerful Function Calling can be, we will use the example of using a screenshot organizing app. The PhotoKit API already has a functionality to identify only photos that are screenshots. So just getting the user’s screenshots and putting them into an app is something that is simple enough to accomplish.
But now, with the power of LLMs, we can easily organize the screenshots by categories, provide a summary for each one, and add search functionality across all screenshots by having clear detailed text descriptions. In the future, we can add additional information, such as extracting any text or links included in the screenshot to make it easily actionable, and even extract specific elements from the screenshot.
To make a function call, we must first image an function in our app that would save the screenshot. What parameters does it need? These function parameters is what the LLM Function Calling tool will return for us so that we can call our function:
// Note that since LLMs are trained mainly on web APIs, we have to image web API function names for better results
func addScreenshotAnalysisToDB(
with title: String,
summary: String,
description: String,
category: String
) {
// this function does not exist in our app, but we pretend that it does for the purpose of using function calling to get a JSON response of the function parameters.
}
import OpenAI
import CorePersistence
let
Related Skills
node-connect
350.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
350.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
350.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
