FoundationModelsOCR
iOS demo app using Apple’s FoundationModels to extract data from scanned invoices. Combines Vision for image processing with LLM-powered field extraction. Runs fully on-device. Ideal for expense tracking, finance apps, or smart document parsing.
Install / Use
/learn @AviTsadok/FoundationModelsOCRREADME
🧾 Invoice Extraction Demo with Vision & Foundation Models
This is a lightweight iOS demo that shows how to extract structured data from an invoice image using the power of:
- 🔍 Apple's Vision framework for text recognition
- 🧠 Foundation Models for parsing structured data using on-device LLMs
- ✅ Safe and strongly typed output using
@Generableand@Guide
📸 What It Does
This app demonstrates the end-to-end pipeline:
- You provide or capture an image of an invoice
- The app uses Vision to extract the printed text from the image
- That raw text is sent into Apple’s on-device Foundation Model
- The model returns structured data representing the invoice, using Swift types
📦 Output Model
The structured output is defined using the @Generable macro and @Guide descriptions to guide the LLM:
@Generable
struct InvoiceItem {
var name: String
var price: Decimal
var quantity: Int
}
@Generable
struct MyInvoice {
@Guide(description: "The name of the vendor")
var vendor: String
@Guide(description: "List of the invoice items")
var items: [InvoiceItem]
@Guide(description: "total invoice amount")
var totalAmount: Decimal
var toString: String {
"Vendor: \(vendor)\n" +
"Items:\n" +
items.map(\.name).joined(separator: "\n") +
"------\n" +
"\nTotal: \(totalAmount)"
}
}
Related Skills
beanquery-mcp
41Beancount MCP Server is an experimental implementation that utilizes the Model Context Protocol (MCP) to enable AI assistants to query and analyze Beancount ledger files using Beancount Query Language (BQL) and the beanquery tool.
valuecell
9.8kValueCell is a community-driven, multi-agent platform for financial applications.
REFERENCE
An intelligent middleware layer between crypto wallets and traditional payment systems.
cashu-skill
A Cashu wallet skill for AI agents
