Kflite
A Kotlin Multiplatform library to run TensorFlow lite models on iOS and Android targets
Install / Use
/learn @shadmanadman/KfliteREADME

Overview
kflite runs TensorFlow Lite (tflite) models directly from shared Kotlin code. It abstracts platform differences, and manages model loading, tensor creation, and inference through a unified API.
Key features:
- Works with Compose Multiplatform
composeResources - Supports model normalization (
YOLO,COCO,PascalVOC,TF formats) - Enable/Disable quantization <a href="https://huggingface.co/docs/optimum/en/concept_guides/quantization">What's quantization?</a>
- Image input models (Support for NLP models are on the way FollowUp)
- Select Delegation (GPU and NNAPI on Android, METAL and CoreML on iOS)
- Whether to allow inference with float16 precision for FP32 models.
allowFp16PrecisionForFp32 - The preference for inference speed and accuracy.
SUSTAINED_SPEEDFAST_SINGLE_ANSWER
Getting Started
For a fastest setup and a working example, see KfliteSample. It demonstrates a full pipeline.
Installation
Add dependencies
Include the dependency in your shared commonMain.dependencies </br>
implementation("io.github.shadmanadman:kflite:1.1.37")
Configure for iOS
Since KMP doesn’t automatically include CocoaPods dependencies in the consumer's iosApp, you need to manually add TensorFlow
Lite for iOS. </br>
Create a Podfile inside your iosApp with following pods:
target 'iosApp' do
use_frameworks!
platform :ios, '16.0'
pod 'TensorFlowLiteObjC'
pod 'TensorFlowLiteObjC/Metal'
pod 'TensorFlowLiteObjC/CoreML'
end
Run a Model
Step 1 - Place model
Put your .tflite model file in composeResources/files.</br>
You can view an example model placement onKflite Sample.
kflite uses Compose Resources to manage assets in a platform-independent way.Your model becomes
available to all targets by converting it into a byte array.
Step 2 - init the model
kflite loads and prepares your model for inference with optional performance parameters. </br>
Call Kflite.init() and load your model as a byte array. This creates an interpreter instance in
memory.
Kflite.init(
model = Res.readBytes("files/efficientdet-lite2.tflite"),
options = InterpreterOptions(
numThreads: Int = 4,
delegateType: DelegateType = DelegateType.CPU,
inferencePreferenceType: TFLiteInferencePreference = TFLiteInferencePreference.PLATFORM_DEFAULT,
allowQuantizedModels: Boolean = true,
allowFp16PrecisionForFp32: Boolean = false,
)
)
numThreads: number of threads allocate to CPU. Defualt is 4delegateType: selects hardware acceleration backend.(GPU and NNAPI on Android, METAL and CoreML on iOS)allowFp16PrecisionForFp32: speeds up inference if supported by hardwareinferencePreferenceTypeThe preference for inference speed and accuracy.(Platform default on Android is FAST_SINGLE_ANSWER and on iOS WaitTypePassive)allowQuantizedModelsWhether to allow inference with quantized models.
Step 3 - Prepare the input data
Kflite works with direct ByteBuffer as input, so you can feed preprocessed inputs or tensors
directly.</br>
Netron is a great tool for visualizing your model. You can see your model
input/output details.</br>
You can see input tensor shape of your model by calling getInputTensor, Checkout your model
metadata or use Netron to check input dimensions and each shape represents.
- To see the number of input tensor that your model has, you can call
getInputTensorCount.
Following example shows how to prepare input data for a model that takes an image as input and the input have a 4D matrix.
Our model has 4 shape. The batch size,image width, image height and the pixel size.
- If you know the amount of each shape, just hard code them in a const, since each shape is constant.
val pixcelSize = Kflite.getInputTensor(0).shape[0] = 1
val inputImageWidth = Kflite.getInputTensor(0).shape[1] = 448
val inputImageHeight = Kflite.getInputTensor(0).shape[2] = 448
val floatTypeSize = Kflite.getInputTensor(0).shape[3] = 3
Calculate the input size.This will be used to allocate the size of byte buffer.
val modelInputSize =
floatTypeSize * inputImageWidth * inputImageHeight * pixcelSize
Create a ByteBuffer from your input data. This is for image inputs. (Text inputs will be supported soon.) Following example scales an image to match model input size and converts it into a normalized float array.</br>
- When the
normalizeis true, the code performs Image Normalization and Data Type Conversion on the pixel data before feeding it into the byte buffer. This changes the data type of the input in the buffer from an 8-bit integer to 32-bit floating point.True this only for models that supports input data in a range of [0.0,1.0].
val inputImage = imageResource(Res.drawable.example_model_input)
.toScaledByteBuffer(
inputWidth = inputImageWidth,
inputHeight = inputImageHeight,
inputAllocateSize = modelInputSize,
normalize: Boolean = false
)
Step 4 - Prepare the output data
Create a container that matches the model’s output tensor shape. This gives you a correctly sized
structure to hold the results.
You can see output tensor shape of your model by calling getOutputTensor, Checkout your model
metadata or use Netron</br>
- To see the number of input tensor that your model has, you can call
getOutputTensorCount.
Following example shows how to prepare output data for a object detection model that outputs a 3D matrix. Our example model has 3 shape. The batch size (number of input), number of results and the bounding box location.</br>
- If you know the amount of each shape, just hard code them in a const, since each shape is constant.
val batchSize = Kflite.getOutputTensor(0).shape[0] = 1
val numberOfResults = Kflite.getOutputTensor(0).shape[1] = 25
val detailsPerResult = Kflite.getOutputTensor(0).shape[2] = 4
We then create a matrix to hold the model output.
val modelOutputContainer = Array(batchSize) {
Array(numberOfResults) {
FloatArray(detailsPerResult)
}
}
Step 5 - Running the model:
Call Kflite.run() and pass the input and output containers:
Kflite.run() performs inference on the model. You feed it the inputs and provide a container for
outputs, and it returns predictions, detections, or classifications depending on your model type.
inputs: List of input tensors.outputs: A map linking output tensor indices to the containers you created earlier.
If your model supports multiple input/output, add them to the list.
Kflite.run(
inputs = listOf(inputImage), // The input made in step 3
outputs = mapOf(Pair(0,modelOutputContainer)) // the output prepared in step 4
)
Step 6: Close the model:
Once inference is complete, you call close() to free up resources and safely release the
interpreter to avoid memory leaks.
Kflite.close()
Once closed, all underlying TFLite interpreters are released.
Normalizing Model Output
Most detection models output is in model-scaled coordinates.</br> When you resize your image to match the model input, you need to normalize the output data into the original size.</br> This normalization is not necessary unless you want to match the original to the model output. For example putting a bounding box on the original image or</br> modify the original data based on the model output.
You can use this normalizing extensions to rescale object detection bounding boxes into the original image.
The normalizedBox will be a data class contain the new ordinations.
val normalizedBox = Normalization(
originalImageHeight = 1080f, //Original input height
originalImageWidth = 2010f, // Original input width
modelImagWidth = 448f, //Model input width
modelImageHeight = 448f //Model input height
).YOLO(
center_x = 20f, //CenterX of Model Output From The Model
center_y = 20f,//CenterY of Model Output From The Model
width = 100f, //Width of Model Output From The Model
height = 120f //Height of Model Output From The Model
)
Other supported normalizing:
Normalization.pascalVOC(x_min, y_min, x_max, y_max)Normalization.coco(x, y, width, height)Normalization.yolo(cx, cy, width, height)Normalization.tfObjectDetection(top, left, bottom, right)Normalization.tfRecordVariant(x_min, y_min, x_max, y_max)
What's next
- Support for NLP models
- Migrate to litert
- Support Kotlin/Nativ
Related Skills
node-connect
354.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
112.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
354.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
354.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
