SkillAgentSearch skills...

Iceframe

Dataframe like library and AI Agent for working with Apache Iceberg in Python, using pyiceberg plus natively implemented procedure extensions

Install / Use

/learn @AlexMercedCoder/Iceframe
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

IceFrame (Alpha)

A DataFrame-like library for working with Apache Iceberg tables using REST catalogs with local execution.

IceFrame provides a simple, intuitive API for creating, reading, updating, and deleting Iceberg tables, as well as performing maintenance operations and exporting data.

Features

  • DataFrame API: Familiar interface for working with tables
  • Local Execution: Uses PyIceberg, PyArrow, and Polars for efficient local processing
  • Catalog Support: Works with REST catalogs (including Dremio, Tabular, etc.) and supports credential vending
  • CRUD Operations: Create, Read, Update, Delete tables and data
  • Maintenance: Expire snapshots, remove orphan files, compact data files
  • Export: Export data to Parquet, CSV, and JSON

Documentation

Getting Started

Data Ingestion

Querying & Processing

Table Management

Maintenance & Quality

Advanced Features

Recipes

Installation

pip install iceframe

For cloud storage support:

pip install "iceframe[aws]"   # AWS S3
pip install "iceframe[gcs]"   # Google Cloud Storage
pip install "iceframe[azure]" # Azure Data Lake Storage

Quick Start

  1. Create a .env file with your catalog credentials (see .env.example):
ICEBERG_CATALOG_URI=https://catalog.dremio.cloud/api/iceberg
ICEBERG_TOKEN=your_token
ICEBERG_WAREHOUSE=your_warehouse
ICEBERG_CATALOG_TYPE=rest
  1. Use IceFrame in your code:
from iceframe import IceFrame
from iceframe.utils import load_catalog_config_from_env
import polars as pl

# Initialize
config = load_catalog_config_from_env()
ice = IceFrame(config)

# Create a table
schema = {
    "id": "long",
    "name": "string",
    "created_at": "timestamp"
}
ice.create_table("my_table", schema)

# Append data
data = pl.DataFrame({
    "id": [1, 2],
    "name": ["Alice", "Bob"],
    "created_at": [pl.datetime(2024, 1, 1), pl.datetime(2024, 1, 2)]
})
ice.append_to_table("my_table", data)

# Read data
df = ice.read_table("my_table")
print(df)

# Query Builder API
from iceframe.expressions import col
from iceframe.functions import sum

df = (ice.query("my_table")
      .select("name", sum(col("id")).alias("total_id"))
      .group_by("name")
      .execute())
print(df)

Feature Comparison: IceFrame vs PyIceberg

IceFrame builds on top of PyIceberg, adding high-level abstractions and missing features.

| Feature | PyIceberg (Native) | IceFrame (Enhanced) | | :--- | :--- | :--- | | Table CRUD | Low-level API | Simplified create_table, drop_table | | Data Writing | Arrow/Pandas integration | Polars integration, Auto-schema inference | | Branching | Basic support (WIP) | create_branch, fast_forward, WAP Pattern | | Compaction | rewrite_data_files (limited) | bin_pack, sort strategies (Polars-based) | | Views | Catalog-dependent | Unified ViewManager abstraction | | Maintenance | expire_snapshots | GarbageCollector, Native remove_orphan_files | | SQL Support | None | Fluent Query Builder (select, filter, join) | | Ingestion | add_files | add_files wrapper + Incremental Ingestion recipes | | Rollback | manage_snapshots | rollback_to_snapshot, rollback_to_timestamp | | Async | None | AsyncIceFrame for non-blocking I/O |

View on GitHub
GitHub Stars24
CategoryDevelopment
Updated16d ago
Forks4

Languages

Python

Security Score

75/100

Audited on Mar 17, 2026

No findings