SkillAgentSearch skills...

Rmi

A learned index structure

Install / Use

/learn @BenJoyenConseil/Rmi
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

RMI

Go PkgGoDev codecov

A goland implementation of a RMI (Recursive Model Indexes), a Learned Index structure based on the research work by Kraska & al.

Fig 1 from the Case for Learned Index Structures

usage

Create an index and make lookups

// load the age column and parse values into float64 values
ageColumn := extractAgeColumn("data/people.csv")

// create an index over the age column
index := index.New(ageColumn)

// search an age and get back its line position inside the file people.csv
search, _ := strconv.ParseFloat(os.Args[1], 64)
lines, _ := index.Lookup(search)

the main.go file contains an example of a learned index overdata/people.csv age column.

It outputs :

$ cat data/people.csv
name,age,sex
jeanne,90,F
jean,23,M
Carlos,3,M
Carlotta,45,F
Miguel,1,M
Martine,1.5,F
Georgette,23,F

$ go run main.go 23
2020/11/15 20:29:56 People who are 23 years old are located at [8 3] inside data/people.csv 

This is the plot showing the approximation (the linear regression), the cumulative distribution function for each value, and the current age's value (the Keys of the index) :

Fig 2 the LearnedIndex over people.csv

features

  • [x] A simple linear regression model learning the CDF of a float64 array
  • [x] A learned index structure fitted on keys of a collection
  • [x] Finding rows id on a CSV file
  • [x] Return a list of lines matching the key
  • [x] Use max + min error bounding elements to search quickly
  • [x] Benchmarks InMemory LearnedIndex against InMem BinarySearch
  • [x] Store offset lines and a primary key index
  • [ ] Store the sortedTable
  • [x] CLI to create indexes over CSV
  • [ ] Benchmarks Learned against BinarySearchTree
  • [ ] A two layer recursive index
  • [ ] Learn on integer
  • [ ] Index is persistent and durable (on hard drive)
  • [ ] A sort algorythm using learned structure
  • [ ] Learning on string type ?

related works

  • Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The Case for Learned Index Structures. In Proceedings of the 2018 International Conference on Management of Data (SIGMOD '18). Association for Computing Machinery, New York, NY, USA, 489–504. DOI:https://doi.org/10.1145/3183713.3196909

  • Ryan Marcus RMI's reference implementation

Related Skills

View on GitHub
GitHub Stars54
CategoryData
Updated29d ago
Forks4

Languages

Jupyter Notebook

Security Score

100/100

Audited on Mar 7, 2026

No findings