Treelearn

Ensembles and Tree Learning Algorithms for Python

Generate Convert Improve

Install / Use

/learn @capitalk/Treelearn

About this skill

Quality Score

0/100

README

TreeLearn started as a Python implementation of Breiman's Random Forest but is being slowly generalized into a tree ensemble library.

Creating a Random Forest

A random forest is simply a bagging ensemble of randomized tree. To construct these with default parameters:

forest = treelearn.ClassifierEnsemble(base_model = treelearn.RandomizedTree())

Training

Place your training data in a n-by-d numpy array, where n is the number of training examples and d is the dimensionality of your data. Place labels in an n-length numpy array. Then call:

forest.fit(Xtrain,Y)

If you're lazy, there's a helper for simultaneously creating and training a random forest:

forest = treelearn.train_random_forest(X, Y)

Classification

forest.predict(Xtest)

ClassifierEnsemble options

base_model = any classifier which obeys the fit/predict protocol
num_models = size of the forest
bagging_percent = what percentage of your data each classifier is trained on
bagging_replacement = sample with or without replacement
stacking_model = treat outputs of base classifiers as inputs to given model

RandomizedTree options

num_features_per_node = number of features each node of a tree should consider (default = log2 of total features)
min_leaf_size = stop splitting if we get down to this number of data points
max_height = stop splitting if we exceed this number of tree levels
max_thresholds = how many feature value thesholds to consider (use None for all values)

ObliqueTree options

num_features_per_node = size of random feature subset at each node, default = sqrt(total number of features)
C = Tradeoff between error and L2 regularizer of linear SVM
max_depth = When you get to this depth, train an SVM on all features and stop splitting the data.
min_leaf_size = stop splitting when any subset of the data gets smaller than this.

Related Skills

proje

Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

research_rules

Research & Verification Rules Quote Verification Protocol Primary Task "Make sure that the quote is relevant to the chapter and so you we want to make sure that we want to have it identifie

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

capitalk

View profile

View on GitHub

GitHub Stars52

CategoryEducation

Updated1y ago

Forks11

capitalk/treelearn

Languages

Python

Security Score

80/100

Audited on May 30, 2024

No findings