FeatureTransforms.jl
Transformations for performing feature engineering in machine learning applications
Install / Use
/learn @invenia/FeatureTransforms.jlREADME
FeatureTransforms
FeatureTransforms.jl provides utilities for performing feature engineering in machine learning pipelines with support for AbstractArrays and Tables.
Getting Started
There are a few key parts to the Transforms.jl API, refer to the documentation for each to learn more.
Transforms are callable types that define certain operations to be performed on data, for example, normalizating or computing a linear combination. Refer to the Guide to Transforms to learn how they are defined and used on various types of input.- The
apply,apply!andapply_appendmethods are used to implementTransforms in various ways. Consult the Examples Section for a guide to some typical use cases. See also the example below. - The Transform Interface is used when you want to encapsulate sequences of
Transforms in an end-to-end feature engineering pipeline. - For a full list of currently implemented
Transforms, consult the API.
Installation
julia> using Pkg; Pkg.add("FeatureTransforms")
Quickstart
Load in the dependencies and construct some toy data.
julia> using DataFrames, FeatureTransforms
julia> df = DataFrame(:a=>[1, 2, 3, 4, 5], :b=>[5, 4, 3, 2, 1], :c=>[2, 1, 3, 1, 3])
5×3 DataFrame
Row │ a b c
│ Int64 Int64 Int64
─────┼─────────────────────
1 │ 1 5 2
2 │ 2 4 1
3 │ 3 3 3
4 │ 4 2 1
5 │ 5 1 3
Next, we construct the Transform that we want to perform on the data.
This can be done one of three ways:
applywhich does not mutate the underlying data,apply!which does mutate the underlying data,apply_appendwhich willapplytransform thenappendthe result to a copy of the input.
All Transforms support the non-mutating apply and apply_append methods, but any Transform that changes the type or dimension of the input does not support the mutating apply!.
In any case, the return type will be the same as the input, so if you provide an Array you get back an Array, and if you provide a Table you get back a Table.
Here we are working with a DataFrame, so the return will always be a DataFrame:
julia> p = Power(3);
julia> FeatureTransforms.apply(df, p; cols=[:a], header=[:a3])
5×1 DataFrame
Row │ a3
│ Int64
─────┼───────
1 │ 1
2 │ 8
3 │ 27
4 │ 64
5 │ 125
julia> FeatureTransforms.apply!(df, p; cols=[:a])
5×3 DataFrame
Row │ a b c
│ Int64 Int64 Int64
─────┼─────────────────────
1 │ 1 5 2
2 │ 8 4 1
3 │ 27 3 3
4 │ 64 2 1
5 │ 125 1 3
julia> FeatureTransforms.apply_append(df, p; cols=[:a], header=[:a3])
5×4 DataFrame
Row │ a b c a3
│ Int64 Int64 Int64 Int64
─────┼────────────────────────────
1 │ 1 5 2 1
2 │ 2 4 1 8
3 │ 3 3 3 27
4 │ 4 2 1 64
5 │ 5 1 3 125
As an extra convenience, you can call the Transform type directly, which emulates calling apply:
julia> ohe = OneHotEncoding(1:3);
julia> lc = LinearCombination([1, -10]);
julia> ohe_df = ohe(df; cols=[:c], header=[:cat1, :cat2, :cat3])
julia> lc_df = lc(df; cols=[:a, :b], header=[:ab]);
julia> df = hcat(df, lc_df, ohe_df)
5×7 DataFrame
Row │ a b c ab cat1 cat2 cat3
│ Int64 Int64 Int64 Int64 Bool Bool Bool
─────┼─────────────────────────────────────────────────
1 │ 1 5 2 -49 false true false
2 │ 8 4 1 -32 true false false
3 │ 27 3 3 -3 false false true
4 │ 64 2 1 44 true false false
5 │ 125 1 3 115 false false true
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
research_rules
Research & Verification Rules Quote Verification Protocol Primary Task "Make sure that the quote is relevant to the chapter and so you we want to make sure that we want to have it identifie
