Forestplot
A Python package to make publication-ready but customizable coefficient plots.
Install / Use
/learn @LSYS/ForestplotREADME
This package makes publication-ready forest plots easy to make out-of-the-box. Users provide a dataframe (e.g. from a spreadsheet) where rows correspond to a variable/study with columns including estimates, variable labels, and lower and upper confidence interval limits.
Additional options allow easy addition of columns in the dataframe as annotations in the plot.
| | |
| --- | --- |
| Release |
|
| Status |
|
| Coverage |
|
| Python |
|
| Docs |
|
| Meta |
|
| Binder|
|
Table of Contents
<details open><summary><b>show/hide</b></summary><p>
</p></details><p></p> <!------------------------- INSTALLATION ------------------------->
Installation
pip install forestplot
conda install forestplot
git clone https://github.com/LSYS/forestplot.git
cd forestplot
pip install .
Developer installation<br>
git clone https://github.com/LSYS/forestplot.git
cd forestplot
pip install -r requirements_dev.txt
make lint
make test
<p align="right">(<a href="#top">back to top</a>)</p>
<!------------------------- QUICK START ------------------------->
Quick Start
import forestplot as fp
df = fp.load_data("sleep") # companion example data
df.head(3)
| | var | r | moerror | label | group | ll | hl | n | power | p-val | |---:|:---------|-----------:|----------:|:--------------------------|:--------------|------:|------:|----:|---------:|----------:| | 0 | age | 0.0903729 | 0.0696271 | in years | age | 0.02 | 0.16 | 706 | 0.671578 | 0.0163089 | | 1 | black | -0.0270573 | 0.0770573 | =1 if black | other factors | -0.1 | 0.05 | 706 | 0.110805 | 0.472889 | | 2 | clerical | 0.0480811 | 0.0719189 | =1 if clerical worker | occupation | -0.03 | 0.12 | 706 | 0.247768 | 0.201948 |
(* This is a toy example of how certain factors correlate with the amount of sleep one gets. See the notebook that generates the data.)
<details><summary><i>The example input dataframe above have 4 key columns</i></summary>| Column | Description | Required |
|:----------|:------------------------------------------------|:----------|
| var | Variable label | ✓ |
| r | Correlation coefficients (estimates to plot) | ✓ |
| label | Variable labels | ✓ |
| group | Variable grouping labels | |
| ll | Conf. int. lower limits | |
| hl | Containing the conf. int. higher limits | |
| n | Sample size | |
| power | Statistical power | |
| p-val | P-value | |
(See Gallery and API Options for more details on required and optional arguments.)
</details>Make the forest plot
fp.forestplot(df, # the dataframe with results data
estimate="r", # col containing estimated effect size
ll="ll", hl="hl", # columns containing conf. int. lower and higher limits
varlabel="label", # column containing variable label
ylabel="Confidence interval", # y-label title
xlabel="Pearson correlation", # x-label title
)
<p align="left"><img width="75%" src="https://raw.githubusercontent.com/LSYS/forestplot/main/docs/images/vanilla.png"></p>
Save the plot
plt.savefig("plot.png", bbox_inches="tight")
<p align="right">(<a href="#top">back to top</a>)</p>
<!------------------ EXAMPLES of CUSTOMIZATIONS ------------------>
Some Examples With Customizations
- Add variable groupings, add group order, and sort by estimate size.
fp.forestplot(df, # the dataframe with results data
estimate="r", # col containing estimated effect size
ll="ll", hl="hl", # columns containing conf. int. lower and higher limits
varlabel="label", # column containing variable label
capitalize="capitalize", # Capitalize labels
groupvar="group", # Add variable groupings
# group ordering
group_order=["labor factors", "occupation", "age", "health factors",
"family factors", "area of residence", "other factors"],
sort=True # sort in ascending order (sorts within group if group is specified)
)
<p align="left"><img width="75%" src="https://raw.githubusercontent.com/LSYS/forestplot/main/docs/images/group-grouporder-sort.png"></p>
- Add p-values on the right and color alternate rows gray
fp.forestplot(df, # the dataframe with results data
estimate="r", # col containing estimated effect size
ll="ll", hl="hl", # columns containing conf. int. lower and higher limits
varlabel="label", # column containing variable label
capitalize="capitalize", # Capitalize labels
groupvar="group", # Add variable groupings
# group ordering
group_order=["labor factors", "occupation", "age", "health factors",
"family factors", "area of residence", "other factors"],
sort=True, # sort in ascending order (sorts within group if group is specified)
pval="p-val", # Column of p-value to be reported on right
