GBFS4MPPML
Official implementation of "Gradient Boosted and Statistical Feature Selection Pipeline for Materials Property Predictions"
Install / Use
/learn @Songyosk/GBFS4MPPMLREADME
GBFS4MPPML
Official implementation of "Gradient Boosted and Statistical Feature Selection Workflow for Materials Property Predictions"
J. Chem. Phys. 159, 194106 (2023)
By S. G. Jung, G. Jung & J. M. Cole
Introduction
The scripts herein are used to generate the results presented in the aforementioned paper.
Jupyter Notebooks are provided along with the scripts. Their main purpose is to demonstrate the functionalities contained therein.
For each property there are two Jupyter Notebooks:
(i) Featurize
This notebook demonstrates the process of generating the features using various descriptors as mentioned in the corresponding manuscript. The descriptors we use are widely recognised and there are various ways one can generate these features. This step can be skipped if one already has a list of features they wish to use for their chemical data.
(ii) GBFS
This notebook goes through the propose workflow as illustrated by the figure below. The approach we have taken is to use a pre-defined local path, where relevant data are stored and new data files are saved. See the provided Jupyter Notebooks as examples. Each function requires pre-defined parameters, such as the name of target variable, a list of features, type of problem etc.
Data
The data sets are available from: ![Table of Datasets]
Workflow
The overview of the project pipeline:

Acknowledgements
J.M.C. conceived the overarching project. S.G.J. and J.M.C. designed the study. S.G.J. developed the workflow, performed the data acquisition and featurization, the statistical analyses, the model pre-training and fine-tuning, and analysed the data under the Ph.D. supervision of J.M.C. G.J. assisted with the data gathering and the development of artificial neural networks for the material-property predictions. S.G.J. drafted the manuscript with assistance from J.M.C. All authors read and approved the final agreed manuscript.
J.M.C. is grateful for the BASF/Royal Academy of Engineering Research Chair in Data-Driven Molecular Engineering of Functional Materials, which is partly sponsored by the Science and Technology Facilities Council (STFC) via the ISIS Neutron and Muon Source; this chair is supported by a PhD studentship (for S.G.J.). STFC is also thanked for a PhD studentship that is sponsored by its Scientific Computing Department (for G.J.).
🔗 Links
License
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
research_rules
Research & Verification Rules Quote Verification Protocol Primary Task "Make sure that the quote is relevant to the chapter and so you we want to make sure that we want to have it identifie
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
