Cathodedataextractor
A document-level information extraction pipeline for layered cathode materials for sodium-ion batteries.
Install / Use
/learn @GGNoWayBack/CathodedataextractorREADME
CathodeDataExtractor
Cathodedataextractor is a lightweight document-level information extraction pipeline that can automatically extract
comprehensive properties related to synthesis parameters, cycling and rate performance of cathode materials from the
literature of layered cathode materials for sodium-ion batteries.
Installation
pip install cathodedataextractor
Features
- It is built on open-source libraries: pymatgen, text2chem, and ChemDataExtractor v2 with some modifications.
- BatterySciBERT-uncased Multi-Label text classification model for filtering documents.
- Automated comprehensive data extraction pipeline for cathode materials.
- Paragraph Multi-Class classification algorithms for documents (HTML/XML) from the RSC and Elsevier.
- A normalised entity handling process is provided.
- An effective chemical abbreviation detection module.
- Heuristic multi-level relation extraction algorithm for electrochemical properties.
In addition, the pipeline is also suitable for string sequence text extraction.
Quick start
Extract from documents
from glob import iglob
from cathodedataextractor.information_extraction_pipe import Pipeline
pipline = Pipeline()
for document in iglob('*ml'):
extraction_results = pipline.extract(document)
Extract from string
from cathodedataextractor.information_extraction_pipe import Pipeline
extraction_results = Pipeline.from_string(
'Apart from the conventional cationic redox of transition metals, '
'both Na-deficit and Na-excess materials have showcased the ability '
'to exploit oxygen redox activity as O2–/O2n– for a charge '
'compensation mechanism. To realize cathodes with enhanced energy '
'density, a technique like the incorporation of alkali metal ions '
'into transition metal layers has been adopted. Recent work by Boisse '
'(13) et al. displayed the impact of honeycomb cation ordering of '
'a highly stabilized intermediate phase for a Na2RuO3 cathode material '
'in instigating the anionic redox activity and providing a capacity '
'of 180 mAh g–1 at 0.2C with a capacity retention of 89% for over '
'50 cycles. More devoted efforts to realize the utmost potential '
'of anionic redox ought to be carried out in the future.')
Issues?
You can either report an issue on GitHub or contact me directly. Try gouyx@mail2.sysu.edu.cn.
Citing
If the source code turns out to be helpful to your research, please cite the following work:
Gou, Y., Zhang, Y., Zhu, J. et al. A document-level information extraction pipeline for layered cathode materials for sodium-ion batteries. Sci Data 11, 372 (2024).
Related Skills
node-connect
341.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.5kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
341.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.5kCommit, push, and open a PR
