GeometryZero

GeometryZero: Improving Geometry Solving for LLM with Group Contrastive Policy Optimization

Generate Convert Improve

Install / Use

/learn @ekonwang/GeometryZero

About this skill

Quality Score

0/100

README

GeometryZero: Improving Geometry Solving for LLM with Group Contrastive Policy Optimization

<div align="center"> <a href="https://github.com/ekonwang/GeometryZero">💻 Code</a> | <a href="https://arxiv.org/abs/2506.07160">📃 Paper</a> | <a href="https://huggingface.co/papers/2506.07160">🤗 Hugging Face</a> </div>

Recent advances in large language models (LLMs) have demonstrated remarkable capabilities across diverse domains, particularly in mathematical reasoning, amid which geometry problem solving remains a challenging area where auxiliary construction plays a enssential role. Existing approaches either achieve suboptimal performance or rely on massive LLMs (e.g., GPT-4o), incurring massive computational costs. We posit that reinforcement learning with verifiable reward (e.g., GRPO) offers a promising direction for training smaller models that effectively combine auxiliary construction with robust geometric reasoning. However, directly applying GRPO to geometric reasoning presents fundamental limitations due to its dependence on unconditional rewards, which leads to indiscriminate and counterproductive auxiliary constructions. To address these challenges, we propose Group Contrastive Policy Optimization (GCPO), a novel reinforcement learning framework featuring two key innovations: (1) Group Contrastive Masking, which adaptively provides positive or negative reward signals for auxiliary construction based on contextual utility, and a (2) length reward that promotes longer reasoning chains. Building on GCPO, we develop GeometryZero, a family of affordable-size geometric reasoning models that judiciously determine when to employ auxiliary construction. Our extensive empirical evaluation across popular geometric benchmarks (Geometry3K, MathVista) demonstrates that GeometryZero models consistently outperform baselines (e.g. GRPO), achieving an average improvement of 4.29% across all benchmarks.

Citation

Please consider citing our paper and starring this repo if you find them helpful. Thank you!

@misc{wang2025geometryzeroimprovinggeometrysolving,
      title={GeometryZero: Improving Geometry Solving for LLM with Group Contrastive Policy Optimization}, 
      author={Yikun Wang and Yibin Wang and Dianyi Wang and Zimian Peng and Qipeng Guo and Dacheng Tao and Jiaqi Wang},
      year={2025},
      eprint={2506.07160},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.07160}, 
}

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

isf-agent

a repo for an agent that helps researchers apply for isf funding