Results for "contextual-evaluation"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

34 skills found · Page 1 of 2

AIPHES / Emnlp19 Moverscore

213

MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance

zed

Updated 12d ago

nelson-liu / Contextual Repr Analysis

211

A toolkit for evaluating the linguistic knowledge and transferability of contextual representations. Code for "Linguistic Knowledge and Transferability of Contextual Representations" (NAACL 2019).

universal

Updated 6d ago

jxmorris12 / Cde

202

code for training & evaluating Contextual Document Embedding models

universal

embeddingsretrieval

Updated 21d ago

gabrielsoltz / Metahub

178

MetaHub is an automated contextual security findings enrichment and impact evaluation tool for vulnerability management.

universal

asffawssecurity+1

Updated 11d ago

Weixin-Liang / MetaShift

108

MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts (ICLR 2022)

universal

Updated 1mo ago

Nth-iteration-labs / Contextual

Contextual Bandits in R - simulation and evaluation of Multi-Armed Bandit Policies

universal

banditbandit-experimentsbandit-learning+17

Updated 6d ago

abietti / Cb Bakeoff

scripts for evaluation of contextual bandit algorithms

universal

Updated 1y ago

jiazhen-code / PhD

[CVPR25 Highlight] A ChatGPT-Prompted Visual hallucination Evaluation Dataset, featuring over 100,000 data samples and four advanced evaluation modes. The dataset includes extensive contextual descriptions, counterintuitive images, and clear indicators of hallucination items.

universal

evaluationhallucinationmllm

Updated 12d ago

google-research-datasets / Eth Py150 Open

A redistributable subset of the ETH Py150 corpus [https://www.sri.inf.ethz.ch/py150], introduced in the ICML 2020 paper 'Learning and Evaluating Contextual Embedding of Source Code' [https://proceedings.icml.cc/static/paper_files/icml/2020/5401-Paper.pdf].

universal

Updated 7mo ago

debiai / DebiAI

Bias detection and contextual evaluation tool for your AI projects

universal

aibiascontextual-evaluation+9

Updated 4mo ago

NASA-IMPACT / PyQuARC

The pyQuARC tool reads and evaluates metadata records with a focus on the consistency and robustness of the metadata. pyQuARC flags opportunities to improve or add to contextual metadata information in order to help the user connect to relevant data products. pyQuARC also ensures that information common to both the data product and the file-level metadata are consistent and compatible. pyQuARC frees up human evaluators to make more sophisticated assessments such as whether an abstract accurately describes the data and provides the correct contextual information. The base pyQuARC package assesses descriptive metadata used to catalog Earth observation data products and files. As open source software, pyQuARC can be adapted and customized by data providers to allow for quality checks that evolve with their needs, including checking metadata not covered in base package.

zed

Updated 1mo ago

kixlab / CUPID

[COLM 2025] CUPID: Evaluating Personalized and Contextualized Alignment of LLMs from Interactions

zed

Updated 3mo ago

mohamedehab00 / A Hybrid Arabic Text Summarization Approach Based On Transformers

In this paper, we proposed a sequential hybrid model based on a transformer to summarize Arabic articles. We used two approaches of summarization to make our model. The First is the extractive approach which depends on the most important sentences from the articles to be the summary, so we used Deep Learning techniques specifically transformers such as AraBert to make our summary, The second is abstractive, and this approach is similar to human summarization, which means that it can use some words which have the same meaning but different from the original text. We apply this kind of summary using MT5 Arabic pre-trained transformer model. We sequentially applied these two summarization approaches to building our A3SUT hybrid model. The output of the extractive module is fed into the abstractive module. We enhanced the summary’s quality to be closer to the human summary by applying this approach. We tested our model on the ESAC dataset and evaluated the extractive summary using the Rouge score technique; we got a precision of 0.5348 and a recall of 0.5515, and an f1 score of 0.4932 and the evaluation of the abstractive model is evaluated by user satisfaction. We add some features to our summary to make it more understandable by applying the metadata generation task” data about data” and classification. By applying metadata generation, we add facilities to our summary, identification, and summary organization. Metadata provides essential contextual details, as not all summaries are self-describing. Also, classify the original text to determine the summary topic before reading. We acquire 97.5% accuracy by using Support Vector Machine (SVM) and trained it using NADA corpus.

gsbDBI / Contextual Bandits Evaluation

Offline Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits

universal

Updated 3mo ago

devsecflow / Cloud Native Assurance Maturity Model

A comprehensive framework and assessment toolkit for measuring and improving Cloud Native security maturity across 8 critical business functions. Includes automated scoring, contextual recommendations, and evidence-based evaluation.

universal

Updated 8mo ago

hrouhizadeh / BioWiC

A Dataset for Evaluating Contextualized Representation of Biomedical Concepts in Language Models

zed

Updated 8mo ago

datapizza-labs / Contextual Retrieval Experiments

Evaluate RAG retrieval strategies with Contextual Retrieval 🍕

universal

Updated 2mo ago

renatocaliari / JTBD Mapper AI Agent

This AI-powered JTBD Mapper helps you explore and understand the potential success criteria people use to evaluate a job to be done. It also helps you discover potential contextual segments to define your market, uncover hidden opportunities, and create innovative products that meet people needs.

universal

innovationjobstobedoneproblemspace+2

Updated 1mo ago

jklafka / Context Probes

Using syntactic and semantic probing tasks to evaluate how contextual word embeddings encode language

universal

Updated 3y ago

bheinzerling / Subword Sequence Tagging

Code for the paper: Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation (ACL 2019)

universal

Updated 3y ago