Groove
Query analysis pipeline framework
Install / Use
/learn @hscells/GrooveREADME
groove
Query analysis pipeline framework
groove is a library for pipeline construction for query analysis. The groove pipeline comprises a query source (the format of the queries), a statistic source (a source for computing information retrieval statistics), preprocessing steps, any measurements to make, and any output formats.
The groove library is primarily used in boogie which is a front-end DSL for groove. If using groove as a Go library, refer to the simple example below which loads Medline queries and analyses them using Elasticsearch and finally outputs the result into a JSON file.
API Usage
In the below example, we would like to use Elasticsearch to measure some query performance predictors on some Medline queries. For the experiment, we would like to pre-process the queries by making each one only contain alpha-numeric characters, and in lowercase. Finally, we would like to output the results of the measures into a JSON file.
// Construct the pipeline.
pipelineChannel := make(chan groove.Result)
p := pipeline.NewGroovePipeline(
query.NewTransmuteQuerySource(query.MedlineTransmutePipeline),
stats.NewElasticsearchStatisticsSource(stats.ElasticsearchHosts("http://localhost:9200"),
stats.ElasticsearchIndex("medline"),
stats.ElasticsearchField("abstract"),
stats.ElasticsearchScroll(true),
stats.ElasticsearchSearchOptions(stats.SearchOptions{
Size: 10000,
RunName: "qpp",
})),
pipeline.Measurement(preqpp.AvgICTF, preqpp.SumIDF, preqpp.AvgIDF, preqpp.MaxIDF, preqpp.StdDevIDF, postqpp.ClarityScore),
pipeline.Evaluation(eval.PrecisionEvaluator, eval.RecallEvaluator),
pipeline.MeasurementOutput(output.JsonMeasurementFormatter),
pipeline.EvaluationOutput("medline.qrels", output.JsonEvaluationFormatter),
pipeline.TrecOutput("medline_qpp.results"))
// Execute it on a directory of queries. A pipeline executes queries in parallel.
go p.Execute("./medline", pipelineChannel)
for {
// Continue until completed.
result := <-pipelineChannel
if result.Type == groove.Done {
break
}
switch result.Type {
case groove.Measurement:
// Process the measurement outputs.
err := ioutil.WriteFile("medline_qpp.json", bytes.NewBufferString(result.Measurements[0]).Bytes(), 0644)
if err != nil {
log.Fatal(err)
}
case groove.Evaluation:
// Process the evaluation outputs.
err := ioutil.WriteFile("medline_qpp_eval.json", bytes.NewBufferString(result.Evaluations[0]).Bytes(), 0644)
if err != nil {
log.Fatal(err)
}
}
}
Citing
If you use this work for scientific publication, please reference
@inproceedings{scells2018framework,
author = {Scells, Harrisen and Locke, Daniel and Zuccon, Guido},
title = {An Information Retrieval Experiment Framework for Domain Specific Applications},
booktitle = {The 41st International ACM SIGIR Conference on Research \&\#38; Development in Information Retrieval},
series = {SIGIR '18},
year = {2018},
}
Logo
The Go gopher was created by Renee French, licensed under Creative Commons 3.0 Attributions license.
Related Skills
node-connect
346.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
346.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
346.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
