281 skills found · Page 5 of 10
zhw12 / AlgMapCode for the paper: "Mining Algorithm Roadmap in Scientific Publications" - KDD 2019
SparseGridsForDynamicEcon / HDMRExample codes for the SIAM Journal on Scientific Computing (SISC) paper "High-Dimensional Dynamic Stochastic Model Representation"
saloni-nd / Scientific DiscoveryFiles and code for my blog, Scientific Discovery
SatcherInstitute / Health Equity TrackerOpen-source data platform visualizing health disparities across the US, revealing inequities in HIV and other diseases alongside determinants of health. Launched with Google.org and developed by Morehouse School of Medicine. We welcome code contributions and scientific collaboration to advance health equity for all communities.
arpit3043 / Extractive Text SummerizationSummarization systems often have additional evidence they can utilize in order to specify the most important topics of document(s). For example, when summarizing blogs, there are discussions or comments coming after the blog post that are good sources of information to determine which parts of the blog are critical and interesting. In scientific paper summarization, there is a considerable amount of information such as cited papers and conference information which can be leveraged to identify important sentences in the original paper. How text summarization works In general there are two types of summarization, abstractive and extractive summarization. Abstractive Summarization: Abstractive methods select words based on semantic understanding, even those words did not appear in the source documents. It aims at producing important material in a new way. They interpret and examine the text using advanced natural language techniques in order to generate a new shorter text that conveys the most critical information from the original text. It can be correlated to the way human reads a text article or blog post and then summarizes in their own word. Input document → understand context → semantics → create own summary. 2. Extractive Summarization: Extractive methods attempt to summarize articles by selecting a subset of words that retain the most important points. This approach weights the important part of sentences and uses the same to form the summary. Different algorithm and techniques are used to define weights for the sentences and further rank them based on importance and similarity among each other. Input document → sentences similarity → weight sentences → select sentences with higher rank. The limited study is available for abstractive summarization as it requires a deeper understanding of the text as compared to the extractive approach. Purely extractive summaries often times give better results compared to automatic abstractive summaries. This is because of the fact that abstractive summarization methods cope with problems such as semantic representation, inference and natural language generation which is relatively harder than data-driven approaches such as sentence extraction. There are many techniques available to generate extractive summarization. To keep it simple, I will be using an unsupervised learning approach to find the sentences similarity and rank them. One benefit of this will be, you don’t need to train and build a model prior start using it for your project. It’s good to understand Cosine similarity to make the best use of code you are going to see. Cosine similarity is a measure of similarity between two non-zero vectors of an inner product space that measures the cosine of the angle between them. Since we will be representing our sentences as the bunch of vectors, we can use it to find the similarity among sentences. Its measures cosine of the angle between vectors. Angle will be 0 if sentences are similar. All good till now..? Hope so :) Next, Below is our code flow to generate summarize text:- Input article → split into sentences → remove stop words → build a similarity matrix → generate rank based on matrix → pick top N sentences for summary.
msadat3 / SciHTCThe dataset and code for the EMNLP 2022 paper "Hierarchical Multi-Label Classification of Scientific Documents" are released here.
matgarate / Blender BvoxerCodes and blendfiles to use voxel data for scientific visualization in blender.
QishengLi / Virtual ChinrestJavaScript plug-in, data, and analysis code for Scientific Reports submission: Controlling for Participants’ Viewing Distance in Large-Scale, Psychophysical Online Experiments Using a Virtual Chinrest
CMarsRover / SciAgentGYMCode for Paper: Benchmarking Multi-step Scientific Tool-use in LLM Agents
celsohlsj / Gee Brazil SvCode repository for the paper: Silva Junior et al. Benchmark maps of 33 years of secondary forest age for Brazil. Scientific Data (2020). https://doi.org/10.1038/s41597-020-00600-4
Cassie07 / SNOW Dataset[Scientific Data] The official code of "A Large-scale Synthetic Pathological Dataset for Deep Learning-enabled Segmentation of Breast Cancer"
twjiang / MIMO CFESource code for the EMNLP 2019 paper "Multi-Input Multi-Output Sequence Labeling for Joint Extraction of Fact and Condition Tuples from Scientific Text" (给定科研文本如生物医药文献,联合抽取其中事实三元组、条件三元组,即对文献进行信息结构化)
cogrhythms / Good Coding PracticesGood coding practices for scientific programming.
samanseifi / Scientific Codes CollectionComputational fluid dynamics codes
LEAF-BoiseState / SPEEDPython code for Scientific Programming for Earth and Ecological Discovery
rojassergio / Learning ScipyThis repository contains source code programs and some notes to complement the book about the scientific Python module SciPy entitle [Learning SciPy for Numerical and Scientific Computing - Second Edition (2015)](https://www.packtpub.com/big-data-and-business-intelligence/learning-scipy-numerical-and-scientific-computing-second-edition)
matsengrp / PluginsClaude Code plugin with specialized agents for scientific writing, code review, and technical documentation
ioanabica / DiffVAECode for Nature Scientific Reports 2020 paper: "Unsupervised generative and graph neural methods for modelling cell differentiation" by Ioana Bica, Helena Andrés-Terré, Ana Cvejic, Pietro Liò
OSU-NLP-Group / AutoSDT[EMNLP'25] AutoSDT is a fully automatic pipeline to collect data-driven scientific coding tasks to train co-scientist models.
Lab-Notebooks / CodescribeAI Agent and CLI for Code Translation and Software Development in Scientific Computing