Results for "model-generalization"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

193 skills found · Page 5 of 7

ubc-tea / FedSoup

The official Pytorch implementation of paper "FedSoup: Improving Generalization and Personalization in Federated Learning via Selective Model Interpolation" accepted by MICCAI 2023

universal

Updated 4mo ago

HuHaigen / Adaptively Customizing Activation Functions

To enhance the nonlinearity of neural networks and increase their mapping abilities between the inputs and response variables, activation functions play a crucial role to model more complex relationships and patterns in the data. In this work, a novel methodology is proposed to adaptively customize activation functions only by adding very few parameters to the traditional activation functions such as Sigmoid, Tanh, and ReLU. To verify the effectiveness of the proposed methodology, some theoretical and experimental analysis on accelerating the convergence and improving the performance is presented, and a series of experiments are conducted based on various network models (such as AlexNet, VGGNet, GoogLeNet, ResNet and DenseNet), and various datasets (such as CIFAR10, CIFAR100, miniImageNet, PASCAL VOC and COCO) . To further verify the validity and suitability in various optimization strategies and usage scenarios, some comparison experiments are also implemented among different optimization strategies (such as SGD, Momentum, AdaGrad, AdaDelta and ADAM) and different recognition tasks like classification and detection. The results show that the proposed methodology is very simple but with significant performance in convergence speed, precision and generalization, and it can surpass other popular methods like ReLU and adaptive functions like Swish in almost all experiments in terms of overall performance.

universal

Updated 4mo ago

HuHaigen / COVID 19 Lung Infection Segmentation

Due to the irregular shapes,various sizes and indistinguishable boundaries between the normal and infected tissues, it is still a challenging task to accurately segment the infected lesions of COVID-19 on CT images. In this paper, a novel segmentation scheme is proposed for the infections of COVID-19 by enhancing supervised information and fusing multi-scale feature maps of different levels based on the encoder-decoder architecture. To this end, a deep collaborative supervision (Co-supervision) scheme is proposed to guide the network learning the features of edges and semantics. More specifically, an Edge Supervised Module (ESM) is firstly designed to highlight low-level boundary features by incorporating the edge supervised information into the initial stage of down-sampling. Meanwhile, an Auxiliary Semantic Supervised Module (ASSM) is proposed to strengthen high-level semantic information by integrating mask supervised information into the later stage. Then an Attention Fusion Module (AFM) is developed to fuse multiple scale feature maps of different levels by using an attention mechanism to reduce the semantic gaps between high-level and low-level feature maps. Finally, the effectiveness of the proposed scheme is demonstrated on four various COVID-19 CT datasets. The results show that the proposed three modules are all promising. Based on the baseline (ResUnet), using ESM, ASSM, or AFM alone can respectively increase Dice metric by 1.12%, 1.95%,1.63% in our dataset, while the integration by incorporating three models together can rise 3.97%. Compared with the existing approaches in various datasets, the proposed method can obtain better segmentation performance in some main metrics, and can achieve the best generalization and comprehensive performance.

mazurowski-lab / Intrinsic Properties

[ICLR 2024] Easy tools for measuring the label sharpness and intrinsic dimension of datasets and learned representations, which relate to model generalization and robustness.

universal

generalizationintrinsic-dimensionlabel-sharpness+2

Updated 2mo ago

nitinvetcha / DeGAML LLM

DeGAML-LLM: Decoupling Generalization and Adaptation in Meta-Learning for Large Language Models

universal

bayesian-meta-learninggeneralizationlarge-language-models+4

Updated 10d ago

HKU-MedAI / FLEX

Knowledge-Guided Adaptation of Pathology Foundation Models Improves Cross-domain Generalization and Demographic Fairness

universal

Updated 1mo ago

CR-Gjx / RIA

TensorFlow implementation of "A Relational Intervention Approach for Unsupervised Dynamics Generalization in Model-Based Reinforcement Learning" (ICLR 2022).

universal

Updated 11mo ago

Samyu0304 / LiSA

Code for Mind the Label Shift of Augmentation-based Graph OOD generalization (LiSA) in CVPR 2023. LiSA is a model-agnostic Graph OOD framework.

universal

cvpr2023graphood-generalization

Updated 2mo ago

ratschlab / Icarefm

Root repository for ICareFM: "A Foundation Model for Intensive Care Unlocking Generalization across Tasks and Domains at Scale"

universal

Updated 5d ago

IBM / Selective Dense State Space Model

Open-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on Regular Languages"

universal

Updated 1mo ago

xxxiaol / Counterfactual Recipe Generation

Source code and data for Counterfactual Recipe Generation: Exploring Models’ Compositional Generalization Ability in a Realistic Scenario (EMNLP2022 main conference paper)

universal

Updated 22d ago

lorenzofamiglini / Irony Sarcasm Detection Task

The detection of irony and sarcasm is one of the most insidious challenges in the field of Natural Language Processing. Over the years, several techniques have been studied to analyze these rhetorical figures, trying to identify the elements that discriminate, in a significant way, what is sarcastic or ironic from what is not. Within this study, some models that are state of the art are analyzed. As far as Machine Learning is concerned, the most discriminating features such as part of speech, pragmatic particles and sentiment are studied. Subsequently, these models are optimized, comparing Bayesian optimization techniques and random search. Once, the best hyperparameters are identified, ensemble methods such as Bayesian Model Averaging (BMA) are exploited. In relation to Deep Learning, two main models are analyzed: DeepMoji, developed by MIT, and a model called Transformer Based, which exploits the generalization power of Roberta Transformer. As soon as these models are compared, the main goal is to identify a new system able to better capture the two rhetorical figures. To this end, two models composed of attention mechanisms are proposed, exploiting the principle of Transfer Learning, using Bert Tweet Model and DeepMoji Model as feature extractors. After identifying the various architectures, an ensemble method is applied on the set of approaches proposed, in order to identify the best combination of algorithms that can achieve satisfactory results. Frameworks used: Pytorch, TF 2.0, Scikit Learn, Scikit-Optimize, Transformers

zed

keven980716 / Weak To Strong Deception

[ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"

universal

Updated 1d ago

eminorhan / Ood Benchmarks

Out-of-distribution generalization benchmarks for image recognition models

universal

Updated 3y ago

wellecks / Symbolic Generalization

Symbolic Brittleness in Sequence Models: on Systematic Generalization in Symbolic Mathematics (AAAI 2022)

universal

Updated 1y ago

huijieZH / Awesome Diffusion Models Memorization Generalization

No description available

universal

Updated 7d ago

oshapio / Necessary Compositionality

Official code for the paper "Compositional Generalization Requires Linear, Orthogonal Representations in Vision Embedding Models"

universal

Updated 11d ago

LucaHermes / Graph Unet Traffic Prediction

Hybrid UNet model for traffic prediction from traffic movies. The hybrid graph operation is a mixture of CNN and GNN operations to capture pixel topology and improve spatial generalization.

universal

Updated 1y ago

NaiyangGuan / Truncated Cauchy Non Negative Matrix Factorization

Non-negative matrix factorization (NMF) minimizes the euclidean distance between the data matrix and its low rank approximation, and it fails when applied to corrupted data because the loss function is sensitive to outliers. In this paper, we propose a Truncated CauchyNMF loss that handle outliers by truncating large errors, and develop a Truncated CauchyNMF to robustly learn the subspace on noisy datasets contaminated by outliers. We theoretically analyze the robustness of Truncated CauchyNMF comparing with the competing models and theoretically prove that Truncated CauchyNMF has a generalization bound which converges at a rate of order O(lnn/n‾‾‾‾‾√) , where n is the sample size. We evaluate Truncated CauchyNMF by image clustering on both simulated and real datasets. The experimental results on the datasets containing gross corruptions validate the effectiveness and robustness of Truncated CauchyNMF for learning robust subspaces.

universal

Updated 6mo ago

WenLi-o00o / Transfer Learning Based MR2CT

code for manuscript "Synthesizing CT Images from MR Images with Deep Learning: Model Generalization for Different Datasets through Transfer Learning "

universal

Updated 10mo ago