193 skills found · Page 5 of 7
ubc-tea / FedSoupThe official Pytorch implementation of paper "FedSoup: Improving Generalization and Personalization in Federated Learning via Selective Model Interpolation" accepted by MICCAI 2023
HuHaigen / Adaptively Customizing Activation FunctionsTo enhance the nonlinearity of neural networks and increase their mapping abilities between the inputs and response variables, activation functions play a crucial role to model more complex relationships and patterns in the data. In this work, a novel methodology is proposed to adaptively customize activation functions only by adding very few parameters to the traditional activation functions such as Sigmoid, Tanh, and ReLU. To verify the effectiveness of the proposed methodology, some theoretical and experimental analysis on accelerating the convergence and improving the performance is presented, and a series of experiments are conducted based on various network models (such as AlexNet, VGGNet, GoogLeNet, ResNet and DenseNet), and various datasets (such as CIFAR10, CIFAR100, miniImageNet, PASCAL VOC and COCO) . To further verify the validity and suitability in various optimization strategies and usage scenarios, some comparison experiments are also implemented among different optimization strategies (such as SGD, Momentum, AdaGrad, AdaDelta and ADAM) and different recognition tasks like classification and detection. The results show that the proposed methodology is very simple but with significant performance in convergence speed, precision and generalization, and it can surpass other popular methods like ReLU and adaptive functions like Swish in almost all experiments in terms of overall performance.
HuHaigen / COVID 19 Lung Infection SegmentationDue to the irregular shapes,various sizes and indistinguishable boundaries between the normal and infected tissues, it is still a challenging task to accurately segment the infected lesions of COVID-19 on CT images. In this paper, a novel segmentation scheme is proposed for the infections of COVID-19 by enhancing supervised information and fusing multi-scale feature maps of different levels based on the encoder-decoder architecture. To this end, a deep collaborative supervision (Co-supervision) scheme is proposed to guide the network learning the features of edges and semantics. More specifically, an Edge Supervised Module (ESM) is firstly designed to highlight low-level boundary features by incorporating the edge supervised information into the initial stage of down-sampling. Meanwhile, an Auxiliary Semantic Supervised Module (ASSM) is proposed to strengthen high-level semantic information by integrating mask supervised information into the later stage. Then an Attention Fusion Module (AFM) is developed to fuse multiple scale feature maps of different levels by using an attention mechanism to reduce the semantic gaps between high-level and low-level feature maps. Finally, the effectiveness of the proposed scheme is demonstrated on four various COVID-19 CT datasets. The results show that the proposed three modules are all promising. Based on the baseline (ResUnet), using ESM, ASSM, or AFM alone can respectively increase Dice metric by 1.12%, 1.95%,1.63% in our dataset, while the integration by incorporating three models together can rise 3.97%. Compared with the existing approaches in various datasets, the proposed method can obtain better segmentation performance in some main metrics, and can achieve the best generalization and comprehensive performance.
mazurowski-lab / Intrinsic Properties[ICLR 2024] Easy tools for measuring the label sharpness and intrinsic dimension of datasets and learned representations, which relate to model generalization and robustness.
nitinvetcha / DeGAML LLMDeGAML-LLM: Decoupling Generalization and Adaptation in Meta-Learning for Large Language Models
HKU-MedAI / FLEXKnowledge-Guided Adaptation of Pathology Foundation Models Improves Cross-domain Generalization and Demographic Fairness
CR-Gjx / RIATensorFlow implementation of "A Relational Intervention Approach for Unsupervised Dynamics Generalization in Model-Based Reinforcement Learning" (ICLR 2022).
Samyu0304 / LiSACode for Mind the Label Shift of Augmentation-based Graph OOD generalization (LiSA) in CVPR 2023. LiSA is a model-agnostic Graph OOD framework.
ratschlab / IcarefmRoot repository for ICareFM: "A Foundation Model for Intensive Care Unlocking Generalization across Tasks and Domains at Scale"
IBM / Selective Dense State Space ModelOpen-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on Regular Languages"
xxxiaol / Counterfactual Recipe GenerationSource code and data for Counterfactual Recipe Generation: Exploring Models’ Compositional Generalization Ability in a Realistic Scenario (EMNLP2022 main conference paper)
lorenzofamiglini / Irony Sarcasm Detection TaskThe detection of irony and sarcasm is one of the most insidious challenges in the field of Natural Language Processing. Over the years, several techniques have been studied to analyze these rhetorical figures, trying to identify the elements that discriminate, in a significant way, what is sarcastic or ironic from what is not. Within this study, some models that are state of the art are analyzed. As far as Machine Learning is concerned, the most discriminating features such as part of speech, pragmatic particles and sentiment are studied. Subsequently, these models are optimized, comparing Bayesian optimization techniques and random search. Once, the best hyperparameters are identified, ensemble methods such as Bayesian Model Averaging (BMA) are exploited. In relation to Deep Learning, two main models are analyzed: DeepMoji, developed by MIT, and a model called Transformer Based, which exploits the generalization power of Roberta Transformer. As soon as these models are compared, the main goal is to identify a new system able to better capture the two rhetorical figures. To this end, two models composed of attention mechanisms are proposed, exploiting the principle of Transfer Learning, using Bert Tweet Model and DeepMoji Model as feature extractors. After identifying the various architectures, an ensemble method is applied on the set of approaches proposed, in order to identify the best combination of algorithms that can achieve satisfactory results. Frameworks used: Pytorch, TF 2.0, Scikit Learn, Scikit-Optimize, Transformers
keven980716 / Weak To Strong Deception[ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"
eminorhan / Ood BenchmarksOut-of-distribution generalization benchmarks for image recognition models
wellecks / Symbolic GeneralizationSymbolic Brittleness in Sequence Models: on Systematic Generalization in Symbolic Mathematics (AAAI 2022)
huijieZH / Awesome Diffusion Models Memorization GeneralizationNo description available
oshapio / Necessary CompositionalityOfficial code for the paper "Compositional Generalization Requires Linear, Orthogonal Representations in Vision Embedding Models"
LucaHermes / Graph Unet Traffic PredictionHybrid UNet model for traffic prediction from traffic movies. The hybrid graph operation is a mixture of CNN and GNN operations to capture pixel topology and improve spatial generalization.
NaiyangGuan / Truncated Cauchy Non Negative Matrix FactorizationNon-negative matrix factorization (NMF) minimizes the euclidean distance between the data matrix and its low rank approximation, and it fails when applied to corrupted data because the loss function is sensitive to outliers. In this paper, we propose a Truncated CauchyNMF loss that handle outliers by truncating large errors, and develop a Truncated CauchyNMF to robustly learn the subspace on noisy datasets contaminated by outliers. We theoretically analyze the robustness of Truncated CauchyNMF comparing with the competing models and theoretically prove that Truncated CauchyNMF has a generalization bound which converges at a rate of order O(lnn/n‾‾‾‾‾√) , where n is the sample size. We evaluate Truncated CauchyNMF by image clustering on both simulated and real datasets. The experimental results on the datasets containing gross corruptions validate the effectiveness and robustness of Truncated CauchyNMF for learning robust subspaces.
WenLi-o00o / Transfer Learning Based MR2CTcode for manuscript "Synthesizing CT Images from MR Images with Deep Learning: Model Generalization for Different Datasets through Transfer Learning "