93 skills found · Page 1 of 4
Ceruleanacg / Personae📈 Personae is a repo of implements and environment of Deep Reinforcement Learning & Supervised Learning for Quantitative Trading.
lnmangione / Halite IIIIn this paper, we apply machine learning to create bots for Halite III, @twosigma's annual A.I. competition. We develop one classifier using Support Vector Machine with Supervised Learning, and one using a Deep Neural Network with Reinforcement Learning
tongjingqi / AI Can Learn Scientific TasteWe propose Reinforcement Learning from Community Feedback (RLCF), a training paradigm that uses large-scale community signals as supervision, and formulate scientific taste learning as a preference modeling and alignment problem.
accel-brain / Accel Brain CodeThe purpose of this repository is to make prototypes as case study in the context of proof of concept(PoC) and research and development(R&D) that I have written in my website. The main research topics are Auto-Encoders in relation to the representation learning, the statistical machine learning for energy-based models, adversarial generation networks(GANs), Deep Reinforcement Learning such as Deep Q-Networks, semi-supervised learning, and neural network language model for natural language processing.
Jerry-XDL / AIDoctorAIDoctor training medical GPT model with ChatGPT training pipeline, implemantation of Pretraining, Supervised Finetuning, RLHF(Reward Modeling and Reinforcement Learning) and DPO(Direct Preferenc…
lebrice / SequoiaThe Research Tree - A playground for research at the intersection of Continual, Reinforcement, and Self-Supervised Learning.
yaqingwang / WeFEND AAAI20Dataset for paper "Weak Supervision for Fake News Detection via Reinforcement Learning" published in AAAI'2020.
rainarch / DSNERDistantly Supervised NER with Partial Annotation Learning and Reinforcement Learning
InternLM / Spatial SSRL[CVPR 2026] Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"
synlp / ChiMed GPTChiMed-GPT is a Chinese medical large language model (LLM) built by continually training Ziya-v2 on Chinese medical data, where pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF) are comprehensively performed on it.
YanjieZe / Rl3d[RA-L 2023 & IROS 2023] Visual Reinforcement Learning with Self-Supervised 3D Representations
jon--lee / Decision Pretrained TransformerImplemention of the Decision-Pretrained Transformer (DPT) from the paper Supervised Pretraining Can Learn In-Context Reinforcement Learning.
AI-MOO / IBM Machine Learning Professional CertificateMachine Learning, Time Series & Survival Analysis. Develop working skills in the main areas of Machine Learning: Supervised Learning, Unsupervised Learning, Deep Learning, and Reinforcement Learning. Also gain practice in specialized topics such as Time Series Analysis and Survival Analysis.
NVlabs / NFTImplementation of Negative-aware Finetuning (NFT) algorithm for "Bridging Supervised Learning and Reinforcement Learning in Math Reasoning"
scottemmons / RvsReinforcement Learning via Supervised Learning
Panda0406 / Reinforcement Learning Distant Supervision RERobust Distant Supervision Relation Extraction via Deep Reinforcement Learning
michaelnny / InstructLLaMAImplements pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF), to train and fine-tune the LLaMA2 model to follow human instructions, similar to InstructGPT or ChatGPT, but on a much smaller scale.
StateOfTheArt-quant / Sharpesharpe is a unified, interactive, general-purpose environment for backtesting or applying machine learning(supervised learning and reinforcement learning) in the context of quantitative trading
kblomdahl / Dream GoArtificial go player based on reinforcement and supervised learning
StateOfTheArt-quant / Trading Gyma unified environment for supervised learning and reinforcement learning in the context of quantitative trading