45 skills found · Page 1 of 2
huggingface / Datasets🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
brmson / Dataset StsSemantic Text Similarity Dataset Hub
NVIDIA-NeMo / NemotronDeveloper Asset Hub for NVIDIA Nemotron — A one-stop resource for training recipes, usage cookbooks, datasets, and full end-to-end reference examples to build with Nemotron models
sooftware / KsponspeechPre-processing KsponSpeech corpus (Korean Speech dataset) provided by AI Hub.
hmohebbi / SentimentAnalysis(BOW, TF-IDF, Word2Vec, BERT) Word Embeddings + (SVM, Naive Bayes, Decision Tree, Random Forest) Base Classifiers + Pre-trained BERT on Tensorflow Hub + 1-D CNN and Bi-Directional LSTM on IMDB Movie Reviews Dataset
arpitg1304 / ForgeConvert between robotics dataset formats (RLDS, LeRobot v2/v3, Zarr, HDF5, Rosbag). Inspect, visualize, and analyze datasets. Works with HuggingFace Hub. Built for OpenVLA, Octo, LeRobot, and Diffusion Policy workflows.
airctic / IcedataIceData: Datasets Hub for the *IceVision* Framework
mahdi-usask / Wind Speed Forecasting For Wind Power Generation Plant. Neural Network ML Based Prediction Algo. For largescale wind power penetration Wind speed prediction is a basic requirement of wind energy generation. There are many artificial neural network (ANN), ARMA, ARIMA approaches proposed in the recent literature in order to tackle this problem. This paper will use the artificial neural network (ANN) approach to get a prediction of wind speed using historical wind speed data. The historical data used here were gathered from NREL website ,as hourly basis from 80 meter hub height. The measurement location is NREL Flatirons Campus (M2). The readings displayed are derived from instruments mounted on or near a 82 meter (270 foot) meteorological tower located at the western edge of the Flatirons Campus (formerly NWTC) and about 11 km (7 miles) west of Broomfield, and approximately 8 km (5 miles) south of Boulder, Colorado. The tower is located at 39o 54' 38.34" N and 105o 14' 5.28" W (datum WGS84) with its base at an elevation of 1855 meters (6085 feet) above mean sea level. Data from year 2014 to 2018, in total 5 years of data has been used here as dataframe. Here the neural network has been implemented by Tensorflow’s Keras API. The used model is “sequential”. Four dense layer has been used in the optimized model. LSTM(Long- short-term memory) architecture has been used here as neural network architecture. Activation function being used in the dense layers are dropout function. The optimizer being used here is Adam. Here various range of Dropout function has been examined and chosen the best fit for this model. Also this paper examined various kinds of optimization method and used the best fitted one. The model performances were evaluated using the mean squared error using adam optimizer. Various kinds of data analytic techniques has been used here for better visualization and in depth understanding of the dataset and its variables. Since it is mostly a time series data so in the analytic part how the data is being changed with time has been shown. From the result of the predicted dataset it can be state that, this wind speed prediction model works best for all kinds of winds speed besides overfitted/ abnormal wind speeds which is a very rare case scenario.
IGNF / FLAIR HUBUperFuse code for the FLAIR-HUB dataset
Prasad9 / TFHubSampleDemonstration of usage of different types of TensorFlow Hub modules integrated with Datasets, Iterators and Saved_Models.
harryabraham11 / Dataset HealthHubDeveloped an AI-driven data preprocessing platform that automates dataset analysis, identifies data quality issues, and performs one-click cleaning and preparation for machine learning workflows using Python, Pandas, and Gradio
UoA-eResearch / Openface Mass CompareAn openface script that runs a REST server. Posted images are compared against a large dataset, and the most likely match is returned. Works with https://hub.docker.com/r/uoacer/openface-mass-compare/
DOI-Finder-Browser-Extension / Doi FinderA Chrome extension that finds Digital Object Identifier (DOI) links of journal articles, books, and datasets on a webpage and adds a direct link to open them on Sci-Hub.
Quant-Enthusiasts / Quant Data ExplorerOpen-source hub for cleaned, annotated, and well-documented financial datasets. Contributors can add new data, notebooks, and visualizations to create a beginner-friendly quant data library.
mohres / LLM SLM Fine TuningFine-tuning open-source large and small instruct/chat language models (LLMs & SLMs) from the Hugging Face Model Hub using public datasets.
lemon07r / VellumForge2VellumForge2 is a Golang CLI for generating high-quality Direct Preference Optimization datasets via a hierarchical prompt pipeline with optional LLM-as-a-Judge scoring. This tool supports OpenAI-compatible APIs, smart rate limiting, concurrent workers, and one-command Hugging Face Hub uploads.
BiaPri / Awesome Energy ToolsCentral hub offering an extensive collection of datasets and tools for energy-related applications.
cahlen / Conversation Dataset GeneratorCraft conversational datasets (JSONL format with rich metadata) using LLMs. Specify parameters manually or use a creative brief for LLM-generated arguments with automatic topic/scenario variation. Optional web search improves persona grounding. Ideal for LoRA tuning, persona training, and creative writing. Includes Hugging Face Hub upload.
KimRass / Train EasyocrFine-tuning 'EasyOCR' on the '공공행정문서 OCR' dataset provided by 'AI-Hub'.
lod-cloud / Datahub2voidGenerates a VoID description of all datasets in the lodcloud group on the Data Hub