Results for "big-data-processing"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

205 skills found · Page 5 of 7

pyajs / Veronica

big data processing and machine learning platform，just like useing sql

universal

big-data-processingmachine-learning-platformpyspark+3

Updated 8mo ago

wailbentafat / Go Bigdata Breaker

A lightweight and scalable circuit breaker implementation in Go, tailored for high-throughput, distributed big data processing pipelines.

universal

Updated 1mo ago

Priyanshu-Baghel / BIG DATA ANALYTICS LAB IT 513

This repository contains experiments, datasets, and implementation codes for the Big Data Analytics Laboratory course. The objective of this lab is to provide hands-on experience with large-scale data processing, distributed computing frameworks, and real-world data analysis using modern Big Data technologies.

universal

Updated 1mo ago

PaulThomas20002 / S5 DATABASE MANAGEMENT SYSTEMS LAB

This course helps the learners to get practical exposure on database creation, SQL queries creation, transaction processing and NoSQL & MongoDB based operations. The course enables the students to create, manage and administer the databases, develop necessary tools for the design and development of the databases, and to understand emerging technologies to handle Big Data.

universal

333cslcsl333+7

Updated 1y ago

mboukabous / Security Intelligence On Exchanged Multimedia Messages Based On Deep Learning

Deep learning (DL) approaches use various processing layers to learn hierarchical representations of data. Recently, many methods and designs of natural language processing (NLP) models have shown significant development, especially in text mining and analysis. For learning vector-space representations of text, there are famous models like Word2vec, GloVe, and fastText. In fact, NLP took a big step forward when BERT and recently GTP-3 came out. Deep Learning algorithms are unable to deal with textual data in their natural language data form which is typically unstructured information; they require special representation of data as inputs instead. Usually, natural language text data needs to be converted into internal representations form that DL algorithms can read such as feature vectors, hence the necessity to use representation learning models. These models have shown a big leap during the last years. Their set ranges from the methods that embed words into distributed representations and use the language modeling objective to adjust them as model parameters (like Word2vec, fastText, and GloVe), to recently transfer learning models (like ELMo, BERT, ULMFiT, XLNet, ALBERT, RoBERTa, and GPT-2). These last use larger corpora, more parameters, more computing resources, and instead of assigning each word with a fixed vector, they use multilayer neural networks to calculate dynamic representations for the words according to their context, which is especially useful for the words with multiple meanings.

giucris / Yasp

Yet Another SPark Framework

universal

big-databig-data-processingelt+7

Updated 2y ago

sammcilroy / Twitter Sentiment Streaming

Twitter Sentiment Analysis: Big Data Streaming Processing

universal

awsflaskjavascript+6

Updated 1y ago

Xiaoyuan-Liu / MapReduce Big Data Processing

MapReduce大数据实验课程资料

universal

Updated 1y ago

paypal / Gators

Gators is a package to handle model building with big data and fast real-time pre-processing, even for a large number of QPS, using only Python.

universal

big-datadata-sciencemachine-learning+1

Updated 2mo ago

impresso / Impresso Text Acquisition

🛠️ Python library to import OCR data in various formats into the canonical JSON format defined by the Impresso project.

universal

big-data-processinghistorical-newspapersimpresso-project

Updated 10d ago

chaudharysurya14 / Ambari Hadoop Installation

Deploy a scalable Hadoop cluster using Apache Ambari for efficient big data processing. Configure Hadoop ecosystem components, ensuring security and optimization. Utilize Ambari's monitoring and management capabilities for seamless cluster administration.

universal

hadoop-administration

Updated 4mo ago

labex-labs / Hadoop Free Tutorials

Practice Hadoop Free Tutorials | This repo collects 81 of free tutorials for Hadoop. Hadoop is a cornerstone of big data processing. This Skill Tree offers a systematic approach to learning the Hadoop ecosystem. Ideal for beginners, it provides a clear roadmap to understand distributed computing ...

universal

awesomeawesome-listexercises+7

Updated 6h ago

kingmolnar / Big Data Processing With Hadoop Spark

Big Data Processing with Hadoop/Spark

universal

Updated 3y ago

yyqq188 / IStreamBigDataProcessingPlatform

流数据处理平台基于flink&clickhouse

universal

Updated 8mo ago

atlarge-research / Granula

A fine-grained performance evaluation framework for Big Data Processing (BDP) systems.

universal

Updated 5y ago

DICL / VeloxMR

Data processing component of the Velox Big Data Framework (VBDF)

universal

big-datadfshadoop+4

Updated 4y ago

ankitkariryaa / Satellite Image Processing And Analysis In The Big Data Era

Tutorials and examples for the Satellite Image Processing and Analysis in the Big Data Era course

universal

Updated 1y ago

Ahmelie / Experimental Teaching System For Big Data Analysis And Processing Of Transportation

交通出行大数据分析处理实验教学系统

universal

Updated 1y ago

MinhKhoi3104 / Aws End To End Data Lakehouse Analytics Platform

An end-to-end AWS data lakehouse leveraging batch data processing, big data technologies, and advanced business analytics.

universal

airflowaws-cloudaws-glue+11

Updated 1mo ago

zutherb / Building A Full Automated Fast Data Platform

Many people promise fast data as the next step after big data. The idea of creating a complete end-to-end data pipeline that combined Spark, Akka, Cassandra, Kafka, and Apache Mesos came up two years ago, sometimes called the SMACK stack. The SMACK stack is an ideal environment for handling all sorts of data-processing needs which can be nightly batch-processing tasks, real-time ingestion of sensor data or business intelligence questions. The SMACK stack includes a lot of components which has to be deployed somewhere. Lets see how we can create a distributed environment in the cloud with Terraform and how we can provision the Mesos-Cluster with Mesosphere Datacenter Operating System (DC/OS) and create a powerful fast data platform.

universal

Updated 10mo ago