205 skills found · Page 5 of 7
pyajs / Veronicabig data processing and machine learning platform,just like useing sql
wailbentafat / Go Bigdata BreakerA lightweight and scalable circuit breaker implementation in Go, tailored for high-throughput, distributed big data processing pipelines.
Priyanshu-Baghel / BIG DATA ANALYTICS LAB IT 513This repository contains experiments, datasets, and implementation codes for the Big Data Analytics Laboratory course. The objective of this lab is to provide hands-on experience with large-scale data processing, distributed computing frameworks, and real-world data analysis using modern Big Data technologies.
PaulThomas20002 / S5 DATABASE MANAGEMENT SYSTEMS LABThis course helps the learners to get practical exposure on database creation, SQL queries creation, transaction processing and NoSQL & MongoDB based operations. The course enables the students to create, manage and administer the databases, develop necessary tools for the design and development of the databases, and to understand emerging technologies to handle Big Data.
mboukabous / Security Intelligence On Exchanged Multimedia Messages Based On Deep LearningDeep learning (DL) approaches use various processing layers to learn hierarchical representations of data. Recently, many methods and designs of natural language processing (NLP) models have shown significant development, especially in text mining and analysis. For learning vector-space representations of text, there are famous models like Word2vec, GloVe, and fastText. In fact, NLP took a big step forward when BERT and recently GTP-3 came out. Deep Learning algorithms are unable to deal with textual data in their natural language data form which is typically unstructured information; they require special representation of data as inputs instead. Usually, natural language text data needs to be converted into internal representations form that DL algorithms can read such as feature vectors, hence the necessity to use representation learning models. These models have shown a big leap during the last years. Their set ranges from the methods that embed words into distributed representations and use the language modeling objective to adjust them as model parameters (like Word2vec, fastText, and GloVe), to recently transfer learning models (like ELMo, BERT, ULMFiT, XLNet, ALBERT, RoBERTa, and GPT-2). These last use larger corpora, more parameters, more computing resources, and instead of assigning each word with a fixed vector, they use multilayer neural networks to calculate dynamic representations for the words according to their context, which is especially useful for the words with multiple meanings.
giucris / YaspYet Another SPark Framework
sammcilroy / Twitter Sentiment StreamingTwitter Sentiment Analysis: Big Data Streaming Processing
Xiaoyuan-Liu / MapReduce Big Data ProcessingMapReduce大数据实验课程资料
paypal / GatorsGators is a package to handle model building with big data and fast real-time pre-processing, even for a large number of QPS, using only Python.
impresso / Impresso Text Acquisition🛠️ Python library to import OCR data in various formats into the canonical JSON format defined by the Impresso project.
chaudharysurya14 / Ambari Hadoop InstallationDeploy a scalable Hadoop cluster using Apache Ambari for efficient big data processing. Configure Hadoop ecosystem components, ensuring security and optimization. Utilize Ambari's monitoring and management capabilities for seamless cluster administration.
labex-labs / Hadoop Free TutorialsPractice Hadoop Free Tutorials | This repo collects 81 of free tutorials for Hadoop. Hadoop is a cornerstone of big data processing. This Skill Tree offers a systematic approach to learning the Hadoop ecosystem. Ideal for beginners, it provides a clear roadmap to understand distributed computing ...
kingmolnar / Big Data Processing With Hadoop SparkBig Data Processing with Hadoop/Spark
yyqq188 / IStreamBigDataProcessingPlatform流数据处理平台 基于flink&clickhouse
atlarge-research / GranulaA fine-grained performance evaluation framework for Big Data Processing (BDP) systems.
DICL / VeloxMRData processing component of the Velox Big Data Framework (VBDF)
ankitkariryaa / Satellite Image Processing And Analysis In The Big Data EraTutorials and examples for the Satellite Image Processing and Analysis in the Big Data Era course
Ahmelie / Experimental Teaching System For Big Data Analysis And Processing Of Transportation交通出行大数据分析处理实验教学系统
MinhKhoi3104 / Aws End To End Data Lakehouse Analytics PlatformAn end-to-end AWS data lakehouse leveraging batch data processing, big data technologies, and advanced business analytics.
zutherb / Building A Full Automated Fast Data PlatformMany people promise fast data as the next step after big data. The idea of creating a complete end-to-end data pipeline that combined Spark, Akka, Cassandra, Kafka, and Apache Mesos came up two years ago, sometimes called the SMACK stack. The SMACK stack is an ideal environment for handling all sorts of data-processing needs which can be nightly batch-processing tasks, real-time ingestion of sensor data or business intelligence questions. The SMACK stack includes a lot of components which has to be deployed somewhere. Lets see how we can create a distributed environment in the cloud with Terraform and how we can provision the Mesos-Cluster with Mesosphere Datacenter Operating System (DC/OS) and create a powerful fast data platform.