BigData
💎🔥大数据学习笔记
Install / Use
/learn @sunnyandgood/BigDataREADME
BigBata
Hbase是数据库,Hive是数据仓库
hadoop2.2.0伪分布式搭建
HDFS 分布式文件系统
- Hadoop分布式数据分析系统概述
- Hadoop深入浅出
- HDFS fs命令
- HDFS架构
- RPC(Remote Procedure Call远程程序调用)及HDFS的读写过程
- Windows系统下运行hadoop、spark程序出错Could not locate executablenull\bin\winutils.exe in the Hadoop binaries
MapReduce
- MapReduce原理
- MapReduce执行过程
- 数据类型与格式
- Writable接口与序列化机制
- Partitioner编程
- 自定义排序编程
- Combiners编程
- 常见的MapReduce算法
- 倒排索引
Zookeeper
hadoop集群搭建
Sqoop
HBase
Hive
使用hive(表描述在hive数据库的TBLS表中,表中的字段在COLUMNS_V2表中,表的id在CDS表中,存储HDFS上的路径在SDS表中)
flume(日志收集系统)
脚本-定时器
Linux
-
- 输入和输出
- [在脚本中重定向输出](https://github.com/sunnyandgood/BigData/blob/master/Linux/%E9%87%8D%E6%96%B0%E5%AE%9A%E5%90%91/%E5%9C%A8%E8%84%9A%E6%9C%AC%E4%B8%AD%
Related Skills
feishu-drive
349.2k|
things-mac
349.2kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
349.2kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
codebase-memory-mcp
1.2kHigh-performance code intelligence MCP server. Indexes codebases into a persistent knowledge graph — average repo in milliseconds. 66 languages, sub-ms queries, 99% fewer tokens. Single static binary, zero dependencies.
