TextProcessing

A Text Processing Portal for Humans http://textprocessing.org/

Books

A collection of text processing books

Text Processing Book: Speech and Language Processing, 2nd Edition
Foundations of Statistical Natural Language Processing, 1st Edition
Natural Language Processing with Python 1st Edition
Python Text Processing with NLTK 2.0 Cookbook
Text Processing in Python 1st Edition
Python 2.6 Text Processing Beginners Guide
Python 2.6 Text Processing Beginners Guide
Taming Text: How to Find, Organize, and Manipulate It 1st Edition
Text Processing with Ruby
Speech and Language Processing (3rd ed. draft)

Courses

A collection of text processing courses

Stanford Natural Language Processing
Natural Language Processing by Columbia University
Text Mining and Analytics
Introduction to Natural Language Processing
Stanford Deep Learning for Natural Language Processing
Umass: Introduction to Natural Language Processing

Python

A collection of python open source text processing projects

NLTK: Natural Language Toolkit
spaCy: BUILD TOMORROW’S LANGUAGE TECHNOLOGIES
TextBlob: Simplified Text Processing
MBSP for Python
Pattern
Gensim: Topic Modelling for Humans
Jieba: Chinese text segmentation
langid.py: Stand-alone language identification system
Sumy: Automatic text summarizer
summarize: A python library for simple text summarization
Reduction
RAKE: A python implementation of the Rapid Automatic Keyword Extraction
tagger: A Python module for extracting relevant tags from text documents
topia.termextract: Content Term Extraction using POS Tagging
summarizer: A multidocument text summarizer
TextTeaser: Official version of TextTeaser
Pyteaser: Summarizes news articles by providing an url
hmmlearn: Hidden Markov Models in Python
Python stemming library using snowball stemmers
IEPY: Information Extraction in Python
Python implementation of TextRank algorithm
SpeechRecognition:Library for performing speech recognition
NLP-Caffe: natural language processing with Caffe
Quepy: A python framework to transform natural language questions to queries in a database query language
Cause of Why
semanticizest: Standalone Semanticizer
nlgserv: JSON HTTP wrapper for SimpleNLG
gensim-simserver: Document similarity server, using gensim
pocketsphinx-python: Python interface to CMU SphinxBase and PocketSphinx libraries
PyJulius: Python interface to Julius speech recognition engine
Theano
Pylearn2: A machine learning research library
Blocks: A Theano framework for building and training neural networks
TensorFlow is an Open Source Software Library for Machine Intelligence
Lasagne: Lightweight library to build and train neural networks in Theano
Keras: Deep Learning library for Theano and TensorFlow
Chainer: A Powerful, Flexible, and Intuitive Framework of Neural Networks
The Nengo Neural Simulator
CUDAMat: Python module for performing basic dense linear algebra computations on the GPU using CUDA
Gnumpy
ELEKTRONN: A highly configurable toolkit for training 3d/2d CNNs and general Neural Networks
vivekn sentiment: Sentiment analysis using machine learning techniques
textacy: higher-level NLP built on spaCy
segtok: sentence segmentation and word tokenization tools

Java

A collection of java open source text processing projects

Stanford CoreNLP
Stanford Log-linear Part-Of-Speech Tagger
Stanford Named Entity Recognizer (NER)
The Stanford Parser: A statistical parser
Stanford Word Segmenter
Stanford Classifer
Stanford Tokenizer
Stanford Open Information Extraction
ClearNLP：Software and resources for natural language processing
Apache OpenNLP
GATE: a full-lifecycle open source solution for text processing
LingPipe
THUTag: A Package of Keyphrase Extraction and Social Tag Suggetion
[KEA:

TextProcessing

Install / Use

README

TextProcessing

Categories

Books

Courses

Python

Java