SkillAgentSearch skills...

LiptoSpeech

Keras implementation of Lip Reading Sentences in the Wild.

Install / Use

/learn @PatrickPrakash/LiptoSpeech
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

LiptoSpeech

Lip reading using End to End Sentence Level Model

Problem Statement:

Lipreading is the task of decoding text from the movement of a speaker’s mouth. Traditional approaches separated the problem into two stages: designing or learning visual features, and prediction

Input : A Video file of a person speaking some word or phrase.
Output : The predicted word or phrase the person was speaking.

Dataset:

GRID-Corpus - http://spandh.dcs.shef.ac.uk/gridcorpus/
LRW - https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrw1.html

Technologies and frameworks:

- Tensorflow 1.2.1
- Keras
- Opencv3
- python 3.6

Preprocess the dataset:

python Videoprocess.py id2_vcd_swwp2s.mpg

Dlib Predictor Model is used to landmark the facial points which can be found in predictor directory predictor/shape_predictor_68_face_landmarks.dat.bz2

MouthExtract folder contains the preprocessed dataset

Prediction:

python predict.py <path to the video>
Example: python predict.py PredictVideo/patrick.m4v

Important:

Please note that the video should be in 25 fps for the model to work.

Related Skills

View on GitHub
GitHub Stars7
CategoryEducation
Updated4mo ago
Forks0

Languages

Jupyter Notebook

Security Score

72/100

Audited on Nov 11, 2025

No findings