FSMN
pytorch FSMN
Install / Use
/learn @d5555/FSMNREADME
FSMN (Feedforward Sequential Memory Networks)
PyTorch implementations of FSMN (Feedforward Sequential Memory Networks):<br>
sFSMNCell - scalar FSMN<br> vFSMNCell - vectorized FSMN<br> csFSMNCell - compact scalar FSMN<br> cvFSMNCell - compact vectorized FSMN<br>
See:
- Feedforward Sequential Memory Networks: A New Structure to Learn Long-term Dependency [arXiv]
- Compact Feedforward Sequential Memory Networks for Large Vocabulary Continuous Speech Recognition [PDF]
- Feedforward Sequential Memory Networks based Encoder-Decoder Model for Machine Translation [PDF] (http://www.apsipa.org/proceedings/2017/CONTENTS/papers2017/13DecWednesday/Poster%202/WP-P2.14.pdf).
- Deep-FSMN for Large Vocabulary Continuous Speech Recognition [arXiv]
- DEEP FEED-FORWARD SEQUENTIAL MEMORY NETWORKS FOR SPEECH SYNTHESIS [arXiv]
- A novel pyramidal-FSMN architecture with lattice-free MMI for speech recognition [arXiv]
Google Colab
!git clone https://github.com/d5555/FSMN.git
from FSMN.FSMN import *
batch = 2
memory_size = 3
input_size = 5
hidden_size = 10
layer_output_size = 5
sequence_size = 11
n_layers = 3 # number of layers
ff_size = 20
bidirectional = True
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
torch.manual_seed(20)
src=torch.randn((batch, sequence_size, input_size)).to(device)
# memory_size, input_size, hidden_size, layer_output_size, n_layers, fsmn_class, ff_size, drop=0.1, activation=F.relu, bidirectional=False, device=None, dtype=torch.float32
#fsmn_class : sFSMNCell, csFSMNCell, vFSMNCell, cvFSMNCell
fsmn = FSMN(memory_size, input_size, hidden_size , layer_output_size, n_layers, cvFSMNCell, ff_size, drop=0.1, device=device, activation=F.relu, bidirectional=bidirectional).to(device)
src_pad_mask = (torch.tensor([[1,2,3,5,6,6,8,8,13,13,13], [1,2,3,5,6,6,13,13,13,13,13]]) != 13).to(device)
predict = fsmn(src, pad_mask=src_pad_mask)
print ( predict.shape )
print ( predict )
Related Skills
node-connect
353.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
353.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
353.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
