SkillAgentSearch skills...

Recnet

RecNet - Recurrent Neural Network Framework

Install / Use

/learn @JoergFranke/Recnet
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

RecNet - Recurrent Neural Network Framework

Build Status License Python Theano

About

RecNet is a easy to use framework for recurrent neural networks. It implements a deep uni/bidirectional Conventional/LSTM/GRU architecture in Python with use of the Theano library. The intension is a easy handling, light weight implementation with the opportunity to check out new ideas and to implement current research.

Current implemented features:

  • Conventional Recurrent Layers (tanh/relu activation)
  • LSTM (with and without peepholes) and GRU [1,2]
  • uni/bidirectional Training
  • Layer Normalization [3]
  • Softmax Output
  • SGD, Nesterov momentum, RMSprop and AdaDelta optimization [4, 5]
  • Dropout Training [6]
  • MSE, Cross-Entropy Loss and Weighted Cross-Entropy Loss
  • normal and log Connectionist Temporal Classification [7]
  • Regularization (L1/L2)
  • Noisy Inputs
  • Mini Batch Training

Example of use:

<table> <tr> <td align="center"><img src="examples/little_timer_task/little_timer_task.png" width="280" height="160" > <a href="https://github.com/joergfranke/recnet/tree/master/examples/little_timer_task">Little timer task</a></td> <td align="center"><img src="examples/numbers_recognition/numbers_recognition.png" width="280" height="160" > <a href="https://github.com/joergfranke/recnet/tree/master/examples/numbers_recognition">Numbers recognition using CTC</a></td> <td align="center"><img src="https://github.com/joergfranke/phoneme_recognition/blob/master/images/example.png" width="280" height="160" > <a href="https://github.com/joergfranke/phoneme_recognition">Phoneme recognition</a></td> </tr> </table>

How to install it

git clone https://github.com/joergfranke/recnet.git
cd recnet
python setup.py install

In case of error try to update pip/setuptools.

How to use it

1. Please provide your data in form of two lists and storage it in a klepto file. One list contains sequences of features and another the corresponding targets. Each element of the list should be a matrix with shape sequence length | feature/target size .

    d = klepto.archives.file_archive("train_data_set.klepto")
    d['x'] = input_features #example shape [ [123,26] , [254,26] , [180,26] , [340,26] , ... ]
    d['y'] = output_targets #example shape [ [123,61] , [254,61] , [180,61] , [340,61] , ... ]
    d.dump()
    d.clear()

2. Instantiate RecNet, define parameters and create model.

rn = rnnModel()
rn.parameter["train_data_name"] = "train_data_set.klepto"
rn.parameter["net_size"      ] = [      2,     10,         2]
rn.parameter["net_unit_type" ] = ['input',  'GRU', 'softmax']
rn.parameter["net_arch"      ] = [    '-',    'bi',     'ff']
rn.parameter["optimization"  ] = "adadelta"
rn.parameter["loss_function" ] = "cross_entropy"
rn.create()

Please find a full list of possible parameters below.

3. Use the provided function for generating mini batches, training, validation or usage.

mb_train_x, mb_train_y, mb_mask = rn.get_mini_batches("train")
for j in range(train_batch_quantity):
    net_out, train_error = rn.train_fn( mb_train_x[j], mb_train_y[j], mb_mask[j] )

Please find complete training and usage scripts in the provided examples.

Documentation

Parameters

| Parameter | Description | Value | | ------------------- | ---------------------------------------------------| ---------------- | | train_data_name | Name of the training data set | String | | valid_data_name | Name of the validation data set | String | | data_location | Path/dictionary to the data set in kelpto files | Path | | batch_size | Size of the mini batches | Integer >=1 | | output_location | Path/dictionary for saving the log/prm files | Path | | output_type | Log during training in console, log-file or both | "console"/"file"/"both" | | net_size | input size, size of each hidden layer, output size | List of integer | | net_unit_type | unit type of each layer (input, GRU, LSTM, conv, GRU_ln ...) | List of unit types | | net_act_type | activation function of each layer (tanh, relu, softplus) | List of activation functions | | net_arch | architecture of each layer (unidirectional, bidirectional, feed forward) | List of architectures | | epochs | Number of epochs to train | Integer >=1 | | learn_rate | Lerning rate for optimization algorithm | Float [0.0001...0.5] | | optimization | Optimization algorithm | "sgd" / "rmsprop" / "nesterov_momentum" / "adadelta" | | momentum | Momentum for some optimization algorithms | Float [0...1] | | decay_rate | Decay rate for some optimization algorithms | Float [0...1] | | use_dropout | Use of dropout between layers vertical | False/True | | dropout_level | Probability of dropout | Float [0...1] | | regularization | Use of regularization (L1/L2) | False/"L1"/"L2" | | reg_factor | Influence of regularization | Float [0...1] | | noisy_input | Add noise to the input | True/False | | noise_level | Factor for noise level | Float [0...1] | | loss_function | Loss function (weighted or normal ce) | MSE/w2_cross_entropy/cross_entropy/CTC/CTClog| | bound_weight | Weight for weighted cross entropy | Integer |

Functionality

| Function | Describtion | Arguments | Return | |----------|-------------|-----------|----------| | create | Create model and compile functions | List of function to compile ['train','valid','forward']| - | | pub | Publish in console or log-file | String of text | - | | get_mini_batches | Create model and compile functions | 'train'/'valid'/'test', opt:'data_name' | - | | dump | Make a dump of current model | - | - | | train_fn | Train model with mini batch | features, targets, mask | training error, network output | | valid_fn | Determine validation error without update | features, targets, mask | validation error, network output | | forward_fn | Determin output based on mini batch | features, mask | network output |

Credits

References

  1. Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780.
  2. Chung, Junyoung, et al. "Empirical evaluation of gated recurrent neural networks on sequence modeling." arXiv preprint arXiv:1412.3555 (2014).
  3. Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. "Layer normalization." arXiv preprint arXiv:1607.06450 (2016).
  4. Zeiler, Matthew D. "ADADELTA: an adaptive learning rate method." arXiv preprint arXiv:1212.5701 (2012).
  5. Hinton, Geoffrey, N. Srivastava, and Kevin Swersky. "Lecture 6a Overview of mini-‐batch gradient descent." Coursera Lecture slides https://class. coursera. org/neuralnets-2012-001/lecture,[Online.
  6. Zaremba, Wojciech, Ilya Sutskever, and Oriol Vinyals. "Recurrent neural network regularization." arXiv preprint arXiv:1409.2329 (2014).
  7. Graves, Alex, et al. "Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks." Proceedings of the 23rd international conference on Machine learning. ACM, 2006.

Further work

  • Extend documentation
  • Add tests
  • Implementations:
    • CTC decoder
    • Parametrize initialization
    • Lern initialization
    • Annealed Gradient Descent
    • Mix of SGD and others like AdaDelta
View on GitHub
GitHub Stars73
CategoryDevelopment
Updated1y ago
Forks11

Languages

Python

Security Score

80/100

Audited on Dec 15, 2024

No findings