FuzzyClassificator
This program uses neural networks to solve classification problems, and uses fuzzy sets and fuzzy logic to interpreting results. [WARNING] The development is frozen here and moved to the Open DevOps community: https://github.com/devopshq/FuzzyClassificator. See article about math in FuzzyClassificator (russian):
Install / Use
/learn @Tim55667757/FuzzyClassificatorREADME
FuzzyClassificator
This program uses neural networks to solve classification problems, and uses fuzzy sets and fuzzy logic to interpreting results. FuzzyClassificator provided under the MIT License.
[WARNING] The development is frozen here and moved to the Open DevOps community: https://github.com/devopshq/FuzzyClassificator
You can see detailed user manual here: https://devopshq.github.io/FuzzyClassificator/
Please report all new bugs or the required functionality in new tasks in Open DevOps community.
See article about math in FuzzyClassificator (russian): http://math-n-algo.blogspot.ru/2014/08/FuzzyClassificator.html
How to use
FuzzyClassificator uses ethalons.dat (default) as learning data and candidates.dat (default) for classifying data (See "Preparing data" chapter). Work contains two steps:
-
Learning. At this step program parses ethalon data, learning neural network on this data and then saves neural network configuration into file.
-
Classifying. At this step program uses trained network for classification candidates from data file.
Presets:
The simplest way to use FuzziClassificator without some troubles is to install Pyzo + Anaconda interpreter, which contains all needable scientific libraries. Pyzo is a cross-platform Python IDE focused on interactivity and introspection, which makes it very suitable for scientific computing. Anaconda is the open data science platform powered by Python. The open source version of Anaconda is a high performance distribution of Python and includes most of the popular Python packages for scientific calculation. In all the examples below, we used an Anaconda Python interpreter.
Usage:
python FuzzyClassificator.py [options] [--learn]|[--classify] [Network_Options]
Optional arguments:
-h, --help
Show help message and exit.
-l [verbosity], --debug-level=[verbosity]
Use 1, 2, 3, 4, 5 or DEBUG, INFO, WARNING, ERROR, CRITICAL debug info verbosity,
INFO (2) by default.
-e [ethalon_filename], --ethalons=[ethalon_filename]
File with ethalon data samples, ethalons.dat by default.
-c [candidates_filename], --candidates=[candidates_filename]
File with candidates data samples, candidates.dat by default.
-n [network_filename], --network=[network_filename]
File with Neuro Network configuration, network.xml by default.
-r [report_filename], --report=[report_filename]
File with Neuro Network configuration, report.txt by default.
-bn [best_network_filename], --best-network=[best_network_filename]
Copy best network to this file, best_nn.xml by default.
-bni [best_network_info_filename], --best-network-info=[best_network_info_filename]
File with information about best network, best_nn.txt by default.
-ic [indexes], --ignore-col=[indexes]
Columns in input files that should be ignored.
Use only dash and comma as separator numbers, other symbols are ignored.
Example (no space after comma): 1,2,5-11
-ir [indexes], --ignore-row=[indexes]
Rows in input files that should be ignored.
Use only dash and comma as separator numbers, other symbols are ignored.
1st header row always set as ignored.
Example (no space after comma): 2,4-7
-sep [TAB|SPACE|separator_char], --separator=[TAB|SPACE|separator_char]
Column's separator in raw data files.
It can be TAB or SPACE abbreviation, comma, dot, semicolon or other char.
TAB symbol by default.
--no-fuzzy
Add key if You doesn't want show fuzzy results, only real. Not set by default.
--reload
Add key if You want reload network from file before usage. Not set by default.
-u [epochs], --update=[epochs]
Update error status after this epochs time, 5 by default.
This parameter affected training speed.
Work modes:
Learning Mode:
--learn [Network_Options]
Start program in learning mode, where Network_Options is a dictionary:
{
config=inputs,layer1,layer2,...,outputs
where inputs is number of neurons in input layer,
layer1..N are number of neurons in hidden layers,
and outputs is number of neurons in output layer
epochs=[int_num]
this is a positive integer number, greater than 0, means the number of training cycles
rate=[float_num]
this is parameter of rate of learning, float number in (0, 1]
momentum=[float_num]
this is parameter of momentum of learning, float number in (0, 1]
epsilon=[float_num]
this parameter used to compare the distance between the two vectors, float number in (0, 1]
stop=[float_num]
this is stop parameter of learning (percent of errors), float number in [0, 100]
}
Classifying Mode:
--classify [Network_Options]
Start program in classificator mode, where Network_Options is a dictionary:
{
config=inputs,layer1,layer2,...,outputs
where inputs is number of neurons in input layer,
layer1..N are number of neurons in hidden layers,
and outputs is number of neurons in output layer
}
Examples:
Start learning with user's ethalon data file and neuronet options Config=(3,[3,2],2), 10 epochs, 0.1 learning rate and 0.05 momentum, epsilon is 0.01 and stop learning if errors less than 5%, update information in log every 5 epochs:
python FuzzyClassificator.py --ethalons ethalons.dat --learn config=3,3,2,2 epochs=10 rate=0.1 momentum=0.05 epsilon=0.01 stop=5 --separator=TAB --debug-level=DEBUG --update 5
Classify all candidates from file candidates.dat and show result in report.txt:
python FuzzyClassificator.py --candidates candidates.dat --network network.xml --report report.txt --classify config=3,3,2,2 --separator=TAB --debug-level=DEBUG
Where 'python' is full path to Pyzo Python 3.3.2 interpreter.
Preparing data
ethalons.dat
This is default file with ethalon data set. This file contains tab-delimited data (by default) that looks like this:
<first header line with column names>
and then some strings contains real or fuzzy values:
- M input columns: <1st value><tab>...<tab><M-th value>
- N output columns: <1st value><tab>...<tab><N-th value>
For each input vector level of membership in the class characterized by the output vector.
Example:
input1 input2 input3 1st_class_output 2nd_class_output
0.1 0.2 Min Min Max
0.2 0.3 Low Min Max
0.3 0.4 Med Min Max
0.4 0.5 Med Max Min
0.5 0.6 High Max Min
0.6 0.7 Max Max Min
For training on this data set use --learn key with config parameter, for example:
--learn config=3,3,2,2
where first config parameter mean that dimension of input vector is 3, last config parameter mean that dimension of output vector is 2, and the middle "3,2" parameters means that neural network must be created with two hidden layers, three neurons in 1st hidden layer and two neurons in 2nd.
candidates.dat
This is default file with data set for classifying. This file contains tab-delimited data (by default) that looks like this:
<first header line with column names>
and then some strings contains real or fuzzy values:
- M input columns: <1st value><tab>...<tab><M-th value>
Example:
input1 input2 input3
0.12 0.32 Min
0.32 0.35 Low
0.54 0.57 Med
0.65 0.68 High
0.76 0.79 Max
To classify each of input vectors You must to use --classify key. All columns are used as values of input vectors.
If You train Neuronet with command:
python FuzzyClassificator.py --ethalons ethalons.dat --learn config=3,3,2,2 epochs=1000 rate=0.1 momentum=0.05
And then classificate candidates vectors with command:
python FuzzyClassificator.py --candidates candidates.dat --network network.xml --report report.txt --classify config=3,3,2,2
Then You'll get report.text file with information that looks like this:
Neuronet: C:\work\projects\FuzzyClassificator\network.xml
FuzzyScale = {Min, Low, Med, High, Max}
Min = <Hyperbolic(x, {'a': 8, 'c': 0, 'b': 20}), [0.0, 0.23]>
Low = <Bell(x, {'a': 0.17, 'c': 0.34, 'b': 0.23}), [0.17, 0.4]>
Med = <Bell(x, {'a': 0.34, 'c': 0.6, 'b': 0.4}), [0.34, 0.66]>
High = <Bell(x, {'a': 0.6, 'c': 0.77, 'b': 0.66}), [0.6, 0.83]>
Max = <Parabolic(x, {'a': 0.77, 'b': 0.95}), [0.77, 1.0]>
Classification results for candidates vectors:
Header: [input1 input2 input3] [1st_class_output 2nd_class_output]
----------------------------------------------------------------------
Input: ['0.12', '0.32', 'Min'] Output: ['Min', 'Max']
Input: ['0.32', '0.35', 'Low'] Output: ['Low', 'High']
Input: ['0.54', '0.57', 'Med'] Output: ['Max', 'Min']
Input: ['0.65', '0.68', 'High'] Output: ['Max', 'Min']
Input: ['0.76', '0.79', 'Max'] Output: ['Max', 'Min']
Work with program modules
FuzzyClassificator.py
This is main module which realizes user command-line interaction. Main methods are LearningMode() and ClassifyingMode() which provide similar program modes. The module provide user interface that implemented in PyBrainLearning.py.
Learning mode contain steps realized by LearningMode():
- Creating PyBrain network instance with pre-defined config parameters.
- Parsing ra
