pyTsetlinMachineParallel

License Python Version Maintenance

Multi-threaded implementation of the Tsetlin Machine (https://arxiv.org/abs/1804.01508), Convolutional Tsetlin Machine (https://arxiv.org/abs/1905.09688), Regression Tsetlin Machine (https://arxiv.org/abs/1905.04206, https://royalsocietypublishing.org/doi/full/10.1098/rsta.2019.0165, https://link.springer.com/chapter/10.1007/978-3-030-30244-3_23), and Weighted Tsetlin Machines (https://arxiv.org/abs/1911.12607, https://ieeexplore.ieee.org/document/9316190, https://arxiv.org/abs/2002.01245), with support for continuous features (https://arxiv.org/abs/1905.04199, https://link.springer.com/chapter/10.1007%2F978-3-030-22999-3_49) and multigranular clauses (https://arxiv.org/abs/1909.07310, https://link.springer.com/chapter/10.1007/978-3-030-34885-4_11).

Installation
Documentation
Tutorials
Examples
Further Work
Requirements
Acknowledgements
Tsetlin Machine Papers
Licence

Installation

pip install pyTsetlinMachineParallel

export OMP_NUM_THREADS=10

Documentation

Documentation coming soon at https://pytsetlinmachineparallel.readthedocs.io/en/latest/

Tutorials

Convolutional Tsetlin Machine tutorial, https://github.com/cair/convolutional-tsetlin-machine-tutorial

Examples

Multiclass Demo

Code: NoisyXORDemo.py

from pyTsetlinMachineParallel.tm import MultiClassTsetlinMachine
import numpy as np 

train_data = np.loadtxt("NoisyXORTrainingData.txt")
X_train = train_data[:,0:-1]
Y_train = train_data[:,-1]

test_data = np.loadtxt("NoisyXORTestData.txt")
X_test = test_data[:,0:-1]
Y_test = test_data[:,-1]

tm = MultiClassTsetlinMachine(10, 15, 3.9, boost_true_positive_feedback=0)

tm.fit(X_train, Y_train, epochs=200)

print("Accuracy:", 100*(tm.predict(X_test) == Y_test).mean())

print("Prediction: x1 = 1, x2 = 0, ... -> y = %d" % (tm.predict(np.array([[1,0,1,0,1,0,1,1,1,1,0,0]]))))
print("Prediction: x1 = 0, x2 = 1, ... -> y = %d" % (tm.predict(np.array([[0,1,1,0,1,0,1,1,1,1,0,0]]))))
print("Prediction: x1 = 0, x2 = 0, ... -> y = %d" % (tm.predict(np.array([[0,0,1,0,1,0,1,1,1,1,0,0]]))))
print("Prediction: x1 = 1, x2 = 1, ... -> y = %d" % (tm.predict(np.array([[1,1,1,0,1,0,1,1,1,1,0,0]]))))

Output

python3 ./NoisyXORDemo.py 

Accuracy: 100.00%

Prediction: x1 = 1, x2 = 0, ... -> y = 1
Prediction: x1 = 0, x2 = 1, ... -> y = 1
Prediction: x1 = 0, x2 = 0, ... -> y = 0
Prediction: x1 = 1, x2 = 1, ... -> y = 0

Interpretability Demo

Code: InterpretabilityDemo.py

from pyTsetlinMachineParallel.tm import MultiClassTsetlinMachine
import numpy as np 

number_of_features = 20
noise = 0.1

X_train = np.random.randint(0, 2, size=(5000, number_of_features), dtype=np.uint32)
Y_train = np.logical_xor(X_train[:,0], X_train[:,1]).astype(dtype=np.uint32)
Y_train = np.where(np.random.rand(5000) <= noise, 1-Y_train, Y_train) # Adds noise

X_test = np.random.randint(0, 2, size=(5000, number_of_features), dtype=np.uint32)
Y_test = np.logical_xor(X_test[:,0], X_test[:,1]).astype(dtype=np.uint32)

tm = MultiClassTsetlinMachine(10, 15, 3.0, boost_true_positive_feedback=0)

tm.fit(X_train, Y_train, epochs=200)

print("Accuracy:", 100*(tm.predict(X_test) == Y_test).mean())

print("\nClass 0 Positive Clauses:\n")
for j in range(0, 10, 2):
	print("Clause #%d: " % (j), end=' ')
	l = []
	for k in range(number_of_features*2):
		if tm.ta_action(0, j, k) == 1:
			if k < number_of_features:
				l.append(" x%d" % (k))
			else:
				l.append("¬x%d" % (k-number_of_features))
	print(" ∧ ".join(l))

print("\nClass 0 Negative Clauses:\n")
for j in range(1, 10, 2):
	print("Clause #%d: " % (j), end=' ')
	l = []
	for k in range(number_of_features*2):
		if tm.ta_action(0, j, k) == 1:
			if k < number_of_features:
				l.append(" x%d" % (k))
			else:
				l.append("¬x%d" % (k-number_of_features))
	print(" ∧ ".join(l))

print("\nClass 1 Positive Clauses:\n")
for j in range(0, 10, 2):
	print("Clause #%d: " % (j), end=' ')
	l = []
	for k in range(number_of_features*2):
		if tm.ta_action(1, j, k) == 1:
			if k < number_of_features:
				l.append(" x%d" % (k))
			else:
				l.append("¬x%d" % (k-number_of_features))
	print(" ∧ ".join(l))

print("\nClass 1 Negative Clauses:\n")
for j in range(1, 10, 2):
	print("Clause #%d: " % (j), end=' ')
	l = []
	for k in range(number_of_features*2):
		if tm.ta_action(1, j, k) == 1:
			if k < number_of_features:
				l.append(" x%d" % (k))
			else:
				l.append("¬x%d" % (k-number_of_features))
	print(" ∧ ".join(l))

Output

python3 ./InterpretabilityDemo.py

Accuracy: 100.0

Class 0 Positive Clauses:

Clause #0:  ¬x0 ∧ ¬x1
Clause #2:   x0 ∧  x1
Clause #4:   x0 ∧  x1
Clause #6:  ¬x0 ∧ ¬x1
Clause #8:  ¬x0 ∧ ¬x1

Class 0 Negative Clauses:

Clause #1:   x0 ∧ ¬x1
Clause #3:   x0 ∧ ¬x1
Clause #5:   x1 ∧ ¬x0
Clause #7:   x1 ∧ ¬x0
Clause #9:   x0 ∧ ¬x1

Class 1 Positive Clauses:

Clause #0:   x1 ∧ ¬x0
Clause #2:   x1 ∧ ¬x0
Clause #4:   x0 ∧ ¬x1
Clause #6:   x0 ∧ ¬x1
Clause #8:   x0 ∧ ¬x1

Class 1 Negative Clauses:

Clause #1:   x0 ∧  x1
Clause #3:  ¬x0 ∧ ¬x1
Clause #5:  ¬x0 ∧ ¬x1
Clause #7:  ¬x0 ∧ ¬x1
Clause #9:   x0 ∧  x1

2D Convolution Demo

Code: 2DNoisyXORDemo.py

from pyTsetlinMachineParallel.tm import MultiClassConvolutionalTsetlinMachine2D
import numpy as np 

train_data = np.loadtxt("2DNoisyXORTrainingData.txt")
X_train = train_data[:,0:-1].reshape(train_data.shape[0], 4, 4)
Y_train = train_data[:,-1]

test_data = np.loadtxt("2DNoisyXORTestData.txt")
X_test = test_data[:,0:-1].reshape(test_data.shape[0], 4, 4)
Y_test = test_data[:,-1]

ctm = MultiClassConvolutionalTsetlinMachine2D(40, 60, 3.9, (2, 2), boost_true_positive_feedback=0)

ctm.fit(X_train, Y_train, epochs=5000)

print("Accuracy:", 100*(ctm.predict(X_test) == Y_test).mean())

Xi = np.array([[[0,1,1,0],
		[1,1,0,1],
		[1,0,1,1],
		[0,0,0,1]]])

print("\nInput Image:\n")
print(Xi)
print("\nPrediction: %d" % (ctm.predict(Xi)))

Output

python3 ./2DNoisyXORDemo.py 

Accuracy: 99.71%

Input Image:

[[0 1 1 0]
 [1 1 0 1]
 [1 0 1 1]
 [0 0 0 1]]

Prediction: 1

Continuous Input Demo

Code: BreastCancerDemo.py

from pyTsetlinMachineParallel.tm import MultiClassTsetlinMachine
from pyTsetlinMachineParallel.tools import Binarizer
import numpy as np

from sklearn import datasets
from sklearn.model_selection import train_test_split

breast_cancer = datasets.load_breast_cancer()
X = breast_cancer.data
Y = breast_cancer.target

b = Binarizer(max_bits_per_feature = 10)
b.fit(X)
X_transformed = b.transform(X)

tm = MultiClassTsetlinMachine(800, 40, 5.0)

print("\nMean accuracy over 100 runs:\n")
tm_results = np.empty(0)
for i in range(100):
	X_train, X_test, Y_train, Y_test = train_test_split(X_transformed, Y, test_size=0.2)

	tm.fit(X_train, Y_train, epochs=25)
	tm_results = np.append(tm_results, np.array(100*(tm.predict(X_test) == Y_test).mean()))
	print("#%d Average Accuracy: %.2f%% +/- %.2f" % (i+1, tm_results.mean(), 1.96*tm_results.std()/np.sqrt(i+1)))

Output

python3 ./BreastCancerDemo.py 

Mean accuracy over 100 runs:

#1 Average Accuracy: 97.37% +/- 0.00
#2 Average Accuracy: 97.37% +/- 0.00
...
#99 Average Accuracy: 97.52% +/- 0.29
#100 Average Accuracy: 97.54% +/- 0.29

MNIST Demo

Code: MNISTDemo.py

from pyTsetlinMachineParallel.tm import MultiClassTsetlinMachine
import numpy as np
from time import time

from keras.datasets import mnist

(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

X_train = np.where(X_train.reshape((X_train.shape[0], 28*28)) > 75, 1, 0) 
X_test = np.where(X_test.reshape((X_test.shape[0], 28*28)) > 75, 1, 0) 

tm = MultiClassTsetlinMachine(2000, 50, 10.0)

print("\nAccuracy over 250 epochs:\n")
for i in range(250):
	start_training = time()
	tm.fit(X_train, Y_train, epochs=1, incremental=True)
	stop_training = time()

	start_testing = time()
	result = 100*(tm.predict(X_test) == Y_test).mean()
	stop_testing = time()

	print("#%d Accuracy: %.2f%% Training: %.2fs Testing: %.2fs" % (i+1, result, stop_training-start_training, stop_testing-start_testing))

Output

python3 ./MNISTDemo.py 

Accuracy over 250 epochs:

#1 Accuracy: 94.91% Training: 4.43s Testing: 1.02s
#2 Accuracy: 96.06% Training: 3.66s Testing: 1.03s
#3 Accuracy: 96.46% Training: 3.24s Testing: 1.07s
...

#248 Accuracy: 98.19% Training: 1.77s Testing: 1.06s
#249 Accuracy: 98.19% Training: 1.90s Testing: 1.05s
#250 Accuracy: 98.21% Training: 1.70s Testing: 1.06s

MNIST Demo w/Weighted Clauses

Code: MNISTDemoWeightedClauses.py

from pyTsetlinMachineParallel.tm import MultiClassTsetlinMachine
import numpy as np
from time import time

from keras.datasets import mnist

(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

X_train = np.where(X_t

PyTsetlinMachineParallel

Install / Use

README

pyTsetlinMachineParallel

Contents

Installation

Documentation

Tutorials

Examples

Multiclass Demo

Code: NoisyXORDemo.py

Output

Interpretability Demo

Code: InterpretabilityDemo.py

Output

2D Convolution Demo

Code: 2DNoisyXORDemo.py

Output

Continuous Input Demo

Code: BreastCancerDemo.py

Output

MNIST Demo

Code: MNISTDemo.py

Output

MNIST Demo w/Weighted Clauses

Code: MNISTDemoWeightedClauses.py