Logical rules extracted from data

Computational Intelligence Laboratory | Department of Informatics | Nicolaus Copernicus University

Look at datasets to find more results obtained using different classifiers.

You can edit this page yourself on Wikispaces here.

Confusion matrices: column labels refer to the true class, row labels to the assigned class, for medical data healthy cases are first.

Appendicitis.

106 vectors, 8 attributes, two classes (88 acute +18 other),
obtained from Shalom Weiss.
Attribute names: WBC1, MNEP, MNEA, MBAP, MBAA, HNEP, HNEA

Rules found using PVM
Accuracy 89.6% in leave-one-out, 91.5% overall

C1: MNEA > 6600 OR MBPA > 11
C2: ELSE

Rules found using C-MLP2LN, no optimization
Accuracy 89.6% in leave-one-out, 91.5% overall

C1: MNEA > 6650 OR MBPA > 12
C2: ELSE

Second neuron gets 3 more cases correctly using 2 rules, but we treat it as noise rather than an interesting rare case.
Using L-units another set of rules is generated with the overall 89.6% accuracy (11 errors).

C1: WBC1 > 8400 OR MBPA >= 42
C2: ELSE

Confusion matrix:		Append.	Other
	Appendicitis	84	10
	Other	1	11

C4.5 generates 3 rules with overall 91.5% accuracy. It may also generate 7 rules for 97.2% accuraccy but this is strong overfitting, with each rule classifying only 1-2 cases.

Summary of accuracy (%) and references

Method	Accuracy	Reference
PVM	89.6	Weiss, Kapouleas
C-MLP2LN	89.6±?	our
RIAC rule induction	86.9	Hamilton et.al
CART, C4.5 (dec. trees)	84.9	Weiss, Kapouleas
FSM rules	???	our (RA)

S.M. Weiss, I. Kapouleas, "An empirical comparison of pattern recognition, neural nets and machine learning classification methods", in: J.W. Shavlik and T.G. Dietterich, Readings in Machine Learning, Morgan Kauffman Publ, CA 1990
H.J. Hamilton, N. Shan, N. Cercone, RIAC: a rule induction algorithm based on approximate classification, Tech. Rep. CS 96-06, Regina University 1996.
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306

Wisconsin breast cancer.

From UCI repository, 699 cases, 9 attributes (1-10 integer values),
two classes, 458 benign (65.5%) & 241 malignant (34.5%).
For 16 instances one attribute is missing.

Attributes: from original database remove F0, id. number (warining: in some papers original feature numbers are given).
F1: Clump Thickness 1 - 10
F2: Uniformity of Cell Size 1 - 10
F3: Uniformity of Cell Shape 1 - 10
F4: Marginal Adhesion 1 - 10
F5: Single Epithelial Cell Size 1 - 10
F6: Bare Nuclei 1 - 10
F7: Bland Chromatin 1 - 10
F8: Normal Nucleoli 1 - 10
F9: Mitoses 1 - 10

C-MLP2LN results:

Rules S1: Single rule: IF f2 = [1,2] then benign else malignant

Original class.
Calculated
1 417 12
2 41 229

Accuracy: 646 correct (92.42%), 53 errors; Sensitivity=0.9720, Specificity=0.8481

Rules S2: 5 rules for malignant, overall accuracy of 96%.

R1	f1<6 &	f3<4 &	f6<2 &	f7<5		100%
R2	f1<6 &	f4<4 &	f6<2 &	f7<5		100%
R3	f1<6 &	f3<4 &	f4<4 &	f6<2		100%
R4	f1=[6,8] &	f3<4 &	f4<4 &	f6<2 &	f7<5	100%
R5	f1<6 &	f3<4 &	f4<4 &	f6=[2,7] &	f7<5	92.3% (36 correct, 3 errors)
	ELSE	benign

3 benign cases wrongly classified as malignant and 25 malignant cases wrongly classified as benign.

Rules S3: 4 malignant rules, overall accuracy of 97.7%, confusion matrix

Confusion matrix:		Benign	Malignant
	Benign	447	5
	Malignant	11	236

R1	f3<3 &	f4<4 &	f6<6 &	f9=1	99.5% (2 err)
R2	f1<7 &	f4<4 &	f6<6 &	f9=1	99.8% (5 err)
R3	f1<7 &	f3<3 &	f6<6 &	f9=1	99.5% (2 err)
R4	f1<7 &	f3<3 &	f4<4 &	f6<6	99.5% (2 err)
	ELSE	benign

3 benign cases wrongly classified as malignant and 25 malignant cases wrongly classified as benign.

Rules S4: Optimized rules: 1 benign vector classified as malignant (rule 1 and rule 5, the same vector).
ELSE condition makes 6 errors, giving 99.00% overall accuracy:

R1	f1<9 &	f4<4 &	f6<2 &	f7<5		100%
R2	f1<10 &	f3<4 &	f4<4 &	f6<3		100%
R3	f1<7 &	f3<9 &	f4<3 &	f6=[4,9] &	f7<4	100%
R4	f1=[3,4] &	f3<9 &	f4<10 &	f6<6 &	f7<8	99.8%
R5	f1<6 &	f3<3 &	f7<8			99.8%
	ELSE	benign				(6 errors)

Other solutions: 100% reliable rules rejecting 51 cases (7.3%) of all vectors.
For malignant class these rules are:

R1	f1<9 &	f3<4 &	f6<3 &	f7<6	100%
R2	f1<5 &	f4<8 &	f6<5 &	f7<10	100%
R3	f1<4 &	f3<2 &	f4<3 &	f6<7	100%
R4	f1<10 &	f4<10 &	f6=[1,5] &	f7<2	100%

For the benign cases rules are: NOT (R5 OR R6 OR R7 OR R8), where:

R5	f1<8 &	f3<5 &	f7<4			100%
R6	f1<9 &	f4<6 &	f6<9 &	f7<5		100%
R7	f1<9 &	f3<6 &	f4<8 &	f6<9		100%
R8	f1=6 &	f3<10 &	f4<10 &	f6<2 &	f7<9	100%

Summary of results (rules discovered for the whole data set).

Method	Accuracy %	Reference	Rules
C-MLP2LN	99.0
FSM	98.3	our (RA)
C4.5 (decision tree)	96.0	Hamilton et.al
RIAC (prob. inductive)	95.0	Hamilton et.al

Duch W, Adamczak R, Grąbczewski K, Żal G, Hybrid neural-global minimization method of logical rule extraction. Journal of Advanced Computational Intelligence 3 (5): 348-356.
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306
H.J. Hamilton, N. Shan, N. Cercone, RIAC: a rule induction algorithm based on approximate classification, Tech. Rep. CS 96-06, Regina University 1996.

Papers on a smaller (569 cases) Wisconsin breast cancer dataset are on the O.L. Mangasarian page.

Cancer (Ljubljana data)

From UCI repository (restricted): 286 instances, 201 no-recurrence-events (70.3%), 85 recurrence-events (29.7%);
9 attributes, between 2-13 values each, 9 missing values

Rules found using PVM: 70% for training, 30% for test
Accuracy 77.4% train, 77.1% test

C1: Involved Nodes > 0 & Degree_malig = 3
C2: ELSE

C-MLP2LN more accurate rules: 78% overall accuracy
R1: deg_malig=3 & breast=left & node_caps=yes
R2: (deg_malig=3 OR breast=left) & NOT inv_nodes=[0,2] & NOT age=[50,59]

Method	Accuracy, % test	Reference
C-MLP2LN	77.4	our
CART	77.1	Weiss, Kapouleas
PVM	77.1	Weiss, Kapouleas
AQ15	66-72	Michalski et.al
Inductive	65-72	Clark, Niblett

Michalski,R.S., Mozetic,I., Hong,J., & Lavrac,N. (1986). The Multi-Purpose Incremental Learning System AQ15 and its Testing Application to Three Medical Domains. In Proceedings of the Fifth National Conference on Artificial Intelligence, 1041-1045, Philadelphia, PA: Morgan Kaufmann.

Clark,P. & Niblett,T. (1987). Induction in Noisy Domains. In: Progress in Machine Learning (from the Proceedings of the 2nd European Working Session on Learning), 11-30, Bled, Yugoslavia: Sigma Press.

CART & PVM 77.4% train, 77.1% test; S.M. Weiss, I. Kapouleas. An empirical comparison of pattern recognition, neural nets and machine learning classification methods, in: J.W. Shavlik and T.G. Dietterich, Readings in Machine Learning, Morgan Kauffman Publ, CA 1990

Duch W, Adamczak R, Grąbczewski K (1997) Extraction of crisp logical rules using constrained backpropagation networks, International Conference on Artificial Neural Networks (ICNN'97), Houston, 9-12.6.1997, pp. 2384-2389
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306

Hepatitis.

From UCI repository, 155 vectors, 19 attributes, 13 binary, other integer, class is first.
Two classes, 32 die (20.6%), 123 live (79.4%)
Missing values (here F1=class): F4(1), F6(1), F7(1), F8(1), F9(10), F10(11), F11(5), F12(5), F13(5), F14(5), F15(6), F16(29), F17(4), F18(16), F19(67)

C-MLP2LN rule, overall accuracy 88.4%, using F2=age, F13=Ascites, F15=bilirubin, F20=histology,

R1: age > 52 & bilirubin > 3.5
R2: histology=yes & ascites=no & age = [30,51]

C-MLP2LN, lignuistic variables from L-units, overall accuracy 96.1%, looks good but uses F19=protime which has missing values in almost half of the cases.

age >= 30 & sex=male & antivirals=no & protime <= 50

Confusion matrix:		Live	Die
	Live	120	3
	Die	3	29

Method	Accuracy %	Reference
C-MLP2LN	???	Our
FSM	90	Our
PVM	??
CART (decision tree)	82.7

Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306

Cleveland heart disease.

From UCI repository, 303 cases, 13 attributes (4 cont, 9 nominal), many missing values.
2 (no, yes) or 5 classes (no, degree 1, 2, 3, 4).
Class distribution: 164 (54.1%) no, 55+36+35+13 yes (45.9%) with disease degree 1-4.

C-MLP2LN simplified rules 85.5% overall accuracy. Rules for healthy class:

R1: (thal=0 OR thal=1) & ca=0.0 (88.5%)
R2: (thal=0 OR ca=0.0) & cp NOT 2 (85.2%)
ELSE sick (89.2%)

Method	Accuracy %	Reference
C-MLP2LN	82.5	RA, estimated?
FSM	82.2	Rafał Adamczak

Statlog Heart disease.

13 attributes (extracted from 75), no missing values.
270=150+120 observations selected from the 303 cases (Cleveland Heart).

Cost Matrix =	Absence	Presence
	0	1
	5	0

Results without risk matrix

Method	Accuracy %	Reference
K*	76.7	WEKA, RA
C-MLP2LN	???	Our
1R	71.4	WEKA, RA
T2	68.1	WEKA, RA
FOIL	64.0	WEKA, RA
RBF	60.0	ToolDiag, RA
InductH	58.5	WEKA, RA

Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306

Diabetes.

From UCI repository, dataset "Pima Indian diabetes":
2 classes, 8 attributes, 768 instances, 500 (65.1%) healthy, 268 (34.9%) diabetes.

F2 is "Plasma glucose concentration (2 hours oral glucose tolerance) test"
F6 is "Body mass index (weight in kg/(height in m)^2)"

1 rule from SSV, overall accuracy 74.9%, Sensitivity=45.5, Spec.=90.6

IF F#2 > 144.5 then diabetes, else healthy

Rule from C-MLP2LN with L-units, overall accuracy 75%

IF ( F2<=151 AND F6<=47 ) THEN healthy, else diabetes

2 rules from SSV, overall accuracy 76.2%, Sensitivity=60.8, Spec.=84.4

IF F#2 > 144.5 OR (F#2 > 123.5 AND F#6 > 32.55) then diabetes, else healthy

Estimation of accuracy (4 leaves in SSV): average of 10 runs, each 10xCV, accuracy 75.2 ±0.6

Confusion matrix:		Healthy	Diabetes
	Healthy	467	159
	Diabetes	33	109

Results from crossvalidation.

Method	Accuracy %	Reference
SSV 5 nodes/BF	75.3±4.8	WD, Ghostminer
SSV opt nodes/3CV/BF	74.7±3.5	WD, Ghostminer
SSV opt prune/3CV/BS	74.6±3.3	WD, Ghostminer
SSV opt prune/3CV/BF	74.0±4.1	WD, Ghostminer
SSV opt nodes/3CV/BS	72.9±4.3	WD, Ghostminer
SSV 5 nodes/BF	74.9±4.8	WD, Ghostminer
SSV 3 nodes/BF	74.6±5.2	WD, Ghostminer
CART	74.5±?	Stalog
DB-CART	74.4±?	Shang & Breiman
ASR	74.3±?	Ster & Dobnikar
CART	72.8±?	Ster & Dobnikar
C4.5	73.0±?	Stalog
Default	65.1±?
C-MLP2LN, overall	75.0±?	Our, 4/99

Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306

Hypothyroid.

Thyroid, From UCI repository, dataset "ann-train.data":
3772 learning and 3428 testing examples;
Training: 93+191+3488 or 2.47%, 5.06%, 92.47%
Test: 73+177+3178 or 2.13%, 5.16%, 92.71%
21 attributes (15 binary, 6 continuous); 3 classes

C-MLP2LN rules (all values of continuous features are multiplied here by 1000)

Initial rules:

primary hypothyroid: TSH>6.1 & FTI <65
compensated : TSH > 6 & TT4<149 & On_Tyroxin=FALSE & FTI>64 & surgery=False
ELSE normal

Optimized more accurate rules: 4 errors on the training set (99.89%), 22 errors on the test set (99.36%)

primary hypothyroid: TSH>30.48 & FTI <64.27 (97.06%)
primary hypothyroid: TSH=[6.02,29.53] & FTI <64.27 & T3< 23.22 (100%)
compensated : TSH > 6.02 & FTI>[64.27,186.71] & TT4=[50, 150.5) & On_Tyroxin=no & surgery=no (98.96%)
no hypothyroid : ELSE (100%)

Method	% training	% test	Reference
C-MLP2LN rules + ASA	99.9	99.36	Rafał/Krzysztof/Grzegorz
CART	99.8	99.36	Weiss
PVM	99.8	99.33	Weiss

C-MLP2LN rules	99.7	99.0	Rafał/Krzysztof

3 crisp logical rules using TSH, FTI, T3, on_thyroxine, thyroid_surgery, TT4 give 99.3% of accuracy on the test set.
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306

Other, non-medical data

Iris flowers

150 vectors, 50 in each class: setosa, virginica, versicolor
PL=x3=Petal Length; PW=x4=Petal Width

PVM Rules: accuracy 98% in leave-one-out and overall

Setosa	Petal Length <3
Virginica	Petal length >4.9 OR Petal Width >1.6
Versicolor	ELSE

C-MLP2LN rules:

7 errors, overall 95.3% accuracy

Setosa	PL <2.5	100%
Virginica	PL >4.8	92%
Versicolor	ELSE	94%

Higher accuracy: overall 98%

Setosa	PL <2.9	100%
Virginica	PL>4.95 OR PW>1.65	94%
Versicolor	PL=[2.9,4.95] & PW=[0.9,1.65]	100%

100% reliable rules reject 11 vectors, 8 virginica and 3 versicolor:

Setosa	PL <2.9	100%
Virginica	PL>5.25 OR PW>1.85	100%
Versicolor	PL=[2.9,4.9] & PW<1.7	100%

Summary:

Method	Accuracy	Reference
PVM 1 rule	97.3	Weiss
CART (dec. tree)	96.0	Weiss
FuNN	95.7	Kasabov
NEFCLASS	96.7	Nauck et.al.
FuNe-I	96.7	Halgamuge
PVM 2 rules	98.0	Weiss, optimal result, corresponds to about 96% in CV tests
C-MLP2LN	98.0	Duch et.al.
SSV	98.0	Duch et.al.
Grobian (rough)	100	Browne; overfitting

References:
S.M. Weiss, I. Kapouleas, "An empirical comparison of pattern recognition, neural nets and machine learning classification methods", in: J.W. Shavlik and T.G. Dietterich, Readings in Machine Learning, Morgan Kauffman Publ, CA 1990
N. Kasabov, Connectionist methods for fuzzy rules extraction, reasoning and adaptation. In: Proc. of the Int. Conf. on Fuzzy Systems, Neural Networks and Soft Computing, Iizuka, Japan, World Scientific 1996, pp. 74-77
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306
C. Browne, I. Duntsch, G. Gediga, IRIS revisited: A comparison of discriminant and enhanced rough set data analysis. In: L. Polkowski and A. Skowron, eds. Rough sets in knowledge discovery, vol. 2. Physica Verlag, Heidelberg, 1998, pp. 345-368
D. Nauck, U. Nauck and R. Kruse, Generating Classification Rules with the Neuro-Fuzzy System NEFCLASS. Proc. Biennial Conf. of the North American Fuzzy Information Processing Society (NAFIPS'96), Berkeley, 1996
S.K. Halgamuge and M. Glesner, Neural networks in designing fuzzy systems for real world applications. Fuzzy Sets and Systems 65:1-12, 1994

Mushrooms

8124 instances, 4208 (51.8%) edible and 3916 (48.2%) poisonous;
22 attributes (all symbolic): cap shape (6, e.g.. bell, conical,flat...), cap surface (4), cap color (10), bruises (2), odor (9), gill attachment (4), gill spacing (3), gill size (2), gill color (12), stalk shape (2), stalk root (7, many missing values), surface above the ring (4), surface below the ring (4), color above the ring (9), color below the ring (9), veil type (2), veil color (4), ring number (3), spore print color (9), population (6), habitat (7).
Together 118 logical input values.
2480 missing values for attribute 11

C-MLP2LN rules:

Disjunctive rules for poisonous mushrooms, from most general to most specific:

No.	Rule	Accuracy
1	odor=NOT(almond.OR.anise.OR.none)	98.52%, 120 poisonous cases missed
2	spore-print-color=green	99.41%, 48 cases missed
3	odor=none.AND.stalk-surface-below-ring=scaly. AND.(stalk-color-above-ring=NOT.brown)	99.90%, 8 cases missed
4	habitat=leaves.AND.cap-color=white	100% accuracy

Alternative R4' rule: population=clustered.AND.cap_color=white

These rule involve 6 attributes (out of 22). Rule 1 may be replaced by:

odor = creosote.OR.fishy.OR.foul.OR.musty.OR.pungent.OR.spicy

Rules for edible mushrooms are obtained as negation of the rules given above, for example rule:

Re1: odor=(almond.OR.anise.OR.none).AND.spore-print-color=NOT.green

makes 48 errors, giving 99.41% accuracy on the whole dataset.
Several slightly more complex variations on these rules exist, involving other attributes, such as gill_size, gill_spacing, stalk_surface_above_ring, but the rules given above are the simplest found so far.

Other methods:

[1] BRAINNE: 300 rules, > 8000 antecedents, 91%
[2] STAGGER: asymptoted to 95% classification accuracy after reviewing 1000 instances.
[3] HILLARY algorithm, about 95%

References:

Duch W, Adamczak R, Grabczewski K (1996) Extraction of logical rules from training data using backpropagation networks, in: Proc. of the The 1st Online Workshop on Soft Computing, 19-30.Aug.1996, pp. 25-30, available on-line at: http://www.bioele.nuee.nagoya-u.ac.jp/wsc1/
Duch W, Adamczak R, Grabczewski K, Ishikawa M, Ueda H, Extraction of crisp logical rules using constrained backpropagation networks - comparison of two new approaches, in: Proc. of the European Symposium on Artificial Neural Networks (ESANN'97), Bruge, Belgium 16-18.4.1997, pp. 109-114
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306
Schlimmer,J.S. (1987). Concept Acquisition Through Representational Adjustment (Technical Report 87-19), Doctoral disseration, Department of Information and Computer Science, University of California, Irvine.
Iba,W., Wogulis,J., & Langley,P. (1988). Trading off Simplicity and Coverage in Incremental Concept Learning. In Proceedings of the 5th International Conference on Machine Learning, 73-79, Ann Arbor, Michigan: Morgan Kaufmann.

Monk 1

Original rule is: head shape = body shape OR jacket color = red

C-MLP2LN:
100% accuracy with 4 rules + 2 exception, 14 atomic formulae.

Other systems: see the original paper:
S. Thrun, J. Bala, E. Bloedorn, I. Bratko, B. Cestnik, J. Cheng, K. De Jong, S. Dzeroski, R. Hamann, K. Kaufman, S. Keller, I. Kononenko, J. Kreuziger, R.S. Michalski, T. Mitchell, P. Pachowicz, B. Roger, H. Vafaie, W. Van de Velde, W. Wenzel, J. Wnek, and J. Zhang.
The MONK's problems: A performance comparison of different learning algorithms. Technical Report CMU-CS-91-197, Carnegie Mellon University, Computer Science Department, Pittsburgh, PA, 1991.

Monk 2

Original rule: exactly two of the six features have their first values

C-MLP2LN:
100% accuracy with 16 rules and 8 exceptions, 132 atomic formulae.
Other systems: see the Thrun et al. original paper: The MONK's problems

Monk 3

Original rule:

NOT (body shape = octagon OR jacket color = blue) OR (holding = sward AND jacket color = green)
was corrupted by 5% noise.

C-MLP2LN:
100% accuracy with 33 atomic formulae.
Other systems: see the Thrun et al. original paper: The MONK's problems
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306

Comparison of results:

Method	Monk-1	Monk-2	Monk-3	Remarks
AQ17-DCI	100	100	94.2	Michalski
AQ17-HCI	100	93.1	100	Michalski
AQ17-GA	100	86.8	100	Michalski
Assistant Pro.	100	81.5	100	Monk paper
mFOIL	100	69.2	100	Monk paper
ID5R	79.7	69.2	95.2	Monk paper
IDL	97.2	66.2	--	Monk paper
ID5R-hat	90.3	65.7	--	Monk paper
TDIDT	75.7	66.7	--	Monk paper
ID3	98.6	67.9	94.4	Monk paper
AQR	95.9	79.7	87.0	Monk paper
CLASSWEB 0.10	71.8	64.8	80.8	Monk paper
CLASSWEB 0.15	65.7	61.6	85.4	Monk paper
CLASSWEB 0.20	63.0	57.2	75.2	Monk paper
PRISM	86.3	72.7	90.3	Monk paper
ECOWEB	82.7	71.3	68.0	Monk paper
Neural methods
MLP	100	100	93.1	Monk paper
MLP+reg.	100	100	97.2	Monk paper
Cascade correlation	100	100	97.2	Monk paper
FSM, Gaussians	94.5	79.3	95.5	Duch et.al.
SSV	100	80.6	97.2	Duch et.al.
C-MLP2LN	100	100	100	Duch et.al.
Other methods
kNN, with VDM metric	--	--	98.0	K. Grudziński

NASA Shuttle

Training set 43500, test set 14500, 9 attributes, 7 classes
Approximately 80% of the data belongs to class 1.

Rules obtained from FSM, without optimization:

Class	15 rules, train 99.89%, test 99.81% accuracy	Correct/False
C1	F9 [-14,0] F1 [27,39] and F2 [-16,13] F2 [-22,110] and F9 [-14,2] F2 [-25,7] and F3 [76,83] and F7 [36,58]	15043/0 11612/0 26014/0 11648/0
C2	F2 [18,110] and F4 = 0 and F5 [-188,12] F1 [42, 59] and F2 [10,50] and F6 [0,59] and F7 [19,37] and F9 [2,24]	25/0 10/0
C3	F2 [-118,-22] and F7 [5,71] and F8 [73,103] and F9 [16,86] F2 [-318,-31] and F5 [-188,34] F2 [-177,-19] and F5 [36,72] and F9 [6,54] F2 [-42,-17] and F3 [71,78] and F6 [-14,24] and F9 [2,26]	58/0 82/0 27/0 9/5
C4	F1 [51, 67] and F2 [-18,17] and F9 [4,70] F1 [53, 66] and F2 [-60,24] and F4 [-29,30] and F9 [8,266] F2 [-12,18] and F3 [64, 79] and F7 [ 4, 26] and F9 [8, 82]	6063/0 5564/0 2634/0
C5	F7 [-48, 5]	2458/2
C6	F2 [-4821,-386] and F5 [-46,34]	9/0

Rules obtained from FSM, without optimization:

Class	19 rules, train 99.94%, test 99.87% accuracy	Correct/False
C1	F9 [-14,0] F1 [27,44] and F2 [-20,18] F2 [-15,51] and F9 [-14,2] F6 [-13839,-41] and F9 [-356,10] F1 [27,50] and F2 [-27,8] and F9 [-14,24]	15043/0 19316/0 26003/0 36/0 25563/1
C2	F2 [21,110] and F4 [ 0, 0] and F5 [-188,26] F1 [40, 57] and F2 [14,59] and F9 [ 8,22]	25/0 12/0
C3	F2 [-102,-37] and F9 [2,28] F1 [ 27, 81] and F2 [-138,-24] and F9 [22,88] F2 [ -64, -21] and F4 [-2,1] and F6 [-37,27] and F9 [2,48]	46/0 60/0 67/8
C4	F1 [53,61] and F2 [ -46, 45] and F7 [ 1, 40] and F9 [18,126] F1 [53,59] and F2 [-4821,275] and F5 [-188,46] and F7 [-48,28] F1 [53,63] and F2 [ -19, 26] and F4 [ -21, 50] and F9 [4,126]	3805 3512/2 6735/0
C5	F4 [-2044,769] and F7 [-48, 2] F7 [ - 19, 5] and F9 [44,196] F6 [ -4, 4] and F8 [36, 38] and F9 [30,38]	690/0 1772/0 203/0
C6	F2 [-4821,-4475] F2 [-4821,-908] and F5 [8,34] F2 [ 275,1958] and F7 [1,54]	3/0 9/0 6/2

17 optimized FSM rules make only 3 errors on the training set (99.99\% accuracy), leaving 8 vectors unclassified, and no errors on the test set but leaving 9 vectors unclassified (99.94\%). After Gaussian fuzzification of inputs (very small, 0.05\%) only 3 errors and 5 unclassified vectors are obtained for the training and 3 vectors are unclassified and 1 error is made (with the probability of correct class for this case close to 50\%) for the test set.

32 rules from SSV gave even better results: 100\% correct on the training and only 1 error on the test set.

Satellite image dataset (STATLOG version)

Training 4435 test 2000 cases, 36 semi-continous [0 to 255] attributes (= 4 spectral bands x 9 pixels in neighbourhood) and 6 decision classes: 1,2,3,4,5 and 7 (class 6 has been removed because of doubts about the validity of this class).

Method	% train	% test	Time train	Time test
Dipol92	94.9	88.9	746	111
Radial	88.9	87.9	564	74
CART	92.1	86.2	330	14
Bayesian Tree	98.0	85.3	248	10
C4.5	96.0	85.0	434	1
New ID	93.3	85.0	226	53

Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306

Ionosphere

200 training, 150 test cases, 34 continuous attributes, 2 classes

Method	Accuracy %	Reference
3-NN + simplex	98.7	Our ???
3-NN	96.7	our
IB3	96.7	Aha
MLP+BP	96.0	Sigillito
C4.5	94.9	Hamilton
RIAC	94.6	Hamilton
C4 (no windowing)	94.0	Aha
Non-linear perceptron	92.0	Sigillito
FSM + rotation	92.8	our
1-NN	92.1	Aha
DB-CART	91.3	Shang, Breiman
Linear perceptron	90.7	Sigillito
CART	88.9	Shang, Breiman

N. Shang, L. Breiman, ICONIP'96, p.133
David Aha: k-NN+C4+IB3 (Aha \& Kibler, IJCAI-1989), IB3 parameter settings: 70% and 80% for acceptance and dropping respectively.
RIAC, C4.5 from: H.J. Hamilton, N. Shan, N. Cercone, RIAC: a rule induction algorithm based on approximate classification, Tech. Rep. CS 96-06, Regina University 1996.

Sonar

208 cases, 60 continuous attributes, 2 classes
From the CMU benchmark repository

Method	Train %	Test %	Reference
MLP+BP, 12 hidden	99.8±0.1	84.7±5.7	Gorman, Sejnowski
MLP+BP, 24 hidden	99.8±0.1	84.5±5.7	Gorman, Sejnowski
1-NN, Manhattan		84.2±1.0	our (KG)
MLP+BP, 6 hidden	99.7±0.2	83.5±5.6	Gorman, Sejnowski
FSM - methodology ?		83.6	our (RA)
1-NN Euclidean		82.2±0.6	our (KG)
DB-CART, 10xCV		81.8	Shang, Breiman
CART, 10xCV		67.9	Shang, Breiman

Our results: kNN also from 13xCV, results from 10xCV are quite similar, for example 1-NN Manhattan 84.5±0.9

Vovel

528 training, 462 test cases, 10 continous attributes, 11 classes
From the CMU benchmark repository

Method	Train	Test	Reference
CART-DB, 10xCV on total set	90.0		Shang, Breiman
CART, 10xCV on total set	78.2		Shang, Breiman

FSM initialization, methodology ?		84.4	our (RA)
9-NN		56.5	our ?
Square node network, 88 units		54.8	UCI
Gaussian node network, 528 units		54.6	UCI
1-NN		54.1	UCI
Radial Basis Function, 528 units		53.5	UCI
Gaussian node network, 88 units		53.5	UCI
Square node network, 22		51.1	UCI
Multi-layer perceptron, 88 hidden		50.6	UCI
Modified Kanerva Model, 528 units		50.0	UCI
Radial Basis Function, 88 units		47.6	UCI
Single-layer perceptron, 88 hidden		33.3	UCI

N. Shang, L. Breiman, ICONIP'96, p.133, made 10xCv instead of using the test set.

Other Data

Glass: Shang, Breiman CART 28.6% error, DB-CART 29.4%

DNA-Primate splice-junction gene sequence

Logical rules extracted from data

Appendicitis.

Wisconsin breast cancer.

Cancer (Ljubljana data)

Hepatitis.

Cleveland heart disease.

Statlog Heart disease.

Diabetes.

Hypothyroid.

Other, non-medical data

Iris flowers

Mushrooms

Monk 1

Monk 2

Monk 3

NASA Shuttle

Satellite image dataset (STATLOG version)

Ionosphere

Sonar

Vovel

Other Data

Links to other Duch-Lab projects, many talks and to other papers on various subjects. Maintained by Wlodzislaw Duch.

Links to other Duch-Lab projects, many talks and to other papers on various subjects.
Maintained by Wlodzislaw Duch.