An OnlineSemi-Supervised Clustering Algorithm Basedon a Self-organizing Incremental NeuralNetwork

Toshiaki Ishii

An OnlineSemi-Supervised Clustering Algorithm Basedon a Self-organizing Incremental NeuralNetwork

2007

Toshiaki Ishii

Thispaperpresents an online semi-supervised clustering algorithm basedona self-organizing incremental neuralnetwork(SOINN).Usinglabeled dataanda large amountofunlabeled data, theproposed semi-supervised SOINN (ssSOINN) canautomatically learn thetopology ofinput data distribution without anyprior knowledge suchasthenumber ofnodesora goodnetworkstructure; itcansubsequently divide thestructure intosub-structures astheneedarises. Experimental results weobtained forartificial dataandreal- worlddatashowthatthessSOINNhassuperior performance forseparating datadistributions withhigh-density overlap and thatssSOINNClassifier (S3C)isanefficient classifier. I.INTRODUCTION Numerous incremental learning algorithms based onneural networks, whicharecalled incremental orcompetitive neural networks, havebeenproposed andapplied inmanyapplica- tion domains asclassification models. A salient advantage of incremental neural networks istheir capability ofoperating withinformation ofnewdataincrementally. Thereby, they canimprove their performance bygradually increasing their structural complexity. Well-known examples ofincremen- talorcompetitive neural networks include Kohonen's self- organizing map(SOM)(1)andGrowing CellStructures (GCS)(2). Neural network architecture consisting ofSOM or GCSlayer followed byaRadial Basis Function output layer (3)canrealize supervised learning. Inaddition, theGrowing Neural Gas(GNG)architecture isa modification tothe GCS,inwhich thedimensionality oftopological structures is notpredefined, butisinstead discovered during training. In recent years, further modifications havebeenproposed: Life- long Learning Cell Structures (LLCS) (4)forsupervised clas- sification andSelf-Organizing Incremental Neural Network (SOINN) (5)forunsupervised clustering. Inthese methods, theinsertion ofnodesisstopped automatically. Thereby, these methods avoid thepermanent increase ofnodes. They areintended tobalance stability andplasticity. A mainchallenge inthedesign ofefficient androbust learning algorithms isthat newdatasets arecontinually added toanalready hugedatabase. Theuseofthehugedataset introduces twoissues. Thefirst ishowtolearn newknowl- edge without forgetting previous knowledge. Thisproblem is considered tobeamaindifficulty oftheincremental learning

Keywords:

Correction
Cite
Save
Machine Reading By IdeaReader

References

Citations