![]()
In partial fulfillment of the Requirements for the Degree of
Master of Science
In this thesis, we investigate the ability of three nonparametric density estimation techniques to distinguish between signal and background events in high energy physics using K* data set for our experiments. Bayes’ classification is used with three different density estimation techniques: Parzen windows, k-nearest neighbors (kNN), and a new hybrid method, called kNN-Parzen method that combines the properties of the other two.
The classifiers are applied to three formats of the K* dataset: (1) the dataset containing all the 43 features , (2) the dataset containing the top 5 features selected using Information Gain as criteria, and (3) the dataset containing the best 18 features selected using nearest neighbor method with leave-one-out validation, through a forward-backward search.
Our overall best classifier used kNN on the full feature dataset to achieve a testing accuracy of 95.09 %. It also consistently outperformed the other classifiers on all the other datasets used. A comparison with results from non-parametric density estimation methods shows that our methods performed well, and even outperformed the Random Forest method which had previously been reported as having the best performance on the K* dataset.
Date: Thursday, December 1, 2005
Time: 4:00 PM
Place: 550-PGH
Faculty, students, and the general public are invited.
Thesis Advisor: Dr. Ricardo Vilalta