University of Houston
Department of Computer Science
Data mining is a powerful new technology that extracts hidden patterns and relationships in data. One of the most important applications of data mining is forecasting the future characteristics of data based on past and present data. In this thesis, we propose a forecasting method called Fine Partitioning method to predict the characteristics of the future data. The objective here is to demonstrate that prediction based on Fine Partitioning is more accurate than prediction without using Fine Partitioning. In predicting the future data in this method we partition the data into multiple sub-groups based on some attributes. If the behavior of the past data we are using in forecasting the future data is stable, (i.e. if it does not deviate much from the average value) then we can use that directly in forecasting the future data. On the other hand if the behavior of the data is not stable, we divide the data into sub-groups and check the stability of these sub-groups. We continue the above process until each sub-group is stable and then predict the future data by linearly combining the data of these stable groups. To test the accuracy of this method, a simulation test is run and the results obtained using finely partitioned groups is compared with the results obtained without using fine partitioning. We generate the test data using normal distribution. The second problem considered in this thesis is described as follows. Once a group is divided into sub-groups the behavior of the overall data is available but the behavior of the data for some sub-groups might not be available. We also propose a method based on an optimization algorithm called Feasible Sequential Quadratic Programming to estimate the behavior of these sub-groups.