CMU-CS-05-145
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-05-145

Automated Modeling and Nonlinear Axis Scaling

Leejay Wu

May 2005

Ph.D. Thesis

CMU-CS-05-145.ps
CMU-CS-05-145.pdf


Keywords: Scaling, modeling, feature selection


This thesis examines nonlinear axis scaling and its impact on the modeling of inter-attribute relationships. Through automated methods, the described system identifies possible scaling methods; decides which attributes serve as inputs or outputs; and builds regression trees that quantify these relationships. While the experiments focus on the accuracy and complexity of these models, both of which one can attempt to quantitatively examine, the results also consider applicability towards the inherently more qualitative task of rule-based outlier or anomaly detection. The results demonstrate that the use of nonlinear axis scaling, even in an automated system, can provide signi cantly more accurate models compared to the unscaled case without proportionally higher complexity costs; and also can help reveal unusual tuples in which what is unusual is not any individual value, but the combination thereof. 179 pages


Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by reports@cs.cmu.edu