Contact Us | Language: čeština English
Title: | Evaluating subset selection methods for use case points estimation | ||||||||||
Author: | Šilhavý, Radek; Šilhavý, Petr; Prokopová, Zdenka | ||||||||||
Document type: | Peer-reviewed article (English) | ||||||||||
Source document: | Information and Software Technology. 2018, vol. 97, p. 1-9 | ||||||||||
ISSN: | 0950-5849 (Sherpa/RoMEO, JCR) | ||||||||||
Journal Impact
This chart shows the development of journal-level impact metrics in time
|
|||||||||||
DOI: | https://doi.org/10.1016/j.infsof.2017.12.009 | ||||||||||
Abstract: | When the Use Case Points method is used for software effort estimation, users are faced with low model accuracy which impacts on its practical application. This study investigates the significance of using subset selection methods for the prediction accuracy of Multiple Linear Regression models, obtained by the stepwise approach. K-means, Spectral Clustering, the Gaussian Mixture Model and Moving Window are evaluated as appropriate subset selection techniques. The methods were evaluated according to several evaluation criteria and then statistically tested. Evaluation was performing on two independent datasets-which differ in project types and size. Both were cut by the hold-out method. If clustering were used, the training sets were clustered into 3 classes; and, for each of class, an independent regression model was created. These were later used for the prediction of testing sets. If Moving Window was used, then window of sizes 5, 10 and 15 were tested. The results show that clustering techniques decrease prediction errors significantly when compared to Use Case Points or moving windows methods. Spectral Clustering was selected as the best-performing solution, because it achieves a Sum of Squared Errors reduction of 32% for the first dataset, and 98% for the second dataset. The Mean Absolute Percentage Error is less than 1% for the second dataset for Spectral Clustering; 9% for moving window; and 27% for Use Case Points. When the first dataset is used, then prediction errors are significantly higher -53% for Spectral Clustering, but Use Case Points produces a 165% result. It can be concluded that this study proves subset selection techniques as a significant method for improving the prediction ability of linear regression models - which are used for software development effort prediction. It can also be concluded that the clustering method performs better than the moving window method. | ||||||||||
Full text: | https://www.sciencedirect.com/science/article/pii/S0950584917305153 | ||||||||||
Show full item record |