Towards improving the efficiency of software development effort estimation via clustering analysis

Vo Van, Hai; Ho, Le Thi Kim Nhung; Prokopová, Zdenka; Šilhavý, Radek; Šilhavý, Petr

Název:

Autor:

Vo Van, Hai; Ho, Le Thi Kim Nhung; Prokopová, Zdenka; Šilhavý, Radek; Šilhavý, Petr

Typ dokumentu:

Recenzovaný odborný článek (English)

Zdrojový dok.:

IEEE Access. 2022, vol. 10, p. 83249-83264

ISSN:

2169-3536 (Sherpa/RoMEO, JCR)

Journal Impact This chart shows the development of journal-level impact metrics in time

JCR	Journal Citation Reports - Impact Factor
SNIP	SCImago Source Normalized Impact per Paper
SJR	SCImago Journal Rank
IPP	SCImago Impact per Publication
CS	Scopus CiteScore

(Note: The metrics are not comparable against one another.)

DOI:

https://doi.org/10.1109/ACCESS.2022.3185393

Abstrakt:

Introduction: The precise estimation of software effort is a significant difficulty that project managers encounter during software development. Inaccurate forecasting leads to either overestimating or underestimating software effort, which can be detrimental for stakeholders. The International Function Point Users Group's Function Point Analysis (FPA) method is one of the most critical methods for software effort estimation. However, the practice of using the FPA method in the same fashion across all software areas needs to be reexamined. Aim: We propose a model for evaluating the influence of data clustering on software development effort estimation and then finding the best clustering method. We call this model the effort estimation using machine learning applied to the clusters (EEAC) model. Method: We cluster the dataset according to the clustering method and then apply the FPA and EEAC methods to these clusters for effort estimation. The clustering methods we use in this study include five categorical variable criteria (Development Platform, Industrial Sector, Language Type, Organization Type, and Relative Size) and the k-means clustering algorithm. Results: The experimental results show that the estimation accuracy obtaining with clustering consistently outperforms the accuracy without clustering for both the FPA and EEAC methods. Significantly, using the FPA method, the average improvement rate from using clustering as opposed to non-clustered was highest at 58.06%, according to the RMSE. With the EEAC method, this number reached 65.53%. The Industry Sector categorical variable achieves the best accuracy estimation compared to the other clustering criteria and k-means clustering. The improvement in accuracy in terms of the RMSE when applying this criterion is 63.68% for the FPA method and 72.02% for the EEAC method. Conclusion: Better results are obtained through dataset clustering compared to no clustering for both the FPA and EEAC methods. The Industry Sector is the most suitable clustering method among the tested clustering methods. Author

Plný text:

https://ieeexplore.ieee.org/document/9803030

Zobrazit celý záznam

Soubory tohoto záznamu

Název: Preprint_1011054.pdf

Velikost: 1.333Mb

Formát: PDF

Popis: preprint

Zobrazit/otevřít

Citace ČSN ISO 690:2011

Citace článku v časopise:
VO VAN, Hai, Le Thi Kim Nhung HO, Zdenka PROKOPOVÁ, Radek ŠILHAVÝ a Petr ŠILHAVÝ. Towards improving the efficiency of software development effort estimation via clustering analysis. IEEE Access [online]. 2022, vol. 10, s. 83249-83264. [cit. 2026-07-29]. ISSN 2169-3536. Dostupné z: https://ieeexplore.ieee.org/document/9803030.

Tyto citace vytvořil software a mohou obsahovat chyby. Pro ověření přesnosti si nastudujte příslušnou citační normu nebo příručku.