University of Western Ontario – Susmita Haldar and Luiz Fernando Capretz
Project Title: Interpretable Software Maintenance and Support Effort using Machine Learning
Research aims and objectives: This paper was presented at the 2024 International Conference on Software Engineering. Software maintenance and support effort consume a significant part of the software project budget. Manually estimating the total hours required for this phase can be very time-consuming, and often differs from the actual cost that is incurred. The automation of these estimation processes can be implemented with the aid of machine learning algorithms. This study contributes to the development of the maintenance and support effort prediction model. This study concluded that staff size, application size, and number of defects are major contributors to the maintenance and support effort prediction models.
Download a copy of this research paper
University of Western Ontario – Susmita Haldar and Luiz Fernando Capretz
Project Title: Explainable Software Defect Prediction from Cross Company Project Metrics using Machine Learning
Research aims and objectives: This paper was presented at the 2023 7th International Conference on Intelligent Computing and Control Systems. Predicting the number of defects in a project is critical for project test managers. This study predicts defects from project-level information based on a cross-company project dataset.
Download a copy of this research paper
University of Costa Rica – Marcelo Jenkins, Leonardo Villalobos-Arias, Christian Quesada-López, Jose Guevara-Coto, Alexandra Martínez
Project Title: Evaluating Hyper-parameter Tuning using Random Search in Support Vector Machines for Software Effort Estimation
Research aims and objectives:
In this paper, we investigate to what extent the random search hyper-parameter tuning approach affects the accuracy and stability of support vector regression (SVR) in SEE. Results were compared to those obtained from ridge regression models and grid search-tuned models. A case study with four data sets extracted from the ISBSG 2018 repository shows that random search exhibits similar performance to grid search, rendering it an attractive alternative technique for hyper-parameter tuning. RS-tuned SVR achieved an increase of 0.227 standardized accuracy with respect to default hyper-parameters. In addition,
random search improved prediction stability of SVR models to a minimum ratio of 0.840. The analysis showed that RS-tuned SVR attained performance equivalent to GS-tuned SVR.
October 2020
National Institute of Technology, Calicut – India – Sreekumar P.Pillai
Project Title: Software Effort Estimation Employing Machine Learning Techniques
Research aims and objectives: The research aims to explore and find the best possible Machine Learning/Deep Learning technique that can be applied to an industrial context in Software Effort Estimation.
Date 2019
George Mason University – USA – Brett Josephson
Project Title: Leveraging Machine Learning Approaches to Create Predictive Metrics of Software Development Success and Failure
Research aims and objectives: Researchers have recently started leveraging data analytic and Machine Learning techniques on code directories to determine correlation between software development effort, status, and issues with project success factors . Our goal is to uncover key software development metrics to predict critical project success metrics. Using the latest techniques in Machine Learning, our goal is to leverage the ISBSG project database to uncover these key metrics that managers and developers could then apply to improve their efforts.
Date: Dec 2019
George Washington University – USA – Sandra Forney
Praxis Title: An Agile Estimation Framework (ASEF)
Research aims and objectives: This project aims to develop a decision framework that can be used by project and engineering managers to assess the suitability of the Agile methodology for their project and estimate the likelihood of project success in terms of cost, schedule, and overall meeting stakeholder objectives.
Date: August 2019
Jaypee University of Engineering & Technology – India – Dinesh Kumar Verma
Project Title: Advancement in Software Development Lifecycle using Computational Techniques
Research aims and objectives: To facilitate the software development process by predicting required effort accurately and as early as possible.
Date: 2019
John Hopkins University – USA – John Piorkowski
Project Title: Improving Software Acquisition Practices with AI for Government Systems
Research aims and objectives: Use of data analytics and machine learning to provide insights into cost, time and quality of government projects.
Date: 2018
Thammasat University – Thailand – Tachanun Kangwantrakool
Project Title: Software Development Effort Estimation using LSTM Networks on Project Description
Research aims and objectives: This paper investigates effectiveness of along short-term memory networks on estimating development effort, in the form of man-day, from a project description in a software development project. Three architectures; average word-vector with softmax, single LSTM layer with dense net and double LSTM layer with dense net, are compared in terms of man-day difference.This research will benefit to stakeholders who are working on project management areas.
Date: Dec 2018
Tomas Bata University – Czechoslovakia – Radek Silhavy
Project Title: Computational Statistics and Machine Learning Methods for Functional Points Analysis Optimisation
Research aims and objectives: Evaluate existing machine-learning models for FPA models. Study project productivity in businesses, companies and countries.Feature selection method designs for algorithm estimation.
Date: Dec 2018
King Mongkut’s Institute of Technology – Thailand – Katawut Kaewbanjong
Project Title: Predicting Software Project Outcomes
Research aims and objectives: Investigate whether performance for software projects can be predicted using neural networks and regression models.
Date: 2018
PSG College of Technology – India – P. Sampathkumar
Project Title: Application of Machine Learning Techniques in Software Engineering
Research aims and objectives: The use of machine learning algorithms to predict software project effort and cost estimation.
Date: 2018
University Putra – Malaysia – Koh Tieng Wei
Project Title: Solving the problem of subjectivity in the Technical Complexity Factors
Research aims and objectives: The selection of each factor of TCF is given by a degree of influence using an ordinal scale. It is limited by six values (0, 1, 2, 3, 4 and 5). The degrees of influence from 0 to 5 are sometimes clearly insufficient. Since the TCF ranged from 0.65 to 1.35, the Unadjusted Function Point count can change by ± 35 %. Thus, the uncertainty inherent in the subjective sub-factor ratings can have a significant effect on the final function point count.
On the other hand, the number of TCF factors is limited by 14 factors and these factors are not enough to calculate the adjustment value of function point. Improving this part can be performed by extending the current 14 factors to cover other characteristics of software. This part also can be improved by using some methods i.e. the Artificial Intelligence techniques such as fuzzy logic or any other techniques. Estimated date of completion: August 2007
University of Otago – New Zealand – George Benwell
Project Title: Application of bayesian networks to software effort prediction
Research aims and objectives: Establish the applicability of Bayesian networks to software development and maintenance effort prediction. The results of the investigation will be used to develop a formal methodology for constructing Bayesian network software effort prediction models. Estimated completion date: September 2006
University of the West of England (UWE) – UK – Ayman Issa
Project Title:
Development of new software cost estimation method(s) based on Use Case Models
Research aims and objectives: To seek answers to the following questions:
Can the Use Case model be utilised to size the effort required to develop a given software development project?
To what extent can this effort be accurate in the early stages of software development?
To what extent can Use Case models assist in assessing the complexity of software systems?
Does the current Use Case modelling approach need to be extended to provide a more accurate estimate of the effort required and the systems complexity? Estimated completion date: Sept 2006
University Tel-Aviv – Israel – Berlin Stanisiav
Project Title:
Comparative analysis of effort estimation approaches
Research aims and objectives:
Classification and comparison of effort estimation methods. Identify methods for improving effort estimating accuracy. Estimated completion date: January 2006
George Washington University – USA – Kingshuk Banerjee
Project Title: Recognising patterns in software development lifecycles using data mining techniques
Research aims and objectives: Explore the correlation between different aspects of software projects. The correlation patterns that may be recognised in the process can help in the planning and scheduling of new projects.
Date: October 2005
Katholieke Universiteit Leuven – Belgium – Monique Snoeck
Project Title:
Developing software effort estimation models using neural network rule extraction and support vector machines
Research aims and objectives: The goal of this project is to develop improved software effort estimation models based on a combination of LOC-based and FPA-based techniques. We want to investigate how specific patterns of technological choices affect the productivity of software development, both in terms of generated lines of code and development effort required.
We will tackle the above research question using both neural network rule extraction and support vector machines. Recently, neural networks have gained a lot of interest for developing software effort estimation models. However, a major obstacle which impedes the practical use of neural networks is their opacity. Although they allow to model complex non-linear relationships, their black box property essentially turns them into incomprehensible and user-unfriendly models. Hence, we will try to mimic the behaviour of the trained neural network by a set of if-then rules, which are easier to interpret and understand by the software engineer. Furthermore, we will also investigate the power of support vector machines (SVMs) for software effort estimation. SVMs are a recently suggested technique for pattern recognition and circumvent many of the problems encountered which training neural networks (e.g. choice of the number of hidden neurons, regularisation parameter, … ). Again, it will be investigated how SVMs can be turned into white-box, self-explanatory models using rule extraction.
Date: September 2005
University of Tennessee at Martin – USA – Denise Williams
Project Title: Exploring Project Success – please note that this is a working title and will probably be changed at a later time
Research aims and objectives: The research aim of this project is to test various hypotheses and models about information system project success by using the data in the database. A review of the IS literature in this area did not provide results which would indicate that these efforts would be redundant. This project would test new hypotheses and test hypotheses which are supported by the literature. Expected date of completion: August 2005
Stevens Institute of Technology – Hoboken, USA – Christine V Bullen
Project Title: Improving business information systems management using the SEI core measures
Research aims and objectives: To explore the benefits of organisational competence in software process measurement and strategies for overcoming various software process improvement challenges using the SEI core measures, (size, time, effort and defects).
Date: June 2005
University of Calgary – Canada – Jingzhou Li
Project Title: Effort prediction for release planning by analogy and simulation support
Research aims and objectives: Our new analogy-based effort estimation method is based on different types of attribute values of features or requirements and will be used for release planning for incremental software development. Simulation technique is employed for exploring suitable methods according to different conditions.
Date: February 2005
Athens University of Economics and Business – Greece – Zorgios Yiannis
Project Title: Intellectual Capital Pricing Methods in Software Engineering
Research aims and objectives: Develop a model that estimates the real value of software engineers’ intellectual capital. Develop a new costing methodology for capturing cost related to Intellectual Capital creation and usage. Estimate software engineers average learning curve.
Date: 2005
George Mason University – USA – Hui Zeng
Project Title: Economical Model for Software Testing Effort
Research aims and objectives: Provide estimation models for software testing effort.
Date: Dec 2004
Tampere University of Technology – Finland – Pasi Vakaslahti
Project Title: Corporate networks: Software co-development in the context of product process, value chains and business models.
Date: Dec 2004
University of Oviedo – Project Engineering – Spain – Joaquin Villanueva Balsera
Project Title: Software Effort Estimation based using hybrid methods
Research aims and objectives: Develop software effort models based on hybrid techniques, combing neural networks with genetic algorithms and MARS (Multiple Adaptive Regression Spines) and compare them to commonly used regression type models.
Estimated completion date: December 2004
Universidad Politécnica de Valencia – Silvia Abrahão
Project Title: Comparative Evaluation of Web Projects
Research aims and objectives: The objective is collect information about web projects developed using conceptual modelling techniques. These projects will be measured using a functional size measurement method for web applications (OOmFPweb) in order to generate project indicators (size, effort, cost, and duration). Then, these projects will be compared and benchmarked using the data stored in the ISBSG repository. Completion Date: December 2004
University of Minnesota – MIS Research Centre, Carlson School of Management– USA – Chulmo Koo
Project Title: Software productivity analysis using data envelopment analysis; A comparison of development and maintenance.
Research aims and objectives: This study identifies a set of factors that influence cost estimation for both project construction and maintenance.
Date: December 2004
University of Western Ontario – Canada – Fabiano Tiba
Project Title: Estimation and classification of projects productivity
Research aims and objectives: To build a pattern recognition system that would analyse software projects’ data and from the parameters estimate productivity (measured as a ratio between UFP and size) and classify them according to it.
Estimated completion date: December 2004
Northumbria University – UK – Qin Liu
Project Title: Machine Learning Approaches to Estimating Software Development Effort
Research aims and objectives: Research the use of machine learning approaches in estimation related to software development processes. Techniques such as generic algorithms and artificial neural networks will be hybridised to provide an adaptive estimation tool.
Date: September 2004
Villanova University – USA – Dr. Wenhong Luo
Project title: Estimating Software Customization Cost in ERP Implementation
Research aims and objectives: This proposed research project seeks to develop and validate a cost estimation model for ERP software customization. The model should take into consideration the differences between packaged and custom-built software projects and include measures of different customization activities, such as table configuration and code customization. The proposed model will be validated using new data collected from the industry.
Completion Date: Fall 2004
University of Maribor – Faculty of Computer Science – Ales Zivkovic
Project Title: Improving early estimates in object oriented development using statistical data.
Research aims and objectives: According to existing mappings of OO concepts to FPA elements it is hard to define types of data element and/or transactional function. Therefore it might be better to use an industrial average for relationships between FPA elements. The same information is very useful in early estimates. Within the project we would like to analyse the ISBSG repository and extract those values predominantly for an OO project although data for other projects may be useful too.
Date: July 2004
University of South Florida – Michael Douglas
Project Title: The Impact of the Software Handoff on Software Development
Research aims and objectives: To improve software cost estimation models by including a missing variable, the software handoff. The software handoff deals with inter-group communication and coordination. Estimated completion date: May 2004
Universidad Nacional de Córdoba – Dr. Sergio A. Cannas
Project Title: Software Development Effort Estimating using Neural Networks
Research aims and objectives: Perform a statistical comparative study of the effort estimations efficiency using algorithmic and neural network methods. The main objective is to obtain an efficient neural network based method for estimating effort for software development. Estimated completion date: May 2004
Harvard University – USA – Feng Zhu
Project Title: The Trend in Software Pricing
Research aims and objectives: Measuring the trend in software pricing over time in order to develop price indices of software. Cluster analysis will be conducted to group relevant attributes, then hedonic regression analysis will be conducted.
Date: November 2004
ICREA – Complex Systems Lab – Spain – Sergi Valverde
Project Title: Modelling the Software Development Process
Research aims and objectives: Model the evolution of software systems looking for first principles explaining the Rayleigh equation relating to effort and size.
Date: December 2004
University of Western Ontario – Canada – Vivian Wei Xia
Project Title: Neuro-Fuzzy Function Points Model
The analysis of Function Points has been improved consistently to the current version of IFPUG 4.1. By contrast surprisingly, the weights of Unadjusted Function Points, very important parameter values to measure Function Points, have never been changed since first suggested by Albrecht in IBM in 1979, while software development boomed rapidly in these days. We try to evaluate whether the parameters assigned 20 years ago can still effectively reflect the trend of software today. We are proposing a model to calibrate the weights of Function Points using Neuro-Fuzzy technique, which is expected to outperform the original Function Points in giving better result of estimating software project work effort.
Research aims and objectives: Neuro-Fuzzy Function Points model prototype
Our new Neuro-Fuzzy Function Points model is composed of three layers: input, processing and output. In the input layer, the model derives fuzzy rules antecedents (IF part) from expert experience, which is subjective knowledge. The model also imports the project data provided by ISBSG, which is objective knowledge. In the processing layer, neural networks are used to train the weights of Unadjusted Function Points and fuzzy rules. In the output layer, the Neural Network Training block generates the calibrated weights of Unadjusted Function Points and fuzzy rules consequences (THEN part). So we can have the complete fuzzy IF-THEN rules. Estimated completion date: Sept 2004
Concordia University – Canada – Real Carbonneau
Project Title: Modelling and Predicting Software Development Project Outsourcing Decisions using Artificial Intelligence.
Research aims and objectives: The objective is to study the feasibility of using Artificial Intelligence to model and predict software development outsourcing decisions. Using data collected from software development projects the model will learn what patterns lead to an in-source or outsource decision. From this model, try to learn what patterns assist in making the decision and predictions will be made and validated on a test set.
Date: May 2004
University of Houston – USA – Kimberly Kamsinsky
Project Title: Calibration of Software Development Effort Estimation using Genetic Programs
Research aims and objectives: To investigate the possibility of calibrating existing, or discovering new, effort multipliers using a Genetic Program.
Date: December 2003
University of Sunderland, UK – John Moses
Project Title: Assessing the influence of Problem Domain on effort estimation consistency
Research aims and objectives: To determine whether the influences of Problem Domain and related factors on effort estimate consistency are important enough to require additional contingency allowances and to quantify any needed allowances.
Estimated completion date: September 2003
CMTE, University of Toronto, Canada – Mette Asmild
Project Title: Software Productivity Analysis using Data Envelopment Analysis
Research aims and objectives:
Develop software productivity models based on the non-parametric Data Envelopment Analysis (DEA) approach. Compare them to the commonly used regression-type models.
Date: October 2003
Dakota State University, USA – Jayanth Sri Ranga Vadla
Project Title: Research into software cost estimation using fuzzy neural networks
Research aims and objectives:
To develop and prove a software cost estimation tool using fuzzy neural networks.
Date: July 2003
Bournemouth University, UK – Martin Shepperd
Project Title: Investigating the Use of Case-based Reasoning for Project prediction
Research aims and objectives:
Apply our CBR tool (archangel) to a large industrial data set.
Date: June 2003
National Taiwan University of Science and Technology – Dr. Sun-Jen Huang
Project Title: Establishing an Improvement Decision Model for the Priority of Process Areas Based on Continuous CMMI Model
Research aims and objectives:
To provide guidance for process improvement for a software development organization, on Software Engineering Institute (SEI) released Capability Maturity Model Integration (CMMI) in 2000.
Universität Augsburg, Germany – Johannes Maria Zaha
Project Title: Comparison of classic software engineering techniques to component oriented software engineering.
Research aims and objectives:
To estimate the potential for improvement of quality, time and cost when using component strategies.
Aristotle University Thessaloniki, Greece – Dr Ioannis Stamelos
Project Title: Software effort estimation based on multi-organisational project data
Research aims and objectives:
To investigate the feasibility and accuracy of various estimation approaches, when applied on a multi-organisational project database. In addition, various methods will be applied through the use of tools, (Angel, BRACE) dedicated to effort estimation.
Federal University of Paraná, Brasil – Prof. Aurora Trinidad Ramirez Pozo
Project Title: Artificial Intelligence on Software Engineering Applications
Research aims and objectives:
Study and development of tools to help software engineering activities, using the ISBSG data to estimate cost, effort and other metrics using machine learning tools.
Ecole de Technologie Supereure – Montreal, Canada
Project Title: Analysis of impact of work effort categories
Research aims and objectives:
Analyse how effort categories impact the quality of estimation models
La Trobe University – Australia – Claire Coleman
Project Title: Case based reasoning for prediction of specification change effect on software development.
Middle East Technical University – Ankara, Turkey – Cigdem Gencel
Project Title: Software size estimation in pre-development phase
Research aims and objectives:
Most of the software estimates should be performed at the beginning of the life cycle, when we do not yet know the problem we are going to solve. In the literature, there are few early size estimation methods, which can be utilized in the pre-development phases (before the software requirements are specified). The aim of this study is to develop a model that can estimate size early in the life-cycle.
National ICT Australia – Australia – Jacky Keung
Project Title: Multivariate Data Analysis on the ISBSG dataset
Research aims and objectives:
Build a new/improved analogy estimation model. Compare the prediction accuracy of Analogy approach to the proposed multivariate model.
IST Studies – Athens, Greece
Project Title: Software Development Effort and Quality Estimation using Neural Networks
Research aims and objectives:
Create neural network models for the estimation of quality and development effort of software projects. Check the efficiency of constructed models.
University of Cyprus – Cyprus – Andreas Andreou
Project Title: Modelling and prediction of software engineering empirical data via computational intelligence.
Université Libre de Bruxelles (ULB), Belgium – Frédéric Geurts, PhD
Project Title: Evaluation of estimation metrics. impact of analogy, neural network and statistical techniques on improving software development budget and duration estimation.
Research aims and objectives:
The objective is to benchmark and improve budget and duration estimation metrics applied to software development. We are conducting a review of the main techniques available, based on LOC and function points. In particular, we focus on analogy-based techniques inspired by recent successes in data mining. We are also investigating neural network-based approaches and more classical statistical techniques (PCA, multi-variance estimation etc).
University of Oxford – UK – Zhizhong Jiang
Project Title: The effect of project management and language choices on software development
Research aims and objectives:
Analysing the effect of development methodology on productivity, and the effect of project characteristics on the choice of development methodology.
Universidad Politécnica de Valencia – Marta Fernández-Diego, Mónica Martínez-Gómez, Jose_Torralba-Martinez
Project Title: Sensitivity of results to different data quality meta-data criteria in the sample selection of projects from the ISBSG data set
Research aims and objectives:
Most prediction models, e.g. effort estimation, require preprocessing of data. Some datasets, such as ISBSG, contain data quality meta-data which can be used to filter out low quality cases from the analysis. However, an agreement has not been reached yet between researchers about these data quality selection criteria. Aims: This paper aims to analyze the influence of data quality meta-data criteria in the number of selected projects, which can have influence in the models obtained.
Universidad Politécnica de Valencia – Marta Fernández-Diego, Sanae Elmouaden, Jose-Maria Torralba-Martinez
Project Title: Software Effort Estimation using NBC and SWR. A comparison based on ISBSG projects
Research aims and objectives:
Compared to traditional methods, Bayesian networks are being increasingly used in software engineering because their use opens many possibilities. A main feature of Bayesian networks is their capability to combine data and expert knowledge. This paper seeks to reinforce the hypothesis that Bayesian networks are a competitive method for estimating software effort in terms of prediction accuracy. For this purpose a Naive Bayes Classifier (NBC) and a forward Stepwise Regression (SWR) models have been developed from a subset of the ISBSG dataset. Under homogeneous conditions we found similar results provided that the discretization of the continuous variables is thin enough.
Universidad Politécnica de Valencia – Marta Fernández-Diego, Fernando González-Ladrón-de-Guevara
Research aims and objectives:
The aim of this study is to determine how and to what extent, ISBSG has been used by researchers from 2000 until June of 2012. Method: A systematic mapping review was used as the research method, which was applied to over 129 papers obtained after the filtering process. The papers were published in 19 journals and 40 conferences. This work presents a snapshot of the existing usage of ISBSG in software development research. ISBSG offers a wealth of information regarding practices from a wide range of organizations, applications, and development types, which constitutes its main potential.
Universidad Politécnica de Valencia – Marta Fernández-Diego, Fernando González-Ladrón-de-Guevara
Project Title ISBSG variables most frequently used for software effort estimation – A mapping review
Research aims and objectives:
ISBSG data makes it possible to estimate a project’s size, effort, duration, and cost. The aim was to analyze the ISBSG variables that have been used by researchers for software effort estimation from 2000 until the end of 2013. A systematic mapping review was applied to over 167 papers obtained after the filtering process.
Universidad Politécnica de Valencia – Marta Fernández-Diego, Fernando González-Ladrón-de-Guevara, Chris Lokan
Project Title: Use of ISBSG data fields in software effort estimation – a systematic mapping study
Research aims and objectives:
A common use of the ISBSG data is to investigate models to estimate a software project’s size, effort, duration, and cost. The aim of this paper is to determine which and to what extent variables in the ISBSG dataset have been used in software engineering to build effort estimation models. We propose guidelines that can help researchers make informed decisions about which ISBSG variables to select for their effort estimation models.
Universidad Politécnica de Valencia – Marta Fernández-Diego, Fernando González-Ladrón-de-Guevara
Project Title: Application of mutual information-based sequential feature selection to ISBSG mixed data
Research aims and objectives:
There is still little research work focused on feature selection (FS) techniques including both categorical and continuous features in Software Development Effort Estimation (SDEE) literature. This paper addresses the problem of selecting the most relevant features from ISBSG data to be used in SDEE. The aim is to show the usefulness of splitting the ranked list of features provided by a mutual information-based sequential FS approach in two, regarding categorical and continuous features.
Universidad Politécnica de Valencia – Marta Fernández-Diego, Fernando González-Ladrón-de-Guevara
Project Title: Potential and limitations of the ISBSG dataset in enhancing software engineering research: A mapping review.
Universidad Politécnica de Valencia – Marta Fernández-Diego, José-María Torralba-Martínez
Project Title
Discretization Methods for NBC in Effort Estimation – An Empirical Comparison based on ISBSG Projects
University of Queensland, Australia – Sophie Cockcroft
Project Title:
Metrics in information systems maintenance and volatility
Research aims and objectives:
The aim of this project is to test the hypothesis that a) maintenance estimates become more accurate as the project ages, b) estimate accuracy varies according to the type of work being done. c) estimate accuracy varies according to industry sector, country or other demographic variable. Additionally it is of interest to establish how the decision is made to continue to maintain a system rather than retire it.