Some industrial psychologists might be deeply concerned about the disruptive effects that artificial intelligence will have on humans in the workplace. Yet, not all industrial psychologists in South Africa recognise the practical uses of machine learning for decision making about people’s potential to flourish. Data science unify statistics and machine learning in ways that could be useful to analyse and make sense of social phenomena. For example, machine learning has been used by Hampton, Asadi, and Olson (2018) to better predict income attainment from numerous biographical variables (for example age, race, ethnicity, and height), as well as discounting delay behaviours (similar to impulsivity). No prior studies could account for all the predictive variables simultaneously and multicollinearity between the independent variables presented a serious problem in running the predictive model (Hampton et al., 2018). The possible reason for the overlap between the independent variables was postulated to be a shared higher-order construct, namely socio-economic status (Hampton et al., 2018). Problems regarding multicollinearity and the need to simultaneously model continuous, categorical and dichotomous variables made machine learning a more promising alternative to traditional methods for data analysis (Hampton et al., 2018). According to Hampton et al. (2018), machine learning also included the benefits of using algorithms that were less sensitive to outliers and the option not to assume a linear relationship between the variables (for example, the relationship between age and income is curvilinear). Through machine learning the researchers discovered that delay discounting is more predictive of income attainment than age, race, ethnicity, and height, which provide important information about behaviours that should be encouraged and discouraged in society.
It is important to note that machine learning is not a magical tool that can make sense of poorly structured or insufficient data. The old principle still counts, namely “garbage in, garbage out”.
Hampton et al. (2018) went to great lengths to ensure the removal outliers, discretization of features, and feature selection in their data. Implicit biases in the collection of data was also considered. In the case of Hampton et al. (2018), a large amount of data was collected from a heterogeneous group (n=2564) in terms of age, education, ethnicity, and race. The algorithm was, therefore, not trained and tested on a typical misrepresentative homogenous sample of Caucasian, well-educated, and wealthy students.
Machine learning algorithms can a useful methodology that can be used to analyse complex data in order to find solutions to problems if properly understood and used. However, learning the technique requires a certain level of computational prowess, which takes time to master. Industrial psychologists in South Africa are urged to experiment with the technique and find potential applications of data science in their daily practices. There are many open source software that can be used to conduct machine learning, such as Python.
Hampton, W. H., Asadi, N., & Olson, I. R. (2018). Good things for those who wait: Predictive modelling highlights the importance of delay discounting for income attainment. Frontiers in Psychology, 9. 1-10. https://doi.org/10.3389/fpsyg.2018.01545