Leadership Research Summary:
• Understanding the traits that define a leader is a perennial quest. An ongoing debate surrounds the complexity required to unravel the leader trait paradigm. With the advancement of machine learning, scholars are now better equipped to model leadership as an outcome of complex patterns in traits. However, interpreting those models is often harder. In this paper, the study guides researchers in the application of machine learning techniques to uncover complex relationships. Specifically, the study demonstrates how applying machine learning can help to assess the complexity of a relationship and show techniques that help interpret the outcomes of “black box” machine learning algorithms.
• While demonstrating techniques to uncover complex relationships, we are using the Big Five Inventory and need for cognition to predict leadership role occupancy. Among our sample (n = 3385), we find that the leader trait paradigm can benefit from modeling complexity beyond linear effects and generate several interpretable results.
Leadership Research Implications and Findings:
• The findings suggest that further exploration of the leader trait paradigm could benefit from (deductively) testing non-linear effects. In studying these effects, scholars should consider cross-validation to avoid overfit and biased results. Moving forward, scholars can use algorithmic machine learning to find novel insights for deductive testing using conventional analytical approaches. Important consideration should be given to the predictor variables selected for the models. As reported in the correlation table (Table 1), leadership occupancy was more common among men and older individuals. In a post-hoc analysis, researchers examined whether personality yielded different effects when gender and age were included in the models.
• Researchers found similar effects of personality, contingent on gender and age, such that the likelihood of leadership role occupancy was higher for older men. The more variables included in the algorithmic machine learning models, the more complex patterns can be explored. Insights derived from such models may change if important predictor variables are omitted. Using the output of algorithmic machine learning as input for conventional analytical approaches is needed to test the validity of findings and results in clear parametric outcomes – suitable for meta-analytical tests. The takeaway here is that, again, LM and algorithmic methods such as RF are not in competition. Rather, they are complementary tools for scientific discovery.
• The methods demonstrated in this paper are also relevant to practitioners and offer great opportunities for further exploration of the leader trait paradigm. Models like the ones we developed could inspire organizations in the selection of future leaders and the succession planning for current leaders. The study found modeling more complex effects through algorithmic models suitable when the goal is to precisely identify the majority of actual leaders rather than broadly identifying (or recalling) all leaders. Thus, RF potentially reduces the trial and error costs of leader selection.
• As the study of machine learning and leadership matures, efforts can be made to further reduce such costs by incorporating trait data with other relevant factors, such as performance measures. Increasing data dimensionality will likely add to the RF advantage when it comes to identifying high potentials faster and with greater accuracy (Spisak et al., 2019).
• Incorporating algorithmic machine learning techniques into decision-making also allows for a better understanding of (a) who has a personality profile similar to individuals in leadership positions and (b) how effective they will likely be in a leadership position. It is important to make this distinction between leadership role occupancy and effectiveness given that the ability to occupy a leadership position does not necessarily translate into effectiveness.
• With machine learning techniques, similar to those we highlighted in this paper, academics and practitioners can explore the potentially complex differences between leadership role occupancy and leadership effectiveness to reduce the chance of making false positives and false negatives (i.e., selecting leaders low on effectiveness, or not selecting effective leaders who are low on emergent qualities). Incorporating machine learning could help academics forge a new understanding of what makes an effective leader and help practitioners remove even more cost from the leader selection process.
• Despite the added value of machine learning, it is not without limitations. Data, in particular related to the “four V’s” of big data, is a perpetual concern: volume, variety, velocity, and veracity. First, though the amount of data we used here was large relative to typical datasets in leadership research, it was by no means “big data”, the space where machine learning is at its best (i.e., volume). Second, in efforts to advance the leader-trait paradigm, we focused on traits. However, several other features such as height and intelligence (Ilies et al., 2004) are important for leadership as well (i.e., variety).
• Third, we studied the leader trait cross-sectionally and did not focus on changes in the data, such as changes in leadership role occupancy – which would especially be interesting to monitor as changes in contextual factors such as market competition and disruption alter who is preferred as a leader (velocity). Fourth, though the sample is from a trusted source, using validated measures, there is the perennial concern with data quality (i.e., veracity). Scholars will need to remain vigilant when it comes to quality as the hunt for compelling datasets intensifies.