What inspires us

Academic Research

Our data science team is always learning and experimenting which means they are closely connected to other research in this area around the world.

Personality testing in employment settings: Problems and issues in the application of typical selection practices

Personality testing in employment settings: Problems and issues in the application of typical selection practices

Why we love it

This paper just nails everything that is wrong with traditional psychometric testing for hiring.It is great to read a visionary paper published in 2001 that foresaw all these issues almost 20 years earlier.

What I learnt from it

Some of the issues highlighted are,

  1. the (in)appropriateness of linear selection models;
  2. the problem of personality-related self-selection effects;
  3. the multi-dimensionality of personality;
  4. bias associated with social desirability, impression management, and faking in top-down selection models; and
  5. the legal implications of personality assessment in employment contexts.

Why it’s a must

Before fixing the issues of traditional psychometric testing for hiring, is it important to understand and acknowledge them. It also feels really good to see how PH’s approach solves some of these issues. We use machine learning models that are able to model complex non-linear relationships. Moreover, complex multi-dimensional relationships between the hiring outcome and not just personality, but rather a broad range of signal variables are LEARNT by our models. We use a chat-like text-based conversation to assess candidates and have noticed how hard it is to game such a system compared to MCQs used in psychometric testing. With regard to bias in our models, we constantly evaluate our models for gender and ethnicity bias and remediate such biases.

Who should pay attention

CHROs, Hiring managers, Talent Acquisition teams

Notes from the frontier: Tackling bias in AI (and in humans)

Notes from the frontier: Tackling bias in AI (and in humans)

Why we love it

Authors provide a succinct yet comprehensive overview of how biases can be baked into AI, ways to mitigate it and when developed well, how AI can help reduce human biases in decision making. It is written for a non-technical audience but includes a great list of original research work in the Endnotes section that anyone interested in further reading can access.

What I learnt from it

It is great to see our own experience building AI solutions in a domain (talent acquisition) where unconscious bias is commonplace, resonating well with what the authors describe. Especially in how AI can help bring human bias to light and steps to follow in building AI solutions with no measurable biases. Some of the key points include:

  • Having a clear and applicable definition of bias and fairness. This includes thinking about group vs. individual fairness, protected characteristics, predictive parity vs. error rate parity etc
  • Being aware of biases in data, data collection methods, and societally unacceptable correlations (algorithmic biases) learnt by algorithms, all of which are discoverable within a sound machine learning process
  • Using methods such as Local Interpretable Model-agnostic Explanations (LIME) to explain the outcomes of seemingly complex algorithms that act as black boxes.

Why it’s a must

As AI becomes pervasive, the topic of algorithmic bias and fairness has attracted lot of attention. This is a great short paper for decision makers, especially at the C-level to demystify the topic. I would say a must read for all CHRO’s exploring the use of AI in their workflows. Especially the five suggestions listed in the conclusion of the paper forms a framework to maximise fairness and minimise bias when using AI.

Who should pay attention

CHROs, Hiring managers, Business leaders

Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach

Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach

 

Why we love it

This two papers highlight how incredibly one’s personality can be inferred from their language use and how these inferred personality profiles can be used to find the best congruence between one’s personality and personality demand of jobs. These papers in fact provided us inspiration and a benchmark for our own ‘language use to personality’ model.

Links

https://doi.org/10.1371/journal.pone.0073791 and

https://doi.org/10.1073/pnas.1917942116

 

What I learnt from it

It was in fact scary at first to see how much we give away in our conversations about who we are. After coming into terms with that, it was encouraging for the DS side of me to see how open-vocabulary approaches can outperform the dictionary-based closed-vocabulary approaches used in previous studies in modelling language use. On the other hand, when these models are applied to Tweets from different professionals, it is amazing to see how different professions have different personality signatures. I can personally vouch for high openness and low agreeableness among top open-source developers.

Why it’s a must

These two pieces of research reaffirm the importance of congruence between one’s personality and job/career. As Denissen et al. (2018) put it, “economic success depends not only on having a ‘successful personality’ but also, in part, on finding the best niche for one’s personality”. In evaluating this congruence, one’s language use can reveal a lot about him/her and machine learning models can model complex non-linear relationships between personality and personality demand of jobs.

Who should pay attention

ML/DS practitioners, HR professionals, Personality junkies

Data Science through the looking glass and what we found there

Data Science through the looking glass and what we found there

 

Why we love it

The authors are from Microsoft and they perform one of the largest analysis of Data Science projects to date, focusing on key information that helps both Data Science solution builders and practitioners alike. They analyse publicly accessible Python notebooks in GitHub and Machine Learning pipelines from a corporate Machine Learning platform, AnonSys. While some of the findings are not so surprising such as the 4-fold growth in number from 2017 to 2019, Python emerging as a de-facto standard for Data Science etc. some of the findings are quite interesting.

What I learnt from it

Some of the interesting findings include,
1) “Big” (i.e., most used) libraries are becoming “bigger”, consolidating well in the DS field.
2) Deep Learning is becoming more popular, yet accounts for less than 20% of DS today.
3) Analysis of the top libraries and top transformers used in Data Science pipelines points to how text, a source of unstructured data, is being tapped in to.
Above all, it is fascinating to see how Data Science/Machine Learning is becoming a ubiquitous technology.

Why it’s a must

The paper uncovers the current state and a number of trends in the Data Science/Machine Learning field. These trends provide practitioners with a good indication on which technology or libraries they should invest their time in.

Who should pay attention

DS/ML practitioners

Bidirectional LSTM-CRF models for sequence tagging

Bidirectional LSTM-CRF models for sequence tagging

Why we love it

I enjoy this paper since it introduces a new method for entity extraction, called BI-LSTM-CRF. The method proposed by this paper is more accurate than previous methods in various wide-applicable tasks in NLP. This paper inspires us to explore the possibility of applying deep learning methods in our work.

What I learnt from it

I learnt that (1) BI-LSTM-CRF out-performs previous models such as CRF or pure LSTM in POS tagging, chunking, and NER tasks. (2) The stacked structure in BI-LSTM-CRF allows it to factor in both word-level features via an LSTM layer and sentence-level feature via the CRF layer, so that it can make use of syntactic and contextual information in the language efficiently.

Why it’s a must

Published in 2015, this paper is one of the first work showing that LSTM based method can be applied to NLP tasks. According to Google Scholar, this paper has more than 1000 citations.

Who should pay attention?

NLP researchers, Machine Learning Engineers

RECENT POSTS

Get our insights newsletter to stay in the loop on how we are evolving PredictiveHire