Heterogeneous Job and Skill Landscapes in the Italian Labor Market

March, 2023

Abstract

Traditional labor market data provide very limited knowledge about the type and skill content of jobs. Yet, the profound labor market transformations due to rapid technological and organizational changes call for a better understanding of work skill attributes and how they characterize modern jobs. In this paper, I analyze a large and highly unstructured novel data source containing the near-universe of online job vacancies posted in Italy in 2014-2019. I exploit information on the type of contract, industry, occupation, and location associated to given job postings, as well as on hundreds of granular skill requirements contained within them, that are given in the form of short strings univocally identified by tags. I first assign these short strings to a set of skill categories through a keyword-based matching routine based on the taxonomy proposed by Deming and Kahn (2018). Then, I illustrate the skill content of the obtained skill categories, examine the demand for each of them, and analyze how the latter varies across several dimensions. Next, I use a bag-of-tags approach to implement two unsupervised machine learning methods that detect skill requirements co-occurrence patterns across job postings and produce groupings of the data based on skill similarity. Firstly, I run a K-means model, which assigns jobs to mutually exclusive clusters representing job types. Secondly, I run a Latent Dirichlet Allocation model, which probabilistically assigns skill tags to mutually exclusive latent topics representing work domains, and expresses job postings as a weighted average of different topics. I show that the groupings obtained with the two methods are strongly consistent. This work exemplifies how dimensionality reduction techniques can be used to draw meaningful insights from large and unstructured datasets.

Heterogeneous Job and Skill Landscapes in the Italian Labor Market

Abstract

Giuseppe Grasso

Postdoctoral Researcher