Guo Y., Krause J., Khadraoui D., Cortina S., Aeckerle-Willems C., Viroli F.
EPJ Data Science, vol. 15, n° 1, art. no. 16, 2026
New legislation requires companies to regularly perform environmental, social, and governance (ESG) risk assessments for their suppliers. This is typically done using public risk indicators for countries and industries in which the suppliers operate. However, the approach often does not represent the actual risk stemming from a particular supplier accurately. Moreover, risk indicators are usually only updated annually and do not reflect current developments. Therefore, big text data collected from media monitoring on the suppliers can augment these risk indicators to provide more accurate and timely risk assessments. Although, using media texts for this purpose is challenging. It usually has a low signal-to-noise ratio and requires reliable complex text interpretation to be usable. Against this background, we propose a hybrid approach of large language models and deep learning classifiers to process media texts for automated ESG risk assessment. A set of 17 classifiers is investigated and tested against the direct usage of large language models for classification. The performance is evaluated in a Monte Carlo experiments on four distinct datasets.

