Measuring Visual Congruence and Brand-Influencer Matching with Deep Learning

alessandro rozza
adam elwood

Our approach for using deep learning techniques to find ways to match brands with influencers based on their Instagram posts

With the rise in use of social media to promote branded products, the demand for effective influencer marketing has increased. Brands are looking for improved ways to identify valuable influencers among a vast catalogue; this is even more challenging with “micro- influencers” (people with between 5k and 100k followers), which are more affordable than mainstream ones but difficult to discover. To be able to match influencers with brands well, lots of marketing research has shown that a certain degree of similarity in the style and content between the images in the posts made by an influencer and the brand endorsed is necessary. This similarity is usually referred to as “visual congruence”.

To contribute to this area, we have used cutting-edge deep learning techniques to find ways to match brands with influencers based on their Instagram posts. Firstly, we have proposed a novel deep learning architecture [1,2] that was able to achieve state-of-the-art results on micro-influencer ranking. Following this, we applied a slight modification of this framework on a large sample of images from Instagram, to extract a suitable measure of visual congruence between brand and influencer posts.

Our architecture (see image 1) is built around the concept of extracting embeddings that represent the semantic content of images and text in the different posts. We start by taking embeddings from standard pre-trained image and text processing neural networks. We then combine them with multiple trainable layers, which are designed to find a similarity score between pooled brand and influencer images. This architecture is trained using back propagation with a list-wise learn to rank approach. On top of this, we include an auxiliary classification task, which tries to classify a post based on the category of the brand or influencer, which helps to stabilise the embeddings and produce a better result. For more details, see [1, 2].

Image 1
Image 1

Applying this architecture on a publicly available dataset of brands and influencers, we have shown that we can produce state-of-the-art results on influencer ranking across a wide range of metrics.

To be able to prove the value of this approach to the marketing research community, we have shown that this ranking procedure can find influencers that have a good engagement with their followers when posting about content that is relevant to a brand. Starting from the above architecture, we defined a visual congruence score [3] to compute the match between brands and influencers based on images in their posts. In order to assess whether this congruence is a good predictor of followers engagement, we ran a multiple linear regression model that uses the brand-influencer visual congruence to predict the engagement generated by an influencer post. This allowed us to show that there is a significant impact of the brand-influencer visual congruence on the level of followers’ engagement on an influencer’s post that endorses a brand.

Up to this point, we have introduced techniques that perform well on metrics averaged over large datasets. However, as we rely heavily on deep neural networks, it can be difficult for a human to understand why the posts of a particular influencer lead to its ranking position for a certain brand. Instead, being able to better interpret why an influencer is matched well with a brand can be of interest to both brands and influencers.

In order to do this, we use our above work to define importance scores for images that we input into our neural networks. These importances can be turned into a heatmap, which we superimpose over the input image, highlighting the most relevant areas for the final ranking.

For example, on an image containing a woman and a car (left) (image 2), we have superimposed heatmaps of the most relevant parts of an image for matching it to either a fashion brand (middle), or a car brand (right). In the case of the fashion brand, this importance measure suggests that the model pays most attention to the woman when calculating if there’s a good match. For the car brand the model pays most attention to the car.

Image 2
Image 2

A similar comparison is carried out in the next image (image 3), which could be used as a post for a food brand (middle) or jewellery brand (right). In all these cases it is clear that the model is paying attention to the part of the image one would naively have thought most relevant to the brand in question.

Image 3
Image 3

Overall, this work has allowed us to introduce an approach that could simplify the work of our Media business unit ( and give them an edge on the competition. On top of this, we have made a significant contribution to the academic communities both in the fields of marketing and multimedia computer science.


[1] Adam Elwood, Alberto Gasparin, and Alessandro Rozza. “Ranking Micro-Influencers: a Novel Multi-Task Learning and Interpretable Framework.” 2021 IEEE International Symposium on Multimedia (ISM). IEEE, 2021. (Best paper award)

[2] Adam Elwood, Alberto Gasparin, and Alessandro Rozza. “Ranking Micro-Influencers: a Multimedia Framework with Multi-Task and Interpretable Architectures” International Journal of Semantic Computing (IJSC) . World Scientific, 2022. (To appear)

[3] Adam Elwood, Alessandro Rozza, Elanor Colleoni, Angelo Miglietta. “Measuring brand-influencer visual congruence on Instagram using deep learning and automated image recognition”, Micro & Macro Marketing. Il Mulino, 2022. (To appear)

Read next

Ease your Android widgets development with help of Jetpack

Ease your Android widgets development with help of Jetpack

alejandro weichandt
omar shatani

Quick introduction of related Jetpack libraries which will help on your Android widget's development [...]

SwiftUI and the Text concatenations super powers

SwiftUI and the Text concatenations super powers

fabrizio duroni
marco de lucchi

Do you need a way to compose beautiful text with images and custom font like you are used with Attributed String. The Text component has everything we need to create some sort of 'attributed text' directly in SwiftUI. Let's go!!! [...]