Inferring User Interests on Social Media From Text and Images

We propose to infer user interests on social media where multi-modal data (text, image etc.) exist. We leverage user-generated data from as a natural expression of users’ interests. Our main contribution is exploiting a multi-modal space composed of images and text. This is a natural approach since humans express their interests with a combination of modalities. We performed experiments using the state-of-the-art image and textual representations, such as convolutional neural networks, word embeddings, and bags of visual and textual words. Our experimental results show that in fact jointly processing image and text increases the overall interest classification accuracy, when compared to uni-modal representations (i.e., using only text or using only images).

Examples of pins (image-text pairs) with the corresponding categories (top) our system predicts
Yagmur Gizem Cinar, Susana Zoghbi, Marie-Francine Moens
International Workshop on Social Media Retrieval and Analysis in conjunction with the IEEE International Conference on Data Mining (ICDM 2015)

Leave a Reply