Apparel Recommendation System

Problem:

Recommend products that are most similar to the current product to Amazon customers. Given a product, I’ve analysed it by its image, brand name, color etc and recommended products which have most similar features to the current product.

Data:

The Amazon Apparel recommendation has around 0.18 million products with 7 features for each product. The data contains both textual and image information as attributes such as product type(shirt, pant etc), product brand, product colour, product image etc.

Approach:

  • Performed Data Cleaning & Data Preprocessing by removing unnecessary and duplicates rows and for text reviews removed HTML tags, punctuations, Stopwords and Stemmed the words using Porter Stemmer.
  • For each product, found the most similar products using sklearns's cosine_similarity and pairwise_distances on
    • Textual Feature Vectors for product's 'Title', 'Brand', 'Type' attributes seperately.
    • Weighted average textual Feature Vectors of product's 'Brand', 'Colour' attributes.
    • Image Feature Vectors for product's Image attribute.

Technologies Used: Convolutional Neural Networks, Machine Learning, Python