Apparel Recommendation System
Problem:
Recommend products that are most similar to the current product to Amazon customers. Given a product, I’ve analysed it by its image, brand name, color etc and recommended products which have most similar features to the current product.
Data:
The Amazon Apparel recommendation has around 0.18 million products with 7 features for each product. The data contains both textual and image information as attributes such as product type(shirt, pant etc), product brand, product colour, product image etc.
Approach:
- Performed Data Cleaning & Data Preprocessing by removing unnecessary and duplicates rows and for text reviews removed HTML tags, punctuations, Stopwords and Stemmed the words using Porter Stemmer.
-
For each product, found the most similar products using sklearns's cosine_similarity and
pairwise_distances on
- Textual Feature Vectors for product's 'Title', 'Brand', 'Type' attributes seperately.
- Weighted average textual Feature Vectors of product's 'Brand', 'Colour' attributes.
- Image Feature Vectors for product's Image attribute.