This is a team approach to solving the petfinder competition using 3 types of data (Images, Text, and tabular)
https://www.kaggle.com/c/petfinder-pawpularity-score
a. Image Data:
Use InceptionV3 as base, train the last layers only, and extract features to represent images.
b. Text Data:
Use Glove embeddings to embed words, and then learn an embedding for the whole document using LSTM
c. Tabular Data:
Use ANN to extract latent features
d. Concat all of them and feed them to another model that predicts the final label (1,2,3,4) as continuous varibale
e. Clip the prediction [1,4] and round to nearest int.