The LSTM model generates captions for the input images after extracting features from pre-trained VGG-16 model. (Computer Vision, NLP, Deep Learning, Python)
This is the first step of data pre-processing. The captions contain regular expressions, numbers and other stop words which need to be cleaned before they are fed to the model for further training. T…