This architecture is called Long Short Term Memory because it is using many short term memory cells to create a long term memory (link), meaning: able to remember a long sequence of input, e.g. 5 years of historical stock data. The old RNN inherently has a problem with long sequences ...
And that is the big “trick” in NLP. We can do all the lexical analysis and syntactic analysis all we want, but in the end we need to convert the words into vectors, and the centre of those vectors is meaning of those words (the topic). So the meaning is also a vector! In the...