Convolutional long short term memory deep neural networks for image sequence prediction
- Additional Document Info
- View All
© 2018 Although done nearly effortlessly by humans, digital systems cannot easily recognize images or predictions from recent observations. Tackling these limitations by proposing novel algorithms to improve the performance of image processing would have widespread implications in a variety of fields, including robotics, manufacturing, biomedicine, and automation. To provide a computer with this combined ability and transform it into an intelligent system, an algorithm must combine memory with an image decomposition procedure. Artificial neural networks (ANNs) are algorithms that aim to solve tasks such as classification, clustering, pattern recognition, and prediction by resembling brain connections. Specifically, three ANNs have excelled in specific areas: deep neural networks (DNNs), which use intrinsic connections to create prediction maps; long short-term memory neural networks (LSTMs), which use recurrent connections to emulate a type of memory; and convolutional neural networks (CNNs), which can decompose complex data through layers for simpler analysis. Although these algorithms can solve certain tasks of image sequence prediction, they cannot easily solve entire problems on their own. Nevertheless, combining these networks may enable solving such problems with ease. Thus, this article evaluates the combination of ANNs into two novel algorithms developed with the aim of improving image sequence prediction: (i) a combination of CNNs and LSTMs to form a CLNN and (ii) a combination of CNNs, LSTMs, and DNNs to form a CLDNN. Although the developed algorithms require a longer training time, they require less training epochs to have better accuracy than their predecessors. Furthermore, both developed methods were capable of accurately performing the image sequence prediction task, outperforming each individual method, as well as predicting longer and greater numbers of sequences correctly. Overall, the developed algorithms were able to better decompose inputs, remember previous inputs, and more accurately predict sequences of images. This allows the prediction of the next step in the sequence, which can be used as part of an intelligent system to make an analysis and an informed decision on the next course of action.