Section 2 gives an overview of the recent literature on deep gen- erative models, image encoders, and diagram-based tasks and datasets. Section 3 describes Paper2Fig100k, a novel dataset of research figures and texts. In Section 4 we pro- pose OCR-VQGAN, an image encoder focused in...
It then categorizes offline document images into different classes according to the nature of the challenges one faces, in an attempt to provide insight into various techniques presented in the literature. The pros and cons of various techniques are explained wherever possible. Along with the ...