| Title | Method | Architecture | Year | Conference | Paper | Code |
|---|---|---|---|---|---|---|
| Detecting text in natural image with connectionist text proposal network (CTPN) | regression-based | Faster RCNN | 2016 | ECCV | Paper | Code |
| Multi-oriented text detection with fully convolutional networks (MOTD) | segmentation-based | FCN | 2017 | CVPR | Paper | |
| Scene text detection via holistic, multi-channel prediction (STDH) | segmentation-based | FCN | 2017 | arXiv | Paper | |
| Detecting oriented text in natural images by linking segments (SegLink) | regression-based | SSD | 2017 | CVPR | Paper | Code |
| Single shot text detector with regional attention (SSTD) | regression-based | SSD | 2017 | ICCV | Paper | Code |
| Wordsup: Exploiting word annotations for character based text detection (Wordsup) | segmentation-based | FCN | 2017 | ICCV | Paper | |
| Textboxes: A fast text detector with a single deep neural network (Textbooks) | regression-based | 2017 | AAAI | Paper | Code | |
| East: an efficient and accurate scene text detector East) | hybrid,segmentation-based | FCN | 2017 | CVPR | Paper | Code |
| Deep direct regression for multi-oriented scene text detection (DDR) | regression-based | DenseBox | 2017 | ICCV | Paper | |
| R2CNN: Rotational region cnn for orientation robust scene text detection (R2CNN) | regression-based | Faster RCNN | 2018 | arXiv | Paper | |
| Textsnake: A flexible representation for detecting text of arbi- trary shapes (Textsnake) | segmentation-based | U-Net | 2018 | ECCV | Paper | |
| Textboxes++: A single-shot oriented scene text detector (Textboxes++) | regression-based | SSD | 2018 | TIP | Paper | Code |
| Rotation-sensitive regression for oriented scene text detection (RRD) | regression-based | SSD | 2018 | CVPR | Paper | |
| Multi-oriented scene text detection via corner localization and region segmentation (Corner Localization) | segmentation-based | FCN | 2018 | CVPR | Paper | Code |
| Pixellink: Detecting scene text via instance segmentation (Pixellink) | segmentation-based | FCN | 2018 | AAAI | Paper | Code |
| Character region awareness for text detection (CRAFT) | segmentation-based | U-Net | 2019 | CVPR | Paper | |
| Efficient and accurate arbitrary-shaped text detection with pixel aggregation network (PAN) | segmentation-based | FPRM+FFM | 2019 | ICCV | Paper | |
| Pyramid mask text detector (PMTD) | hybrid | Faster RCNN | 2019 | arXiv | Paper | Code |
| Textfield: Learning a deep direction field for irregular scene text detection (Textfield) | hybrid | FCN | 2019 | arXiv | Paper | |
| Omnidirectional scene text detection with sequential-free box discretization (BDN) | hybrid | Faster RCNN | 2019 | IJCAI | Paper | Code |
| Scene Text Detection with Supervised Pyramid Context Network(SPCNet) | hybrid | FPN | 2019 | AAAI | Paper | |
| Shape robust text detection with progressive scale expansion network (PSENet) | segmentation-based | FPN | 2019 | CVPR | Paper | Code |
| TextFuseNet: Scene Text Detection with Richer Fused Features | hybrid | 2020 | ECCV | Paper | Code | |
| Real-time Scene Text Detection with Differentiable Binarization (DB) | hybrid | 2020 | AAAI | Paper | Code |
| Title | Architecture | Year | Conference | Paper | Code |
|---|---|---|---|---|---|
| Robust scene text recognition with automatic rectification (RARE) | Attention-based | 2016 | CVPR | Paper | |
| Recursive recurrent nets with attention modeling for OCR in the wild (R2AM) | Attention-based | 2016 | CVPR | Paper | |
| STAR-Net: A spatial attention residue network for scene text recognition (STARNet) | CTC-based | 2016 | BMVC | Paper | |
| An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition (CRNN) | CTC-based | 2017 | IEEE Transactions on Pattern Analysis and Machine Intelligence | Paper | Code |
| Gated recurrent convolution neural network for OCR (GRCNN) | CTC-based | 2017 | NIPS | Paper | Code |
| Learning to read irregular text with attention mechanisms (L2RI) | Attention-based | 2017 | IJCAI | Paper | |
| Focusing attention: Towards accurate text recognition in natural images (FAN) | Attention-based | 2017 | ICCV | Paper | |
| Char-net: A character- aware neural network for distorted scene text recognition (Char-Net) | Attention-based | 2018 | AAAI | Paper | |
| Aon: Towards arbitrarily-oriented text recognition | Attention-based | 2018 | CVPR | Paper | Code |
| Edit probability for scene text recognition (EP) | Attention-based | 2018 | CVPR | Paper | |
| Scene text recognition from two-dimensional perspective (CA-FCN) | Attention-based | 2018 | AAAI | Paper | |
| Show, attend and read: A simple and strong baseline for irregular text recognition (SAR) | Attention-based | 2018 | AAAI | Paper | Code |
| Rosetta: Large scale system for text detection and recognition in images (ROSETTA) | CTC-based | 2018 | Paper | ||
| Aster: An attentional scene text recognizer with flexible rectification (ASTER) | Attention-based | 2018 | IEEE | Paper | Code |
| Synthetically supervised feature learning for scene text recognition (SSEF) | CTC-based | 2018 | ECCV | Paper | |
| What is wrong with scene text recognition model comparisons? dataset and model analysis (CLOVA) | Attention-based | 2018 | ICCV | Paper | Code |
| Aggregation cross-entropy for sequence recognition (ACE) | Attention-based | 2019 | Paper | Code | |
| Esir: End-to-end scene text recognition via iterative image rectification (ESIR) | Attention-based | 2019 | CVPR | Paper | |
| Moran: A multi-object rectified at- tention network for scene text recognition (MORAN) | Attention-based | 2019 | Pattern Recognition | Paper | Code |
| 2D-CTC for scene text recognition (2D-CTC) | 2D-CTC | 2019 | Paper | Code | |
| SCATTER: selective context attentional scene text recognizer | Attention-based | 2020 | CVPR | Paper | Code |
| Bidirectional Scene Text Recognition with a Single Decoder (Bi-STET) | Attention-based | 2020 | ECAI | Paper | Code |
| Towards accurate scene text recognition with semantic reasoning networks | Attention-based | 2020 | CVPR | Paper | |
| Hamming OCR: A Locality Sensitive Hashing Neural Network for Scene Text Recognition | Attention-based | 2020 | arXiv | Paper |
| Title | Arcitecture | Year | Conference | Paper | Code |
|---|---|---|---|---|---|
| Textboxes: A fast text detector with a single deep neural network | CTC,SSD | 2017 | AAAI | Paper | Code |
| Deep textspotter: An end-to-end trainable scene text localization and recognition framework (Deep TextSpotter) | CTC,Yolo | 2017 | ICCV | Paper | |
| Towards end-to-end text spotting with convolutional recurrent neural networks | Attention | 2017 | IEEE | Paper | |
| Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes (Mask TextSpotter) | CTC,R-CNN | 2018 | ECCV | Paper | Code |
| An end-to-end textspotter with explicit alignment and attention | Attention | 2018 | IEEE | Paper | Code |
| FOTS: Fast oriented text spotting with a unified network (FOTS) | CTC | 2018 | CVPR | Paper | Code |
| Textboxes++: A single-shot oriented scene text detector (TextBoxes++) | CTC | 2018 | TIP | Paper | Code |
| Total-text: A comprehensive dataset for scene text detection and recognition (Total-Text) | 2018 | IEEE | Paper | Code | |
| TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network (TextNet) | Attention | 2018 | ACCV | Paper | |
| Convolutional character networks (CharNet) | R-CNN | 2019 | ICCV | Paper | Code |
| TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting (TextDragon) | CTC | 2019 | ICCV | Paper | |
| Towards unconstrained end-to-end text spotting | 2019 | ICCV | Paper | ||
| Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting (Text Perceptron) | CTC,R-CNN | 2020 | AAAI | Paper | |
| All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting | Attention | 2020 | AAAI | Paper | |
| ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network (ABCNet) | CTC | 2020 | CVPR | Paper | Code |