Skip to content
forked from RapidAI/RapidOCR

A cross platform OCR Library based on PaddleOCR & OnnxRuntime & OpenVINO.

License

Notifications You must be signed in to change notification settings

dtiku-cn/RapidOCR

 
 

Repository files navigation

简体中文 | English

Open source OCR for the security of the digital world

Open in Colab
PyPI Documentation Status SemVer2.0

Contents

Introduction

  • Completely open source, free and support offline deployment of multi-platform and multi-language OCR.
  • Chinese Advertising: Welcome to join our QQ group to download the model and test program, QQ group number: 887298230
  • Cause: Baidu paddlepaddle engineering is not very good, in order to facilitate everyone to perform OCR reasoning on various terminals, we convert it to onnx format, use Python/C++/Java/Swift/C# to change It is ported to various platforms.
  • Name Source: Light, fast, economical and smart. OCR technology based on deep learning technology focuses on artificial intelligence advantages and small models, with speed as the mission and effect as the leading role.
  • Usage:
    • If the existing model in the repo meets the requirements → RapidOCR deployment can be used.
    • Not meeting requirements → Based on PaddleOCR. Fine-tune your own data → RapidOCR deployment. -If this repo is helpful to you, please click on a small star ⭐ Bah!

Navigation

Recently updates(more)

🍜2023-05-22 api update:

  • Decouple the API from ocrweb and maintain it as a separate module. For details, see API
  • After rapidocr_web>0.1.6, pip install rapidocr_web[api] will not be supported for installation, you can directly install and use pip install rapidocr_api.

❤2023-05-20 ocrweb update:

🌹 2023-05-14 ocrweb v0.1.5 update:

  • Add the return value of the interface version's return coordinate box (issue #85)
  • API mode adds base64 format input
  • For details, please refer to: link

Overall Framework

flowchart LR
    subgraph Step
    direction TB
    C(Text Det) --> D(Text Cls) --> E(Text Rec)
    end

    A[/OurSelf Dataset/] --> B(PaddleOCR) --Train--> Step --> F(PaddleOCRModelConverter)
    F --ONNX--> G{RapidOCR Deploy\n<b>Python/C++/Java/C#</b>}
    G --> H(Windows x86/x64) & I(Linux) & J(Android) & K(Web) & L(Raspberry Pi)

    click B "https://github.com/PaddlePaddle/PaddleOCR" _blank
    click F "https://github.com/RapidAI/PaddleOCRModelConverter" _blank
Loading

Demo

  • Online demo
    • For details, please refer to: ocrweb/README
    • The model combination (optimal combination) used for the demo is:
      ch_PP-OCRv3_det + ch_ppocr_mobile_v2.0_cls + ch_PP-OCRv3_rec
      
    • Demo:
  • Hugging Face Demo
    • The demo is built on Hugging Face's Spaces, generated by the Gradio library.
    • Demo:

TODO and Task Claim

Original initiator and start-up author

Acknowledgements

  • Many thanks to DeliciaLaniD for fixing the misplaced start position of scan animation in ocrweb.
  • Many thanks to zhsunlight for the suggestion about parameterized call GPU reasoning and the careful and thoughtful testing.
  • Many thanks to lzh111222334 for fixing some bugs of rec preprocessing under python version.
  • Many thanks to AutumnSun1996 for the suggestion in the #42.
  • Many thanks to DeadWood8 for providing the document which packages rapidocr_web to exe by Nuitka.
  • Many thanks to Loovelj for fixing the bug of sorting the text boxes. For details see issue 75.

Sponsor

Sponsor Applied Products
-
  • If you want to sponsor the project, you can directly click the Sponsor button at the top of the current page, please write a note (e.g. your github account name) to facilitate adding to the sponsorship list above.

Authorization

  • The copyright of the OCR model belongs to Baidu, and the copyright of other engineering codes belongs to the owner of this warehouse.
  • This software is licensed under Apache 2.0. You are welcome to contribute code, submit an issue or even PR.
  • If you find this project useful in your research, please consider citing:
    @misc{RapidOCR 2021,
        title={{Rapid OCR}: OCR Toolbox},
        author={MindSpore Team},
        howpublished = {\url{https://github.com/RapidAI/RapidOCR}},
        year={2021}
    }

Join us

  • For international developers, we regard RapidOCR Disscussions as our international community platform. All ideas and questions can be discussed here in English.

Demo

Demonstration with C++/JVM

Demonstration with .Net

Demonstratioin with multi_language

About

A cross platform OCR Library based on PaddleOCR & OnnxRuntime & OpenVINO.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 85.2%
  • HTML 8.9%
  • Jupyter Notebook 2.9%
  • CSS 2.3%
  • C 0.7%