3 篇文章带有标签 “paddleocr”

开源 OCR 引擎基准测试

OCR 引擎

EasyOCR

EasyOCR 支持 80+ 语言。

Abaza = 'abq'
Adyghe = 'ady'
Afrikaans = 'af'
Angika = 'ang'
Arabic = 'ar'
Assamese = 'as'
Avar = 'ava'
Azerbaijani = 'az'
Belarusian = 'be'
Bulgarian = 'bg'
Bihari = 'bh'
Bhojpuri = 'bho'
Bengali = 'bn'
Bosnian = 'bs'
Simplified_Chinese = 'ch_sim'
// ...

安装

pip install torch==2.0.1 torchvision==0.15.2 -i https://download.pytorch.org/whl/cpu
pip install easyocr

代码示例 import easyocr languages = ['ch_sim', 'en'] model = easyocr.

构建基于PaddlePaddle开发服务镜像

构建镜像

FROM paddlepaddle/paddle:2.2.2-gpu-cuda10.2-cudnn7
LABEL maintainer="wang-junjian@qq.com"

RUN apt-get update && apt-get install libjpeg-dev zlib1g-dev -y

RUN pip install -i https://mirrors.aliyun.com/pypi/simple/ \
    numpy fastapi paddleocr opencv-python

EXPOSE 20000

WORKDIR /inference-serving
ADD . ./

CMD ["python", "app.py"]

官方推荐:非安培架构的GPU,推荐使用CUDA10.2,性能更优。

自己构建 paddlepaddle 镜像

通过官方的 Docker Hub 没有找到 runtime 版本,想着节省几个G的空间,于是考虑自己来构建。

使用PaddleOCR进行文字识别

安装

pip install paddleocr

测试

import cv2
import numpy as np

from paddleocr import PaddleOCR


ocr = PaddleOCR(use_angle_cls=True)
image_path = 'test.jpg'
img = cv2.imread(image_path)

img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_gray1 = img_gray[:,:, np.newaxis]
img_gray3 = np.concatenate([img_gray1, img_gray1, img_gray1], axis=-1)

texts = ocr.ocr(img_gray3)
for text in texts:
    """
    box   坐标1         坐标2
          坐标4         坐标3
    """
    box = text[0]
    t = text[1][0]
    score = text[1][1]

可视化(图像上画出文本和得分) import os import shutil import cv2 import numpy as np import uuid from PIL import ImageFo