本帖最后由 icestick8586 于 2018-11-30 12:06 编辑
1、首先去百度注册一个账户,然后选择对应的识别类型创建对应的应用,获取AppID,APIKey,SecretKey,请参考百度官方接入文档http://ai.baidu.com/docs#/Begin/top
2、官方使用文档http://ai.baidu.com/docs#/OCR-Python-SDK/top- #-*- coding: UTF-8 -*-
- #前提是python已安装aip库--》pip install baidu-aip
-
- import os
- from aip import AipOcr
- APP_ID = '你注册账号创建应用后得到的APPID'
- API_KEY = '你注册账号创建应用后得到的API_KEY'
- SECRET_KEY = '你注册账号创建应用后得到的SECRET_KEY '
- aipOcr = AipOcr(APP_ID, API_KEY, SECRET_KEY)
- os.chdir("E:\\office\\src_pic") #你需要转换的图片目录
- dirs = os.listdir()
- def get_file_content(filePath):
- with open(filePath, 'rb') as fp:
- return fp.read()
- options = {}
- options["language_type"] = "CHN_ENG"
- options["detect_direction"] = "true"
- options["detect_language"] = "true"
- options["probability"] = "true"
-
- print('开始处理,共'+str(len(dirs))+"张图片。")
- flag=0
- T = 0 #统计处理图片成功的数量
- for filePath in dirs:
- if filePath.split('.')[-1]=='txt':continue
- flag+=1
- print('正在处理第'+str(flag)+'张图片')
- try:
- result = aipOcr.basicGeneral(get_file_content(filePath), options)
- except BaseException as e:
- print(e)
- else:
- try:
- with open(filePath.split('.')[0]+'.txt','w',encoding='utf-8') as f:
- for i in result['words_result']:
- f.write(i['words']+'\n')
- T += 1
- except BaseException as e :
- print(e)
- else:
- print('处理完成')
- print('{}全部处理完成!{}'.format("="*30,"="*30))
- print('处理成功的图片有{}张,处理失败的图片有{}张'.format(T,len(dirs)-T))
复制代码 |
|