小花喵-python pdf to txt

小花喵 ^{老牛亦知昭光贵，不用扬鞭自奋蹄。}

2019年12月29日 10:55:22

python pdf to txt

import pdfplumber,os

def main():

path = 'cooper.pdf'

txtfile ='cooper.txt'

print('努力转换之中，请稍待片刻~~~~~~~~')

pdf = pdfplumber.open(path)

mytxtfile= open(txtfile,'w')

myrow = ""

for page in pdf.pages:

# 获取当前页面的全部文本信息，包括表格中的文字

# print(page.extract_text())

for table in page.extract_tables():

# print(table)

for row in table:

if row[0]!="":

#myrow.append(row[0])

myrow=row[0].replace(" "," ")

myrow =myrow.replace(" "," ")

myrow =myrow.replace(" "," ")

myrow =myrow.replace(" "," ")

myrow =myrow.replace(" "," ")

myrow =myrow.replace(" "," ")

myrow =myrow.replace(" "," ")

myrow =myrow.replace(" "," ")

myrow =myrow.replace(" "," ")

mytxtfile.write(myrow+'\n')

print(row)

#print(myrow)

#print('---------- 分割线 ----------')

pdf.close()

mytxtfile.close()

print('转换完成!')

if __name__=='__main__':

main()

留言列表

发表评论取消回复: 名称^*

邮箱

网址

« 2024年8月 »
一	二	三	四	五	六	日
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

搜索

控制面板: 您好，欢迎到访网站！
登录后台查看权限

网站分类

最新留言

标签列表

作者列表

admin (37)

站点信息

文章总数:37
页面总数:1
分类总数:4
标签总数:7
评论总数:0
浏览总数:48062