关于一个文本处理的小问题 - V2EX
V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
morainzh
V2EX    问与答

关于一个文本处理的小问题

  •  
  •   morainzh 2023-04-09 19:15:14 +08:00 1384 次点击
    这是一个创建于 992 天前的主题,其中的信息可能已经有所发展或是发生改变。

    背景如下,我是一名高中生,需要预处理一些文档导入到 anki 中平时背诵。

    文档内容包含英文单词,英文短语,英文例句及其对应的中文释义

    原始文本长这样

    10.consultant n. 顾问,咨询者,会诊医生 v. consult n.consultation 咨询 consultant fee 咨询费 financial consultant 财务顾问 I will have to consult the plane time table first.(我得先查阅飞机时刻表。) 11.receptionist n. 接待员,前台小姐 n. reception 接待仪式;招待会;接待区;接待 v. receive 收到;经受;接见,招待;接纳;回应;接收 Low achievers in schools willreceivepriority. (在学校里成绩较差的人会拥有优先权。) My speech was very well received.(我的报告受到了热烈的欢迎。) She was warmly received.(她受到热情地接待。) 

    我需要处理成严格的中文+|+英文\n 像这样

    10.consultant |n. 顾问,咨询者,会诊医生 v. consult n.consultation|咨询 consultant fee |咨询费 financial consultant |财务顾问 I will have to consult the plane time table first.|(我得先查阅飞机时刻表。) 

    我不太会正则这些东西,感觉 python 什么的也不太好处理,请问各位大神有无解决方案

    小额有偿也可,先谢过了

    2 条回复    2023-04-11 23:02:28 +08:00
    TrembleBeforeMe
        1
    TrembleBeforeMe  
       2023-04-09 20:56:58 +08:00
    Regex.ai 让 AI 帮你写正则表达式|再也不用学习正则了 https://www.appinn.com/regex-ai/
    hahahahahahahah
        2
    hahahahahahahah  
       2023-04-11 23:02:28 +08:00 via iPhone
    把文件发一下
    关于     帮助文档     自助推广系统     博客     API     FAQ     Solana     2767 人在线   最高记录 6679       Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 25ms UTC 02:21 PVG 10:21 LAX 18:21 JFK 21:21
    Do have faith in what you're doing.
    ubao msn snddm index pchome yahoo rakuten mypaper meadowduck bidyahoo youbao zxmzxm asda bnvcg cvbfg dfscv mmhjk xxddc yybgb zznbn ccubao uaitu acv GXCV ET GDG YH FG BCVB FJFH CBRE CBC GDG ET54 WRWR RWER WREW WRWER RWER SDG EW SF DSFSF fbbs ubao fhd dfg ewr dg df ewwr ewwr et ruyut utut dfg fgd gdfgt etg dfgt dfgd ert4 gd fgg wr 235 wer3 we vsdf sdf gdf ert xcv sdf rwer hfd dfg cvb rwf afb dfh jgh bmn lgh rty gfds cxv xcv xcs vdas fdf fgd cv sdf tert sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf shasha9178 shasha9178 shasha9178 shasha9178 shasha9178 liflif2 liflif2 liflif2 liflif2 liflif2 liblib3 liblib3 liblib3 liblib3 liblib3 zhazha444 zhazha444 zhazha444 zhazha444 zhazha444 dende5 dende denden denden2 denden21 fenfen9 fenf619 fen619 fenfe9 fe619 sdf sdf sdf sdf sdf zhazh90 zhazh0 zhaa50 zha90 zh590 zho zhoz zhozh zhozho zhozho2 lislis lls95 lili95 lils5 liss9 sdf0ty987 sdft876 sdft9876 sdf09876 sd0t9876 sdf0ty98 sdf0976 sdf0ty986 sdf0ty96 sdf0t76 sdf0876 df0ty98 sf0t876 sd0ty76 sdy76 sdf76 sdf0t76 sdf0ty9 sdf0ty98 sdf0ty987 sdf0ty98 sdf6676 sdf876 sd876 sd876 sdf6 sdf6 sdf9876 sdf0t sdf06 sdf0ty9776 sdf0ty9776 sdf0ty76 sdf8876 sdf0t sd6 sdf06 s688876 sd688 sdf86