
Python3 re 模块的问题
import re s=" 以及: " pattern = re.compile('[(](.+?)(:?.png|.jpg)[)]') result = pattern.findall(s) for i in result: print(i) 匹配出来的结果如下
('/img/2020pic/02/1', '.jpg') ('/img/2020pic/02/2', '.png') 请问为什么每个匹配项会被分成一个元组,如果想要独立的抓出 /img/2020pic/02/1.jpg 和另一个 png,应该怎么改呢?
1 ysc3839 2020-10-02 19:07:47 +08:00 via Android 最后一句话是什么意思呢?能否举个例子? |
2 WaterWestBolus OP @ysc3839 就是说,我想写个正则表达式,从上面那个 s 里面取出```/img/2020pic/02/1.jpg```字段和```/img/2020pic/02/2.png```字段,放在一个 list 里面,预期的 list 应该如下 ``` ['/img/2020pic/02/1.jpg','/img/2020pic/02/2.png'] ``` |
3 1462326016 2020-10-02 19:21:28 +08:00 也许可以这样? ``` import re s = " 以及: " pattern = re.compile(r'\((.+?[.png|.jpg])\)') result = pattern.findall(s) for i in result: print(i) `` 排版可能会乱。。。回复好像不支持 markdown |
4 WaterWestBolus OP @1462326016 谢谢你的回复,刚刚尝试了你的代码,在 s = ' (( RubberPencil) p).write("Hello");'的情况下居然能匹配到字符串'( RubberPencil) p',非常费解。。 |
5 ysc3839 2020-10-02 20:32:31 +08:00 via Android |
6 iNaru 2020-10-02 20:52:06 +08:00 (?<=\().+?\.(?:jpg|png)(?=\)) |
7 AlisaDestiny 2020-10-02 21:38:05 +08:00 In [2]: p = re.compile(r'\((.+?\.(?:jpg|png))\)') In [3]: p.findall(s) Out[3]: ['/img/2020pic/02/1.jpg', '/img/2020pic/02/2.png'] |
8 WaterWestBolus OP |
9 ysc3839 2020-10-02 23:26:42 +08:00 via Android |
10 ysc3839 2020-10-02 23:29:28 +08:00 via Android |
11 JCZ2MkKb5S8ZX9pq 2020-10-02 23:55:51 +08:00 re.compile(r'(?<=\]\().*?\.(?:png|jpg)(?=\))') 我试了下这样可以 |
12 brucmao 2020-10-03 00:51:44 +08:00 |
13 krixaar 2020-10-03 08:35:38 +08:00 目的是要一个 list,不一定非得从正则本身下手吧。result = [''.join(i) for i in pattern.findall(s)] 就直接搞定了? |
14 chaogg 2020-10-04 20:05:40 +08:00 >>> pattern = re.compile(r'\!\[.*?\]\((.+?\.(?:jpg|png))\)') >>> pattern.findall(s) ['/img/2020pic/02/1.jpg', '/img/2020pic/02/2.png'] |
15 biglazycat 2020-10-24 16:01:09 +08:00 line = " 以及: " pattern = re.compile('\((\S+)\)') result = pattern.findall(line) print(result) |
16 biglazycat 2020-11-23 04:54:37 +08:00 >>> s=" 以及: " >>> re.findall(r'[/\w]+\/\w+\.\w+', s) ['/img/2020pic/02/1.jpg', '/img/2020pic/02/2.png'] |