问题描述
links = sel.xpath('//i[contains(@title,"置顶")]/following-sibling::a/@href').extract() 报错:ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters

links = sel.xpath('//i[contains(@title,"置顶")]/following-sibling::a/@href').extract() 报错:ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters
1 revotu Jun 28, 2017 参见文章:[解决 Scrapy 中 xpath 用到中文报错问题][1] ## 解决方法 ## 方法一:将整个 xpath 语句转成 Unicode ```Python links = sel.xpath(u'//i[contains(@title,"置顶")]/following-sibling::a/@href').extract() ``` 方法二:xpath 语句用已转成 Unicode 的 title 变量 ```Python title = u"置顶" links = sel.xpath('//i[contains(@title,"%s")]/following-sibling::a/@href' %(title)).extract() ``` 方法三:直接用 xpath 中变量语法(`$`符号加变量名)`$title`, 传参 title 即可 ```Python links = sel.xpath('//i[contains(@title,$title)]/following-sibling::a/@href', title="置顶").extract() ``` [1]: http://www.revotu.com/solve-unicode-erros-using-xpath-in-scrapy.html |
2 bsns Jun 28, 2017 我一般是加 u |
4 NaVient Jun 29, 2017 独立爬虫项目,请用 py3 |