Python3 处理中文问题

requests 解析 json 数据，在终端可以执行，但是在后台运行报错，如下：

'ascii' codec can't encode characters in position 530-531: ordinal not in range(128)

json 中有中文应该是这个引起的。
json() 之前 encode('utf-8')也没法解决

这种情况下怎么解决？

第 1 条附言 2017-04-19 16:42:40 +08:00

最后在脚本里面输出了环境变量解决了
export PYTHOnIOENCODING=UTF-8

JSON

encode

utf-8'

ascii'

8 条回复 2017-04-19 23:51:16 +08:00

clino

2017-04-19 16:17:35 +08:00

那你要先搞清楚字符串到底是什么编码

Feva

2017-04-19 16:22:30 +08:00

# coding=utf-8
or
# -*- coding:utf-8 -*-
后台文件有么？

JasperYanky

2017-04-19 16:25:19 +08:00

@Feva 脚本之前是 ok 的现在 json 中的数据变了应该是某个中文字符引起的

JasperYanky

2017-04-19 16:26:04 +08:00

@clino Python3 默认是 unicode 吧

Feva

2017-04-19 16:28:30 +08:00

@JasperYanky 你这个明显是处理非 assii 字符报错
py 文件头加编码不行，再加入如下代码
import sys
reload(sys)
sys.setdefaultencoding('utf-8')

ipwx

2017-04-19 17:15:33 +08:00

Python 脚本没有指定编码，所以解释器就报错了吧，应该不是 Json 的锅。

ipwx

2017-04-19 17:15:55 +08:00

$ ipython
Python 3.6.0 |Anaconda 4.3.1 (x86_64)| (default, Dec 23 2016, 13:19:00)
Type "copyright", "credits" or "license" for more information.

IPython 5.1.0 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.

In [1]: import json

In [2]: json.dumps('你好')
Out[2]: '"\\u4f60\\u597d"'

yucongo

2017-04-19 23:51:16 +08:00

可以是国人的网页没跟足标准，试试 resp.encoding='utf-8'：
url = '...'
resp = requests.get(url)
resp.encoding = 'UTF-8'

=====
In [189]: url = 'http://www.baidu.com'

In [190]: import requests

In [191]: resp = requests.get(url)

In [192]: resp.encoding
Out[192]: 'ISO-8859-1' # bad ！

In [200]: chardet.detect(resp.content)
Out[200]: {'confidence': 0.99, 'encoding': 'utf-8'}

In [201]: resp.encoding = 'utf-8'