论坛首页 编程语言技术论坛

python UnicodeEncodeError: 'ascii' codec can't encode characters 解决方法

浏览 14249 次
精华帖 (0) :: 良好帖 (0) :: 新手帖 (0) :: 隐藏帖 (0)
作者 正文
   发表时间:2010-06-25  
python UnicodeEncodeError: 'ascii' codec can't encode characters  详解
新建一个test.py

#coding:utf-8

s='nihao中国'.decode('utf-8')
print type(s)
print s

执行错误:
Traceback (most recent call last):
<type 'unicode'>
  File "/home/sdm/work/code/datadeal/tran_client/test_encode.py", line 5, in <module>
    print s
UnicodeEncodeError: 'ascii' codec can't encode characters in position 5-6: ordinal not in range(128)
-------------
修改如下
#coding:utf-8
import sys
reload(sys)
sys.setdefaultencoding('utf-8')

s='nihao中国'.decode('utf-8')
print type(s)
print s
---------一切正常-------
<type 'unicode'>
nihao中国
-------------------------
修改如下
#coding:utf-8
import sys
reload(sys)
sys.setdefaultencoding('utf-8')

s='nihao中国'.decode('utf-8')
print type(s)

fn='/tmp/test.txt'
f=open(fn,'w')
f.write(s)
f.close()
print open(fn).read()


--------------
不报错
<type 'unicode'>
nihao中国
---------------------------------------
修改如下
#coding:utf-8
import sys
reload(sys)
#sys.setdefaultencoding('utf-8')

s='nihao中国'.decode('utf-8')
print type(s)

fn='/tmp/test.txt'
f=open(fn,'w')
f.write(s)
f.close()
print open(fn).read()


---------
报错
<type 'unicode'>
Traceback (most recent call last):
  File "test_encode.py", line 11, in <module>
    f.write(s)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 5-6: ordinal not in range(128)

-------------------------
说明 sys.setdefaultencoding
修改了默认的 unicode.encode 编码 行为
sys 为何reload 才有 sys.setdefaultencoding
-------
#coding:utf-8
import sys
#reload(sys)
sys.setdefaultencoding('utf-8')
-----------------
Traceback (most recent call last):
  File "test_encode.py", line 4, in <module>
    sys.setdefaultencoding('utf-8')
AttributeError: 'module' object has no attribute 'setdefaultencoding'
--------------------
grep -r -i 'setdefaultencoding' /usr/lib/python2.6
会看到
/usr/lib/python2.6/site.py:        sys.setdefaultencoding(encoding) # Needs Python Unicode build !
/usr/lib/python2.6/site.py:    # Remove sys.setdefaultencoding() so that users cannot change the
/usr/lib/python2.6/site.py:    if hasattr(sys, "setdefaultencoding"):
/usr/lib/python2.6/site.py:        del sys.setdefaultencoding

------------
site.py 里面
def main():
    global ENABLE_USER_SITE

    abs__file__()
    known_paths = removeduppaths()
    if ENABLE_USER_SITE is None:
        ENABLE_USER_SITE = check_enableusersite()
    known_paths = addusersitepackages(known_paths)
    known_paths = addsitepackages(known_paths)
    if sys.platform == 'os2emx':
        setBEGINLIBPATH()
    setquit()
    setcopyright()
    sethelper()
    aliasmbcs()
    setencoding()
    execsitecustomize()
    if ENABLE_USER_SITE:
        execusercustomize()
    # Remove sys.setdefaultencoding() so that users cannot change the
    # encoding after initialization.  The test for presence is needed when
    # this module is run as a script, because this code is executed twice.
    if hasattr(sys, "setdefaultencoding"):
        del sys.setdefaultencoding


main()
这个地方把del sys.setdefaultencoding 防止用户在改变defaultencoding  不知道为什么
----------------------------------------
site 这个模块是自动加载的
验证下
python  -c "import sys;print 'site' in sys.modules"
True
------------------
这个报错的
#coding:utf-8
import sys
s='nihao中国'.decode('utf-8')
print type(s)

fn='/tmp/test.txt'
f=open(fn,'w')
f.write(s)
f.close()
print open(fn).read()

-------
控制台直接运行 有时候不报错
#coding:utf-8
import sys
s='nihao中国'.decode('utf-8')
print type(s)
print s

但是 nohup 运行确报错:
nohup python test_encode.py
tail nohup.out
<type 'unicode'>
Traceback (most recent call last):
  File "test_encode.py", line 5, in <module>
    print s
UnicodeEncodeError: 'ascii' codec can't encode characters in position 5-6: ordinal not in range(128)

nohup 和 控制台运行 对python来说 标准输出是不一样的
-----------
在使用twisted.python.log 的时候和nohup 类似
代码前面加
import  twisted.python.log as  log
log.startLogging(sys.stdout)
之后 print unicode 仍然报错
#coding:utf-8
import sys
reload(sys)
#sys.setdefaultencoding('utf-8')

import  twisted.python.log as  log
log.startLogging(sys.stdout)

s='nihao中国'.decode('utf-8')
print type(s)
print s

2010-06-18 14:53:19+0800 [-] Log opened.
2010-06-18 14:53:19+0800 [-] <type 'unicode'>
2010-06-18 14:53:19+0800 [-] <unicode instance at 3074179752 with str error Traceback (most recent call last):
       
          File "/usr/lib/python2.6/dist-packages/twisted/python/reflect.py", line 560, in safe_str
            return str(o)
       
        UnicodeEncodeError: 'ascii' codec can't encode characters in position 5-6: ordinal not in range(128)
        >

----
结论:
开文件开头加上 就比较安全了
import sys
reload(sys)
sys.setdefaultencoding('utf-8')

----------



   发表时间:2010-06-28  
标准的做法是建一个sitecustomize.py的文件,里面写入这两句:

import sys
sys.setdefaultencoding('utf-8')
0 请登录后投票
   发表时间:2011-01-18  
果然是高手 本来以为是转成字符串就能解决
0 请登录后投票
论坛首页 编程语言技术版

跳转论坛:
Global site tag (gtag.js) - Google Analytics