utf8-decode

Python popen() - 通信(str.encode(编码 ="utf-8"，错误 ="ignore"))崩溃

在Windows上使用Python3.4.3。我的脚本在控制台中运行一个小的java程序，应该得到输出:importsubprocessp1=subprocess.Popen([...],stdout=subprocess.PIPE,stderr=subprocess.PIPE,universal_newlines=True)out,err=p1.communicate(str.encode("utf-8"))这导致一个正常的'UnicodeDecodeError:'charmap'codeccan'tdecodebyte0x9dinposition135:charactermapst

amp 34 code encode utf-8 python python-3.x encoding subprocess popen

python - 使用 decode() 与正则表达式对这个字符串进行转义

我有以下字符串，我正在尝试找出对它进行转义的最佳做法。解决方案必须有点灵活，因为我从一个API接收这个输入，我不能绝对确定当前的字符结构(\n而不是\r)将始终相同。'“如果它没坏，就不要修理它。”\n进行了详细的洗车。\n服务员在把车开到隧道。注意:我的车是...'这个正则表达式看起来应该可以工作:text_excerpt=re.sub(r'[\s"\\]','',raw_text_excerpt).strip()我也读过decode()可能有效(并且通常是更好的解决方案)。raw_text_excerpt.decode('string_unescape')按照这些思路尝试了一些东西

python decode code section pre regex string escaping

Python - 读取奇怪的 utf-16 格式的文本文件

我正在尝试将文本文件读入python，但它似乎使用了一些非常奇怪的编码。我像往常一样尝试:file=open('data.txt','r')lines=file.readlines()forlineinlines[0:1]:printline,printline.split()输出:0.02001971.97691e-005['0\x00.\x000\x002\x000\x000\x001\x009\x007\x00','\x001\x00.\x009\x007\x006\x009\x001\x00e\x00-\x000\x000\x005\x00']打印线条效果很好，但在我尝试拆分线

本文 Python section code 39 numpy encoding utf-16le

Python (nltk) - UnicodeDecodeError : 'ascii' codec can't decode byte

我是NLTK的新手。我遇到了这个错误，我四处搜索编码/解码，特别是UnicodeDecodeError，但这个错误似乎特定于NLTK源代码。这是错误:Traceback(mostrecentcalllast):File"A:\Python\Projects\Test\main.py",line2,inprint(pos_tag(word_tokenize("John'sbigideaisn'tallthatbad.")))File"A:\Python\Python\lib\site-packages\nltk\tag\__init__.py",line100,inpos_tagtagg

amp UnicodeDecodeError section Python 39 error-handling compiler-errors nltk

python - PyODBC 输出不正确的 UTF-16

我正在尝试从MySQL数据库中提取表名列表。相关部分代码如下:conn=pyodbc.connect('...')cursor=conn.cursor()fortableincursor.tables():printtable.table_name对于每个表格，它都会打印一堆乱码(方框和菱形问号)。使用repr(table.table_name)它打印:u'\U00500041\U004c0050\U00430049\U00540041\U004f0049'对于名为“APPLICATION”的表。如果将每个32位字符视为两个16位字符，您将得到字符串“PALPCITAOI”。交换字符对

不正 python section code table mysql unicode pyodbc

python - "decoder jpeg not available"在 AWS Elastic Beanstalk 上使用 Pillow

我在AWSElasticBeanstalk下使用Python处理jpeg文件时遇到了一些问题。我在.ebextensions/python.config文件中有这个:packages:yum:libjpeg-turbo-devel:[]libpng-devel:[]freetype-devel:[]...所以我相信我已经安装了libjpeg并且可以正常工作(我试过libjpeg-devel，但是yum找不到这个包)。另外，我的requirements.txt中有这个:Pillow==2.5.1...所以我相信我已经安装了Pillow并在我的环境中工作。然后，由于我有Pillow和lib

amp Beanstalk section thumbnail Pillow python amazon-web-services python-imaging-library amazon-elastic-beanstalk

python - 我可以将 decode(errors ="ignore") 设置为 Python 2.7 程序中所有字符串的默认值吗？

我有一个Python2.7程序，可以从各种外部应用程序中写出数据。当我写入文件时，我不断地遇到异常，直到我将.decode(errors="ignore")添加到正在写出的字符串中。(FWIW，以mode="wb"方式打开文件并不能解决这个问题。)有没有办法说“忽略此范围内所有字符串的编码错误”？最佳答案您不能重新定义内置类型的方法，也不能将errors参数的默认值更改为str.decode()。不过，还有其他方法可以实现所需的行为。稍微好一点的方法:定义您自己的decode()函数:defdecode(s,encoding="

amp python code errors section python-2.7 decode

带有utf8问题的python正则表达式

我得到一个包含多行纯utf-8文本的文件。比如下面，顺便说一句，是中文的。PROCESS：类型：关爱积分[NOTIFY]交易号：2012022900000109订单号：W12022910079166交易金额：0.01元交易状态：true2012-2-2910:13:08文件本身以utf-8格式保存。文件名为xx.txt这里是我的python代码，env是python2.7#coding:utf-8importrepattern=re.compile(r'交易金额：(\d+)元')forlineinopen('xx.txt'):match=pattern.match(line.decod

python utf8 code section utf-8 regex python-2.7

python - 'str' 对象在 Python3 中没有属性 'decode'

我对python3.3.4中的“解码”方法有一些问题。这是我的代码:forlinesinopen('file','r'):decodedLine=lines.decode('ISO-8859-1')line=decodedLine.split('\t')但是我无法解码这个问题的行:AttributeError:'str'objecthasnoattribute'decode'你有什么想法吗？谢谢最佳答案一个编码字符串，一个解码字节。您应该从文件中读取字节并对其进行解码:forlinesinopen('file','rb'):de

amp 39 section decodedLine python python-3.x python-3.3

Python UTF-16 CSV 阅读器

我有一个必须阅读的UTF-16CSV文件。Pythoncsv模块似乎不支持UTF-16。我正在使用python2.7.2。我需要解析的CSV文件很大，有几GB的数据。下面是JohnMachin问题的答案printrepr(open('test.csv','rb').read(100))输出内容只有abc的test.csv'\xff\xfea\x00b\x00c\x00'我认为csv文件是在美国的Windows机器上创建的。我正在使用MacOSXLion。如果我使用phihag提供的代码和包含一条记录的test.csv。使用的示例test.csv内容。下面是printrepr(open(

阅读器 Python 39 self code csv utf-16

59 60 616263 64 65