I'm using the OptParse module to retrieve a string value. OptParse only supports str typed strings, not unicode ones.
So let's say I start my script with:
./someScript --some-option ééééé
French characters, such as 'é', being typed str, trigger UnicodeDecodeErrors when read in the code:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 99: ordinal not in range(128)
I played around a bit with the unicode built-in function, but either I get an error, or the character disappears:
>>> unicode('é');
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)
>>> unicode('é', errors='ignore');
u''
Is there anything I can do to use OptParse to retrieve unicode/utf-8 strings?
It seems that the string can be retrieved and printed OK, but then I try to use that string with SQLite (using the APSW module), and it tries to convert to unicode somehow with cursor.execute("..."), and then the error occurs.
Here is a sample program that causes the error:
#!/usr/bin/python
# coding: utf-8
import os, sys, optparse
parser = optparse.OptionParser()
parser.add_option("--some-option")
(opts, args) = parser.parse_args()
print unicode(opts.some_option)