<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Mithro rants about stuff &#187; $LANG</title>
	<atom:link href="http://blog.mithis.net/archives/tag/lang/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.mithis.net</link>
	<description></description>
	<lastBuildDate>Fri, 16 Dec 2011 06:02:58 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>$#%#! UTF-8 in Python</title>
		<link>http://blog.mithis.net/archives/python/91-utf-8-in-python</link>
		<comments>http://blog.mithis.net/archives/python/91-utf-8-in-python#comments</comments>
		<pubDate>Mon, 19 Jan 2009 03:54:08 +0000</pubDate>
		<dc:creator>mithro</dc:creator>
				<category><![CDATA[Python]]></category>
		<category><![CDATA[$LANG]]></category>
		<category><![CDATA[encoding]]></category>
		<category><![CDATA[sys.stdout]]></category>
		<category><![CDATA[utf-8]]></category>
		<category><![CDATA[utf8]]></category>

		<guid isPermaLink="false">http://blog.mithis.net/?p=91</guid>
		<description><![CDATA[This is not a post about using UTF-8 properly in Python, but doing evil, evil things. Python dutifully respects the $LANG environment variable on the terminal. It turns out that a lot of the time this variable is totally wrong, it&#8217;s set to something like C even though the terminal is UTF-8 encoding. The problem [...]]]></description>
			<content:encoded><![CDATA[<p>This is <b>not</b> a post about using UTF-8 properly in Python, but doing <i>evil, evil</i> things.</p>
<p>Python dutifully respects the $LANG environment variable on the terminal. It turns out that a lot of the time this variable is totally wrong, it&#8217;s set to something like C even though the terminal is UTF-8 encoding. </p>
<p>The problem is that there is no easy way to change a file&#8217;s encoding after it&#8217;s open, well until this horrible hack! The following code will force the output encoding of stdout to UTF-8 even if started with LANG=C.</p>
<blockquote>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #808080; font-style: italic;"># License: MIT</span>
<span style="color: #ff7700;font-weight:bold;">try</span>:
    <span style="color: #ff7700;font-weight:bold;">print</span> u<span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\u</span>263A&quot;</span>
<span style="color: #ff7700;font-weight:bold;">except</span> <span style="color: #008000;">Exception</span>, e:
    <span style="color: #ff7700;font-weight:bold;">print</span> e
&nbsp;
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">sys</span>
<span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #dc143c;">sys</span>.<span style="color: black;">stdout</span>.<span style="color: black;">encoding</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">from</span> ctypes <span style="color: #ff7700;font-weight:bold;">import</span> pythonapi, py_object, c_char_p
PyFile_SetEncoding = pythonapi.<span style="color: black;">PyFile_SetEncoding</span>
PyFile_SetEncoding.<span style="color: black;">argtypes</span> = <span style="color: black;">&#40;</span>py_object, c_char_p<span style="color: black;">&#41;</span>
<span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #ff7700;font-weight:bold;">not</span> PyFile_SetEncoding<span style="color: black;">&#40;</span><span style="color: #dc143c;">sys</span>.<span style="color: black;">stdout</span>, <span style="color: #483d8b;">&quot;UTF-8&quot;</span><span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">raise</span> <span style="color: #008000;">ValueError</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">try</span>:
    <span style="color: #ff7700;font-weight:bold;">print</span> u<span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\u</span>263A&quot;</span>
<span style="color: #ff7700;font-weight:bold;">except</span> <span style="color: #008000;">Exception</span>, e:
    <span style="color: #ff7700;font-weight:bold;">print</span> e</pre></div></div>

</blockquote>
]]></content:encoded>
			<wfw:commentRss>http://blog.mithis.net/archives/python/91-utf-8-in-python/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

