Skip to content

{ Category Archives } Python

Reading cookies from most Firefox versions…

Yesterday, I wrote about how to reading the cookies from Firefox 3.0 from Python. This code snippet extends the previous example by adding code which finds the cookie file on various different operating systems (Windows, Linux and Mac OS X). Hope this helps people who need to do this.

#! /usr/bin/env python
# Reading the cookie's from Firefox/Mozilla. Supports Firefox 3.0 and Firefox 2.x
#
# Author: Noah Fontes <nfontes AT cynigram DOT com>, 
#         Tim Ansell <mithro AT mithis DOT com>
# License: MIT
 
def sqlite2cookie(filename):
    from cStringIO import StringIO
    from pysqlite2 import dbapi2 as sqlite
 
    con = sqlite.connect(filename)
 
    cur = con.cursor()
    cur.execute("select host, path, isSecure, expiry, name, value from moz_cookies")
 
    ftstr = ["FALSE","TRUE"]
 
    s = StringIO()
    s.write("""\
# Netscape HTTP Cookie File
# http://www.netscape.com/newsref/std/cookie_spec.html
# This is a generated file!  Do not edit.
""")
    for item in cur.fetchall():
        s.write("%s\t%s\t%s\t%s\t%s\t%s\t%s\n" % (
            item[0], ftstr[item[0].startswith('.')], item[1],
            ftstr[item[2]], item[3], item[4], item[5]))
 
    s.seek(0)
 
    cookie_jar = cookielib.MozillaCookieJar()
    cookie_jar._really_load(s, '', True, True)
    return cookie_jar
 
import cookielib
import os
import sys
import logging
import ConfigParser
 
# Set up cookie jar paths
def _get_firefox_cookie_jar (path):
    profiles_ini = os.path.join(path, 'profiles.ini')
    if not os.path.exists(path) or not os.path.exists(profiles_ini):
        return None
 
    # Open profiles.ini and read the path for the first profile
    profiles_ini_reader = ConfigParser.ConfigParser();
    profiles_ini_reader.readfp(open(profiles_ini))
    profile_name = profiles_ini_reader.get('Profile0', 'Path', True)
 
    profile_path = os.path.join(path, profile_name)
    if not os.path.exists(profile_path):
        return None
    else:
        if os.path.join(profile_path, 'cookies.sqlite'):
            return os.path.join(profile_path, 'cookies.sqlite')
        elif os.path.join(profile_path, 'cookies.txt'):
            return os.path.join(profile_path, 'cookies.txt')
 
def _get_firefox_nt_cookie_jar ():
    # See http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/473846
    try:
        import _winreg
        import win32api
    except ImportError:
        logging.error('Cannot load winreg -- running windows and win32api loaded?')
    key = _winreg.OpenKey(_winreg.HKEY_CURRENT_USER, r'Software\Microsoft\Windows\CurrentVersion\Explorer\Shell Folders')
    try:
        result = _winreg.QueryValueEx(key, 'AppData')
    except WindowsError:
        return None
    else:
        key.Close()
        if ret[1] == _winreg.REG_EXPAND_SZ:
            result = win32api.ExpandEnvironmentStrings(ret[0])
        else:
            result = ret[0]
 
    return _get_firefox_cookie_jar(os.path.join(result, r'Mozilla\Firefox\Profiles'))
 
def _get_firefox_posix_cookie_jar ():
    return _get_firefox_cookie_jar(os.path.expanduser(r'~/.mozilla/firefox'))
 
def _get_firefox_mac_cookie_jar ():
    # First of all...
    result = _get_firefox_cookie_jar(os.path.expanduser(r'~/Library/Mozilla/Firefox/Profiles'))
    if result == None:
        result = _get_firefox_cookie_jar(os.path.expanduser(r'~/Library/Application Support/Firefox/Profiles'))
    return result
 
FIREFOX_COOKIE_JARS = {
    'nt': _get_firefox_nt_cookie_jar,
    'posix': _get_firefox_posix_cookie_jar,
    'mac': _get_firefox_mac_cookie_jar
}
 
cookie_jar = None
try:
    cookie_jar = FIREFOX_COOKIE_JARS[os.name]()
except KeyError:
    cookie_jar = None
 
path = raw_input('Path to cookie jar file [%s]: ' % cookie_jar)
if path.strip():
    # Some input specified, set it
    cookie_jar = os.path.realpath(os.path.expanduser(path.strip()))
 
if cookie_jar.endswith('.sqlite'):
    cookie_jar = sqlite2cookie(cookie_jar)
else:
    cookie_jar = cookielib.MozillaCookieJar(cookie_jar)

Edit: The latest version of this code can be found at http://blog.mithis.com/cgi-bin/gitweb.cgi and includes numerous fixes and updates.

Tagged , , , ,

$#%#! UTF-8 in Python

This is not a post about using UTF-8 properly in Python, but doing evil, evil things.

Python dutifully respects the $LANG environment variable on the terminal. It turns out that a lot of the time this variable is totally wrong, it’s set to something like C even though the terminal is UTF-8 encoding.

The problem is that there is no easy way to change a file’s encoding after it’s open, well until this horrible hack! The following code will force the output encoding of stdout to UTF-8 even if started with LANG=C.

# License: MIT
try:
    print u"\u263A"
except Exception, e:
    print e
 
import sys
print sys.stdout.encoding
 
from ctypes import pythonapi, py_object, c_char_p
PyFile_SetEncoding = pythonapi.PyFile_SetEncoding
PyFile_SetEncoding.argtypes = (py_object, c_char_p)
if not PyFile_SetEncoding(sys.stdout, "UTF-8"):
    raise ValueError
 
try:
    print u"\u263A"
except Exception, e:
    print e
Tagged , , , , ,

Reading Firefox 3.x cookies in Python

I found the following code snippet on my hard drive today. It allows you to access Firefox 3.x cookies in Python. Firefox 3.x moved away from the older text file format to a sqlite database.

This code is useful if you want to access something behind an authentication gateway and you also access the page through your web browser. You can also use this code to convert a sqlite database into a cookie file CURL can read.

I didn’t write this code, it was written by Noah Fontes when we where doing some scraping of the Google Summer of Code website (before I joined Google).

#! /usr/bin/env python
# Protocol implementation for handling gsocmentors.com transactions
# Author: Noah Fontes nfontes AT cynigram DOT com
# License: MIT
 
def sqlite2cookie(filename):
    from cStringIO import StringIO
    from pysqlite2 import dbapi2 as sqlite
 
    con = sqlite.connect(filename)
 
    cur = con.cursor()
    cur.execute("select host, path, isSecure, expiry, name, value from moz_cookies")
 
    ftstr = ["FALSE","TRUE"]
 
    s = StringIO()
    s.write("""\
# Netscape HTTP Cookie File
# http://www.netscape.com/newsref/std/cookie_spec.html
# This is a generated file!  Do not edit.
""")
    for item in cur.fetchall():
        s.write("%s\t%s\t%s\t%s\t%s\t%s\t%s\n" % (
            item[0], ftstr[item[0].startswith('.')], item[1],
            ftstr[item[2]], item[3], item[4], item[5]))
 
    s.seek(0)
 
    cookie_jar = cookielib.MozillaCookieJar()
    cookie_jar._really_load(s, '', True, True)
    return cookie_jar
Tagged , , , ,

Cool Python, Swaping two variables

Some people say “you learn something new everyday” or something like that. Today someone on #python showed me a cool trick I never would have thought of on my own.

Often there is a time when you want to swap the contents of two variables. The most popular way to do this is using a third variable as shown below:

temp = a
a = b
b = temp

This looks sucky and doesn’t really express very well what you want to do. A much better way to do this in Python is with the following magic line:

a, b = b, a

Doesn’t that look so much better? And it is very clear to anyone who has used Python before what is going on. To think, I have been using Python for about 7 years now and never thought of doing that.

Just thought I would share this tidbit.

Skimpy, Scheme in Python

For something a bit different, I decided to work on making embedding Scheme in Python easier. I’ve previously been using the cool PyScheme, however it hasn’t been updated in quite a long time (since 2004) and is quite slow.

The reason I would want to do something crazy like this is that Thousand Parsec use a subset of Scheme called TPCL. The is used to transmit information from the server to clients about rules for creating designs. Servers also need to be able to parse TPCL for “dumb clients” which can’t parse TPCL for themselves.

Recent developments by DystopicFro on his Summer of Code project, a Ruleset Development Environment have meant that he also needs the ability to parser TPCL (and specifically the ability to detect errors). This got us chatting about PyScheme and it’s inadequacies.

What I have decided to do is create a module called SchemePy (pronounced Skimpy). On platforms where speed is of no concern, we will fall back to using a modified version of PyScheme. However, we can also use C scheme systems such as Guile (or other libraries) to improve speed.

Why have multiple implementations? It stops us from using custom things in one scheme implementation which are not compatible with other implementations. It also makes installation easier for the user, as they are much more likely to already have a compatible scheme library installed. Different scheme’s also have different speed advantages.

So far I have got the Guile wrapper 95% working. It’s written mainly in Python using ctypes. I needed a small C helper module as well because of the extensive Macro’s used by Guile. So far, you can convert between Guile and Python types easily, you can register Python functions into the Guile context and exceptions are caught. There is also the ability to pass python objects thru the Scheme environment to Python functions. I would like to thank the guys who hang out on #guile for all their help, it has made doing this wrapper much easier.

I’m happy enough with the outcome. My guess it will be between 10 and 20 times faster then PyScheme, but I’ve yet to do any benchmarking. I’m going to move to wrapping mzscheme too soon enough. It should be much easier to do now that I have gotten most of the hard stuff sorted out. I think a lot of it will be common between the implementations.

What I really need to do is get a test-suit working. Once I have more then one implementation working it will be very important to make sure that they all work the same way, the only way I can see to do that properly is to have a test suite which I can run every implementation against.

One thing which might be really cool to investigate is using a similar system to lython which compiles Lisp s-expressions directly to python byte code. If this was done well it should be the fastest method as it would mean no type conversion needs to be done.

Overall this has been quite a good learning experience. I have improve my ctype skills quite a lot (this wasn’t my first ctypes wrapper, that being libmng-py, a Python wrapper around libmng). I also understand how Scheme works quite a lot better now.