Tim Hatch

Weblog | Photos | Projects | Panoramas | About

Go ahead, browse the archives.

Github Gist 14 Aug, 2008

Yet Another PastebinI found out at the SBonRails meeting last night that Github is now offering a pastebin. I know what you’re thinking, do we really need another pastebin but this one is fast, not annoying, and provides the history of the evolution of a paste as a git repo. Pretty neat, and they use Pygments (a Python-based syntax highlighter I contribute to) for colorization, from what I assume is a Rails app. I’d love to see more about how they integrated it (and some sort of api that lets me grab the metadata for which filetype a given paste is, for running daily tests, but that’s another post).

Home Desk Setup 14 Aug, 2008

Berto asked about the state of my home work area, so here’s a quickly-annotated photo. The two desktops stay on all the time but with some powersaving (Moya could sleep if I play with WOL and get it working). Aeryn Sun is my G4 PowerBook (I joke it’s a SlowBook Pro), which is still fast enough for a whole lot of tasks. (Actually I’m writing this post on it!) I’m slowly becoming a Mac shop as machines are becoming fast enough to run all my testing VMs on a single Linux box.

Desk Photo, August 2008

The printer is a HL-2170w which has a quaint Web interface for configuring it, but works great with the Mac. You can’t really see the SUA1000XL which is the most awesome UPS I’ve ever been able to hold off the ground without help. It’s around 55 pounds and able to run the whole setup for 39 minutes (something on the order of 38% load, when I leave and the monitors powersave it goes down to 10% load and around 90 minutes runtime). Behind the printer is the D-Link gigabit switch which works great and supports jumbo frames, and next to it (also not visible) is the first photo backup drive (HFS+, covered in a couple of paragraphs) and two external ReiserFS data drives. Of course the printer is not on a UPS, since it draws too much on startup (the main UPSes actually kick on because the line voltage drops too much at that point, since I only have one circuit to my bedroom).

Backup FlowchartThe backup scheme I use for photos is pretty solid, I think. Coming out of the camera, I keep the photos on the flash card until they’re backed up fully in Texas, at which point I mark the card as erasable. The last couple of months stay on my laptop, and everything before that is then on the external hard drive and in Texas, on a machine named Toothpaste (simply for the sheer joy of asking people “Where’s Toothpaste?” in everyday conversation). Yearly it also gets burned to DVD and the DVDs are kept in a box with the expectation that I’ll never have to use them. Toothpaste is a lowly Celeron 566 which is optimized to the hilt for low power consumption (around 25w idle, 45w with all disks spun up and transferring large amounts of data).

Astute readers who spend as much time at Ikea as I do might recognize the Galant desk surface and Vika Fagerlid legs. I actually got the desk surface first from the as-is area and decided on the legs later (as I was running short of shelf space). What you can’t see is that there is another, different leg at the back corner which is from another series which is 3/4” shorter than the others. The solution was pretty simple, cut a piece of 1 × 4 and screwed it on (this probably qualifies me for a post on Ikea Hacker, since “cut the allen wrench and use in a drill” made it into a post last month).

TeamsServer 12 Aug, 2008

Strange, I was looking for libjpeg headers on my Leopard machine today and noticed /usr/share/wikid which has some files unreadable by regular users which appear to be help files for some sort of collaboration server. Part of it is a Subversion checkout of the local path file:///Library/Collaboration/Groups/help/revisions, which of course doesn’t exist, and the owner (_teamsserver) has a home directory of /var/teamsserver, which also doesn’t exist.

This doesn’t seem to be officially exposed anywhere on the Leopard client, but I can find references to a collaboration server by searching online. Strange.

Quick Howto for PIL on Leopard 12 Aug, 2008

If you are trying to get PIL working and get messages about missing jpeg or freetype support, here's the quickest way to get going:

  • Download PIL 1.1.6 source package and have the Developer Tools already installed
  • Patch setup.py with this patch so it can find the Freetype you already have. (patch -p0 < leopard_freetype2.diff)
  • sudo apt-get install libjpeg if you have fink (otherwise, build by hand and adjust paths)

You can use it without jpeg support but im.show() has jpeg hardcoded as the filetype so you would need to work around that.

Spaceship Operator 08 Aug, 2008

Comic Sans Spaceship Operator

Berto thought it necessary to combine two of my peeves (perl and Comic Sans MS) into one image and send it to me in an innocuous name like “Picture 1.png” in Skype. I closed it as soon as possible, but it just haunted me from the Leopard Dock when I returned from lunch.

undefined symbol: Py_InitModule4_64 07 Aug, 2008

We ran into this error on Debian this morning, with a system that had more than one Python.

Typically this means that an extension was compiled against the wrong Python.h.

This just means that the calling Python interpreter and the extension don't use the same version (one is Python 2.4, the other 2.5). They were both 64-bit but the name changed in http://www.python.org/dev/peps/pep-0353/

Some commands to help debug this are ldd (to show dymanic library linking) and nm (to list exports).

Earthquake Yesterday 30 Jul, 2008

Looks like we had a 5.6 earthquake in the greater L.A. area yesterday. We felt it at the office, but it wasn’t much since we are around 100 miles away (a bit like I was tapping my leg, only I was’t). So I have seen a wildfire, evacuation, a heat wave, and an earthquake. Now the only thing I need to be fully acclimated, the locals tell me, is a flood. Good thing I live on a hill!

Simulating a Rebase in Mercurial 26 Jul, 2008

I ran into this while doing some work with Pygments this morning — I continued working in a local repo without consciously realizing that a merge node had been created from my work yesterday. If I had just pulled that merge node before committing this morning, everything would be pretty. But it wasn’t, and I wanted to avoid merging the merge which would then cause another merge upstream. Forgive the ascii art.

       *   $PUSH1
       |\
       | * 
       |  \
       |   * $PUSH1
MERGE1 *--/ \
       |     \
       |\     * $CUR2, I don't want this.
       | \
       |  * $PUSH2, I want this, so I can...
MERGE2 *-/

Here’s how:

Background

I pushed $PUSH1, which got merged with $TRUNK, creating $MERGE1.

Some local changes (on top of $PUSH1) later, I am ready to push again (I have $CUR2 and want to end up with $PUSH2 on top of $MERGE1).

First, enable the transplant extension.

[extensions]
transplant=

Go ahead and pull so you have multiple heads in your local repo (don’t push to your personal remote, any of ($MERGE1:$CUR2] or you’ll have to go through an extra set of steps first, below).

Rebasing in HG

# we start with two heads, $MERGE and $CUR2, and no uncommitted changes
hg update -C $MERGE
hg transplant --branch $CUR2
# at this point, your $MERGE head is updated to what we'll call $MERGE2
cd ..

# this is the easiest way to remove $CUR2's head
hg clone -r $MERGE2 repo new_repo
mv repo old_repo
mv new_repo repo
cd repo
hg push

Extra steps if you pushed

If you did push something on $CUR2’s line, you’ll need to fix that first.

ssh remote

hg clone -r$MERGE1 remote new_remote
cp remote/.hg/hgrc new_remote/.hg/
# set permissions like they were before
mv remote old_remote
mv new_remote remote

A Good Profiling Decorator 20 Jul, 2008

To see where slowdowns are in your Python programs, Python provides some low-level hooks (sys.settrace) and a module to evaluate timings (profile and its faster cousin cProfile). It’s pretty easy to run these on your whole program (on Ubuntu, make sure you install python-profiler):

slow.py

def slow():
    x = "Hello world" * 100000
    y = ''.join(x)

slow()

Invocation

$ python -m cProfile ./slow.py

         6 function calls in 0.067 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.067    0.067 <string>:1(<module>)
        1    0.000    0.000    0.066    0.066 slow.py:1(<module>)
        1    0.001    0.001    0.066    0.066 slow.py:1(slow)
        1    0.001    0.001    0.067    0.067 {execfile}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.065    0.065    0.065    0.065 {method 'join' of 'str' objects}

This is a pretty powerful tool, so look into the options for sorting — in this case the slowdown is in str.join() which would be highlighted by sorting on tottime (add -s time before the file to run).

The built-in profile module doesn’t work so well on threaded programs (which I was last week) or when the entire app waits a lot on io (ditto) or when you’re running on a really slow cpu (ditto), so I came up with a good profiling decorator, which allows you to just see profiling stats on a function at a time:

profiler.py

try:
    import cProfile as profile
except ImportError:
    import profile

import os.path

__all__ = ['profiled']

def profiled(path, multi=True):
    """
    Decorator to allow individual functions to be profiled, without profiling
    the whole program.  This allows for much more targeted profiling, which is
    necessary for threaded programs or if performance becomes an issue.

    multi: if True, adds a sequential number to each profile run to prevent
                    name collisions.
           if False, the last invocation wins.
    """
    # This extra layer of indirection is so the decorator can accept arguments
    # When the user doesn't provide arguments, the first arg is the function,
    # so detect that and show an error.
    if not isinstance(path, (str, unicode)):
        raise Exception("This decorator takes a path argument")
    d = os.path.dirname(path)
    assert os.path.exists(d)
    p, q = os.path.splitext(path)
    i = [0]

    def decorator(func):
        def newfunc(*args, **kwargs):
            pr = profile.Profile()
            ret = pr.runcall(func, *args, **kwargs)
            if multi:
                fn = "%s.%d%s" % (p, i[0], q)
            else:
                fn = path
            i[0] += 1
            pr.dump_stats(fn)
            return ret
        # Be well-behaved
        newfunc.__name__ = func.__name__
        newfunc.__doc__ = func.__doc__
        newfunc.__dict__.update(func.__dict__)
        return newfunc
    return decorator

Invocation

Edit your source file that needs profiling:

from profiler import profiled
@profiled("/tmp/func.stats")
def slow(): ...

Now just run it as usual (python slow.py). /tmp/func.0.stats contains the first run, so let’s look at it:

$ python -c 'import pstats; pstats.Stats("/tmp/func.0.stats").sort_stats("time").print_stats(30)'
Sun Jul 20 10:38:42 2008    /tmp/func.0.stats

         3 function calls in 0.051 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.050    0.050    0.050    0.050 {method 'join' of 'str' objects}
        1    0.001    0.001    0.051    0.051 slow.py:3(slow)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

The argument to print_stats controls how many lines are printed, and there are alternate sorting keys (make sure you specify one, otherwise it autoselects “random”): See http://docs.python.org/lib/profile-stats.html

Update: Found a more complete (including coverage) decorator by Marius Gedminas whom I met briefly at Europython 2007, http://mg.pov.lt/blog/profiling.html.

Wireless Network Names 13 Jul, 2008

I’m amused by the names of wireless networks that people come up with. My first example is at my old apartment in Denton, and everyone was getting FIOS routers with random-looking 5-character SSIDs. “Ninjas > Pirates” I’m not sure if I agree with, but I’m glad that they’re at least opinionated enough to broadcast that one. Mine is minihub, on a WRT54GC so it’s small.

This one is at my new office, where there’s a Rails shop across the hall which actually has a scrum-style burndown chart on the wall. I’m assuming they also do pair programming, but don’t let people working alone on that network. (Given the existence of “pair”, “Pairs”, and “solo”, even the pair programmers must be separated somehow, maybe if they’re the top pair they get their own network?) nutricat’em (no apostrophe, just because), and gumnut are ours, and we don’t know who has the Apple one.