djangoproject.com | python.org | nginx.org
version seven.
  http://demongin.org
demongin.org - Overwrite save() in Django to Sequentially Replace HTML Elements

Overwrite save() in Django to Sequentially Replace HTML Elements

The low-down on how I automatically number my footnotes.


Friday, 2011-03-18 | demongin.org, Django, Programming, Testing

If it don't fit, force it: if it breaks, it needed replacing in the first place.

Recently, in an effort to write more clearly, I have been writing essays with obscene amounts of footnotes.

Due to the simplicity of the editor1 that I use to write these essays, this obscene amount of footnotes necessitated an obscene amount of manual tweaking: every time I wanted to add a footnote in the middle of an essay, I had to 1.) find and 2.) manually increment or decrement all of the preceding and subsequent footnote numbers.

Yuck.

In order to save myself some time (and sanity) I decided to make two changes to my process:

  1. Indicate footnotes with a special, unusual tag and
  2. Overwrite the save() function of the django model field where I keep the bodies of my essays.
I decided to solve my problem in this way because I wanted footnotes to automatically keep track of themselves in such a way that I could keep track of them while editing (i.e. opening, saving, closing, re-opening) essays: it's not a perfect (or even a "best practice" design), but it fits my style/needs.

Without further ado:
    # overwrite save() to automatically handle footnotes
    def save(self):
        from re import compile, finditer, DOTALL
        tag = "foot"
        pattern = compile(r"<%s>(.*?)<\/%s>" % (tag,tag), DOTALL)

        count = 1
        s = self.body
        match_group = finditer(pattern,s)
        for m in match_group:
            s = s[:m.start()] + "<%s>%s</%s>" % (tag,count,tag) + s[m.end():]
            count += 1

        self.body = s
        super(Post, self).save()
Basically, in my main model (which I call "Post", as in "blog post")2, I save a field called "body":
class Post(models.Model):
    body = models.TextField()
I like to write my own HTML into my posts (because I'm kind of a formatting micromanager), so the "body" field saves my essays and the raw HTML I write into them.

So when I save, I overwrite the built-in save() function of the "Post" model to do the following:
  1. compile a simple regex that captures any upper-case sup tag
  2. initialize the "body" field as "s" (for more legible code)
  3. do a re.finditer() function using my pattern and my body: this gets me a list of match objects with start() and stop() methods that I can use to get string slice numbers
  4. iterate over the group of match objects, using each object's start() and stop() number to slice my body, swapping out whatever was matched for a value that I increment at the end of the loop
Essentially, I write a paragraph like this:
This is a sentence. Here is a second sentence with a footnote<foot>sadfasdfsdfasdf</foot>. Here is a third.
...and, when I save, I save a string that looks like this:
This is a sentence. Here is a second sentence with a footnote1. Here is a third.
Pretty simple stuff, but not entirely un-clever: sequentially substituting values in a string is something that doesn't come up often and, when it does, it usually involves a generator expression (or something similarly complicated).

Finally, even if you don't want to overwrite one of your own django save() functions to alter a string in a programmatic/sequential fashion like I did here, you can see how this might be applicable in other projects: finditer() is kind of an awesome (and somewhat obscure) method of re, and knowing how to use it might save you some pain elsewhere (particularly in the sysadmin realm where sequential find/replace is more likely to occur).



  1. i.e. a very lightly tweaked version of the standard django admin and Chrome.
  2. I don't actually do my imports inside the save() function: they're just here for the sake of clarity in this blog post.