Using strace to solve a problem which didn't exist
For the next few minutes, I’d like you to forget that web servers can rewrite URLs. It’s the only way any of this is going to make sense.
I’d decided it was time to switch from WordPress to a statically generated site. So, about five years after everyone else, I jumped on the Jekyll bandwagon. At this point it’s easy. There’s tools to import your content from a bunch of different blogging platforms, simple deployment options, and enough of a userbase for you to find answers when things don’t work.
Of course I’d chosen to make things difficult for myself. As a learning exercise, I decided I’d deploy to my own server1 using Capistrano. I’m not a fan of link rot, so I spent some time making sure my post URLs followed the same structure. The last path which didn’t match my old WordPress setup was the RSS feed. Jekyll generates the feed at /feed.xml
, whereas WordPress served it at /feed
.
In my mind, the solution was obvious: add a symlink. For the most part, that works. Lighttpd will follow symlinks by default, and serve whatever file they point at. Sadly, without the file extension, Lighttpd’s MimeType
detection doesn’t work. /feed.xml
is served as text/xml
, but /feed
gets the default application/octet-stream
.
It’s cool though. Lighttpd wouldn’t just leave us hanging. It’ll let you use extended file attributes to set the Content-Type of a file2. Problem solved:
$ ln -s feed.xml feed
$ attr -s Content-Type -V text/xml feed
attr_set: Operation not permitted
Could not set "Content-Type" for feed
Well that’s not cool.
I spent the next hour looking through contradictory info in man pages. man 1 attr
will tell you that attributes can be set on “all types of XFS inodes: regular files, directories, symbolic links, device nodes, etc”3. man 1 setfattr
has a line on the --no-dereference
option, which specifically exists to disable symlink dereferencing. Why wouldn’t this work?
I wasn’t getting anywhere, so I decided to test things out locally. This meant a small tweak to the command, since I use a Mac for development4:
$ ln -s feed.xml feed
$ xattr -s -w Content-Type text/xml feed
It worked. Now I had to figure out what was happening on the server. Enter strace
. If you’ve not used it before, strace
outputs every system call made by a process, and every signal it receives. I was looking for something related to setting attributes, so I went with:
$ strace attr -s Content-Type -V text/xml feed 2>&1 | grep attr
and got about 15 lines5 back. One of them was exactly what I was looking for:
lsetxattr("feed", "user.Content-Type", "text/xml", 8, 0) = -1 EPERM (Operation not permitted)
attr
was doing the right thing. It was calling lsetxattr
, the specialised version of the setxattr
syscall, which sets attributes on the symlink itself rather than its target. Something else was stopping this from working.
It turns out the answer wasn’t far from where I was looking; I just needed to switch to man 5 attr
6:
extended user attributes are only allowed for regular files and directories
And that was that. Linux and ext3 were having none of it.
Fortunately, URL rewriting does exist:
# Preserve old feed url
url.rewrite = (
"^/feed" => "/feed.xml"
)
If you enjoyed reading this, I’d recommend Julia Evans’ posts. She goes into a lot more detail, and uses some really neat examples to show off what you can do with strace
.
-
This isn’t worth the hassle unless you’re doing it to learn. Just use S3 or GitHub Pages. For reference I’m doing this on Ubuntu 12.04 with an ext3 filesystem. ↩
-
This feature feels like a mistake waiting to happen. I just can’t figure out why. ↩
-
Alarm bells! Why is this man page talking about XFS specifically? ↩
-
Running OS X 10.10 (Yosemite) ↩
-
Down from nearer 100 without the
grep
.strace
is verbose! ↩ -
man pages are split into sections. Section 5 is reserved for “File formats and conventions”, with section 1 being used for “General commands”. 1 is the default. I almost never remember to check the others. ↩