Octopress

I'm still new to Octopress and so far I'm quite happy with it. While it always feel like a trade (between power and simplicity), the choice between HAML or Markdown is really nice.

Anyway one of the thing I wanted to hack was a nice custom 404 page. I found a lot of info on the web but no complete walkthrough, so I decided to write a quick post about it.

Creating the page

First of all, we need to create a new page. Mine is located in source/error/404-not-found.html But there isn't much restrictions.

source/error/404-not-found.html

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
---
title: "404 Page Not Found"
comments: false
sharing: false
footer: false
---

<p style="text-align:right;font-family:Courrier,monospace;">
  404_page_not_found();
  {% img center /images/404-not-found.scaled.jpg ENOENT %}
</p>

<style type="text/css">
  /* avoid to display the page's title, it kinda give it all away */
  article.hentry > header {
      display: none;
  }
</style>

By the way I struggled to have the Liquid tag inserted correctly (i.e. not interpreted) into this codeblock until I found Jim's post about the workaround . Kudo to him ♥

Of course you can customize it even more with a dedicated layout for errors pages, use HAML, Markdown etc.

hide it!

Let's face the ugly truth: it is not a very exciting page.

We don't want robots to crawl our 404 page. The following robot.txt will Disallow the whole /error directory (where, if needed, we could add more pages for other types of errors like 403, 418 etc.).

source/robot.txt

1
2
3
4
5
6
---
---
User-agent: *
Disallow: /error/

Sitemap: {{ site.url }}/sitemap.xml

Then, we need to hack into plugins/sitemap_generator.rb in order to avoid our page to be visible in the site's sitemap.xml. Find the EXCLUDED_FILE array and add the filename (not the path) to it. As an exemple, here is mine:

plugins/sitemap_generator.rb

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# ...

module Jekyll

  # Change SITEMAP_FILE_NAME if you would like your sitemap file
  # to be called something else
  SITEMAP_FILE_NAME = "sitemap.xml"

  # Any files to exclude from being included in the sitemap.xml
  EXCLUDED_FILES = ["atom.xml", "404-not-found.html"]

  # Any files that include posts, so that when a new post is added, the last
  # modified date of these pages should take that into account
  PAGES_INCLUDE_POSTS = ["index.html"]

# ...

I know, it feel hacky. tecnobrat made a pull request to Octopress with a nice patch to handle this in a cleaner way, but it has been closed without explanation.

Webserver configuration

Last but not least, we need to tell the webserver that we can handle 404 errors like a man. Here is how to do it with nginx:

nginx.conf

1
2
3
4
5
6
7
8
server {
    server_name kaworu.ch www.kaworu.ch;
    # [...]
    error_page 404 /error/404-not-found.html;      # preserve the 404 status code
    #error_page 404 = /error/404-not-found.html;    # reset the status code
    #error_page 404 =418 /error/404-not-found.html; # set the status code to 418
    # [...]
}

I believe keeping the 404 status code is the good choice here. If you're using Apache you're on your own (but someone is whispering *.htaccess*).

Now reload your server's configuration and test it:

% curl -I https://kaworu.ch/this-does-not-exist-hahaha
HTTP/1.1 404 Not Found
Server: nginx/1.4.1
Date: Wed, 15 May 2013 18:18:33 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 3325
Connection: keep-alive
Vary: Accept-Encoding
ETag: "518fd711-cfd"

Alright! A quick test with your browser to ensure that the awesome page we created at step one is displayed and it's all done. Now all you might need is a little bit of inspiration.