Go to file
Jashank Jeremy 85f2dfffa6 faster_lsi: Massively accelerate LSI performance.
Currently, Classifier::LSI rebuilds the index every time an entry is
added.  This runs into massive performance overheads on my website;
theoretically, disabling automatic index rebuilds, and explicitly
rebuilding the LSI index at the end of the LSI repopulation should
speed things up nicely.

As a side note, here, I use pandoc-ruby to provide a more featureful
Markdown transformer, so be mindful that the numbers I quote here have
artifically imposed I/O overheads.

With just the 76 posts I wrote this year (abysmal, I know), I come up
with the following figures:

    Without faster_lsi:
      jekyll --lsi  16.91s user 0.88s system 97% cpu 18.302 total
    With faster_lsi:
      jekyll --lsi  2.72s user 0.77s system 88% cpu 3.940 total

With 109 posts, we begin to see even better improvements:

    Without faster_lsi:
      jekyll --lsi  51.00s user 1.47s system 98% cpu 53.060 total
    With faster_lsi:
      jekyll --lsi  5.04s user 1.12s system 91% cpu 6.735 total

At this point, we begin to see I/O overheads being slower than LSI
when faster_lsi is active.  I call that fairly conclusive.  But wait,
there's more.  I have 273 posts lying around... I wonder what happens
if I feed them all in.  With faster_lsi, it was nice and clippy.
Without it, I simply gave up, and went and refilled my cup of tea.
And it was still going.

    Without faster_lsi:
      jekyll --lsi  1277.86s user 10.90s system 99% cpu 21:30.29 total
    With faster_lsi:
      jekyll --lsi  34.62s user 4.43s system 96% cpu 40.430 total

That is, in anyone's books, a major improvement.  Note, however, that
I don't know just how well this will perform with `jekyll --auto`
because I don't know how it does the LSI rebuilds.  I _think_ (but
please, don't commit me on this) that the LSI is rebuilt every time
Jekyll picks up a file change.

So, all up, the performance improvement is massive, and scales
depending on how many files you have.  At the last point, the
improvement is just on 3200%.

A more optimal solution would be to cache the LSI index and/or content
data somehow.  I'll leave that to when faster_lsi takes over ten
minutes to run.
2012-10-31 22:19:59 +11:00
bin Merge branch 'master' of https://github.com/daneharrigan/jekyll into daneharrigan-master 2012-04-23 16:48:18 -07:00
features Allow a custom 'layouts' directory 2012-05-30 21:39:43 -04:00
lib faster_lsi: Massively accelerate LSI performance. 2012-10-31 22:19:59 +11:00
test Cleanup for RDiscount TOC support. Closes #333. 2012-04-23 16:15:55 -07:00
.gitignore Update and clarify dependencies. 2011-11-26 18:48:51 -08:00
.travis.yml Update travis-ci configuration file 2012-06-12 00:35:21 +01:00
Gemfile Gemfile to help install the dependencies 2011-03-06 01:46:00 -08:00
History.txt Update History. 2012-04-23 17:23:11 -07:00
LICENSE convert to use rakegem 2010-04-21 13:55:01 -07:00
README.textile Merge remote-tracking branch 'jbw/ruby-v1.9' into devel 2011-04-23 18:08:34 +08:00
Rakefile No longer need pygments locally 2012-05-31 16:06:49 -04:00
cucumber.yml fixes problem in issue 64 fix where pages like about.md would be output as about.md/index.html. provides the output extension as a method rather than replacing the ext attribute as part of transform 2010-02-27 09:27:36 +00:00
jekyll.gemspec Swap out albino for pygments.rb 2012-05-31 15:51:34 -04:00

README.textile

h1. Jekyll

By Tom Preston-Werner, Nick Quaranto, and many awesome contributors!

Jekyll is a simple, blog aware, static site generator. It takes a template directory (representing the raw form of a website), runs it through Textile or Markdown and Liquid converters, and spits out a complete, static website suitable for serving with Apache or your favorite web server. This is also the engine behind "GitHub Pages":http://pages.github.com, which you can use to host your project's page or blog right here from GitHub.

h2. Getting Started

* "Install":http://wiki.github.com/mojombo/jekyll/install the gem
* Read up about its "Usage":http://wiki.github.com/mojombo/jekyll/usage and "Configuration":http://wiki.github.com/mojombo/jekyll/configuration
* Take a gander at some existing "Sites":http://wiki.github.com/mojombo/jekyll/sites
* Fork and "Contribute":http://wiki.github.com/mojombo/jekyll/contribute your own modifications
* Have questions? Post them on the "Mailing List":http://groups.google.com/group/jekyll-rb

h2. Diving In

* "Migrate":http://wiki.github.com/mojombo/jekyll/blog-migrations from your previous system
* Learn how the "YAML Front Matter":http://wiki.github.com/mojombo/jekyll/yaml-front-matter works
* Put information on your site with "Template Data":http://wiki.github.com/mojombo/jekyll/template-data
* Customize the "Permalinks":http://wiki.github.com/mojombo/jekyll/permalinks your posts are generated with
* Use the built-in "Liquid Extensions":http://wiki.github.com/mojombo/jekyll/liquid-extensions to make your life easier

h2. Runtime Dependencies

* RedCloth: Textile support (Ruby)
* Liquid: Templating system (Ruby)
* Classifier: Generating related posts (Ruby)
* Maruku: Default markdown engine (Ruby)
* Directory Watcher: Auto-regeneration of sites (Ruby)
* Pygments: Syntax highlighting (Python)

h2. Developer Dependencies

* Shoulda: Test framework (Ruby)
* RR: Mocking (Ruby)
* RedGreen: Nicer test output (Ruby)
* RDiscount: Discount Markdown Processor (Ruby)

h2. License

See LICENSE.