qipowl

the next generation markup environment for virtually everything

Download .zip Download .tar.gz View on GitHub

qipowl

The main idea of qipowl is to yield the power of DSL in Ruby. The whole input text is treated neither more nor less than DSL. That gives the user an ability to make virtually every term in input text the operating entity.

Intro

Everybody knows that parsing is annoying, error-prone and generally hazardous. That’s because the common old-school approach to parsing suc^W^W failed.

The problems are greatly exaggerated. The only thing we need to make the parsing process pleasant is to join bootstrapping techniques, Ruby DSL abilities and a bit of additional Ruby coding.

Like Baron Münchhausen, who allegedly pulled himself and the horse on which he was sitting out of a swamp by his own hair (or like Ouroboros if one prefers serpentariums to swamps), the text to be parsed should… Well, it should parse itself.

illustration by Oskar Herrfurth

Ruby DSL has all the prerequisites for that. Let’s see.

Bowler Basics

The main principle of qipowl is:

All the input text is interpreted as Ruby code.

There is a rake¹ on our way, and we’re near to get on the forehead, so let’s watch our steps:

  • Get rid of all the symbols, which may confuse ruby interpreter (e. g. ASCII); we simply gsub all the single-byte symbols to their “fullwidth” representation.
  • Split input text to, say, “paragraphs” (by empty lines).
  • Process each line of result as if it was plain Ruby (eval; it’s safe).
  • gsub wide characters back to Latin1.

As we have done the four steps above, we’ve run into troubles with “undefined methods/constants” errors. Let’s see why. For instance, we are to proceed with the following string:

I'm Brian, and so's my wife

After 1st step we yield:

I'm Brian, and so's my wife

Step 2) is omitted, since we have the only string in the input. Now let’s give a chance to step 3). If we’d stop here and try to suggest, how ruby interpreter will be dealing with this garbage. Let’s take a look at this valid ruby command:

puts rand hash

It surprisingly puts a huge random value. Ruby evaluates hash, passes the result of evaluation to rand, evaluates rand and, chaining the result, finally prints it out. Aha! But there is no such ruby functions like wife, so's or I'm. It’s time for method_missing come to scene.

def respond_to?
  true
end

def method_missing method, *args, &block
  method, *args = special_handler(method, *args, &block) \
    if self.private_methods.include?(:special_handler)
  [method, args].flatten
end

Here we simply pass the parameters further in the chain, giving an aspect for intervention: if any of descendants have had special_handler method overwritten, it’ll be called.

Since we have these two methods only, we already have a void parser. It means our string will be executed as ruby code without any errors from within class, containing these methods.

def parse string
  eval 'I'm Brian, and so's my wife'
end

# ⇒ 'I'm Brian, and so's my wife'

Nice scaffold, isn’t it? Here and further we call this base class Bowler.


¹ True rake, I mean.

Bowler Scaffold

If it looks like a pan, stores like a pan, and sizzles like a pan, then it is a bowler.

I choose the name Bowler for the base class because every process usually has three stages:

  • prepare
  • perform
  • ??? (I’ve heard there is no such word “postpare” in English

My process also has these stages and I needed clear easy-to-remember names for them. Here we are:

  • defreeze
  • roast
  • serveup

Since the bowler introduced in the previous chapter has a middling meaning, we need more built-in helpers for future use. First of all, we need to fullwidth the input in defreeze, to break in into “paragraphs” in roast and to gsub it back to normal characters in serveup. Secondary, we need to be able to store some processing rules in config file (which is to be YAML and should be loaded by base class.)

That’s all for now, we’ll turn back to this class later.

Processing chain

Within default Bowler, the input string is simply eaten by ruby interpreter and spitted out then. Not so useful, huh?

Well, let’s recall that each “word” in input string is currently handled by method_missing. What if we declare new method within class derived from Bowler?

# I'm Brian and so's my wife
class Herald
  def Brian *args
    me = __callee__
    [me, ', the Herald, honour me,', *args]
  end
end

After we have the input string processed by this class, the output became:

I'm Brian, the Herald, honour me, and so's my wife!

Cute, huh?

There are ready-to-use examples for command line parser, markdown-like markup to HTML parser and even YAML loader provided.