The Problem

A man and his son are driving in a car one day, when they get into a fatal accident. The man is killed instantly. The boy is knocked unconscious, but he is still alive. He is rushed to hospital, and will need immediate surgery. The doctor enters the emergency room, looks at the boy, and says “I can’t operate on this boy, he is my son.” How is this possible? The answer is simple: the doctor is the boy’s mother.

Sooner or later in each complex application involving data manipulation backed up by the database arises a necessity to have a parent-child relationship inside the same table/model. The easiest example that comes into my mind would be several related blog posts, maintaining the long story chapters. The sequence of these posts would be a whole topic but since people rare like to read more than 280 symbols at once nowadays we tend to split long writings into parts.

The wrong approach

These posts basically maintain a linked list. The naïve database approach to store the relation between neighbour chapters would be to add a reference field to the table/model.

class Post < ActiveRecord::Base
  has_one :next, class: 'Post', foreign_key: :next_post_id
  belongs_to :previous, class: 'Post', inverse_of: :next
end

Here is my strong advice: do not do that! Never ever.

The better approach

Create a new join table posts_relationships and use it to link the sibling posts chains. It sounds like overcomplicating things, but in practice it is not. It will pay back soon, helping to avoid a nighmare debugging session full of whys and wtfs while the deleted objects will keep resurrecting out of the ashes.

Let’s assume we use the straight approach and we are adding new options for the children objects. Like a check for whether the parent was updated, or whatever. We do:

class Post < ActiveRecord::Base
  ...
  def check_parent
    flash("Parent updated!") if previous.updated_at > updated_at
  end
end

The example is contrived, I do omit all the guards and checks. So far so good. Now we want to implement a functionality to unlink parent if needed. We do:

class Post < ActiveRecord::Base
  def unlink_parent
    update_attributes! previous: nil
  end
end

And here is when monsters come. If any other process had the instance of parent loaded at the moment, any call to save it would restore the link between objects.

One cannot rely on optimistic locking here, because the parent object is not stale by any mean.

One cannot rely on reload inside the parent code, because reload will trash all the wanted scheduled changes to the parent out.

One cannot rely on anything, save for an ugly explicit check of next.reload.previous.nil?.

With an intermediate relationships table, the issue goes away for free. The reference between objects is deleted from this table when parent.save is called, and the latter will fail immediately, violating referential integrity.

That simple. Do not be lazy to implement the relationships table. Or, better, do not use mutable variables to store objects in general and ActiveRecord in particular.