Monday, February 6, 2012

Where TK agrees with TK (or: why are triggers harmful)

Sofar we've explored playing around with a few triggers to implement a business rule. In this post I'd like to step back a bit and take a birds-eye view at the common use-cases that I see brought up for introducing triggers into the app-development picture.

The first use-case is: we use triggers to assign or modify column values of mutating rows (that is, rows currently being inserted, updated, or deleted). Here's a few typical examples of this use case that, I'm sure, must look familiar to you too.

For this use-case we always use a row-level trigger (for each row), as this type of trigger allows us to manipulate :new.[column-name] variables. We can re-assign supplied (by the trigger DML-statement) column values, or initialize ones that were not supplied. So for instance the use enters all column-values but the Created and By_User columns, which are initialized by a row-trigger we wrote.

The second use-case is: we use triggers to execute insert/update/delete statements. So we perform some DML against a table. This table has some trigger, in which we perform more DML.

Of course this can lead to a cascading effect: the DML that we perform from within our trigger body might be against a table that too has triggers. In which there might be more DML-statements performed, and so on and so forth. Typical examples of why we perform DML-statements inside triggers, are: we want to audit changes against the table, or we want to maintain redundant data stored elsewhere in the database design.

The third use-case is: we use triggers to execute non-transactional stuff. And by "non-transactional" I mean stuff that, once executed inside our transaction, does not participate in a rollback, in case we (the code-path) decides to perform one.

The most beautiful example of the last decade for this is, we call a web-service inside our trigger body. In case this is a web-service that actually 'does something' (changes/transacts information at/behind the endpoint of the service), then obviously when we rollback our triggering DML-statement, whatever was changed by the web-service, remains changed. Other examples of this use-case are: we send email from our trigger body, or perform file I/O. We can also perform an autonomous transaction (one that actually performs some DML which given it's executed inside the autonomous TX, cannot be rolled back by the parent transaction). This is actually a blend of this use-case with the previous one.

So what's harmful about all these use-cases? Well, they make stuff happen automagically (copyright the other TK).

What we mean by this is, that instead of only getting the effect of the triggering DML-statement, you also get the effect of whatever is being performed inside the trigger body. And exactly that is the harmful bit.

In a SQL DBMS, the insert, update and delete 'operators', constitute three primitive operators that we can use to manipulate the data held inside the database (tables). Primitive here is used in the sense of primary or basic. Sort of like Inc(rement) and Dec(rement) are primitives that can be used in machine language to increase or decrease a register's value. These primitive operators have well understood and universally known change-semantics. Say you start with an empty EMP table, you then insert a employee row, you will end up with a change of the EMP table: it now has a single row in it. And all other bits and pieces of the database remain unchanged.

Triggers executing automagic stuff change these universally well-understood change-semantics of our three primitive operators.

And this is harmful: because nobody (after you've gone) expects insert, update or delete to quietly behave  differently. For those of you who have done some machine programming in a distant past: it's like increasing one of your registers, and suddenly (as an automagic side-effect) the top of the execution stack is increased also and has some subroutine's address pushed on it. Not something you'd expect...

The TCO (total cost of ownership) of an application sits not in initially building the thing. It sits in the years, sometimes decades, the thing lives. And you, but more probably others, maintain it. And more often than not these developers expect their available primitive operators to follow the primitive change-semantics.

I'll finish this post with the following thought:

If you agree on the three use-cases discussed being harmful, do you then also agree with me that:
  • The cascade delete option of a foreign key, and
  • The default clause of a table column, and
  • The char (fixed-length) datatype
are to be considered harmful too.

Aren't they conceptually about the exact same problem? Introducing different change-semantics to our three primitive operators?

I'd be interested to hear of any other 'features' that change the way insert, update or delete behave.