Good explanation of cascade (ON DELETE/UPDATE) behavior
I don't design schemas everyday, but when I do, I try to setup cascade updates/deletes correctly to make administration easier. I understand how cascades work, but I can never remember which table is which.
For example, if I have two tables -
Child- with a foreign key on
ON DELETE CASCADE, which records trigger a cascade and which records get deleted by the cascade? My first guess would be the
Childrecords get deleted when
Parentrecords are deleted, since
Childrecords depend on
Parentrecords, but the
ON DELETEis ambiguous; it could mean delete the
Parentrecord when the
Childrecord is deleted, or it could mean delete the
Childrecord when the
Parentis deleted. So which is it?
I wish the syntax was
ON PARENT DELETE, CASCADE,
ON FOREIGN DELETE, CASCADEor something similar to remove the ambiguity. Does anyone have any mnemonics for remembering this?
If you like the
Childterms and you feel they are easy to be remembered, you may like the translation of
ON DELETE CASCADEto
Leave No Orphans!
Which means that when a
Parentrow is deleted (killed), no orphan row should stay alive in the
Childtable. All children of the parent row are killed (deleted), too. If any of these children has grandchildren (in another table through another foreign key) and there is
ON DELETE CASCADEdefined, these should be killed, too (and all descendants, as long as there is a cascade effect defined.)
FOREIGN KEYconstraint itself could also be described as
Allow No Orphans!(in the first place). No
Childshould ever be allowed (written) in the child table if it hasn't a
Parent(a row in the parent table).
For consistency, the
ON DELETE RESTRICTcan be translated to the (less aggresive)
You Can't Kill Parents!Only childless rows can be killed (deleted.)
I feel something is still missing in the analogy. Can't a child have more than one parent? In this case will killing one parent make the child an orphan?
@Jus12 No, foreign key constraints work with 1 parent only. It's not a good analogy regarding this aspect.
@ypercube: Is this not allowed? `Order(custID, itemID, orderID)` where `custID` refers to a primary key in `Customers` table and `itemID` refers to a primary key in `Items` table. Won't `Order` have two parents?
@Jus12 That is allowed of course but it would be 2 foreign key constraints. Then every child (order) would have a parent (customer) and a parent (item). The behaviours of the 2 FKs might differ though. (So, for example, it could be that killing customers would kill all their (orders) children but killing items would not kill their orders.)
The parent analogy can still work if we don't say "orphan". If there are two references to two separate parents on a child entry, this can still be seen as a child of a divorced couple. Restrict: "I won't let you kill my mom" Cascade: "If you kill my dad, I will also die"
For example, if I have two tables - Parent and Child - where Child records are owned by Parent records, which table needs the ON DELETE CASCADE?
ON DELETE CASCADE is an optional clause in a foreign key declaration. So it goes with the foreign key declaration. (Meaning, in the "child" table.)
...it could mean delete the Parent record when the Child record is deleted, or it could mean delete the Child record when the Parent is deleted. So which is it?
One way to interpret a foreign key declaration is, "All valid values for this column come from 'that_column' in 'that_table'." When you delete a row in the "child" table, nobody cares. It doesn't affect data integrity.
When you delete a row from the "parent" table--from "that_table"--you remove a valid value from the possible values for the "child" table. To maintain data integrity, you have to do something to the "child" table. Cascading deletes is one thing you can do.
Chapter and verse, from PostgreSQL docs.
Restricting and cascading deletes are the two most common options. RESTRICT prevents deletion of a referenced row. NO ACTION means that if any referencing rows still exist when the constraint is checked, an error is raised; this is the default behavior if you do not specify anything. (The essential difference between these two choices is that NO ACTION allows the check to be deferred until later in the transaction, whereas RESTRICT does not.) CASCADE specifies that when a referenced row is deleted, row(s) referencing it should be automatically deleted as well. There are two other options: SET NULL and SET DEFAULT. These cause the referencing column(s) in the referencing row(s) to be set to nulls or their default values, respectively, when the referenced row is deleted. Note that these do not excuse you from observing any constraints. For example, if an action specifies SET DEFAULT but the default value would not satisfy the foreign key constraint, the operation will fail.
There are five options for
ON DELETE, and
ON UPDATEthat can apply to the
FOREIGN KEY. These are called
<referential actions>, directly from the SQL:2011 spec
ON DELETE CASCADE: if a row of the referenced table is deleted, then all matching rows in the referencing table are deleted.
ON DELETE SET NULL: if a row of the referenced table is deleted, then all referencing columns in all matching rows of the referencing table to be set to null.
ON DELETE SET DEFAULT: if a row of the referenced table is deleted, then all referencing columns in all matching rows of the referencing table to be set to the column’s default value.
ON DELETE RESTRICT: it is prohibited to delete a row of the referenced table if that row has any matching rows in the referencing table.
ON DELETE NO ACTION(the default): there is no referential delete action; the referential constraint only specifies a constraint check.
The foreign key establishes the dependent relationship. The
<referential action>determines what happens when the relationship is dissolved.
Example / Metaphor / Explanation
For this example, we will accept the common model of society and economy: where every
businessis a company that maintains a relationship to the
CREATE TABLE bourgeoisie( fatcat_owner varchar(100) PRIMARY KEY ); INSERT INTO bourgeoisie(fatcat_owner) VALUES ( 'Koch Brothers' ); CREATE TABLE business ( name varchar(100), fatcat_owner varchar(100) REFERENCES bourgeoisie ); INSERT INTO business(name, fatcat_owner) VALUES ('Georgia-Pacific', 'Koch Brothers');
businesses are directly affected by
bourgeoisieby way of their
fatcat_ownerthen what do you do after the workers' revolution when that you purge the
fatcat_owners and have a classless society?
-- Viva la revolución BEGIN; DELETE FROM bourgeoisie; END;
You have a few options here,
- Stop the revolution. In SQL parlance,
RESTRICT. Some people believe this is the lesser evil, but they're usually wrong.
Allow it to go on. If so when the revolution happens SQL gives you four options,
SET NULL-- leave it blank. Who knows, maybe capitalism is restored the
bourgeoisiecomes up and oligarchs fill the roll of the
fatcat_owners. Important note, the column must be
NOT NULL) or this can never happen.
SET DEFAULT-- perhaps you had a
DEFAULTthat handled this? A
DEFAULTcan call a function. Maybe your schema is revolution-ready, already.
CASCADE-- there is no damage control. If the
bourgeoisiegoes, so does the
business. If a business must have a
fatcat_pig, then sometimes it makes more sense to lose the data rather than have a non-business in a
NO ACTION-- this is essentially a method of delaying the check, in MySQL it's no different than
RESTRICT, but in PostgreSQL, you'd be able to do
-- Not a real revolution. -- requires constraint be DEFERRABLE INITIALLY DEFERRED BEGIN; SET CONSTRAINTS ALL DEFERRED; DELETE FROM bourgeoisie; INSERT INTO bourgeoisie VALUES ( 'Putin' ); UPDATE business SET fatcat_pig = 'Putin'; END;
In such a system, the constraint is validated only before the transaction commits. This may result in stopping the revolution, but you can recover in the transaction -- for some degree of "recover."
Does `referenced` table means parent table and `referencing` table means child table?
well, perhaps we can rationalize the syntax. Let's take a Python example:
class Parent(self): # define parent's fields class Child(self): # define child's fields parent_pk_is_childs_foreign_key = models.ForeignKey(Parent, on_delete=models.CASCADE)
what this line says is on_delete of the Parent (which is accidentally mentioned in the statement), please cascade the deletion onto the child. That is why the CASCADE statement is defined at the child level, it marks those children that need to be deleted
For example if you had another class
class GrownUpChild(self): # define grown up child's fields parent_pk_is_childs_foreign_key = models.ForeignKey(Parent, on_delete=models.DO_NOTHING)
this structure would clearly show which of the children need to be removed (Child) and which are to stay (GrownUpChild) albeit orphaned
[Edit: Given the context of the discussion, specifically in cases of on_delete=models.CASCADE etc, ] in fact it is often a desired behavior to leave children of a deleted parent, due to auditing and reporting reasons, as well as recovering accidental deletions. [ of course enterprise level software will be built around such behavior and will flag deleted records as deleted=1 instead of actually deleting them and also will not include them in any queries for the front end, minus some specially designed reports. In addition it will have a function of purging the deleted==1 records from the database, which will usually be executed by the UI administrator, often avoiding any involvement from the database administrator's side.]
_'In fact it is often a desired behavior to leave children of a deleted parent, due to auditing and reporting reasons, as well as recovering accidental deletions'_ - that would be disastrous in a (sane) database.
@dezso thanks for your input. However, multiple enterprise level CRM systems do exactly that.
you sound like a savvy database admin :) I can totally hear your point. The software I mentioned above that does it, actually has a function to remove those deleted = 1 manually so it is up to the admin of the app to make this call. Normally database administrator is not even involved in maintaining this aspect. And besides, the entire software's database class is built around the concept, so it always checks for deleted flag in crud operations