Good explanation of cascade (ON DELETE/UPDATE) behavior
I don't design schemas everyday, but when I do, I try to setup cascade updates/deletes correctly to make administration easier. I understand how cascades work, but I can never remember which table is which.
For example, if I have two tables -
Parent
andChild
- with a foreign key onChild
that referencesParent
and hasON DELETE CASCADE
, which records trigger a cascade and which records get deleted by the cascade? My first guess would be theChild
records get deleted whenParent
records are deleted, sinceChild
records depend onParent
records, but theON DELETE
is ambiguous; it could mean delete theParent
record when theChild
record is deleted, or it could mean delete theChild
record when theParent
is deleted. So which is it?I wish the syntax was
ON PARENT DELETE, CASCADE
,ON FOREIGN DELETE, CASCADE
or something similar to remove the ambiguity. Does anyone have any mnemonics for remembering this?If you like the
Parent
andChild
terms and you feel they are easy to be remembered, you may like the translation ofON DELETE CASCADE
toLeave No Orphans!
Which means that when a
Parent
row is deleted (killed), no orphan row should stay alive in theChild
table. All children of the parent row are killed (deleted), too. If any of these children has grandchildren (in another table through another foreign key) and there isON DELETE CASCADE
defined, these should be killed, too (and all descendants, as long as there is a cascade effect defined.)The
FOREIGN KEY
constraint itself could also be described asAllow No Orphans!
(in the first place). NoChild
should ever be allowed (written) in the child table if it hasn't aParent
(a row in the parent table).For consistency, the
ON DELETE RESTRICT
can be translated to the (less aggresive)You Can't Kill Parents!
Only childless rows can be killed (deleted.)I feel something is still missing in the analogy. Can't a child have more than one parent? In this case will killing one parent make the child an orphan?
@Jus12 No, foreign key constraints work with 1 parent only. It's not a good analogy regarding this aspect.
@ypercube: Is this not allowed? `Order(custID, itemID, orderID)` where `custID` refers to a primary key in `Customers` table and `itemID` refers to a primary key in `Items` table. Won't `Order` have two parents?
@Jus12 That is allowed of course but it would be 2 foreign key constraints. Then every child (order) would have a parent (customer) and a parent (item). The behaviours of the 2 FKs might differ though. (So, for example, it could be that killing customers would kill all their (orders) children but killing items would not kill their orders.)
The parent analogy can still work if we don't say "orphan". If there are two references to two separate parents on a child entry, this can still be seen as a child of a divorced couple. Restrict: "I won't let you kill my mom" Cascade: "If you kill my dad, I will also die"
For example, if I have two tables - Parent and Child - where Child records are owned by Parent records, which table needs the ON DELETE CASCADE?
ON DELETE CASCADE is an optional clause in a foreign key declaration. So it goes with the foreign key declaration. (Meaning, in the "child" table.)
...it could mean delete the Parent record when the Child record is deleted, or it could mean delete the Child record when the Parent is deleted. So which is it?
One way to interpret a foreign key declaration is, "All valid values for this column come from 'that_column' in 'that_table'." When you delete a row in the "child" table, nobody cares. It doesn't affect data integrity.
When you delete a row from the "parent" table--from "that_table"--you remove a valid value from the possible values for the "child" table. To maintain data integrity, you have to do something to the "child" table. Cascading deletes is one thing you can do.
Chapter and verse, from PostgreSQL docs.
Restricting and cascading deletes are the two most common options. RESTRICT prevents deletion of a referenced row. NO ACTION means that if any referencing rows still exist when the constraint is checked, an error is raised; this is the default behavior if you do not specify anything. (The essential difference between these two choices is that NO ACTION allows the check to be deferred until later in the transaction, whereas RESTRICT does not.) CASCADE specifies that when a referenced row is deleted, row(s) referencing it should be automatically deleted as well. There are two other options: SET NULL and SET DEFAULT. These cause the referencing column(s) in the referencing row(s) to be set to nulls or their default values, respectively, when the referenced row is deleted. Note that these do not excuse you from observing any constraints. For example, if an action specifies SET DEFAULT but the default value would not satisfy the foreign key constraint, the operation will fail.
thanks but what other things can you do?
@leeCoder: Updated answer with references to the docs.
SQL:2011 Spec
There are five options for
ON DELETE
, andON UPDATE
that can apply to theFOREIGN KEY
. These are called<referential actions>
, directly from the SQL:2011 specON DELETE CASCADE
: if a row of the referenced table is deleted, then all matching rows in the referencing table are deleted.ON DELETE SET NULL
: if a row of the referenced table is deleted, then all referencing columns in all matching rows of the referencing table to be set to null.ON DELETE SET DEFAULT
: if a row of the referenced table is deleted, then all referencing columns in all matching rows of the referencing table to be set to the column’s default value.ON DELETE RESTRICT
: it is prohibited to delete a row of the referenced table if that row has any matching rows in the referencing table.ON DELETE NO ACTION
(the default): there is no referential delete action; the referential constraint only specifies a constraint check.
The foreign key establishes the dependent relationship. The
<referential action>
determines what happens when the relationship is dissolved.Example / Metaphor / Explanation
For this example, we will accept the common model of society and economy: where every
business
is a company that maintains a relationship to thebourgeoisie
through afatcat_owner
.CREATE TABLE bourgeoisie( fatcat_owner varchar(100) PRIMARY KEY ); INSERT INTO bourgeoisie(fatcat_owner) VALUES ( 'Koch Brothers' ); CREATE TABLE business ( name varchar(100), fatcat_owner varchar(100) REFERENCES bourgeoisie ); INSERT INTO business(name, fatcat_owner) VALUES ('Georgia-Pacific', 'Koch Brothers');
If all
business
es are directly affected bybourgeoisie
by way of theirfatcat_owner
then what do you do after the workers' revolution when that you purge thefatcat_owner
s and have a classless society?-- Viva la revolución BEGIN; DELETE FROM bourgeoisie; END;
You have a few options here,
- Stop the revolution. In SQL parlance,
RESTRICT
. Some people believe this is the lesser evil, but they're usually wrong. Allow it to go on. If so when the revolution happens SQL gives you four options,
SET NULL
-- leave it blank. Who knows, maybe capitalism is restored thebourgeoisie
comes up and oligarchs fill the roll of thefatcat_owners
. Important note, the column must beNULLABLE
(notNOT NULL
) or this can never happen.SET DEFAULT
-- perhaps you had aDEFAULT
that handled this? ADEFAULT
can call a function. Maybe your schema is revolution-ready, already.CASCADE
-- there is no damage control. If thebourgeoisie
goes, so does thebusiness
. If a business must have afatcat_pig
, then sometimes it makes more sense to lose the data rather than have a non-business in abusiness
table.NO ACTION
-- this is essentially a method of delaying the check, in MySQL it's no different thanRESTRICT
, but in PostgreSQL, you'd be able to do-- Not a real revolution. -- requires constraint be DEFERRABLE INITIALLY DEFERRED BEGIN; SET CONSTRAINTS ALL DEFERRED; DELETE FROM bourgeoisie; INSERT INTO bourgeoisie VALUES ( 'Putin' ); UPDATE business SET fatcat_pig = 'Putin'; END;
In such a system, the constraint is validated only before the transaction commits. This may result in stopping the revolution, but you can recover in the transaction -- for some degree of "recover."
Does `referenced` table means parent table and `referencing` table means child table?
@sg552 Yes, you understood it correctly.
A simple mnemonic would be
ON DELETE of parent CASCADE [by deleting] here
That tells you which deletes (deletes of the parent) get cascaded, where the ON DELETE CASCADE statement goes (on the child), and what gets deleted (the child).
well, perhaps we can rationalize the syntax. Let's take a Python example:
class Parent(self): # define parent's fields class Child(self): # define child's fields parent_pk_is_childs_foreign_key = models.ForeignKey(Parent, on_delete=models.CASCADE)
what this line says is on_delete of the Parent (which is accidentally mentioned in the statement), please cascade the deletion onto the child. That is why the CASCADE statement is defined at the child level, it marks those children that need to be deleted
For example if you had another class
class GrownUpChild(self): # define grown up child's fields parent_pk_is_childs_foreign_key = models.ForeignKey(Parent, on_delete=models.DO_NOTHING)
this structure would clearly show which of the children need to be removed (Child) and which are to stay (GrownUpChild) albeit orphaned
[Edit: Given the context of the discussion, specifically in cases of on_delete=models.CASCADE etc, ] in fact it is often a desired behavior to leave children of a deleted parent, due to auditing and reporting reasons, as well as recovering accidental deletions. [ of course enterprise level software will be built around such behavior and will flag deleted records as deleted=1 instead of actually deleting them and also will not include them in any queries for the front end, minus some specially designed reports. In addition it will have a function of purging the deleted==1 records from the database, which will usually be executed by the UI administrator, often avoiding any involvement from the database administrator's side.]
_'In fact it is often a desired behavior to leave children of a deleted parent, due to auditing and reporting reasons, as well as recovering accidental deletions'_ - that would be disastrous in a (sane) database.
@dezso thanks for your input. However, multiple enterprise level CRM systems do exactly that.
you sound like a savvy database admin :) I can totally hear your point. The software I mentioned above that does it, actually has a function to remove those deleted = 1 manually so it is up to the admin of the app to make this call. Normally database administrator is not even involved in maintaining this aspect. And besides, the entire software's database class is built around the concept, so it always checks for deleted flag in crud operations
Yes, that is a known and sane pattern - but then you should possibly change the wording above to reflect this.
License under CC-BY-SA with attribution
Content dated before 6/26/2020 9:53 AM
Jus12 6 years ago
I feel something is still missing in the analogy. Can't a child have more than one parent? In this case will killing one parent make the child an orphan?