I have three basic types of entities: People, Businesses, and Assets. Each Asset can be owned by one and only one Person or Business. Each Person and Business can own from 0 to many Assets. What would be the best practice for storing this type of conditional relationship in Microsoft SQL Server?
My initial plan is to have two nullable foreign keys in the Assets table, one for People and one for Businesses. One of these values will be null, while the other will point to the owner. The problem I see with this setup is that it requires application logic in order to be interpreted and enforced. Is this really the best possible solution or are there other options?
Introducing SuperTypes and SubTypes
I suggest that you use supertypes and subtypes. First, create
CREATE TABLE dbo.PartyType ( PartyTypeID int NOT NULL identity(1,1) CONSTRAINT PK_PartyType PRIMARY KEY CLUSTERED Name varchar(32) CONSTRAINT UQ_PartyType_Name UNIQUE ); INSERT dbo.PartyType VALUES ('Person'), ('Business');
CREATE TABLE dbo.Party ( PartyID int identity(1,1) NOT NULL CONSTRAINT PK_Party PRIMARY KEY CLUSTERED, FullName varchar(64) NOT NULL, BeginDate smalldatetime, -- DOB for people or creation date for business PartyTypeID int NOT NULL CONSTRAINT FK_Party_PartyTypeID FOREIGN KEY REFERENCES dbo.PartyType (PartyTypeID) );
Then, if there are columns that are unique to a Person, create a
Person table with just those:
CREATE TABLE dbo.Person ( PersonPartyID int NOT NULL CONSTRAINT PK_Person PRIMARY KEY CLUSTERED CONSTRAINT FK_Person_PersonPartyID FOREIGN KEY REFERENCES dbo.Party (PartyID) ON DELETE CASCADE, -- add columns unique to people );
And if there are columns that are unique to Businesses, create a
Business table with just those:
CREATE TABLE dbo.Business ( BusinessPartyID int NOT NULL CONSTRAINT PK_Business PRIMARY KEY CLUSTERED CONSTRAINT FK_Business_BusinessPartyID FOREIGN KEY REFERENCES dbo.Party (PartyID) ON DELETE CASCADE, -- add columns unique to businesses );
Usage and Notes
Asset table will look something like this:
CREATE TABLE dbo.Asset ( AssetID int NOT NULL identity(1,1) CONSTRAINT PK_Asset PRIMARY KEY CLUSTERED, PartyID int NOT NULL CONSTRAINT FK_Asset_PartyID FOREIGN KEY REFERENCES dbo.Party (PartyID), AssetTag varchar(64) CONSTRAINT UQ_Asset_AssetTag UNIQUE );
The relationship the supertype Party table shares with the subtype tables Business and Person is “one to zero-or-one”. Now, while the subtypes generally have no corresponding row in the other table, there is the possibility in this design of having a Party that ends up in both tables. However, you may actually like this: sometimes a person and a business are nearly interchangeable. If not useful, while a trigger to enforce this will be fairly easily done, the best solution is probably to add the
PartyTypeID column to the subtype tables, making it part of the PK & FK, and put a CHECK constraint on the
The beauty of this model is that when you want to create a column that has a constraint to a business or a person, then you make the constraint to the appropriate table instead of the party table.
Also, if desired, turning on cascade delete on the constraints can be useful, as well as an
INSTEAD OF DELETE trigger on the subtype tables that instead delete the corresponding IDs from the supertype table (this guarantees no supertype rows that have no subtype rows present). These queries are very simple and work at the entire-row-exists-or-doesn’t-exist level, which in my opinion is a gigantic improvement over any design that requires checking column value consistency.
Also, please notice that in many cases columns that you would think should go in one of the subtype tables really can be combined in the supertype table, such as social security number. Call it TIN (taxpayer identification number) and it works for both businesses and people.
ID Column Naming
The question of whether or not to call the column in the Person table
PersonPartyID is your own preference, but I think it’s best to call these
BusinessPartyID—tolerating the cost of the longer name, this avoids two types of confusion. E.g., someone unfamiliar with the database sees
BusinessID and doesn’t know this is a
PartyID, or sees
PartyID and doesn’t know it is restricted by foreign key to just those in the
If you want to create views for the
Business tables, they can even be materialized views since it’s a simple inner join, and there you could rename the
PersonPartyID column to
PersonID if you were truly so inclined (though I wouldn’t). If it’s of great value to you, you can even make
INSTEAD OF INSERT and
INSTEAD OF UPDATE triggers on these views to handle the inserts to the two tables for you, making the views appear completely like their own tables to many application programs.
Making Your Proposed Design Work As-Is
Also, I hate to mention it, but if you want to have a constraint in your proposed design that enforces only one column being filled in, here is code for that:
ALTER TABLE dbo.Assets ADD CONSTRAINT CK_Asset_PersonOrBusiness CHECK ( CASE WHEN PersonID IS NULL THEN 0 ELSE 1 END + CASE WHEN BusinessID IS NULL THEN 0 ELSE 1 END = 1 );
However, I don’t recommend this solution.
A natural third subtype to add is organization, in the sense of something that people and businesses can have membership in. Supertype and subtype also elegantly solve customer/employee, customer/vendor, and other problems similar to the one you presented.
Be careful not to confuse “Is-A” with “Acts-As-A”. You can tell a party is a customer by looking in your order table or viewing the order count, and may not need a Customer table at all. Also don’t confuse identity with life cycle: a rental car may eventually be sold, but this is a progression in life cycle and should be handled with column data, not table presence–the car doesn’t start out as a
RentalCar and get turned into a
ForSaleCar later, it’s a Car the whole time. Or perhaps a
RentalItem, maybe the business will rent other things too. You get the idea.
It may not even be necessary to have a
PartyType table. The party type can be determined by the presence of a row in the corresponding subtype table. This would also avoid the potential problem of the
PartyTypeID not matching the subtype table presence. One possible implementation is to keep the
PartyType table, but remove
PartyTypeID from the Party table, then in a view on the Party table return the correct
PartyTypeID based on which subtype table has the corresponding row. This won’t work if you choose to allow parties to be both subtypes. Then you would just stick with the subtype views and know that the same value of
PersonID refer to the same party.
Further Reading On This Pattern
Please see A Universal Person and Organization Data Model for a more complete and theoretical treatment.
I recently found the following articles to be useful for describing some alternate approaches for modeling inheritance in a database. Though specific to Microsoft’s Entity Framework ORM tool, there’s no reason you couldn’t implement these yourself in any DB development:
- Table Per Hierarchy
- Table Per Type (this is what I advocate above as the only fully normalized method of implementing inheritance in a database)
- Table Per Concrete Class
- Or a more brief overview of these three ways: How to choose an inheritance strategy
P.S. I have switched, more than once, my opinion on column naming of IDs in subtype tables, due to having more experience under my belt.