Anti-pattern: entity-attribute-value (EAV)

The Entity-Attribute-Value (EAV) [] data model has appeared in many different guises over the years, and comes under a variety of names, such as Object-Attribute-Value Model and Open Schema.

The basic idea is to record data using a single table, typically with three columns - the entity, attribute and value. This munges all the data together, the result being that the DB schema is fixed and yet open for arbitrary data models. This seems particularly inviting and therefore the pattern has been applied repeatedly on many projects. As a result a lot is known about how well (or badly) it works in practice.

It is arguable there are valid uses of the pattern. The example sited in Wikipedia is for clinical data, where patient records would otherwise need thousands of columns in order to support all the possible attributes on a patient.

However, attribute volatility isn't usually present for most database designs, so the EAV pattern doesn't offer any particular benefits. In fact there are many significant disadvantages:

In the literature there have been multi-million dollar projects that have failed as a result of the EAV pattern. For example, see Tim Gorman's article Bad CaRMa (2006) discussing a debacle that generated an estimated ROI of minus $10 million in the early 90's.

Radek Maziarka has written a blog post about it: entity-attribute-value fallacy (2018).

Here are some quotes from the DB theorists and practitioners:

[Roy Hann on comp.database.theory]:

.... What you have re-invented here is the tired old entity-attribute-value (EAV) design (much beloved of medical research wonks for some reason). What you are asking to ignore is *the* killer argument against EAV. Whether you realize it or not yet, you are undertaking to write *all* the code to implement and enforce any rules about what makes sense. (Or else you are planning to do without those rules, in which case you can just go ahead and do as you like but don't come back here looking for sympathy later.

[David Cressey on comp.database.theory]:

There is a very simple reason why people keep reinventing E-A-V. They want all the benefits that go with structured data together with all the benefits that go with unstructured data. They aren't sophisticated enough to see the trade-off in advance

[Joe Celko on comp.database.theory]:

You will see it called EAV (entity-attribute-value) in the literature. It is an attempt to put metadata into a RDBMS and it falls apart in about one year in production work. You cannot write any constraints, DRI is impossible and every typo becomes a new attribute. The simplest queries require huge resources and complex queries are unmaintainable.
... It is much simpler, easier, and effective to have a set of tools that lets you rapidly react to design changes.

[Kennith Downs on comp.database.theory]:

... Some months ago the EAV thing took root in the mind of somebody who tried to hire me to implement it. He was convinced he had thought of it himself and that nobody else ever had. The politics of the situation meant I could not simply dismiss it, I had to take it apart piece by piece with him. He gave me the very strong impression he did not believe me and I simply was not trying hard enough, but it is good to come here and see that others have come to the same conclusion, that my judgement and experience are not unique.

[JOG comp.database.theory]:

Just because they do this doesn't make it correct. A user's failings do not reflect on the system they are using. When you employ EAV to circumvent a schema, you are not using the RM as the logical model, full stop.

[Joe Celko on comp.database.theory]:

This design error is called EAV (entity-attribute-value) and it is common enough to have a name -- like "cancer" :)

[Bernard Peek on comp.database.theory]:

The entity-attribute-value system is an alternative but I don't believe it is much better, and I speak as one who has built such systems. You still need every user to have pretty much the same training as a DBA. If you can't ensure that all of your users are of that calibre then the system will rapidly become unworkable. I sometimes use an e-a-v system as part of the process of designing a new system, but I haven't yet handed over a finished design with it still in place.

[Tony on comp.database.theory]:

In a relational database you should be highly concerned with the integrity of the data. This is best done declaratively, via check constraints and foreign key constraints. If you are validating against code tables or "EAV" tables, you won't have that option. Will you use database triggers instead, or just rely on your application's validation? If you don't enforce the integrity in the database, it is sure to become corrupted in no time.

[Laconic2 on comp.database.theory]:

One point I want to make clear about the EAV solution. At the conceptual level of data abstraction, this is tantamount to giving the users CREATE TABLE or ADD COLUMN privileges. In essence, what we are permitting the users to do is revise and extend the portion of the subject matter modeled by the database. That is why, IMO, "You still need every user to have pretty much the same training as a DBA."
Intelligent use of EAV and intelligent use of CREATE TABLE, ADD COLUMN require the same level of sophistication concerning mapping the subject matter into data. At other levels of abstraction they are, of course, different. But this is what the proponents of EAV sometimes overlook. People who can't do data modeling can't use EAV.

[Tony on comp.database.theory]:

... you use the ghastly "EAV" approach ...

[Michael Valentine Jones DBA]:

I worked on a system where someone actually implemented this. It was a gigantic mess. The problem is that relational database engines are just not designed to work with this model, so instead of having the data model stored in the physical table structure, all knowledge about the data ends up being stored in the application.
Also, the queries become nightmares of inefficiency. Think about scanning a 40 million row table to do a zip code validation. Now do the same, only join it to itself 15 times for a variety of different attributes.
Basically, this approach takes the hard job of constantly updating a rapidly changing data model and turns it into something a hundred times worse.

[Paul Geoffrey on comp.database.theory]:

About twice a year, the EAV data model would crop up. The first few times I took the trouble to try to do the right thing. When it works it's because the application is small and not very business critical.
But it's almost always an utter disaster. Once the data is in the application, it's impossible to get it out again without writing a small-fridge sized (bigger than a bread-box, smaller than a washing- machine) chunk of code. And even if you can, the data integrity is typically shot. So you really can't migrate to any kind of replacement system.

[Frank Hamersley on comp.databases.theory]:

<Confession> I have (once only) deployed an EAV solution - as a very isolated interface to an ETL gizmo (ie. not actually a "system") where they couldn't define all the requirements but knew when it had to be finished. The dynamism they needed in the schema would not have survived the (production) change control process and still met the drop dead date so a special case was "approved". </Confession>
In mitigation it is not self referencing like the "best" examples seen in this thread...so it is quite plain in comparison.
The monster is still in place but a pig on space and performance albeit it only has to perform once a month. I have my .303 sight firmly trained on it and in the near term it will cop a slug right between the eyes!

[Joe Celko on comp.database.theory]:

Every few months a Newbie re-invented the EAV fallacy and thinks they are so clever.