Better At Oracle

A collection of tutorials, code and tools to help you get better using Oracle


30 June 2011

Referential Integrity - uniqueness, nulls and checks

This will not be the last application that uses this database, you can be sure of that. Applications come and go, but the data lives on forever. No matter how good and fancy your application, it isn't going to please your users if the INTEGRITY of the data is not 100%. Surely the application can do a good job of keeping the data correct, or can it?

Email From the Boss

Great news guys!

After some hard fought negotiations we have won a fantastic contract to build a new online order portal, this is a great chance for your team to impress some senior people at this company and secure more work in the future.

The architecture of this new portal has already been decided - it will be a multi-tier solution, with a Java based application server and an Oracle database behind the scenes. Even better, Sally, our database design guru has already designed the database schema but before she could get things finished off, she disappeared on her round-the-world trip, so it's up to your team to get it completed.

The client has grown tired of all sorts of data errors in their existing paper based order system. It's a real mess, often with forms half filled out, missing postcodes and invalid email addresses. I have assured them that no order can be entered into our new system that does not have all the fields present and correct. Of course, we cannot help it if the order desk enters the wrong information, but we can damn well make sure something sensible is stored in each field.

This project is right up my street, but unfortunately I am off schmoozing with clients all week, so I am counting on you to get this right.

The Boss.

PS - this company is a small operation, but they are certain they will hit the big time soon, with millions of orders per day, so make sure the solution performs as well as it possibly can! If we do this well, a ton more work will come our way.

The email from the boss makes a few things clear. The main problem the client currently has is bad Data Integrity - paper based orders are often only half filled out and it causes all sorts of problems.

Another thing is clear - you better not mess this up!

Team Meeting

You are sure you have heard something about Data and Referential Integrity before, but you also overheard another team talking about how it is inflexible and slow ... time to call a team meeting to see what the experts have to say.

You: Right guys, this new project has me in a pickle. Just how are we going to ensure bad data doesn't get into the order system?

John: Simple - all we need to do is stick some Javascript validation on the web pages. That is quick and easy and will stop invalid data getting into our application code.

Sam: Hold on a moment John. Some people turn Javascript off in their browsers, I don't think we can depend solely on that.

John: Good point. The Javascript is nice as it gives the user instant feedback if they do something wrong, but it sounds like we need more. Perhaps we should implement the business rules in the Java application tier too.

You: Well, that sounds like we are duplicating logic in two places, and timescales are pretty tight here. Lets leave the Javascript for version 2, but performing checks in the application tier sounds like the way to go.

Tom: Hold on a moment guys. Have you heard of Referential Integrity? I was reading a blog article yesterday that claimed best practice is to place the business rules as close to the data as possible, ie in the database. Apparently Database Integrity Constraints are used for this.

You: Interesting, but I heard that Referential Integrity was some slow complex technology from the nineties?

Tom: Not according to this article. It claimed constraints in the database were the only way to go if you want a successful database application.

All: Why?

Tom: Well, lots of reasons actually, let me dig out the article ...

More than one way to do it

Well the team meeting was certainly useful. Tom's arguments along with some searching on the Internet showed there is a lot of confusion out there. The database gurus will say you simply must enforce business rules and data integrity at the database level, while Java programmers often dismiss this as nonsense - doing integrity checks in the database is way too expensive, they say, and it is much easier and more flexible to do it in the application anyway, win win! So who is correct?

The Boss did say this company is currently small, but is destined for great things. That could lead to all sorts of applications accessing the database, not just this web portal. There could be mobile applications, batch load applications, developers logging directly into the database to correct some data, and who knows what else.

You really don't want to implement the same business rules in all these places, who knows what mistakes will be made. There should be a central location that defines the data constraints in one place and the only common part of these applications is the database.

Looks like its time to go and investigate what the database can do, and if it really is as slow as the Java people say, but first lets looks at Sally's schema design to see just want sort of data constraints need enforced.

The schema

Sally's schema design

Lets look at each table in turn.

The Products Table

The first thing noted on the Products table is that no two products can have the same name. In database lingo, we say that Product_Name must be unique, and all modern databases have a way of enforcing uniqueness using a unique index. Anytime you come across a scenario where a column or a combination of columns must be unique, you should be thinking "I need to add a unique index on these columns". Lets see how it works. First create the products table:

ORA11G> create table products (
          product_id       integer,
          product_name     varchar2(255),
          sales_start_date date,
          sales_end_date   date,
          price            number
        );

Now, lets add our first constraint on this table:

ORA11G> create unique index products_uk1 on products (product_name);

Now insert some data and see what happens:

ORA11G> insert into products values (
          1,
          'First Product',
          sysdate,
          null,
          9.99
        );

ORA11G> insert into products values (
          2,
          'Second Product',
          sysdate,
          null,
          19.99
        );

ORA11G> commit;

If all goes to plan, Oracle should have created two rows in Products table. Now try and insert another product with the same name:

ORA11G> insert into products values (
          3,
          'First Product',
          sysdate,
          null,
          9.99
        );

ERROR at line 1:
ORA-00001: unique constraint (PRODUCTS_UK1) violated

Perfect - Oracle noticed that product_name was already in the table and prevented another row with the same name being inserted. Notice that the error message makes it very clear about what has gone wrong.

Watch out for NULL

The second note Sally has made about the Product_Name column is that all products must have a name. Going back to our database lingo, we say that Product_Name cannot be null. Lets try and insert some records into the products table with null (or blank) product names:

ORA11G> insert into products values (
  4,
  null,
  sysdate,
  null,
  9.99
);

ORA11G> insert into products values (
  5,
  null,
  sysdate,
  null,
  9.99
);

Ooops - Oracle quite happily inserted a row into Products with a blank productname - even worse, it actually inserted two rows with same blank ProductName! What is going on, this unique index isn't working at all.

You: Tom, I have been playing with these database constraints you mentioned, and I am confused!

Tom: Well I am no expert, but I will try and help, whats up?

You: Well, I created a unique index and Oracle prevented me from adding two products with the same name. Then I tried inserting a product with no name at all, which should not be allowed, but Oracle inserted it quite happily! Even worse, it let me insert two rows with no name - surely if I have one with no name, the unique constraint should stop me inserting more!?

Tom: Ah ha - you have been caught out by the mysteries of NULL!

You: NULL?

Tom: In database lingo, when a field is set to NULL it is in a special state. Null is not blank like an empty string in Java - null means unknown, that the database doesn't know what is in the field at all.

You: But surely if we have a unique constraint one null is the same as another null and shouldn't be allowed?

Tom: Nope. Always remember NULL is unknown. NULL is never equal to NULL. It's almost like all nulls are different, so your unique constraint is working correctly.

You: Huumm, OK. But how do I stop blank product names getting inserted?

Tom: Have a look at the "not null" constraint

So as Tom said, be careful with NULL fields. They are unknown values, and Oracle will never return true if you issue a query that contains a where clause like:

where null_field_one = null_field_two

NULL never equals NULL.

So what about this NOT NULL constraint Tom mentioned? Lets add one to the Products table:

ORA11G> alter table products modify product_name not null;

ERROR at line 1:
ORA-02296: cannot enable (APP.) - null values found

What happened here? Well, those rows that you added with null product names are in the table, and Oracle has noticed that. Oracle will never change your data to allow it to enable a constraint - it is up to you to fix it:

ORA11G> delete from products
        where product_name is null;

ORA11G> commit;

ORA11G> alter table products modify product_name not null;

Perfect, now Oracle should prevent blank product names, and duplicate product names getting into the products table - Try inserting some invalid data and see what happens.

Checking your values

The next point of note in the products table is that all products must have a price, and that price should always be greater than zero. You already know how to handle the first part:

ORA11G> alter table products modify price not null;

What about ensuring the price is always greater than zero? Time to speak to Tom once again ...

You: Tom, the unique and not null constraints are great! With hardly any code at all, I can keep my data nice and clean. However another thing has come up I don't think the database can handle.

Tom: You will be surprised what Oracle can do, its pretty impressive!

You: Well, Sally says that all products must have a price greater than zero. Surely Oracle cannot check the values going into a column?

Tom: Oh yes it can! You want to look into Check Constraints

You: Interesting ...

According to Tom, Oracle has another type of constraint called check constraints. They are used to check values being inserted into a column meet certain conditions or contain certain values. Lets see how to create one to meet our price requirement:

ORA11G> alter table products add constraint products_ck1
  check (price > 0);

Now try and insert a price with value less than zero:

ORA11G> insert into products values (
          6,
         'Product with negative price',
         sysdate,
         null,
         -9.99
       );

ERROR at line 1:
ORA-02290: check constraint (APP.PRODUCTS_CK1) violated

Excellent! Again notice how clear the error message is.

The final note Sally has on the Products table, states that the sale start date should always be less than the sale end date. This can also be enforced with a check constraint - they have the ability to look at more than one column in the same row:

ORA11G> alter table products add constraint products_ck2
          check (sales_start_date < sales_end_date);

What is a little surprising, but maybe not after Tom's warning on NULLs, is that the check constraint still passes even if sales_end_date is null. Try inserting a new product where the sales_end_date is before the sales_start_date.

What we have learned

That is the products table taken care of lets see what we have learned:

  • To enforce uniqueness, use a unique index on the column or columns
  • To prevent NULL (blank) values being inserted, use a not null constraint.
  • To check the values of a column meet some conditions, use a check constraint, even if it is comparing two or more columns in the same row.
  • Be careful when dealing with NULLs - many bugs can arise because of them, so its worth giving them some extra attention!
blog comments powered by Disqus