Note: This blog post is from 2008. Some content may be outdated--though not necessarily. Same with links and subsequent comments from myself or others. Corrections are welcome, in the comments. And I may revise the content as necessary.
Some of the most common questions people wonder as they first consider using Derby (the embedded database in CF8) are the following: is it a development-only DB? Does it perform and scale well? Isn't it only a single-user DBMS? Is it the same database embedded in AIR? There's a lot of misinformation out there and there are surprisingly positive answers to all these potential criticisms, which I address these below.
I'll also address these points and more in my talk at Adobe Max, in my session "Using Apache Derby, the Open Source Database Embedded in ColdFusion 8", on Tuesday, November 18, from 4:30 pm - 5:30 pm. If you've not yet booked that slot, or are on the fence, I think you'll be VERY surprised to hear all that Derby can do. I'll share some tidbits over coming days.
Is it a development-only DB? How does it perform and scale?
No, it's not just a development DB. First, yes, of course you can build production applications with it (though redistribution would be according to the Apache license). But as for its performance capability, there's in fact a PDF of a presentation comparing Derby to MySQL and others. Still another is available from IBM.
I've heard that Derby is a single-user DBMS
That's a common misconception, and some of it stems from the fact that it can run on its own, embedded in another Java application or server (like ColdFusion), or it can operate using another feature called the Derby Network Server.
Let's look first at the simpler embedded form: Derby itself provides no communications abilities of its own, which helps keep it lightweight. As such, it can't be communicated with from outside applications, indeed it can accept requests from within that same JVM process. Does that mean it's single-user? Of course not. CF (like most web app servers) is multi-user, and IT handles that multi-user processing. Derby itself has no problem processing multiple requests within the CF process. It just can't receive them from outside of CF.
So why might a CF user care about the Network Server feature? Well, if you wanted to talk to Derby from another Java process, then you need it. You may think this applies only to other Java applications on the server or clients, but here's one that may trip you up: an IDE. The ij tool (mentioned later), for example, can't talk to your embedded database without enabling the Network Server.
That latter point, and indeed more on this whole question of "is it a multi-user database or not?" is covered in Chapter 1 of an available Derby book I'll discuss in a later entry, and that chapter is available online (and in pdf form). The discussion of interest here is on the next to last page. I'd like to quote a bit:
When developers refer to Apache Derby as an embeddable database, they are referring to the fact that the Apache Derby database runs within a JVM process. Without the Apache Derby network server, there would be no networking services, data access outside of the embedded JDBC driver in the database engine, or other infrastructure requirements; this accounts for its small footprint.
Understanding what the embedded concept entails is critical when developing applications. For example, one common misconception that developers have when they work with Apache Derby as a standalone database is that it's only a single-user database and does not have communication capabilities. They believe that it is a single-user, single-connection, single-threaded system and develop their applications accordingly. This is not true. Apache Derby as a standalone database can support as many connections as desired, so long as they are established from the same JVM hosting the Apache Derby engine.
For an Apache Derby database to be accessed from a process that resides outside the hosting JVM that loaded the Apache Derby database initially (even if the JVM process resides on the same server), you need to load the Apache Derby network server. Read that last sentence twice to ensure you understand it because it is often a source of confusion for Apache Derby developers when multiple JVMs reside on the same machine. The Apache Derby network server allows for communications between JVM processes. This means that this communication infrastructure isn't solely required to communicate between machines; it is needed even if two different JVM processes reside on the same machine and want to talk to the same database.
Is this the same database engine that's embedded in Adobe Air?
No, that's yet another open source (indeed, public domain) DB, called SQLite (not a typo: it's spelled with one "L"). One may ask why the different Adobe teams chose different open-source embedded DBMSs. It could be that the different groups weren't aware of each other's decision and simply chose what seemed best for them. (Is Derby embedded in CF the same way SQLite is in Air? Well, that seems semantics. Yes, Derby is embedded in CF. The full DBMS is there. Nothing to install. Yes, you have to create a DSN (more later), but that's it.)
Could Derby have been used for Air? Sure. Could SQLite have been used as the embedded DB for CF? It seems so. The two are very similar in being small, embedded, yet highly functional multiuser database engines.
There's a comparison of the two on the SQLite site, though obviously it has a SQLite perspective. For instance, it says "Derby only allows a single process to have the database open at a time in its embedded mode. However, Derby also offers a full client/server mode." That's a common misconception, as discussed in my previous point above.
Signup for my Derby talk at Max
Again, I'll have lots more to say on Derby. But I realize it can be hard to keep up on any blog entry or series of entries. If I've piqued your interest at all, or if you want to show your support for CF (and Derby) talks at Max, consider signing up for my talk. I'll have more to say and show there, for sure. Also, if you have colleagues who maybe don't read blogs, let them know that Derby is a lot more than they may have been led to believe, and have them sign up. The Max folks use signups to measure interest in topics. This may be one that slipped under the radar of many considering talks. I'm just trying to help promote it, while sharing more Derby goodness along the way.