Getting Started with the Apache Derby Open Source Database in CF8
If you've not noticed, one of the many new features in CF8 is the Apache Derby database. Derby is a full-featured database with a 10-year heritage (formerly known as Cloudscape, and released to open source by IBM, who had acquired it from Informix, and also now distributed by Sun as JavaDB.)
For those not familiar with the Apache family of open-source projects, don't confuse this with the web server of the same name (which is just one of many, many open source projects hosted on the apache.org site.)
I'm sure as CF folks start exploring Apache Derby they'll want to know more. I've pulled together this page to serve as a resource to help others. If you'd like to make any additions or corrections, let me know.
Derby is fully transactional, secure, easy-to-use, standards-based — SQL, JDBC API, and Java EE — yet small, only 2MB. It has the advanced features you'd expect in any quality DBMS: stored procedures, triggers, views, referential integrity constraints (including cascading deletes), multi-user capable, cost-based query optimization, transactions, savepoints, schemas, encryption, and much more. It can even run entirely read-only off of a CD.
The Apache Derby project has a strong and growing community that includes developers from large companies such as Sun Microsystems and IBM as well as individual contributors.
One of my main motivations is simply to get people to start talking about the possibilities that it opens, being embedded within CF. I allude to a couple possibilities in the "why use Derby" question below. In fact, I've added a new section to the bottom to track any clever ideas others relate on how they're using Derby within CF.
Here are the topics I've started to cover in this resource:
- General Resources for Learning More about Derby
- Is it a pure development only DB?
- Is this the same database engine that's embedded in Adobe Air?
- I've heard that Derby is a single-user DBMS
- How can we use it/enable it in ColdFusion?
- Are there any IDE or query tools for talking to Derby?
- Why use Derby database verses using something like MySQL?
- Where can I find a reference for supported SQL, etc.?
- Is there a way to automate creation of a new Derby Database in CF?
- Where else can I learn more from other CFers using Derby?
- What are some of the clever ways others are using Derby in CF8?
General Resources for Learning More about Derby
Here are some resources to learn more about Derby:- Derby is an Apache project. Here's the "quick start" page for it at apache.org (of course, there are many more useful resources there)
- It used to be known as IBM Cloudscape, and there's an entire Cloudscape sub-site at ibm.com as well with tremendous resources
- Indeed, there's a book that recognizes both heritages:
- The engine is also distributed by Sun as JavaDB and there are resources there, including a useful FAQ.
- There's also a Wikipedia site
And of course, as you use Derby with CF (whether the "embedded" or "client" version), please do chime in with your experiences on the .
Is it a development-only DB?
No, it's not just a development DB. First, yes, of course you can build production applications with it (though redistribution would be according to the Apache license). But as for its performance capability, there's in fact a PDF of a presentation comparing Derby to MySQL and others. Still another is available from IBM.Is this the same database engine that's embedded in Adobe Air?
No, that's yet another open source (indeed, public domain) DB, called SQLite (not a typo: it's spelled with one "L"). One may ask why the different Adobe teams chose different open-source embedded DBMSs. It could be that the different groups weren't aware of each other's decision and simply chose what seemed best for them. (Is Derby embedded in CF the same way SQLite is in Air? Well, that seems semantics. Yes, Derby is embedded in CF. The full DBMS is there. Nothing to install. Yes, you have to create a DSN (more later), but that's it.)Could Derby have been used for Air? Sure. Could SQLite have been used as the embedded DB for CF? It seems so. The two are very similar in being small, embedded, yet highly functional multiuser database engines.
There's a comparison of the two on the SQLite site, though obviously it has a SQLite perspective. For instance, it says "Derby only allows a single process to have the database open at a time in its embedded mode. However, Derby also offers a full client/server mode." That's a common misconception, which I'll address in the next question.
I've heard that Derby is a single-user DBMS
That's a common misconception, and some of it stems from the fact that it can run on its own, embedded in another Java application or server (like ColdFusion), or it can operate using another feature called the Derby Network Server.Let's look first at the simpler embedded form: Derby itself provides no communications abilities of its own, which helps keep it lightweight. As such, it can't be communicated with from outside applications, indeed it can accept requests from within that same JVM process. Does that mean it's single-user? Of course not. CF (like most web app servers) is multi-user, and IT handles that multi-user processing. Derby itself has no problem processing multiple requests within the CF process. It just can't receive them from outside of CF.
So why might a CF user care about the Network Server feature? Well, if you wanted to talk to Derby from another Java process, then you need it. You may think this applies only to other Java applications on the server or clients, but here's one that may trip you up: an IDE. The ij tool (mentioned later), for example, can't talk to your embedded database without enabling the Network Server.
That latter point, and indeed more on this whole question of "is it a multi-user database or not?" is covered in Chapter 1 of the Derby book I'd mentioned, and that chapter is available online (and in pdf form). The discussion of interest here is on the next to last page. I'd like to quote a bit:
When developers refer to Apache Derby as an embeddable database, they are referring to the fact that the Apache Derby database runs within a JVM process. Without the Apache Derby network server, there would be no networking services, data access outside of the embedded JDBC driver in the database engine, or other infrastructure requirements; this accounts for its small footprint.
Understanding what the embedded concept entails is critical when developing applications. For example, one common misconception that developers have when they work with Apache Derby as a standalone database is that it’s only a single-user database and does not have communication capabilities. They believe that it is a single-user, single-connection, single-threaded system and develop their applications accordingly. This is not true. Apache Derby as a standalone database can support as many connections as desired, so long as they are established from the same JVM hosting the Apache Derby engine.
For an Apache Derby database to be accessed from a process that resides outside the hosting JVM that loaded the Apache Derby database initially (even if the JVM process resides on the same server), you need to load the Apache Derby network server. Read that last sentence twice to ensure you understand it because it is often a source of confusion for Apache Derby developers when multiple JVMs reside on the same machine. The Apache Derby network server allows for communications between JVM processes. This means that this communication infrastructure isn’t solely required to communicate between machines; it is needed even if two different JVM processes reside on the same machine and want to talk to the same database.
How can we use it/enable it in ColdFusion?
It's built into CF8, which means both that it shows up as an option in the drivers list when adding a new datasource (in the CF Admin), and also the database engine itself if embedded within CF8. Nothing to install. Just start using it. In fact, the CF8 example DBs are now provided in Derby, no longer Access.Are there any IDE or Query tools for talking to Derby?
As you consider this question (or consider the answers of others) be careful to recognize that people may mean different things. First, someone may mean they want to access Derby from within a favored IDE they already use (like Eclipse), while others may not care and will be happy to use a separately downloaded tool. Some will want to use such a downloaded tool that they may already have, but want to know how to use THAT tool with Derby, while others will be happy to use any tool that helps solve the problem, even if new to them.Finally, and perhaps most important, recognize that when people speak of IDEs for a database, they may have very different needs in mind. Different tools will meet these needs at different levels of success, if at all. Are you lookint to be able:
- to easily see the databases in a server, and the available tables and columns in a database, and so on?
- to see the data in a given table?
- to be able to enter SQL statements manually and see the results of their executions?
- visually build SQL query and update (DML, or data manipulation language) statements? perhaps with the tool showing available relationships between tables so as to build needed join clauses automatically?
- generate CFC code from a table, to perform CRUD (create, read, update, delete) operations, that you can call from any code?
- build complete HTML or Flex interfaces to perform CRUD operations?
- visually build a new databases, tables, columns, relationships, etc (generating DDL, or data definition language, statements)?
Of course you can connect to it using any tool that supports CF's RDS (Eclipse with the Adobe CF 8 extensions for Eclipse, Dreamweaver, HomeSite+/CF Studio). All 3 have offer features to view databases and their tables and columns, view data, execute and generate SQL DML. None offer visual tools to create new tables, columns, etc (DDL). Both the Adobe extensions for Eclipse and Dreamweaver have features to generate CFCs from tables.
If you use Eclipse, you have several choices:
- the RDS Dataview in the Adobe CF8 extensions for Eclipse just mentioned
- a specific Eclipse Derby plugin
- The Eclipse Data tools Platform (DTP) project, which was also discussed in 2 blog entries by CFer Todd Sharp The DTP project does add the means to create new tables and columns (DDL), but I myself have had difficulty using it yet with Derby. (I can't see Derby in the list of available drivers.)
Finally, any tool that supports JDBC will work. For instance, AquaData Studio doesn't list Derby directly but they do list support for "Generic JDBC Platform", and I've even found a couple of references to it perhaps supporting Derby. I list several other query tools at a new resource page I have on Query tools.
There is also the DataMgr tool, which now has support for Derby.
I should add, finally, that there are also tools that are not specifically IDEs or Query tools, but are code generators that work with CF and databases, such as those I list in my Resource page on Code generators.
Why use Derby database verses using something like MySQL?
Well, let me remind you of the resource I mentioned above comparing Derby to MySQL.I think the primary benefit would be for those developing (or using) a CF app built to use Derby, where the developer can know that the DB will be there (as long as on CF8), and on any platform. Same for the users: they don't need to "install" a database. Sure, for many of us, this is trivial, and we like the freedom to choose among available DBMSs, but anyone developing packaged apps can take advantage of this.
Similarly, the CF8 example apps are distributed using it. That was Adobe taking advantage of the same benefit. I'm sure, over time, there will be still more uses.
For instance, I hope that Adobe would change the default client storage from the Registry to a Derby DB. Certainly shops could choose it manually, in the meantime (though, sure, if they already have another DBMS installed they may prefer to use that for familiarity.) And clearly some shops would prefer anyway to create a different DSN per app for such a client var repository. I'm just saying it could be a matter of picking an app-specific one presumed to be Derby (without need for the user to create a new Derby database and DSN.) Just a thought.
There may be many other useful benefits from our having a DBMS embedded in CF, not the least of which is for those who are building apps to be deployed elsewhere. This fits very nicely with the WAR/EAR deployment mechanism of CF Enterprise (though that can just as well use any DBMS, of course.)
Beyond that, as for whether and why to use Derby over other DBMSs, I'd leave that debate for others to make. Certainly the Derby site will have many such articles making the case, and a google search for derby and mysql will turn up more. Just to clarify, I didn't create this page so much to evangelize the DB and convince people to use it, so much as just to create a place share resources with those exploring it now that it's included in CF8.
Where can I find a reference for supported SQL, etc.?
Both the Apache and IBM sites have complete reference documentation. See the Apache site's version of the SQL statement reference, Support for SQL-92 features, and Table of Contents.The equivalents at the IBM site: SQL statement referfence, Support for SQL-92 features, and Table of Contents.
Each of these resources have lots more documentation than what I've highlighted. See the tables of contents for more.
There's also a mapping of various levels of SQL-99 and SQL-2003 features mapped to Derby. This is part of a larger Derby wiki with lots of useful info.
Is there a way to automate creation of a new Derby Database in CF?
Well, do you really mean creating a database or a DSN? I'll assume you really mean the former (in addition to the latter). Both can be done via CFML. Since CF7, we've had the Admin API which includes methods for creating DSNs, and which could therefore pass in the attribute needed to create a new database while also creating a DSN. See the CF docs which offer examples of using the Admin API to create a datasource, and you can apply the same technique to a Derby DSN. Beyond that, though, Sam Farmer has released DerbyCFC, to assist in listing, creating Derby databases in CF.Where else can I learn more from other CFers using Derby?
There was an article in one of the last issues of the CFDJ before it became defunct: Working with the Apache Derby Database and ColdFusion, though note that it came out before the final release of CF8, so some of the problems are solved (like the hassle of creating new databases, now as easy as a checkbox in the DSN setup). There have also been a few CF/Derby-oriented blog entries (some of which also were before the final release):- Creating Derby embedded databases in ColdFusion 8
- Exploring Your DB With Eclipse 101 (Including Derby DBs!)
- Exploring Your DB With Eclipse 101 - Part 2
- Using the Derby Database
- Coldfusion 8: Creating a Derby database
- More On Scorpio And Apache Derby
Finally let me point out that I've started an Derby community on the ColdFusionCommunity.org site. There you'll find people discussing Apache Derby from a CFML perspective. I'll note, however, that I've taken all the discussions that preceded December 2007 and turned them into this blog entry.
What are some of the clever ways others are using Derby in CF8?
I shared above a couple of ideas of how I can foresee folks using the embedded Derby database, but what about you? What clever ways are you (or might you consider) using it?- As a client variable repository (Charlie Arehart)
- To create a packaged CF app in a WAR/EAR file, whereby no DBMS needs to be installed in the destination server (Charlie Arehart)
- (your name here)







