[Looking for Charlie's main web site?]

Recording of my Adobe eseminar session, "Monitoring #ColdFusion with FusionReactor"

After my barrage Friday of four entries on the CF Server Monitor, here's something instead on FusionReactor. Some may know that last week I did a talk on the Adobe ColdFusion eseminar series, "Monitoring ColdFusion with FusionReactor". I got word today that the recording link has been posted.

You can find the recording here. Note that you need to login with an Adobe ID, just like when you download Adobe software or participate in their forums. (I have no control over that.)

Since that link just goes right to the recording, here is the description I'd used for the session, to help decide if the recording may interest you. BTW, I clarify on the session that FR is useful for more than just ColdFusion, in that FusionReactor can be used for Railo, BlueDragon, and OpenBlueDragon, as well as in fact any Java server (Tomcat, JBoss, Jetty, Glassfish, Websphere, etc.), and the session applies just as well to folks using those.

My session: Monitoring ColdFusion with FusionReactor

Recording
Session Description:
If your CF server starts acting up, how do you go about resolving problems? If you're on ColdFusion 8 or 9 Enterprise, you may know that you have a built-in ColdFusion Server Monitor. Did you know there is an alternative tool that supplements it well? FusionReactor is a commercial third-party tool, which can monitor not only any version of CF (6, 7, 8 or 9, whether Standard or Enterprise) but also Livecycle and any other Java web application or server in your environment.

Such monitoring is about more than "watching a screen". You can arrange to receive email alerts with valuable information (sort of a black box recording before a crash), and FusionReactor also creates really valuable logs that can also help with post-mortem analysis. They can also assist with deciding on CF server configuration settings, watch trends for hardware upgrades, and more. And as of FusionReactor 4, these logs now track information that previous only the CF Server Monitor displayed (but didn't log at all). Finally, an additional tool, FusionAnalytics, can help analyze and visualize that data over minutes, hours, days, weeks, months, and so on.

In this 50-minute session, veteran CF troubleshooter and independent consultant Charlie Arehart will introduce and demonstrate these and other key features of FusionReactor (including stack tracing and crash protection), and will end with a brief demo of FusionAnaytics.

See also what other Adobe ColdFusion talks are coming, or recordings of past talks

In fact, you can see that description posted now as one of what they call "on demand" esminars, which are simply a list of all the CF eseminars that have been recorded.

Note that you can change the filter on the right to see eseminars for still other Adobe products. (Sadly, there's no way I can give a link directly to my preso on that page, or I'd have done that and not offered the description above. And while today it's the first in the list, in time that will no longer be the case.)

Similarly, you can find out all upcoming CF Adobe esminars, or use the same filter feature there to see those for other products.

Session feedback welcomed

Finally, I'd welcome feedback from anyone who did or does view the talk I gave. They offer no such mechanism. Comments here are fine, and I don't mean only any favorable ones. :-)

PS If anyone wonders why I've taken to using the #coldfusion hashtag in my blog entry titles, it's simply that many of the CF blog aggregating tools out there are now retweet what gets posted, and so by using the tag here, it increases the chance that CFers may notice the new entry if watching for that hashtag, even if they don't watch for blog entries from a specific person. It may look a little uglier, but I hope it's helpful for all concerned in the long run.

CF911: Want to monitor #ColdFusion "out of process" (from outside the instance itself)? Many ways.

I just blogged about how the hidden gem "enable monitoring server" option in CF 9.0.1 does NOT cause the CF Server Monitor to somehow magically run "out of process". See more on that.

Yet people will reasonably want to be able to have some mechanism that "watches" CF "from the outside", to know when it's gone down. How can you do that? That's what I'll point out in this entry.

And beyond talking about what goes along with the CF Enterprise Server Monitor, I'll also point out options for those who are NOT running CF 8, 9, or 10 Enterprise and therefore do not have the Enterprise Server Monitor. This also includes those CF 6 or 7. There are solutions for you, and also for those running Railo, BD, or indeed any Java server. More on all that in a moment.

This is part 4 of an unexpected series of entries today on the CF Enterprise Server Monitor. :-) I got on a roll, and each seemed deserving of its own topic. See the "Related Blog Entries" below this entry for links to those.

What the CF Server Monitor is, and is not

To be clear, The CF Enterprise Server Monitor (and indeed, FusionReactor and SeeFusion in their basic configuration) is "just" a web interface (Flex-based) that talks to an embedded flex gateway component running within the CF instance/address space to get information about how the instance is doing. If that instance goes down, then the web interface will have nothing to talk to any more, and the "monitor" will no longer be of value.

But there are alternatives to watch the instance (indeed, multiple instances) from the outside

First: The CF Enterprise Multiserver monitor

First up, let me talk about what's built into CF 8/9/10 Enterprise, to go along with the CF Server Monitor. Many never notice it, or they misunderstand it: the Multiserver Monitor. This is different from (but closely related to) the Server Monitor.

It is a single Flex-based interface that watches whatever other CF Enterprise server monitor instances you tell it to watch. It does run "out of process" in that it runs on your desktop (as a Flex-based web page) and can show you if a monitored server has gone down or become unresponsive. It also shows a few key stats about each monitored instance in the one interface, while such monitored instances are up. If you want to see the full details of what's going on, you then can ask it to open the full CF Server Monitor (as another web page).

I talk about it more in an article I did back in 2008, part 4 of my 4-part series of articles on the Server Monitor.

But briefly, it's launched from the same "Server Monitor" page in the CF Admin. I suspect that many mistakenly assume that it's only for monitoring "instances in a multiserver deployment of CF". It can do that, but it can be configured to watch any other CF Enterprise instance (whether on the same machine or on another, and whether that's running in a Server, Multiserver, or indeed J2EE form of deployment.)

For more on how to configure the Multiserver Monitor itself, as well as how to configure a server to allow you to "watch" it (you need to tweak an XML file on the server to be monitored), and more, see that article above.

I should note, as well, that CF 9 (and 10) now offer a Server Manager, also launched from the same "server monitor" page in the CF Admin, and it does much the same as the Multiserver Monitor, and a little more (as well as a little less). You may want to look at both to decide for yourself. See the CF documentation for more on the Server Manager.

What if you don't have CF 8/9/10 Enterprise?

Since the CF Server Monitor and Multiserver Monitor come only with CF 8/9/10 Enterprise (and the MS Monitor can only monitor such an instance), what do you do if you're on CF Standard, or CF 6 or 7? Or indeed Railo, BlueDragon, or Open BlueDragon? There are solutions for you.

First up, as I talked about in an entry back in 2007, CF8 monitor doesn't run on CF8 Standard, or any 6 or 7. What to do?, there are indeed alternative CF Server monitors, FusionReactor and SeeFusion, which both work on CF Standard as well as CF Enterprise.

FusionReactor even works with Railo, BD, OpenBD, or indeed any Java server. And by that last point, that means such things as LiveCycle and Flex Data Services, as well as generic tools that run on Java like JIRA and Confluence. It also means the Solr Server that comes with CF 9 and 10!

And of course both FR and SF also work just fine on a CF 8/9/10 Enterprise server, even if you have the CF Server Monitor running. They can all run with very little overhead, though of course see the entry I did back in 2007 on potential overhead concerns with the CF Server Monitor.

FusionReactor and SeeFusion each also offer an Enterprise Dashboard feature

But the point for this entry is that both FusionReactor and SeeFusion each of offer an Enterprise Dashboard feature which both offer a single interface to watch multiple monitored instances (on one or many machines), similar to the CF Enterprise Multiserver monitor.

You do need to have FusionReactor installed on any instance to be watched by its dashboard, and you need SeeFusion installed on any instance to be watched by its dashboard. And you do need the Enterprise edition of each for them to be watched by their respective Enterprise dashboards. But to be clear, the CF license does not need to be Enterprise (and again, FR can watch Railo, BD, and other Java servers.)

There is a page on the FR site devoted to FusionReactor Enterprise Dashboard. There's no specific page on the SeeFusion site about its Enterprise dashboard.

What if you're not sitting there in front of these multi-instance dashboards?

A reasonable question might be: all these dashboards to watch multiple servers at once is nice, but it won't help if you're not looking at them when things go amiss. And that's true.

Now, before I talk more about these multi-instance dashboards, let me clarify that all 3 CF monitoring tools have means to notify you by email when an alert condition arises, as detected within the running instance. In the CF Server Monitor, they're called "Alerts" (and I discuss them in part 3 of my 4-part articles series.) In FusionReactor, they're called "crash protection notification", and that's discussed more here and here. And in SeeFusion, they are called Active Monitoring Rules (there's no page dedicated to them that I can point you to.) These alerts are available in the Standard edition of both FusionReactor and SeeFusion.

But if those individual instances become unresponsive or crash, then those alerting mechanisms will not notify you, either. (That said, they may well alert you when you can't get into the monitors, but the instance is still running. They can also alert you with info just before the instance goes down, and may have valuable info. So these alerts are indeed very worthwhile. I should write more about them separately some day.)

Back to the point of the multi-instance dashboards, where the one interface is watching the others, that's a different story.

The CF Enterprise Multiserver Monitor is just an interface: it has no means to email you that a monitored server has gone down. Neither does SeeFusion's dashboard, as far as I can tell.

FusionReactor offers several enhancements in its Enterprise Dashboard

But FusionReactor's Enterprise Dashboard has several enhancements over the others. I didn't set out in this entry to make this point. It just is worth noting.

First, it can in fact be configured to email you when a monitored instance becomes unresponsive (goes down) and comes back up. That can go to one or more people, and of course it could go to an address that could trigger an SMS or other text message. (More on mobile device notification in a moment.)

Second, the FR Enterprise Dashboard configuration page can also be configured to run a script when a monitored server goes down or comes back up. For more on that, see "Using FusionReactor Enterprise Scripting" in the FR documentation.

Third, FusionReactor offers an available AIR version of its dashboard, which can pop up an alert from your desktop status tray when a monitored server has problems.

Fourth and finally, and a delight to many I'm sure, there are available mobile app versions of the FusionReactor Enterprise Dashboard (IOS and Android).

So, whether you're coding away, or playing Angry Birds, whether you want a mobile interface or a text message, or good 'ol email. You can get notification about your monitored servers.

So you can, indeed, monitor CF from "outside the process"

Again, the main point of this entry was to show how in fact you can monitor CF (and other CFML engines) from "outside the process". Those on CF 8-10 Enterprise have the MultiServer monitor and Server manager. Those on CF 6-10 Enterprise or Standard have FusionReactor and SeeFusion and their Enterprise Dashboards. FusionReactor also supports other CFML engines and indeed any Java server/app (whether those you have associated with CF, including Solr, LiveCycle, FDS, and more, or really any Java server like Tomcat, JBoss, Glassfish, WebSphere, etc.)

And all 3 tools offer email-based alerts from within the process, which certainly has its own value.

Still other non-CF solutions

Finally, someone may want to ensure that I point out that there are, of course, many other monitoring tools that can "watch a CF server" from "outside the process". I did focus here just on those that are devoted to CF itself. But there are many different kinds of other tools you may want to consider.

For instance, there are free (and paid) tools and services which will send requests to your server on a periodic basis and will then email you when they are down (and may track that in an interface). I talk about such tools in a category of my CF411 site: Web Site Uptime Monitoring Tools. Some may recognize some prominent ones like Pingdom, Watchmouse. There are many more I list.

Second would be the category of system monitoring tools, like Nagios, ServerDensity, Spiceworks, ManageEngine, and others. I have a category for such tools, System Monitoring Tools, again listing many more, free and commercial.

Third would be Java Application Monitoring Tools like NewRelic, JaMon, Glassbox, Orion, and others. And I have another separate category for monitoring a JVM itself, but again the focus here is "out of process" monitoring.

Indeed, I have several other categories of monitoring tools for other aspects of a web environment. Check out the top-level category that those and the others abvoe are in, Monitoring Tools/Services, to include also database monitoring tools, other (in-process) ColdFusion monitoring tools, SAN/NAS monitoring tools, web analytics tools, and more.

Bottom line: there are many ways to watch a CF instance from "outside the process". Go nuts, kids!

And let me know what you think.

Speaking next week at CFCamp in Germany, 3 topics: Zeus, FusionAnalytics, FusionReactor 4

Just wanted to share, for any who may be interested to hear, that I will be speaking next week at a new conference called CFCamp, being held in Munich, Germany, on Friday Oct 28 (and now sold out).

At the event, I'll be giving 3 talks. Well, two are sessions in the one-track conference, and one is a day-long class the day before.

The two session topics will be:

  • What's Next In Zeus, aka CF10
  • Continuously improve CF code quality, server availability & application stability

The descriptions for each of those is on that page for the conference program". As you'll also note there, the other speakers are Mark Drew, Gert Franz, Gary Gilbert, Luis Majano, and Bilal Soylu.

The day-long class I'll be doing (separately purchased, and nearly sold out) is:

If you haven't heard, both Fusionreactor 4 and FusionAnalytics have been released in recent weeks. They're powerful tools that I help people use all the time in my independent CF troubleshooting consulting. If you haven't checked them out yet, do. And note the availability of both a live demo (nothing to download and install) and a free 10-day demo for each.

See you in Munich, or in the future

If you may be in the area and interested in attending, see that page (top right) for more on registering.

I'll note that I will likely give both the talks in other venues and formats (whether in-person or over the web) in the future. If you may be interested, let me know.

And if you'll be in Germany next week, I hope to see you there. (Sadly, my wife didn't get to come this time.)

Thanks to all the sponsors for helping make the event happen, for me and for all who will be attending.

I'm speaking this evening on the Adobe CF Developer Week webinars: mine on CF Server Monitor

Hey folks, just a heads up (for those who may not have seen all the tweets and list messages) that this week is the Adobe CF Developer Week series of free webinars.

Update, Recording: Note that this session was recorded. You can view it here, but note that you must login with an Adobe ID to see it.

And I'm presenting a session tonight, Tuesday September 13, at 7pm Eastern, on "Understanding and Using the ColdFusion Server Monitor".

As many of you know, I'm pretty much a fanatic about the monitor, especially about truly understanding elements of it that many miss. And so in my talk this will not be just a dog and pony show, but I will talk about practical experiences with it, though presented to either those new to it or experienced with it.

Note that the times for all these devweek sessions is shown (on the Adobe site) as being Pacific time, so again mine is at 7pm, not 4pm, Eastern.

And yes, the sessions are being recorded and seem to be made available the next day.

Finally, beware that there is no one URL you can use to join in on all the Connect sessions, nor can you get the Connect session URL by going to the event page (via the first link above). Instead, you must register for each event (free) from that first page, to get each session's Connect URL--and you'll want to do that at least several minutes in advance of any session to have time to register, get the email, login, etc.

See you then.

PS Hey, while we're talking monitoring, note as well that if you've not heard, FusionReactor has come out with its new release 4, which has lots of great additions, especially FREC (or the FR Extensions for CF) which cause FR to grab and log lots of great info that the CF Server Monitor only shows and never logs. I'll be blogging about FR 4 soon, but plenty to see on their site. and FusionAnalytics is also just about to release, really!

I won't be discussing these at this talk, focused solely on the server monitor, but as I always tell folks, each tool has its use and often a single shop can benefit from having both (like I do, as do many of the clients I help with troubleshooting). You can find more from me about FR here in my blog. And I'll have lots more to say about FA and FR4 more soon.

CF911: Lies, Damned Lies, and CF Request Timeouts...What You May Not Realize

How often have you seen (or seen others complain of getting) an error from ColdFusion such as:

The request has exceeded the allowable time limit Tag: cfoutput

Do you know what this means? It's usually not what you think. I've even seen experienced CF developers who get thrown by this challenge. In this entry I'll try to help explain a very common problem and correct some misconceptions. I'll even contend that this info is often useless and indeed misleading (and therefore the feature producing it ought not be relied upon, and should even be turned off). Along the way, I'll share some things that I've not seen documented elsewhere.

Strap on your seatbelts. We're going for a bit of a ride (if it was easy and could be understood in the length of a tweet, then perhaps everyone would already understand it!) As always, I welcome feedback.

What the error usually does NOT mean, though most assume it

People are often mystified: "Why in the heck would a CFOUTPUT take a long time?"

Or perhaps they're a little more savvy as to what's happening, and they assume, "No, it's just that the CF timeout time was reached when it got to the CFOUTPUT". That could be.

Sadly though, in most cases, neither is what has happened. CF is usually NOT reporting that "here where the app timed out".

What the error usually DOES mean--the surprise

"OK, smarty-pants. What does the error really mean? Are you saying that CF is lying to me?" Well, often, yes, I'm afraid so, but it's not something nefarious.

Rather, what's more typically the explanation is that some previous activity in the page/request, such as a CFQUERY, CFHTTP, invocation of a web service, or the like is what really took a "long time".

If it was this which caused the request to exceed the timeout (either as defined in the CF Admin Settings page, or using CFSETTING RequestTimeout, or a Timeout attribute on a tag), you'd of course expect CF to report it then and there. The problem is that, often, it cannot report it "right then". And it's not its fault.

There are some operations CF/the JVM cannot interrupt

The problem is that CF (and the JVM) cannot interrupt a request while it's processing what's called a "native method". That is quite typically the mode that a request is in while it's waiting for a reply from a CFQUERY, CFHTTP, and so on. These operations talk to something outside of CF (like a database with CFQUERY, or another server with CFHTTP or web service call--which could even be requesting a page from the same CF instance, but technically the underlying Java httpclient process doesn't know that.) It could also happen with file or network operations.

So the request will wait for this long-running operation to finish. It can't stop it, not with the CF Admin request timeout, not with CFSETTING RequestTimeout, not with the kill features in the CF Server Monitor, FusionReactor, and SeeFusion. Nothing. It's like the Anti-Terminator: "it absolutely will not stop" (can't be terminated) until its task is completed.

So what happens when the long-running operation finishes? Is that when the request times out? An example

"Ok, I got it. The long-running operation (CFQUERY, CFHTTP, whatever) will not stop until it's finished. What happens then?"

Well, you see, that's where the confusion comes in. Let's use an example to make things crystal clear.

Say that the CF admin timeout is 60 seconds (not at all uncommon), or perhaps you set the timeout to 60 for a given template using CFSETTING. Anyway, let's say that the request in question gets 2 seconds into processing when it starts running a long-running query (for example). Let's say that query then takes 75 seconds. When the query is done, the request has now run for 77 seconds, which is 17 seconds beyond the timeout time.

We already know that it won't stop the CFQUERY itself (at least until the query is finished). But guess what: it also will NOT report that the timeout has been exceeded on that line (whatever it was, CFQUERY, CFHTTP, etc.) From my experience, CF doesn't check the time against the timeout at the end of operations, but rather at the beginning.

So instead, it will proceed to the next line of code. You'd think, "ok, then, it will stop on whatever is the next line of code and give you the error there, right?" Sadly, not necessarily, and it only adds to the confusion of the timeout message.

CF checks the time at the start of the next operation, but sadly only on SOME tags

So it's bad enough that it won't report the error on the tag that DID run long. Instead, we saw that it will proceed to the next tag/function. But curiously (tragically), CF will often NOT stop on THE next line of code.

Instead, I've observed that it only seems to check the time (against the timeout) at the beginning of CERTAIN tags, such as CFOUTPUT, CFLOOP, CFQUERY, and so on. Yes, I'm saying that I've confirmed that it will skip over various other tags (such as CFSET, or CFSCRIPT code, and more). I've not yet found any documentation as to the details of this.

So this is where the error gets confusing

So the bottom line is that not only does the the request NOT stop on the tag/function that took a long time, it doesn't even stop on "the next line" after that, which can make things all the more confusing/challenging to resolve.

Indeed, this is why you often see the error reporting as having occurred on a tag other than what was really the problem, and why you also can't just look at whatever was *the* line of code preceding that.

So what can you do with this information?

I don't mean to paint an entirely bleak picture. All is not lost. It's just a little more challenging than it should be.

At least first of all you can now know that when you see this error, you should NOT assume that it's reporting the line that really caused the problem. You can and should consider whether some earlier operation in the code could have taken a long time. In my experience, this is usually the situation.

I'll talk in a moment about some other tools that can help you understand where the time is really being taken. First, I do want to offer a clarification, lest anyone read my meaning too literally.

Are you saying the error message is always lying?

Well, no. You'll notice that I peppered my opening paragraphs with "usually", because it's certainly possible that a request could indeed be stopped on the very line that DID exceed the timeout.

Consider in our example that if the long-running query had run for only 57 seconds. Now, since it had taken 2 seconds before that, it now is one second short of timing out. Let's say the request then proceeds to loop over a query resultset or do some other operations that might take it a couple more seconds. When it does finally exceed the timeout, it may well happen right on the very tag that CF Reports as having "crossed" the timeout time.

But given the problem of how it only reports that on some tags (and not all), it could still be in this situation that it reports the wrong line of code. Just consider all the above as you evaluate what to make of the situation.

So how can I know what tag did take a long time?

So how can you know what tag is taking a long time, when a request it running long? or did take a long time, if it finished in the past? This is a bit more challenging. The good news is that there are tools that can help, including the CF Enterprise Server Monitor, FusionReactor, and SeeFusion.

Let's focus first on using these tools to catch requests while they're still running, which could be valuable if your server is hanging up because of some long-running requests. Then we'll talk about using the tools to capture the same information and make it available by email to review later.

The underlying feature/solution: stack tracing

In either case, whether watching requests live or capturing information about them to review in the future, and in all three tools, the solution to identifying why a request is long-running will be based on "stack tracing" that request.

This is a feature built-into the JVM, which is exposed easily by these tools, but missed entirely by many. More than that, some misunderstand stack tracing as something only shown at the bottom of error pages. (That is indeed a stack trace, but it's not nearly as useful as what I'm referring to here, which is for getting information on request while they're running, not when they have had an error.)

Stack tracing a running request will allow you to see exactly what line of CFML (if any) is running at that very moment, which again can be vital for resolving problems of long-running requests.

CF Enterprise Server Monitor

First, if you run CF Enterprise (8 or 9), you can use the CF Server Monitor to watch requests while running and see more details about them. If you use the available "start monitoring" button, you can see what requests are running in "Active Requests". Further, if you enable "start profiling", then if you double-click a running request, you can see in the middle of the next page a "stack trace", which shows the exact line of code that was executing at the time you double-clicked the request.

(Yes, I'm aware of the potential overhead of using the Server Monitor, though some people do over-state it in my experience. I'll point to other resources I've done on the Monitor, where I discuss its pros and cons, in a moment.)

Of course, viewing that stack trace at a random point in time during the life of a request could well mean simply that you'd see it executing just any random line, where perhaps a millisecond later CF will have moved on to another. The key is to refresh the stack trace, to see if CF indeed HAS moved on to a new line. If not, that line would be a smoking gun to investigate. Sadly the refresh icon in the Monitor doesn't update the details while viewing a running request. You need to go back to the list of active requests, open the request again, and repeat your observation.

Tools like FusionReactor and SeeFusion

Fortunately, tools like FusionReactor and SeeFusion make that refresh a lot easier, to obtain a stack trace while request is running. Each offers a button to take a stack trace of a running request. From the page they show you can usually determine the line of CFML code that's running (they each offer a little more stack trace detail than the CF Server Monitor does, but I'll point you soon to a resource to help you better understand them.)

More important, each of these tools offer a refresh button to refresh the stack trace, so that you can properly determine if the line that's executing has changed while you're refreshing.

That said, I will note that FusionReactor offers an important advantage with respect to that refresh operation: it ties the stack trace display to the specific CFML page that was being viewed (in Running Requests page) when you selected it. So if that request ends while you're looking at its stack trace, and you refresh it, FusionReactor will report that it's finished.

SeeFusion, on the other hand, would not. It knows only the thread id on which the requested page was running, so that if the request ends and you refresh the stack trace, it only knows to refresh the stack trace for whatever request is running on that thread. It can't (and won't) tell you if the given request has in fact ended, so you could now be looking at a new (and different) request, which could be quite confusing in this situation. It's incumbent upon you to notice (when using SeeFusion) that the stack trace you see in indeed for the same request you started with. (FR gives each request its own internal request id, which is how it avoids that problem.)

Catching running requests details when you're not watching the monitor tools

Of course, it's only possible to use the stack tracing features above if you can be on the server running the monitor tools when the problem occurs, right?

Well, not exactly: all three tools offer features to watch for a long-running request which can then send you notification by email of the details that would include a thread dump, which is a stack trace of all running requests.

In the CF Server Monitor, these are called Alerts. FusionReactor refers to them as Crash Protection notifications, and SeeFusion refers to this as "Active Monitoring Rules". See the documentation for each tool to find more information.

Learning more about stack tracking and the monitor tools

For more on all this, I discuss the idea of taking stack traces and thread dumps (which is a list of all stack traces for all current threads) in another blog entry.

I also discuss the CF Server Monitor, FusionReactor, and SeeFusion in several blog entries. The links just used are to the respective categories about each here in my blog. I've also discussed these topics (monitoring, stack tracing, and more) in various articles and presentations I've done.

The step debugger

Finally, some may point out that you can also get an idea of the time spent on any tag/function within a request if you use the interactive Step Debugger (whether that built into CFBuilder or the commercial FusionDebug alternative). As you step through the code, it would be clear if you got "hung up" on a line, though I don't know that I'd favor this as a solution here. Still, I've discussed these also in various blog entries, articles, and presentations.

(Sadly, you can't rely on the typical end-of-page debugging output, as enabled in the CF Admin, because that output is only shown if the request completes. We're referring here to pages that end in error.)

Can't I force CF to timeout some specific tags?

Again, in my experience (as I focus on CF server troubleshooting as a consultant), the root cause of problems in most "long-running" requests in CF is that some one tag or function is running long.

So could we perhaps force CF to timeout that specific tag? Well, yes and no.

You can, in fact, (and should) consider whether the tag in question might have its own TIMEOUT attribute or feature (and whether it will really help, as I'll explain.) Let's look at each of them.

Setting Timeout on CFHTTP, CFINVOKE, and others

There is indeed a TIMEOUT attribute on CFHTTP. Unfortunately, it won't ALWAYS keep the operation from exceeding that timeout. I've not quite put my finger on it (just haven't experimented completely), but if I had to guess, I'd say that it could be that if the operation is in the midst of returning data (from the server to CF), then it could perhaps time it out, whereas if it's waiting for the output then it may not be able to. Anyone know for sure?

There is also a TIMEOUT on CFINVOKE (for use when calling web services). Curiously, though, there is no TIMEOUT for use with CFOBJECT when calling web services (try it, it won't work, and none is documented). More curious still is that there IS a timeout available for use with createObject() (when calling a web service), though only by way of an argstruct argument that's new in CF8, which I have blogged about. Note as well that, according to the docs, that only times out the process of obtaining the WSDL, not the execution of any method in the web service.

There are also timeouts on various other operations that talk to something outside of CF (cfmail, cfftp on open/close operations, cfldap, cfpop, cffeed), though again it seems reasonable to expect that these may not always honor the timeout at the exact time given, as discussed above.

Setting Timeout on CFQuery

What about the elephant in the room, CFQUERY? Well, yes, it does have a TIMEOUT attribute, but many have found that it often does not timeout the query. Like the CFHTTP, I wonder if it may be a question of whether it's waiting for output (which likely can't be interrupted) or starting to receive it (which likely can be).

I will note that there's some promise in this regard, though for now not from CFML itself, but rather from the updated JDBC drivers in CF9 and the addition of a new timeout option in the CF Admin Datasource Advanced Settings page. You'll see that there is a new "query timeout" option that was not in CF before 9. I have blogged about it in more detail. It's not perfect: people are reporting different experiences with it (see the comments in the blog entry), and note (more important) that for now there seems no corresponding connection between this and the CFQUERY TIMEOUT attribute. (As I note there, I have raised a bug about this.) Still, it may be better than nothing and could help many, if you're on CF9.

So is there really nothing I can do for the hung requests?

OK, so we've explained why the requests don't timeout, often because they're talking to some remote process that is not responding. But what CAN you do when you're in this boat? Well, other than trying to add timeouts to the code as discussed above, generally nothing, at least for the requests that are already running.

And certainly a restart of CF will kill them off, or at least stop CF trying to talk to the remote process. (Of course, it's possible that upon restart, new requests will come in and try to connect to the same non-responsive or slow-responding remote process, so it could come right back.)

Stop the request on the remote server

But while you can't do much from WITHIN CF for these hung requests, there's one other way you may be able to stop the madness: stop the request on the remote server.

Once you can determine exactly what tag it is that's hung up (with the stack tracing tools above), you could then target whatever it was waiting for: the database server, a remote page called via CFHTTP, an exchange server using CFLDAP, etc.

Since the tools that let you stack trace the running request also show you the time the request started, you could use that info to go to the administrator of whatever service you're calling and ask if THEY may be able to kill the request. As soon as what you're waiting for stops, the CF request will continue. (Of course, it may only continue for a few milliseconds before it will be timed out by CF, as I discussed above, which is why I'm no fan of the CF request timeout feature, and think it should be turned off. More on that in a moment.)

Beware: you may not always find the remote server still "hung up"

Back to this issue of finding and killing the remote process that CF may be waiting for (that's causing your hung request), I should note that there may be times when you would go to the remote administrator and say, "look, I have this long-running CF requests that's waiting for this process (query, ldap request, web page, etc.) that is waiting forever for something that is running long on your server". And they may look and see nothing on their ends that's running long. Doh!

Yep, it can happen, for various reasons, so just be sensitive to this. You may really then have no way at all to kill the hung request. But note, again, that you may be able to use this observation to do something more to prevent the problem in the future, perhaps on the remote server side.

For instance, I've heard some describe problems where CFQUERY processing has hung talking to an Oracle database and (if I've got it right) the problem is an inconsistency between the CF datasource connection timeout and Oracle's "session" timeout. If anyone has more details on that, please do share.

But my point is simply that CF may be "waiting" for a call that will never be answered and can't be terminated from the other end. Again, in such cases, you can only kill them by restarting CF, and then you need to investigate how/why the call to the remote server are getting hung up in the first place. That's where logging information for diagnostic purposes may really come in handy, as is discussed next.

Logging what CF is getting hung up, to show to the remote administrator

If this problem (of calls to remote servers that take too long or get hung up) is happening often, and/or you can't always be logged in to see when it's happening using the tools above, another idea is to log for yourself whenever you make such a call to a remote server (that you know tends to hang up), such as putting a CFLOG statement before and after the CFHTTP, CFLDAP, CFQUERY, etc.

At least then you'll be able to see when it does and doesn't take a long time. The log would also help you by showing when it logs a start but no stop.

You could also code it so that it only logs when it's slow, but being able to confirm that it's generally fast and only sometimes slow may be itself useful diagnostic info.

Note that CF 9.0.1 by default adds new logging that does automatically log the start and end of calls to cfhttp, cffeed, and more to corresponding new logs (http.log, feed.log, etc.), which could also help.

Finally, as for logging the queries, you can get that from FusionReactor and SeeFusion automatically, as their "jdbc wrapper" features allow you to log every query (or optionally only those slower than a certain time). There is also a new "log activity" feature in the "advanced settings" of a CF datasource definition that could also log DB activity, though it is quite verbose and a tad unwieldy (not one line/row per query like the other two tools).

Bottom line: I'm no fan of request timeout features

So all that said, I'll repeat and clarify that I'm no fan of request timeout features, not that in the CF Admin, nor that offered in CF monitoring tools that offer to "kill requests" automatically, like the CF Server Monitor Alerts, FusionReactor's crash protection, and SeeFusion's active monitoring rules. I don't think they should be used, personally.

Let me be clear: I do love those tools and use them and help people use them daily. And I do love and highly recommend the features in those tools for sending you *alerts* when requests exceed a given time. What I don't like is them trying to kill them automatically, for all the reasons I outlined above. So I tell clients to turn off the "timeout requests" feature (though it does still make sense to use TIMEOUT attributes on certain tags, or may make sense to implement the CFSETTING RequestTimeOut on some page where you know that the reason it runs long is not one of these things that can't be killed anyway.)

Instead, I recommend (and help my clients daily) to use the alert info (from the CF Server Monitor, or FR or SF) to be notified if/when requests ARE taking too long--and NOT to kill them. Note that these tools all send the notice *as soon as* the request takes too long (whatever time you set), whereas CF's "log slow requests" feature only logs when requests end--and that's only IF they do end without failing.

So yes, get notified that requests are taking too long. Use the info in the alerts, which includes the stack trace info I discuss above. Do find and resolve the problem. Don't rely on (or in my opinion even use) auto-kill features, when in fact they nearly never are able to kill really problematic requests anyway.

Yes, yes, I do realize that there are some requests that CAN be interrupted by these timeout/kill features, but I'll assert that such requests are far less commonly the cause of any serious problems. Your mileage may vary, of course. But I make my statement based on several hundred instances of helping folks solve typical CF server problems.

So why is the "timeout requests" setting there?

One last thought worth considering: someone might reasonably ask, "Charlie, why are you such a hater of the setting? If Adobe has it there, it must be for a good reason."

Here's what I'd say to that: sure, when CF originally ran on C++ (prior to CF 6), perhaps this setting could be reasonably relied upon to ensure that requests would not run any longer than the set time. (I don't recall, but perhaps even then there may have been at least SOME tags that it couldn't interrupt.) But clearly since CF 6, in the Java model, this is no longer the case.

And yet if you read the Admin page, or its help, or the docs, or the comments from nearly anyone who considers the setting, the presumption is that this WILL stop requests from running longer than the x number of seconds indicated.

Why am I so impassioned/manic about this?

I hope I've made clear in this entry why I think that's not only wrong to conclude (in nearly all cases), but worse it sets up a tragic misconception of how CF works. If you think this should and will stop long requests (or that the alert features of the monitors will kill them), then you're going to be in for a shock when requests do hang up for an extended period of time. What are the implications?

  • You may totally under-estimate how many simultaneous request threads you should enable.
  • You may never pay attention to tools like CFSTAT or jrun metrics (to observe at least "how many requests are running" at any given time), which will help you see if/when requests are hung.
  • You may never bother to learn how to use the CF Server Monitor (or FR or SF), all of which can go still further and show not just how many requests are running (possibly hung) but a) how long, b) what the URL is, c) what the IP address is, and so much more, which can help you find and resolve problems.
  • You may never bother to learn how to do the stack tracing that I discuss above, which is often vital to understanding where and why any given request is hung (or was at the time an alert was thrown)
  • You may never bother to analyze logs that show the activity patterns (how many requests are running at periodic intervals, such as the FusionReactor "resource log" reports.) It's really THAT information that is vital to your understanding what to set for your simultaneous requests setting.
  • and so on

All of this info (and understanding) is VITAL to a very important and common class of CF server troubleshooting: why is CF up but not responding? It may be that requests are hung.

But if you assume, "well, they can't be running any more than x seconds", then you'll start to think "so it must be something else", and you figure you may as well just restart CF. Or you start reading about how someone suggests you change your JVM settings (which may have NOTHING TO DO with this problem, and not only not solve it but could cause new ones), and so on.

Again, I see this all the time. I hope by this entry to have helped avoid some of the very common misunderstandings on this subject that I frequently see either on lists, or in emails to me, or in my consulting engagements. If I seem passionate about it, it's because I am. Same with the memory issues I discuss in the related and similarly titled entry, CF911: Lies, damned lies, and when memory problems not be at all what they seem, Part 1.

Need More Help?

I mentioned above that I provide CF Server Troubleshooting consulting. If you need some help understanding how to apply the information above to your specific problem (or need help with any CF server, or CFBuilder, problem), I'm happy to help.

I don't need to come on-site, nor do you need to give me remote access. Instead, we can work easily and securely right over the web using Adobe Connect.

And I don't have any minimum time-block requirement--and I even offer a satisfaction guarantee. To learn more, including rate plans, see my consulting page. (I hope some will forgive this brief commercial here. I don't generally mention it, but since some say that they didn't know I offer such services, it seemed an appropriate point to mention it.)

Conclusion

So phew, another really long blog entry. But I hope it may help some people (and help some who help others).

As always, I welcome your feedback, corrections, additions, etc. Really, I ask for your feedback. If it helped, please say so. My blog doesn't get the traffic of many others. I often see that hundreds of people have read things, but few ever comment. I can't know if it's that I've answered every question (I can hope so), or that you weren't impressed. Like the guy said in Dirty Harry, "I gots to know". :-) Sometimes, all it takes is a few people to "prime the pump" and start commenting to lead others to do so. Why not grab the handle? :-) And if you think this would be helpful info for others, please do share it (tweet about it, mention it on mailing lists/forums when you see the problem raised, etc.)

I'm planning to better organize and package CF server troubleshooting resources (mine and others). We have a lot of great info out there for those solving CF problems. It can just be a challenge to sort through it all. I hope to help solve that. Look for more news to come on that front in time.

How do I love FusionReactor? Let me count the ways (6 minute interview video)

The folks behind FusionReactor have started a YouTube video channel and they recently posted a 6-minute interview with me that we did at CFUnited. In it, they ask and I recount the reasons I appreciate and recommend it. Check out the video, embedded also below.

FusionReactor is one of the leading CF Server Monitor tools, which works not only with CF 6/7/8/9, either Standard or Enterprise, but it also works with Railo, Open BlueDragon, and even BlueDragon JX 7.1. In fact, it works with any J2EE/JEE server or servlet engine.

If you're running a site on any of those platforms and ever have problems of slowness, instability, or any other "curious" problems, or just need to better understand the nature of requests that CF is processing, and how well (or poorly) it's doing it, FusionReactor is a great tool, for the reasons I outline. It's like having x-rays into the app server.

I've written and spoken about the tool quite a bit, and have a FusionReactor blog category here with over a dozen entries here, as well.

I'll be speaking at cf.Objective() on "Stack Tracing CFML Requests to Solve Problems"

Though I got the news a couple of weeks ago that my submission to cf.Objective() 2010 had been accepted, I only tweeted my delight about it and didn't blog it. Here's the description:

"CF911: Stack Tracing CFML Requests to Solve Problems"

Regardless of what CFML server monitoring tool(s) you have, or even if none, did you know that you can use a feature called "stack traces" to be able to pinpoint the exact line of code that a CFML request is running at any time? Did you know how to use that information to troubleshoot performance/stability problems? Do you know how to obtain that information either manually or automatically (such as during a crash while you're not watching)? Do you know how to obtain that information in any of the CFML Server Monitors (FusionReactor, SeeFusion, the CF8/9 Enterprise Server Monitor), or with free command line tools? And how to do this for any CFML engine (CF, Railo, BlueDragon, etc.)? Do you know how to interpret the information once you get it?

In this session, veteran CF troubleshooter Charlie Arehart will help remove the mystery from using stack traces. It really is amazingly simple with the right tools, and it can be incredibly useful to solve otherwise thorny problems, once you understand how to interpret the information.

Of course, I'm thrilled to be heading back to Minneapolis. I spoke there previously in 2008 and 2007 but couldn't attend in 2009. It'll be great to see all the fine folks who run and attend this unique conference.

BTW, I just saw also that CFUnited announced another round of topics accepted today and I see a topic whose title if very similar, "How to Read a Stack Trace", by the inimitable Daryl Banttari. It's hard to tell from his brief description how similar these will be, but Daryl is awesome so I'm sure I'll learn much from his. (I was literally just about to offer mine as another CFUnited submission but now won't of course. :-) Hopefully another of my submissions will be accepted, so I can keep my streak of having spoken at every CFUnited since they started.)

Anyway, the good news is that whichever conference you go to, this important (and often misunderstood) topic will be covered! :-)

Spying on ORM database interactions: Hibernate, Transfer, etc. on any CFML engine

As people use CF9's ORM feature (or other ORMs like Transfer and Reactor, or indeed Hibernate, on any version of CF6+ or indeed any other CFML engine), they may be left wondering what sort of SQL interactions happen "under the covers" between the ORM framework and the database engine (whether in a given request, or perhaps at startup of CF).

Well, there are several ways you can watch them, as this entry will discuss, and some may be better suited to the job than others. It can be very interesting to discover what's going on, especially if you're having any suspected performance problems which you think may be related to ORM processing (or just if you wonder what all it does for you).

As for spying on the SQL, of course ORM support is just a different way that the CFML engine (through the ORM framework) sends SQL to a database via a regular DSN, just like any other request, so there's nothing really "tricky" about this. It's just about realizing that while you don't write the SQL yourself, it's still generated by the CFML engine/ORM framework, and you may not realize/consider the available tools which can spy on it, just like any other DB processing from within CF. Indeed, some people may not even realize how many options exist to spy on JDBC interactions from their CFML engine to the database engine.

The good news is that there are several approaches, some included in CF (some depending on the edition), and some available separately which would work in any edition of CF or the other CFML engines (Open BlueDragon, Railo, etc.), and with any of the ORM frameworks. And again, some may be better than others for certain challenges.

(FWIW, besides the aforementioned Transfer and Reactor, there are still other ORM solutions for CFML, which I mention in my CF411 list as CFML ORM Frameworks. Indeed, note that you can run Hibernate on CF prior to CF9, if you want to. This is a recovery of a blog entry that no longer exists, recovered via archive.org.)

Built-in ORM Logging Option

First, note that for those using CF9+ ORM, there is indeed a built-in option in the CF ORM setup where one can enable logging, settable in the application.cfc: see the this.ormsettings option and its available key/value pair, logSQL="true".

There are several resources where you can learn more on that (and a related log4j property file approach to logging this). Besides the CF9 docs page on the ORM settings, there is also a blog entry by Adobe engineer Rupesh Kumar.

The default is to log this information to the console, but you can manipulate those log4j settings to tell it to use a file (see the links above). Even so, this will result in quite a lot of data being logged, which you will then need to connect back to your specific requests. The following approaches may be preferable.

Using FusionReactor or SeeFusion

Users of any CF edition (6+) or any CFML engine (Railo, OpenBD, or BD 7+) can use tools like SeeFusion and FusionReactor, which have always had the ability to monitor database interactions by "wrapping" the datasource to be monitored. FusionReactor engineer John Hawksley has posted a recent article specifically on monitoring CF9's ORM interaction, in the FR Devnet site, Using FusionReactor's JDBC Driver Wrapper With ColdFusion 9 ORM. Its concepts would apply to any ORM, of course.

Similarly, I've written generically about FusionReactor's database monitoring feature in What is the FusionReactor datasource monitoring feature? Why would I use it? Powerful stuff. As I point out in that article, the concepts discussed apply as well to SeeFusion's ability to monitor queries by wrapping datasources.

That said, it's worth noting that FusionReactor does have a couple of advantages, in that it provides for the display of all queries for a given request (while viewing the details of that request), whereas SeeFusion only lets you see the slowest query in a given request. FusionReactor also provides a separately available display of all the slowest queries (across all requests). It also logs every query (connecting it to a given request as well), while SeeFusion (Enterprise, at least) can also log the slowest queries to a database.

And note that both of these track any requests coming out of CF, not just those associated with a given request. So if there is ORM SQL that is associated with the startup of CF, that's tracked too. (And for those aware of issues with CF's Client Variables, such DB activity is also tracked, even that done by the hourly purge, which takes place on a background, non-jrpp thread.)

CF Enterprise Server Monitor

Those running CF 8 or 9 (Enterprise only) will find that its available Server Monitor does offer built-in monitoring of the SQL executed against CF datasources, at least, as long as you enable "Start Profiling" (which also enables other features, and overhead, as well). In this way, the Enterprise Server Monitor can monitor database interactivity, including ORM interactions.

Unlike FusionReactor (and like SeeFusion), it focuses only on showing queries that exceed certain limits, and at that it shows them only in a "Slowest Queries" interface, tracking the slowest queries among all requests. The CF Enterprise Server Monitor also has no logging ability at all.

Being able to see every single DB interaction for a given request (or across all requests) may be all the more interesting for discovering/observing what's happening with ORM interactivity.

Another alternative CF feature

Still another little-known feature for spying on JDBC interactions in CF is by way of the JDBC "spy" feature, which does in fact allow logging of all JDBC interactions mde from within CF. This feature was first enabled by way of the DataDirect 3.5 driver update which was made available (as an optional upgrade for 6 and 7) in the CF 7.02 timeframe. I wrote about the Spy feature back back in Aug 2006.

Since then, CF 8 (and now 9) offer it instead as a new "log activity" option in the "advanced settings" for a datasource definition in the CF Admin (which is disabled by default). I pointed this out in another entry from 2007 as one of many easily missed changes for the CF 8 Admin.

This "log activity" output is not as easy to interpret as FusionReactor's logs, and can indeed be voluminous (moreso than FR's), so be careful. Anyway, it's one of the several ways you can monitor JDBC interactions between CFML and your DB engine. Again, any of these may be useful for monitoring any of your CFML/database interactions.

Generic DB Monitoring tools

Indeed, it's worth noting finally that while the focus here has been watching the DB interaction from CF (and the ORM framework) to the database (by watching the JDBC traffic going out of CF and returning), you could just as well watch the DB interactivity from the DB's perspective instead (watching it coming and and being returned).

There are many tools that can monitor database processing, available for each of the major databases (free and commercial). I list several such tools in one of my CF411 section, Database/SQL Monitoring Tools.

Hope all that's helpful, whether you use ORM or not.

FusionReactor 3.5 update announced

Users of FusionReactor will want to know that today version 3.5 was released today. Among its changes are support for CF9, support for various newer operating systems, and some other modest enhancements.

For readers not familiar with FusionReactor, it's a server monitor (and more) for CF, Railo, OpenBD, and indeed any J2EE/Java EE server. I've written and spoken about it a lot. Here is the link to the category of my blog entries about it.

As for 3.5, here is what has been shared in an email to their customers. I wanted to pass it along to everyone:

WHAT'S NEW

EXTENDED PLATFORM SUPPORT - FusionReactor 3.5 now supports ColdFusion 9 plus a range of new servers and operating system platforms - including Windows 7, Windows 2008 Server R2, Mac OS X 10.6 "Snow Leopard", Railo 3.1.1 and JBoss 5.1.

IMPROVED INSTALLER - supporting 64 bit Windows machines

INCREASED MONITORING PERFORMANCE - FusionReactor continues to be the ColdFusion production monitor of choice because of its incredibly LOW overhead of less than 1%!

EXTENDED FRAPI INTERFACE - FRAPI, the FusionReactor API, gives you the ability to access FusionReactor functionality from your ColdFusion pages. This interface has now been extended to include additional Request information.

NEW AMF PROCESSOR - Action Message Format (AMF) is the file format used with Flash Remoting and applications such as Flex 2 and 3. FusionReactor has a completely new AMF processor supporting externalizable Objects.

A NUMBER OF MINOR ENHANCEMENTS AND IMPROVEMENTS - With every new release we continue to extend FusionReactor to make it even more stable and secure.

Click here to see the FusionReactor 3.5 Release Notes and Resolved Issues http://www.fusion-reactor.com/support/kb/FRS-230.cfm.

To upgrade, please download FusionReactor 3.5 from the FusionReactor download pag.e

Click here to download 3.5: http://www.fusion-reactor.com/fr/downloads.cfm

If you have any questions or feedback please email sales@fusion-reactor.com.

CF911: Easier thread dumps and stack traces in CF: how and why

You may have heard the value of taking thread dumps or stack traces when trying to understand and resolve problems with CF. They can be valuable to see what's really running on your server at the time it may seem hung or slow to respond. The problem is that they can be challenging to obtain, so here's how to get them even more easily.

(If you're not familiar with the value of thread dumps or stack traces, read on. The resources I point to get help you to appreciate their usefulness.)

Update: I did a presentation on this topic, posted here. After reading the below, you may want to view that with more details on stack tracing any of many ways.

Well, I was tooling around looking for some help on a topic when I happened upon an old entry from the awesome and ever-valuable blog of Steven Erat, formerly of Adobe and now of Webapper. The entry is An easier way to take ColdFusion thread dumps on Windows , and he actually points to entry in another classic blog by Brandon Purcell (formerly of Adobe and now with Universal Mind). Steven's entry also points to a classic Adobe technote on interpreting dumps and stack traces.

But both entries are from 2005, and since then there are still easier solutions for getting either a thread dump or stack trace, and I wanted to point them out (I'll also add a link to this entry in Steven's.)

For those on CF 6, 7, 8

First, for those running CF 6, 7, 8 (and other CFML engines, like Railo and Open BlueDragon), you can use either FusionReactor or SeeFusion, commercial monitoring tools, which each offer a simple single button to produce a thread dump (a stack trace for all threads running inside the JVM that underlies CF).

Better still, they also offer an option to get a stack trace for a single running CFML page request.

For those on CF 8 Enterprise

If you're on CF 8 Enterprise, you can take advantage of its Server Monitor, which also offers the ability to get a thread dump, by way of its "snapshots" feature.

I discuss the Server Monitor in a 4-part series of articles for Adobe, starting here.

Part 3 of the series discusses Snapshots in particular.

(I also have an an article introducing FusionReactor.)

The CF 8 Server Monitor can also provide the stack trace of a given request, but you need both "start monitoring" and "start profiling" to be enabled. If they are, you can see currently running requests in the Active Requests screen, and if you click on one, the stack trace appears in the middle of the request detail page.

But maybe you don't need to do a thread dump at all

All that said, I should point out that one of the values of thread dumps was to see what requests were running, which is incredibly valuable.

But if you do get either FusionReactor or SeeFusion, or can use the CF8 Enterprise Server Monitor, note that they all have a feature to show what requests are running.

That, alone, can be what you really want to know, and their interface is a LOT easier to read than a thread dump. :-) And the stack trace feature let's you see exactly what line of CFML code a request is running at the time you request the stack trace.

Thread dumps are still valuable

Still, thread dumps can still be valuable. For one thing, you may not always be on the server when you want to see what's running. Both FusionReactor and the CF8 Monitor have an option to trigger sending you a thread dump by email when certain conditions are met (too many requests, requests taking too long, too little memory, etc.).

Also, sometimes a problem is not caused by a running CF request. It could be another thread that's having a problem, whether one of the scheduler threads (which are used for background tasks like the client variable purge process), or cf-thread threads (used by CFTHREAD in CF8), and so on.

A tool to help analyze thread dumps

Indeed, Steven would also want to point out that Webapper (makers of SeeFusion) also have a free online tool called SeeStack which can help analyze thread dumps to make them a little easier to read.

Still more on thread dumps and stack traces

Further, the Adobe technote that Steven points to, Debugging thread dumps and server problems in ColdFusion MX 6.1 and 7.0, also gives considerable insight into understanding stack traces and thread dumps. It's also quite old and doesn't talk about things with respect to the more modern tools I've mentioned, but the information is valuable whether you use them or not.

In fact, much of it is about understanding stack traces (remember: a thread dump is a list of stack traces for all threads in the JVM), and connecting the dots in what they report and specific possible problems in your code or configuration.

I plan to do some entries in the near future, walking through use of these tools to solve common problems of CF servers being unavailable. Until then, check out also the "related entries" listed at the bottom of Steven's entry.

Of course, there are many other fine classic blog entries (and bloggers) who talk about CF troubleshooting, too. I also plan to offer something to help with finding those.

Update: I did a presentation on this topic, posted here. After reading the below, you may want to view that with more details on stack tracing any of many ways.

More Entries

BlogCFC was created by Raymond Camden. This blog is running version 5.005.

Managed Hosting Services provided by
Managed Dedicated Hosting