Monitoring ColdFusion web server connectors, more on Tomcat 'Status Workers'
Note: This blog post is from 2015. Some content may be outdated--though not necessarily. Same with links and subsequent comments from myself or others. Corrections are welcome, in the comments. And I may revise the content as necessary.If you're running CF 10 or above, there was a very interesting post on the Adobe CF blog, from July 19 2015, entitled, Configuring Status Worker in Connectors. The Adobe blog post title may not have caught your attention, but it's about setting up a lightweight and built-in Tomcat monitoring feature for observing the status of the Tomcat web server connector.
You may want to consider enabling it, but I would add some caveats and observations that I share below. Note that it's really quite easy to enable, and DOES NOT require a restart of CF (only of your web server, or technically in IIS, a recycling of the application pool/s--a web site restart is not enough) to take effect.
(Update in 2018: The original 2015 Adobe post that I refer to was somehow lost in the move to Adobe's portal in 2017. The URL offered in my opening paragraphs is in fact a republication in 2018 of that original post. For the sake of posterity, you can still find the original post at the archive.org version from 2016, but I'll note that the content is identical.)
If you've not yet read the Adobe blog entry, go check it out and then come back here for several observations I have to share, some of which I think you'll agree could be very important.
What is the web server connector and why should I care to monitor it?
It may help some readers to explain that the web server connector is the means by which requests get from your web server (IIS, Apache, etc.) to CF. In CF9 and earlier, it was a JRun connector. In CF10 and above, it's a Tomcat connector.
While for most folks the connector is something they may never know or worry about, for others it's been a bane of their existence especially since moving to CF10 or 11 (for reasons outlined in comments in other Adobe blog posts, like ColdFusion 11 IIS Connector Tuning.)
And so this "status worker" lightweight monitoring feature could be helpful for everyone running CF10 and 11 to consider. I show here a portion of its report, as enabled on my own site just now. Some may recognize its output as looking a LOT like the Apache mod_status output.
(Let me also clarify, at the outset, that this Status Worker is not at all a substitute for a full-fledged request monitoring tool, like the CF Enterprise Server Monitor, FusionReactor, or SeeFusion. It does NOT list all running requests, like those do. As you can see, the report on the right does not list running requests, though it does list at least a count of running requests. I show one in the image, indicated by the "busy" count showing "1", discussed further below.)
This "Status Worker" is a Tomcat feature, not an Adobe creation
As another point of clarification, and as Chinoy does indicate, this "status worker" is not an Adobe creation. It's a built-in Tomcat feature. It's just not one that Adobe had mentioned before. You could have found it yourself in the Tomcat docs for it, which he helpfully points.
(Indeed, the Tomcat web server connector is used by default for Lucee/Railo or indeed any Tomcat implementation which uses the AJP connector, and this status worker feature would apply to you. If you instead use the BonCode connector with Lucee/Railo or CF, then you are not using the Tomcat-provided connector and this status worker concept does not apply to you, I would think.)
Some challenges to beware
Still, even upon reading either just the Adobe blog, or indeed those Tomcat docs, you may find some challenges when trying to implement and understand the status worker, despite Chinoy's helpful explanation. I wanted to highlight my experience resolving some of those challenges here.
1) Don't leave it editable and/or unsecured
If this will be a server on the internet (or indeed, accessible to an intranet where someone could want to cause trouble), then please heed his warning to enable the read_only=True option when enabling/configuring it.
If you do not, then by default ANYONE who reads the Adobe blog post could then try their proposed /cf/status URL against your server and (assuming you have configured it per those steps) they will be able to not only MONITOR your web server connector status info but more important they will be able to change the settings in your web server connector.
Some may note that "these changes don't change the configuration permanently, they only last until the next restart of the connector", and that is true. Still, someone could cause trouble for you changing some of the connector settings on-they-fly while your server is running.
As the Tomcat docs note, another option to consider is to use web server configuration features (IIS, Apache, etc.) to control who can access this /cf/status URL. For instance, with either Apache or IIS, it would be easy to use their available URL Rewrite feature, to look for a pattern /cf/status*, and "reject" the request if the {REMOTE_ADDR} "does not match" a given IP address (and you can repeat that condition again to name another remote_addr).
(And of course, you could use a different URL than /cf/status, but beware the oft-made warning about "security by obscurity".)
The wiser plan is to ALWAYS set this read_only option on, by default, if enabling this status worker. If you need to make changes, change them in the config files and restart the worker (restart Apache or IIS, or in IIS you could also just recycle the app pool associated with the connector to restart it.)
2) You need to configure the worker for each connector, if you have more than one
Note that this status worker feature is indeed enabled PER CONNECTOR (and I discuss later how you view it PER WEB SITE).
But my point here is that if you have configured more than one connector (if you have multiple numbered folders under [coldfusion10|11]\config\wsconfig), and you want to have a status worker working for each, then you need to specify this configuration (in the files Chinoy showed) within each connector folder. If you get a 404 error trying to access the /cf/status URL, this is probably your problem. (You can get that also, though, if you just forgot to restart the web server/recycle the app pool for the site in question.)
The way you would get to each status worker is to visit the site's URL by whatever way your web server's configuration defines a binding (for a domain/ip address and/or port), and then add the status URL you setup (like /cf/status).
Again, I'll have more to say on this in a moment, but if you had multiple web sites sharing one or multiple connectors, you would use the /cf/status URL to view the monitor for each site's domain/ip/port to see the status of connections from the web server to CF for that site. (Note that it's totally ok for the worker name and indeed the URL pattern to be the same in multiple connectors. Connectors are independent from each other.)
BTW, the "port" that he refers to (for accessing the status worker display) is indeed the port (if any) that you would use for requests made to your web site which uses the connector configured to be monitored. It does NOT refer to the AJP port, like 8012 or 8013 or 8014, as you may see listed in the workers.properties file. Nor does it refer to ColdFusion's internal web server port, like 8500, if you may have that enabled. This web server connector has nothing to do with using CF's internal web server. It's only how your external web server, like IIS, Apache, or nginx are connected to CF.
3) Each status worker reports on connections for the one site through which you're viewing it, not all connections from all sites
(I have revised this since originally writing the blog post, to really make things clear.)
This is very important to understand: each status worker report will only report on connections to THAT one specific site through which you're viewing it. It does NOT report on requests going through any another site and/or connector.
So yes, this means that for you to know the status of connections against ALL your sites you'd need to run this tool AGAINST ALL YOUR SITES, one at a time.
Being able to see all the connector values for all sites at once could be valuable, because you may be having a problem with one site's use of connections and not another!
Another benefit would be to know how many connections are being used across ALL sites at one time, as would be counted against the total connection_pool_size for that connector, and the maxthreads defined in CF.
Sadly, there is no mechanism in this Tomcat-provided tool to get one interface that reports for ALL web sites. Nor is there any logging of this connection info, at all. The closest we have is the metrics log, about which I will blog more later.
[I will note, though, that if you have your web site bindings setup so that a given web site does respond to multiple domains and/or subdomains, then you only need to view the status worker for one of those sites, as it's per SITE not per domain within a site, technically. And FWIW, the status worker does not reflect any requests made using ColdFusion's built-in web server, such as may have enabled to process requests for the CF Admin, for instance.]
As for a means to automatically monitor the status worker, I will also note that there is an option in the status worker to cause its output to be rendered as XML. One could create a mechanism to track that information programmatically, though it's beyond the scope of this post to explore that further.
New/Updated 4) Beware that the status worker information only reflects the status of requests made SINCE the connector was last restarted (IIS App pool was restarted, for instance)
This is a new point I am adding, after originally writing the post. I have come to observe that the information displayed in the status worker report is REFRESHED (wiped out, starts over) if the connector is restarted. In the case of IIS, that's if the application pool that underlies the site in question gets restarted/recycled. The numbers are of course also reset if the web server (IIS, Apache, nginx) is itself restarted.
In the case of IIS, note also that application pools can be be recycled not only manually (such as by right-clicking on one in the application pools list), but they can recycle on their own. For instance, by default app pools recycle after 20 minutes of inactivity, and every 1740 minutes (1 day and 3 hours)!
My point is that if you are viewing the output of the status worker, it is NOT necessarily cumulative over a very long period of time. It could reflect only minutes worth of information, even though CF has not itself restarted.
5) Beware that some status worker operations may hangup the connector momentarily
Note that some status worker operations, like the edit operation (to display the connector options which can be edited, as discussed above), may LOCK UP the connector briefly. This affects ALL requests using the connector, including from other users.
So if you see your status worker web request hanging up (which could be a few seconds), beware it may not be only YOUR browser (making that request) which is hung up: it may be the entire connector that's hung up, and other requests made by other users on the same connector will be hung up, again perhaps only momentarily.
I've only observed this with the initial display of that "edit" operation in the web interface, and only for at most a few seconds. Still, do beware. It's easy to think it was "only your request" that was hungup. I confirmed this while observing multiple concurrent requests.
6) The Status Worker info is helpful, but may not help solve all connector problems
The information reported by this Status Worker mechanism may be helpful for some problems, so do check it out. But it's not clear to me if it will help us all to understand/resolve problems related to proper configuration of the connector, as have been discussed in the Adobe blog entries like the one I mentioned before, at ColdFusion 11 IIS Connector Tuning.
My observation of things is that the "count" info (such as the counts of how many requests are "busy", or "connected") are the same information that can be seen in the metrics.log (new in CF10 and 11), assuming it's properly configured. (I've done a separate post on that, where I report how the port in the CF Admin Debugging Output page setting for this should be the connector/AJP port, at least if most requests to CF do connect through an external web server. Sadly, the CF Admin defaults to using CF's built-in web server port, 8500.)
Anyway, I can report that the "busy" value (in the status worker report, and the metrics.log) does reflect the number of currently running requests (if any), if you do have long-running CFML requests (running in that site and connector). That's interesting, but remember again that this only reflects requests in THAT site and connector. If you have requests running against OTHER sites and/or connectors, this total will NOT equal the TOTAL number of running requests in CF (which IS reported in the metrics log, CFSTAT, and tools like the CF Enterprise Server Monitor, FusionReactor, and SeeFusion).
The "connected" ("Con") value (in both) is more interesting. It seems to reflect the count of connections, which do indeed live on beyond the life of a request (and can be terminated after being inactive for a while by changing the connection_pool_timeout feature discussed in the Adobe IIS connector tuning blog entry mentioned above, or again are terminated if the web server or app pool in IIS is recycled.) That number may be helpful for knowing when the number of connections is being starved (also discussed in that IIS connector tuning blog entry, but again it's a challenge that this worker only reports for the one site against which you're running it.)
The "state" value may be useful, if it's ever showing other than "OK". The legend indicates that we may want to watch out for a state value of "OK (busy)" (all connections busy), or perhaps also "ERR" (error, with possible sub-"states"), or maybe even "OK (idle)" (no requests handled). More on these in the Tomcat status worker reference docs.
Just remember, again, as discussed above, the info in this status worker report is only for requests against THAT site and reflects information since the web server (and/or IIS app pool underlying that site) was last restarted.
7) The Status Worker can help manage Tomcat/CF clusters
One more aside about the status worker: while my focus here has been on using it for monitoring and troubleshooting, it's worth noting that if your connector has been configured for load balancing, then the status worker also both provides additional information about the cluster but also adds features (in the web UI) to manage the cluster, including taking instances (workers) out of the cluster. Again, this is beyond the scope of this post, but see the discussion of options like "edit" and "recover" in the aforementioned Tomcat status worker reference docs.
8) Monitoring the connector with JMX
Before concluding, let me note that another tool which could help with monitoring the connectors is the underlying Tomcat JMX beans. I hope to someday do a post on them, and how to get access to them, and hopefully also how to log them to watch changes in monitored values over time.
9) Conclusion
So the status worker is an interesting Tomcat feature, which CF users may want to leverage for monitoring the web server connector, which handles communications between the web server and CF. It can help track how the connector is being used, and it may help understand configuration issues regarding the connector.
I'd love to hear if others have success using the status worker, and for sure if I get to work with someone with the problem and we discover something using the status worker, I'll update here or create a new post.
Hope all that's helpful.
For more content like this from Charlie Arehart:Need more help with problems?
- Signup to get his blog posts by email:
- Follow his blog RSS feed
- View the rest of his blog posts
- View his blog posts on the Adobe CF portal
- If you may prefer direct help, rather than digging around here/elsewhere or via comments, he can help via his online consulting services
- See that page for more on how he can help a) over the web, safely and securely, b) usually very quickly, c) teaching you along the way, and d) with satisfaction guaranteed
I guess our setting is not correct because the timeout setting in wsconfig's configuration files is set to 60 seconds and I see connectionTimeout="20000" in the server.xml file. Should I set connectionTimeout="60000" in the server.xml file to match wsconfig? Or should I change the value in wsconfig configuration files to be 20 seconds instead of 60? Or does it even matter? Thanks
Now, I say above that "according to Adobe" you should make the change, but you ask if it "really matters", and in fact I would note that many people have by mistake made the change only to the property file of the connector and not remembered to change the server.xml. Or they may make the change in both, but then "reconfigure" their connector and forget to put the timeout change back in the property file.
In both cases, I have not found it to be the cause of any significant problem, myself. Could it cause a problem, if they are not in sync? I suppose. I'm just saying that I don't think the implications are well-understood. It would be something useful to better understand, for sure, for everyone's sake.
But bottom line for you: just make them be in sync. :-)
server.xml
<Connector protocol="HTTP/1.1" port="#" redirectPort="#" maxThreads="500" connectionTimeout="60000"/>
And the connectors:
worker.cfusion.connection_pool_timeout=60
worker.cfusion.max_reuse_connections=250
worker.cfusion.connection_pool_size=500
That said, the other settings, like max reuse and pool size may well vary between the connectors (or not), and I would note that no one can say from the outside whether the particular values one chooses would be "good ones" or not, as there are too many variables.
Again, someday I want to blog about that with more observations. :-) For now, you're at least set with respect to the timeouts being in sync between this connector and CF.
http://web.archive.o...://blogs.coldfusion.com/post.cfm/configuring-status-worker-in-connectors
And since that will likely break in this blog software's reformatting of the URL, here's a bitly link:
http://bit.ly/cftomc...
For more on that archive, and how to leverage it for a question like this, see my post here:
http://www.carehart....
As for this blog post being missing from the Adobe blog, that is indeed very odd. I don't know of it being intentional. I will ask some folks to see if they may reply about this (or put it back. They may do neither, of course).
https://coldfusion.a...
We came across this when researching a problem with scheduled task timeouts, when we would see these errors via Fusion Reactor;
java.net.SocketTimeoutException: Read timed out
and
private native int socketRead0(final FileDescriptor p0, final byte[] p1, final int p2, final int p3, final int p4) throws IOException;
We hope that modifying the connectionTimeout setting in Server.xml and the worker.cfusion.connection_pool_timeout setting in the workers profile.
As for the read timeout you are seeing, well I doubt those will be fixed by tweaking the connection_timeout. That is about how long an IDLE (no longer used) connection can remain available in the connection pool. That will NOT affect a RUNNING request.
Now, as I also point out in the post, these are in fact facets of TOMCAT connector tuning, and I point to the docs on that. And one of the pages there is just about timeouts: https://tomcat.apach... There are various timeouts that could be affecting you--but I will note that most default to 0 and so should NOT lead to timeouts waiting for long requests.
As for your saying that the request you make (that fails) is a CF scheduled task, well that then opens up still MORE possibilities, as your issue could be about the caller (CF in this case) or the server you're calling (you don't say if it's a CF page on your own server), and the web server that fronts that.
All of those could have an impact on what may timeout, and how that would relay back to your sched task as a "read timeout".
If you may be interested in help to get to the bottom of this, that's the kind of consulting I do and I can often solve problems (remotely) in less than an hour that might have plagued folks for days, weeks, or months. For more, see the consulting page on my site here (including my satisfaction guarantee, my rates, my approach, and more).
But if you figure it out on your own (or we might together), do share what you learned, for others to learn from.
We updated Apache to 2.4.37 and it runs successfully.
Problem is users are now unable to access the page with a 400 error that IE was able to connect to the web server but there is now a problem with the address.
And this blog post is not therefore an appropriate place to have a back and forth on your challenge and getting to its solution. (And FWIW, I don't have any immediate suggestion, but I would ask many questions to guide you to the problem and solution.)
Instead, you could ask this as a question on the Adobe CF portal (https://coldfusion.a...) where I and others do watch for such questions and try to help.
Or if you prefer direct (and more immediate) assistance I should be able to help you resolve this quickly via my remote troubleshooting consulting. More on my rates, approach, and satisfaction guarantee at https://www.carehart...