How GlassFish DAS communicates with Node Agents and Instances ...
Question : Explain the communication details between Domain Admin Server, node-agents and server instances in Sun's Application Server 8.x and 9.x (GlassFish V2).
Terminology :
DAS
: Domain Admin Server (One per domain) -- The process that controls the management of the entire domain.
NA
: Node Agent -- Generally, one per box or Solaris container -- The process that controls the life cycle of server instances.
SI : Server Instance -- The real Java EE instances that run user applications in an enterprise.
Answer :
1. Background : The domain.xml controls the configuration. At every node-agent, there are also a few configuration files that are consulted by every NA. See NA section at docs.sun.com for details. Following are the points in time when the communication (for administration/management purpose) happens:
- DAS communicates with each NA : Only when DAS needs to know NA's running status.
-
DAS communicates with each SI
: When DAS needs to know SI's running status and when it needs to cascade the SI MBeans into the DAS's MBenServer.
- NA communicates with DAS : During initial rendezvous (which may happen during creation of NA), synchronization of the NA itself and synchronization of each SI that NA is responsible for.
- SI communicates with the DAS : Never, explicitly.
Thus, the communication is mainly driven by DAS. When the domain is created, the administration is configured to use an authentication realm named admin-realm . This realm points to what's called a FileRealm which is nothing but the implementation of a security realm implementation that uses admin-keyfile . If you see the domain's configuration, you'll find this file in config folder of that domain.
The communication happens over two channels. One is the HTTP channel and the other is RMI channel. For this purpose, there is a SynchronizationServlet and a System JMX Connector (standard in JDK 5) that is provided. Every DAS and SI, including the NA start a JMX RMI ConnectorServer that can be optionally configured to use transport layer security.
Every NA communicates with DAS multiple times, but the key points are of initial hand-shake and synchronization. The initial hand-shake is when NA makes DAS aware of its own existence and DAS correspondingly responds if it has the correct credentials. When the DAS is configured to have secure access (this is the default in enterprise profile domain), both the HTTP and JMX/RMI channels use Transport Layer Security with SSL/v3. Note that during the initial hand-shake, the DAS knows about NA's existence alone. DAS does not release the contents of the domain's repository during this phase. This happens over HTTP channel since creation of node-agent takes the DAS's admin-port (default: 4848) as an option.
After an NA is created, the most natural step is to start that NA. This is done by executing the asadmin start-node-agent command. Since this is the first-time startup of the NA, NA syncs up with the DAS. Note that startup of NA requires the correct credentials (admin user name and admin password) to be supplied. The DAS compares them against its own admin-keyfile and the communication succeeds only when this succeeds. The NA startup also requires the master password to be provided on the command line because in order to start, the NA has to be able to unlock the security store (e.g. keystore.jks) that it synced from the DAS. Note that master password is never put on the wire! It has to be provided at the time of both DAS startup and every NA startup. For advanced use cases, there is an unattended boot scenario that is handled by using the option --savemasterpassword which should be used with care.
The reason NA needs the master password is also to pass it on to the SI's it starts (as part of start-instance or start-cluster) so that these instances are able to unlock the security store to get the primary keys and certificates.
The NA always communicates with the DAS over JMX/RMI channel. Thus NA opens an RMI connection to the DAS where DAS is listening for RMI/JMX Connections. This is where the RMI Registry in DAS (default port 8686 ) comes into picture.
When the domain is created, it uses the self-signed certificate aliased s1as which is used for internal communication. This certificate is created anew every time a domain is created. The master password of a domain is what locks the server's keystore. In enterprise profile domain, NSS is used to manage the secure store, whereas in cluster profile domain, JKS manages the secure store. The semantics of the master password are unchanged in both the cases.
The Server Instances are synced with the DAS as part of either:
- start-instance, or
- start-cluster, or
- start-node-agent --syncinstances procedure.
For this synchronization, they use the HTTP layer and communicate with the SynchronizationServlet that's listening for sync requests. This servlet is (of course) running in the DAS.
The server instances get the admin credentials from the node-agent process in a secure manner (using stdin). This also evident when you try to use the startserv script that's located in instance's bin folder.
The process of DAS communicating with the NA and SI's is identical in that it communicates with them over RMI/JMX in the other direction.
2. Transport Layer Security :
This is achieved when we enable the security-enabled flag on the admin-listener and jmx-connector named system on the DAS and server instances. Note that admin-listener (HTTP/S) is started only in the DAS. There is no admin-listener in server instances.
It's of course possible to use another CA-signed certificate for this purpose. It needs additional configuration after importing those certs in the store.
3. Authentication and Credentials :
Please see: http://wiki.glassfish.java.net/attach/GlassFishAdministrationPages/admincreds.html
Thanks for this. It's useful to understand the communication that takes place between the DAS, node agents and server instances.
Ismael
Posted by Ismael Juma on October 17, 2008 at 11:58 PM PDT #
Hi Kedar!
Is it possible for a DAS and NAs to talk over RMI-IIOP? For some setups it's desirable to have NAs in less secure nets, and the DASes in more secure nets with firewalls between them. In this case plain RMI can cause headaches, no?
Posted by sysprv on January 09, 2009 at 04:30 AM PST #
Hi Kedar
I'm fairly new to glassfish and I'm wondering why nobody talks clustering of 2 or more DAS. I know I can cluster nodes but if I can have only one DAS per domain and that box becomes unavailable do I have no fail over at all ?
Posted by Alex on April 14, 2009 at 12:38 AM PDT #
sysprv,
Yes, that's an issue with GlassFish v2.
Alex,
Yes, admin availability is affected if DAS goes down. But the user applications will continue to have failover even if DAS does not run. Do you think you need a highly available DAS?
-Kedar
Posted by Kedar Mhaswade on April 14, 2009 at 02:44 PM PDT #
Kedar
I was just wondering if DAS clustering existed at all. Let's say I loose the DAS for a longer period of time would it be enough to point all agents to a new DAS that I created. Could you describe a poor-man's DAS failover ?
Posted by Alex Stuck on April 14, 2009 at 04:45 PM PDT #
Ah, that's a slightly different thing. Here's what you can do --
- Always take backup of your domain after reaching a stable configuration. This can be done using "asadmin backup-domain". This creates a zip file with proper date-stamp etc.
- Assuming you have a reasonably recent backup and your DAS machine (A) goes down, you quickly restore the backed up zip file on a new machine (B). Start this new incarnation of the DAS on machine B.
- Stop all node-agents and instances and edit the node-agent-folder/agent/config/das.properties and change the DAS location to B from A. Restart the node-agents. The hand-shake should occur and DAS should recognize the node-agents and instances.
This is poor-man's DAS resurrection (not really a fail-over).
Let me know (by sending an e-mail to kedar.mhaswade@sun.com) if this works for you.
Posted by Kedar Mhaswade on April 14, 2009 at 05:03 PM PDT #
Kedar
Thanks I will try that. Sounds about right.
Hey do you know why my agents still ask me for username/masterpassword when I start them although I created them like this :
asadmin create-node-agent --host {dashost} --secure=true --savemasterpassword=true {host}_agent
From the docs:
"To enable the node agent to be started without prompting the user for a password, save the node agent's master password to a file when you create the node agent"
The file "master-password" exists. Any ideas ?
Posted by Alex Stuck on April 14, 2009 at 07:07 PM PDT #
Got it to work - wow that took some messing around.
Still trying to understand the architecture better.
I found your post on how agents need to sync with the DAS one time first before they can use the password-alias.
Thx
Posted by Alex Stuck on April 15, 2009 at 07:51 PM PDT #