Monday, June 16, 2008

RHQ - tip of the day: sync your clocks

RHQ and JBossON 2, that is built on top of RHQ are distributed systems. The agents can (and will) live on a different system than the server.

Metric data that is gathered on the agents will be sent to the server and stored in the database. Now if you want to e.g. compare metrics from two different sources (e.g. the load of two JBossAS servers in a cluster), you can only make sense out of it when the metrics have been taken at the same time (a few seconds usually don't hurt, but if it is more, you will see a peek on one server at one time and a peek on the other server at the other time. This will make you wonder why your load balancer was distributing the load in strange ways. In reality the peeks were at the same time and the system was working as it should.



In the above image you see on the left side two peaks that obviously are not occurring at the same time. On the right with synchronized clocks, you see that those peaks, even at different level, occurred simultaneously.


Another issue that we have seen in the past is that metrics were not showing at all on the GUI and users were thinking that something is broken or that the agent did not collect data (especially directly after install). The cause to that was that that the agent clock was so far in the past or in the future that its values just well outside of the displayed time range.



You won't see metric data of an agent after it started when its clock is 2h away from the master clock with this setting of a display range of 1 hour.




Technorati Tags:
,


No comments: