Accurate Time with NTP in a Hurry

Accurate Time with NTP in a Hurry

Accurate timestamps are a focus of a great deal of regulation and scrutiny. While many environments require heavy duty solutions like hardware-driven PTP for lock-step accuracy, a best effort is better than no effort. If nothing else, you should at least have NTP – the Network Time Protocol – running and properly configured.

Even without fine tuning, a basic NTP implementation can get your clocks into the single-digit millisecond range. It’s the bare minimum, and every single server you have should at least do this much.

Installing NTP

NTP must always be installed and running as a service.  (Details of this vary by Linux distro or OS). Install it via the distribution’s package manager (e.g., apt-get, yum, yast, etc…).  Do not install it directly via RPM or similar files.  If you cannot access an official package repository, open the necessary firewall ports or set up an internal mirror. After all, regular patching of servers is the surest way to avoid security problems down the road.

Configuring NTP

NTP is configured via the /etc/ntp.conf file.  The service should be restarted for changes to take effect.

Example file:

server 165.193.126.229 iburst
server 206.246.122.250 iburst
server 128.138.140.44 iburst
server 66.219.116.140 iburst

restrict 10.80.5.10 mask 255.255.255.0 nomodify

driftfile /var/lib/ntp/drift/ntp.drift
logfile /var/log/ntp

Time Sources

Each time source is another ntp server, designated one-per-line in the following format:

server HOSTNAME/IP OPTIONS

You should always use only one option: iburst. This causes the server to more rapidly synchronize when first connecting/reconnecting.  Thus:

server HOSTNAME/IP iburst

Use 3-5 time sources, preferably from the list of NIST servers if within the United States.  For other countries, follow local standard practices.

See this summary of additional options, though note that none of them should be necessary in most cases.

Note: Do not use the burst option. Many NTP servers will blacklist you if you do so. ONLY use iburst.

 

Security

The restrict line defines what servers can query this NTP server for time.  (All NTP servers both synchronize to servers above them and provide time synchronization to servers below them). In the example, we are allowing all servers in the 10.80.5.0 subnetwork to query our server for time, but not to make configuration changes.

If you do not want any other servers to be able to query your server, then use the following instead:

restrict default ignore

See the full security documentation if you need to configure more complex security.

 

Drift File

This file must be writable by the ntp daemon.  Ensure that it is created: it will look something like this:

39.351

The units for the drift file are “PPM” (parts per million). The number represents the drift of the motherboard’s hardware clock, and is used to “stay ahead” of any expected hardware clock drift.

1 PPM = 1 microsecond per second = 3.6ms per hour = 86.4ms per day

So, in this case, my server’s hardware clock is drifting by about 3.4s per day.

Note: If this drift number is suspected to have been calculated erroneously, you can safely stop ntpd, delete the file, and start it again. Time synchronization will take much longer once restarted, but the drift file will be recalculated. This can be useful if the ambient environment of the server has changed significantly.

 

Logfile

NTP must be able to write this logfile, though you will find what it writes to be mostly useless.

 

Detailed Statistics

If you suspect deeper issues with time synchronization, you can enable detailed stat generation with the following additional configuration lines:

statsdir /tmp/ntpstatdir                # directory for statistics files
filegen peerstats  file peerstats  type day enable
filegen loopstats  file loopstats  type day enable
filegen clockstats file clockstats type day enable

How to interpret these data

 

Troubleshooting NTP

Starting NTP

The time has to already be within seconds of the correct time when ntpd is started: otherwise it will crash shortly after starting.  If NTP repeatedly crashes, then use the following basic procedure once:

  1. Stop ntpd and confirm the service is stopped
  2. Force a single hard synchronization with “ntpd -gq” (or manually set the system time yourself)
  3. Start the ntp service again

If the single hard sync fails, you do not have access to the server to which you are trying to synchronize.

After you have done this, the ntp service should run indefinitely.

Full command documentation

Note: Do not use cron jobs to start, stop, or synchronize time with ntp or ntpdate. ntp should always be running. Hard syncs like the above should only be used if the ntp service is unable to start.

Note: Do not use ntpdate for the once-off forced sync: this is long-deprecated.

Confirming NTP is Working

ntpstat

This command will tell you whether or not ntp is working, provide some basic stats, and give an exit code indicating whether or not the clock is synchronized.

Full documentation

ntpq

This is the general query tool for NTP.  Run it with the -p option to see the full list of peers to which this server is attempting to synchronize:

testlab3 ~ # ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 nist1-nj2.ustim .ACTS.           1 u  578 1024    0    0.000    0.000 4000.00
*nist1-pa.ustimi .ACTS.           1 u  982 1024  377   10.972   -4.568   1.265
+india.colorado. .NIST.           1 u  452 1024  377   74.793   -8.306   2.612
+nisttime.carson .ACTS.           1 u  817 1024  377   36.977    2.723  26.136

Here, we are synchronizing to nist1-pa.ustimi.  Our clock is -4.568ms from it’s clock, and the network RTT for ntp queries to it is 10.972ms.  Jitter is a second derivation measure of the overall stability of the time sync.  Reach is a binary bit-shifter showing the last several query attempts (377 means all attempts succeeded).  Poll is how often (seconds) our server queries this source.

The * indicates the current server to which we are synching.

The +s indicate other servers that are valid candidates for synchronization.

Full NTPQ documentation

ntpdc

This is a more powerful and fiddly query tool for NTP.  To use it, simply run “ntpdc” opening a secondary command line.  Then type “sysinfo” to get a data dump.

testlab3 ~ # ntpdc
ntpdc> sysinfo
system peer:          nist1-pa.ustiming.org
system peer mode:     client
leap indicator:       00
stratum:              2
precision:            -20
root distance:        0.01103 s
root dispersion:      0.04478 s
reference ID:         [206.246.122.250]
reference time:       d7e000df.608c8259  Wed, Oct  8 2014 14:28:47.377
system flags:         auth monitor ntp kernel stats
jitter:               0.014114 s
stability:            12.394 ppm
broadcastdelay:       0.003998 s
authdelay:            0.000000 s

Reference RFC-1305 for a full explanation of these variables.

Useful variables:

Precision (sys.precision, peer.precision, pkt.precision): This is a signed integer indicating the precision of the various clocks, in seconds to the nearest power of two. The value must be rounded to the next larger power of two; for instance, a 50-Hz (20 ms) or 60-Hz (16.67 ms) power-frequency clock would be assigned the value -5 (31.25 ms), while a 1000-Hz (1 ms) crystal-controlled clock would be assigned the value -9 (1.95 ms).

Root Delay (sys.rootdelay, peer.rootdelay, pkt.rootdelay): This is a signed fixed-point number indicating the total roundtrip delay to the primary reference source at the root of the synchronization subnet, in seconds. Note that this variable can take on both positive and negative values, depending on clock precision and skew.

Root Dispersion (sys.rootdispersion, peer.rootdispersion, pkt.rootdispersion): This is a signed fixed-point number indicating the maximum error relative to the primary reference source at the root of the synchronization subnet, in seconds. Only positive values greater than zero are possible.

Full NTPDC documentation

Multiple Servers Needing Time

If you have multiple servers in the same datacenter/subnet, only one of them should synchronize to the outside world. The others should synchronize locally to that one, as well as peer with one another). This gives the best case of local synchronization.

image

Ideally, have 2 internal NTP servers that synchronize each to 3-5 external sources and peer with one another.  Have all of your internal servers synchronize to both of these (and, optionally, peer with one another).

image

Full NTP Documentation

How NTP synchronizes time

General documentation

I wrote this guide due to the simple fact that I’ve realized, over the years, that almost no one uses NTP in what I could consider a useful way. It’s a poorly understood tool that is far more powerful than people often give it credit for being.

Time is critical to system administration. If you are a system administrator, understand NTP and use it.  You may end up using a more complex protocol like PTP for some specific, possibly esoteric, reason someday. But, until you can articulate exactly how a protocol like PTP buys you something over NTP, stick with NTP.

This little guide barely scratches the surface of NTP. If you really want or need to manage a time synchronization network, do yourself a favor and read the many links to further documentation I provided.  I glossed over, vaguely summarized, or otherwise took liberties with the finer details of the protocol to focus on the most important parts.

Your Next Steps

Choosing a time synchronization solution is, like anything, a cost/benefit analysis. There are certainly better solutions than NTP, and even the question of how accurate a clock actually is can be difficult to answer. This is not an authoritative guide on NTP, nor is NTP necessarily the right solution for you. But, NTP is the baseline. This is your starting point, the minimum investment required. Your needs, means, and expectations will guide your next steps.

  • Share

About FIX Flyer

With over 120 clients worldwide, FIX Flyer develops advanced technology for managing complex, multi–asset, institutional securities trading using highly scalable software and network technologies.

Learn More