The Realities of NIS

13 NOV 1997 Rob Thomas robt@cymru.com

NOTE: This paper was written in response to the statement that the ONLY solution was NIS. I wanted to point out that NIS does in fact suffer from several limitations of which an admin MUST be aware. This is not intended to be an outright condemnation of NIS, nor do I believe that one is never justified in utilizing NIS. As I note later in this paper, NIS might in fact be the best of an overly mediocre bunch of centralized network admin tools.

NIS is a horrid solution that does not scale well, adds unnecessary network traffic, suffers from cross-platform limitations, and lowers overall system and network security.

NIS does not scale well. NIS clients suffer a bottleneck at the NIS server (be it slave or master) that only worsens as the network grows. Since NIS utilizes RPC, a given NIS server that is also serving any other RPC-based service (e.g. NFS, ClearCase, etc.) will suffer RPC timeouts and re-transmits as the traffic load increases in any one area. This punishment is then passed along to the clients in the form of increased times for NIS lookups. Note, too, that NIS clients do not cache much (if any) of the information given to them by the NIS servers. This means that each simple lookup results in a batch of RPC traffic between the NIS client and the NIS server.

NIS adds a great deal of otherwise unnecessary network traffic. One must consider how the kernel actually deals with certain seemingly mundane tasks. Let us consider a simple "ls -l" of a shared work-area directory. We will call this directory "foo", with the following files:

-rw-r--r--   1 nitefyre cust 22 Nov 2 06:23 18682.idx
-rw-r--r--   1 proknich cust 3360346 Oct 16 14:32 28979.idx
-rw-r--r--   1 crown    cust 104652 Oct 9 07:26 411.idx
-rw-r--r--   1 rfreeman cust 663 Oct 22 09:13 L104970TMP.html.save
-rw-r--r--   1 dehl     cust 13312 Oct 27 14:07 L118950TMP.bin
-rw-r--r--   1 bavarian cust 36108 Nov 3 23:30 L138582TMP.jpeg
-rw-------   1 cabbit cust 94492 Nov 8 23:19 L173020TMP.qt
-rw-------   1 cabbit cust 29980368 Nov 9 04:54 L1847015TMP.qt
-rw-r--r--   1 cgilmore cust 39176 Oct 6 13:53 L185030TMP.bin
-rw-r--r--   1 rpfries cust 459183 Oct 6 15:06 L225900TMP.bin
-rw-r--r--   1 jnp cust 772707 Oct 20 17:38 L240570TMP.bin
-rw-r--r--   1 rs cust 0 Oct 12 17:16 L26421TMP.txt
-rw-r--r--   1 aroldan cust 885020 Oct 25 21:34 L75490TMP.bin
-rw-r--r--   1 bobprice cust 55040 Oct 10 09:48 L90910TMP.bin
-rw-r--r--   1 mwt cust 2517812 Oct 7 21:59 core

Now, ask yourself: What actually happens when I type "ls -l" in this directory? Remember that all the inode has is a UID and GID for the given file. So, an NIS call must be made for EVERY file you see here, to match BOTH the UID and the GID. Then you are presented with the output. That is a great many calls for 15 files. And since such information is only cached on Sun (nscd, and not a very large cache), you have just added a bevy of network traffic so that ONE user could list the files in a directory. But this grows worse: Consider that most shared work dirs are actually NFS mounted from central point. Now what have you done to the network? You send an NFS getattr() call to the NFS server, which then sends several calls out to the NIS server, so that the NFS server can give the NFS client a listing of 15 files. I have seen, using the Etherman tool, a reliable 12% to 22% network utilization, by a single NFS server, for only NIS calls! Hardly an efficient use of already precious network bandwidth. So do you turn every NFS server into an NIS slave? How, then, do you prevent rogue NIS clients from binding to your NFS/NIS servers, and thereby increasing the number of overall RPC calls weathered by the portmapper/rpcbind process on your already busy NFS server?

NIS suffers from cross-platform limitations. For example, DEC's OSF/1 NIS server will not properly server either Sun (SunOS and Solaris) or HP (HP-UX 9.X and higher) NIS slaves or clients. Calls for password entries, in this case, result in a consistent NULL population of the u_ information. Furthermore, only HP allows the NIS server to ypxfr maps in parallel mode. This means that all other NIS servers (Sun, SGI, DEC, IBM) can only ypxfr the maps to one slave at time, in serial fashion. Now, given a large-scale implementation of NIS, how can you promise data (map) integrity at any given moment? Which NIS slaves, then, do you elect to be updated first? Last? In other words, as your NIS domain grows, your promised integrity decreases in linear fashion.

Note, too, that the industry at large is looking towards NIS+ to be the solution to NIS' many woes. Although Sun, in Solaris 2.6, officially brought NIS back to life (as opposed to the lightly supported NISkit), even HP is going towards NIS+ as part of the industry-wide push to ONC+ and SecureRPC. NIS+, however, is also fraught with errors and limitations, mostly linked to the effort to "re-write NIS", as opposed to crafting a better solution.

NIS is a horrible security hole. In fact, one could successfully argue that any RPC-based service is a security hole waiting to be exploited. However, NIS wins dubious honors here, only because of the bevy of NIS exploit code freely available on the Internet. NIS is one of the easiest means by which a cracker can gain entry upon your network.

Although the tone of this paper is certainly critical, one must reflect upon Theo De Raadt's infamous quote. Theo, one of the chief architects of BSD, was asked to state which UNIX branch was better: SVR4 or BSD. To which Theo replied:

The point here is that NIS is just one of many mediocre solutions to the distributed file conundrum. NIS, NIS+, Hesiod, rdist, and others are all contenders for the central-management nightmare and inefficiency award.

The point of all this is simple: Stating that centralized management of key system files is "easier" is just so much bunkum. Easier for who? The admins? Perhaps. The hosts? Not at all. The user community? Hard to prove. In other words, choosing NIS should not be done lightly nor simply to avert the deletion of key files. As Corey pointed out, he could just as easily have removed the password map from the NIS master. This would have, of course, resulted in the eventual lockout of ALL NIS clients in the given NIS domain, instead of lockout for a single machine. So, therein lies the rub: Centralized management tools are, in fact, easier. Easier to admin your network. Easier to completely destroy your network.

So choose your file distribution and syncing tool wisely. Know exactly WHAT issue you are trying to solve for the end CUSTOMER (not the admins), and then be sure that your chosen solution truly meets that task and standard.