@(#)Performance tips and tricks 01 SEP 2000 Rob Thomas robt@cymru.com Cisco router performance tips and tricks PREVENTING A "HUNG" PROCESS The Cisco IOS is much like any operating system in that it schedules and runs processes. Some of the processes support the essential tasks of a router, such as IP Input. Others are for maintenance tasks, such as Check heaps. It is possible, however, for a process to become wedged, thus monopolizing the CPU. While this may not be immediately apparent on some high-end routers (e.g. 7500 with VIPs, GSR12K), it will cause woe eventually, and certainly on the shared memory routers. To prevent a process from wedging on the CPU, we use the following command: scheduler process-watchdog. There are four options to this command: 1. scheduler process-watchdog hang This will keep the process in the process table, but will no longer schedule it on the CPU. 2. scheduler process-watchdog normal This is the "default" behaviour as decided by the developers. The result is difficult to predict unless you have access to the source code. 3. scheduler process-watchdog reload This will reload the router. While this may seem a bit harsh, keep in mind that the loss of certain processes (e.g. IP Input) will al- most certainly leave the router in an unusable state. In this case it would be better to reload. 4. scheduler process-watchdog terminate This choice will result in the termination of the process and the continued operation of the router. However, as noted above, the loss of some key process is certain to have an adverse affect on the router. The choice you make is ultimately your own. ALLOWING THE SCHEDULING OF LOWER PRIORITY PROCESSES In cases where extremely high network load presents itself on the inter- face of a router, it is possible that other tasks will not be able to run. By default, the Cisco IOS allocates 5% of the CPU time to the lower priority tasks. During a high load event, such as a DDOS, this default may be insufficient to ensure that other tasks acquire CPU time, such as routing protocol updates and CEF table maintenance. To modify the default behaviour, we utilize the scheduler allocate command. This is a global command, and is used thusly: scheduler allocate 4000 200 Where 4000 is the maximum number of microseconds to allocate to fast switching any single network interrupt context, and 200 is the minimum guaranteed number of microseconds to allocate to process level tasks while network interrupts are masked. The defaults are 4000 and 200, as shown above. While the default settings are largely adequate for most steady-state operation, you may find that modifications are necessary to ensure console access to the routers during periods of extreme network stress. Be careful, however, to not leave the network interrupt context -- the heart and soul of the Cisco router -- ill served. MODIFYING THE INTERVAL AT WHICH STATISTICS ARE GATHERED By default, the Cisco IOS gathers statistics for the various counters at an interval of 300 seconds, or five minutes. While this is suitable for most states of operation, there may be select periods where a more granular (or gross) interval is required. Keep in mind that this does have an effect on the performance of the router, however. To change the statistics gathering interval, we use the global load-interval command thusly: load-interval 300 Where 300 is the number of seconds we wish to use as our interval. If you set this to some more granular figure (e.g. 30 seconds), remember to set it back when you have completed your data gathering. -- Rob Thomas http://www.cymru.com