The network subsystem uses a memory management facility that revolves around a data structure called an mbuf. Mbufs are mostly used to store data in the kernel for incoming and outbound network traffic. Having mbuf pools of the right size can have a positive effect on network performance. If the mbuf pools are configured incorrectly, both network and system performance can suffer. The upper limit of the mbuf pool size, which is the thewall tunable, is automatically determined by the operating system, based on the amount of memory in the system. As the system administrator, only you can tune the upper limit of the mbuf pool size.
The thewall network tunable option sets the upper limit for network kernel buffers. The system automatically sets the value of the thewall tunable to the maximum value and in general, you should not change the value. You could decrease it, which would reduce the amount of memory the system uses for network buffers, but it might affect network performance. Since the system only uses the necessary number of buffers at any given time, if the network subsystem is not being heavily used, the total number of buffers should be much lower than the thewall value.
The unit of thewall tunable is in 1 KB, so 1048576 bytes indicates 1024 MB or 1 GB of RAM.
The AIX 32-bit kernel has up to 1 GB of mbuf buffer space, consisting of up to four memory segments of 256 MB each. This value might be lower, based on the total amount of memory in the system. The size of the thewall tunable is either 1 GB or half of the amount of system memory, whichever value is smaller.
The AIX 64-bit kernel has a much larger kernel buffer capacity. It has up to 65 GB of mbuf buffer space, consisting of 260 memory segments of 256 MB each. With the 64-bit kernel, the size of the thewall tunable is either 65 GB or half of the amount of system memory, whichever value is smaller.
Therefore, systems with large numbers of TCP connections, network adapters, or network I/O should consider using the 64-bit kernel if the mbuf pool is limiting capacity or performance.
The value of the maxmbuf tunable limits how much real memory is used by the communications subsystem. You can also use the maxmbuf tunable to lower the thewall limit. You can view the maxmbuf tunable value by running the lsattr -E -l sys0 command . If themaxmbuf value is greater than 0 , the maxmbuf value is used regardless of the value of thewall tunable.
The default value for the maxmbuf tunable is 0. A value of 0 for the maxmbuf tunable indicates that the thewall tunable is used. You can change the maxmbuf tunable value by using the chdev or smitty commands.
The sockthresh and strthresh tunables are the upper thresholds to limit the opening of new sockets or TCP connections, or the creation of new streams resources. This prevents buffer resources from not being available and ensures that existing sessions or connections have resources to continue operating.
The sockthresh tunable specifies the memory usage limit. No new socket connections are allowed to exceed the value of the sockthresh tunable. The default value for the sockthresh tunable is 85%, and once the total amount of allocated memory reaches 85% of the thewall or maxmbuf tunable value, you cannot have any new socket connections, which means the return value of the socket() and socketpair() system calls is ENOBUFS, until the buffer usage drops below 85%.
Similarly, the strthresh tunable limits the amount of mbuf memory used for streams resources and the default value for the strthresh tunable is 85%. The async and TTY subsytems run in the streams environment. The strthresh tunable specifies that once the total amount of allocated memory reaches 85% of the thewall tunable value, no more memory goes to streams resources, which means the return value of the streams call is ENOSR, to open streams, push modules or write to streams devices.
You can tune the sockthresh and strthresh thresholds with the no command.
The mbuf management facility controls different buffer sizes that can range from 32 bytes up to 16384 bytes. The pools are created from system memory by making an allocation request to the Virtual Memory Manager (VMM). The pools consist of pinned pieces of kernel virtual memory in which they always reside in physical memory and are never paged out. The result is that the real memory available for paging in application programs and data has been decreased by the amount that the mbuf pools have been increased.
The network memory pool is split evenly among each processor. Each sub-pool is then split up into buckets, with each bucket holding bufers ranging in size from 32 to 16384 bytes. Each bucket can borrow memory from other buckets on the same processor but a processor cannot borrow memory from another processor's network memory pool. When a network service needs to transport data, it can call a kernel service such as m_get() to obtain a memory buffer. If the buffer is already available and pinned, it can get it immediately. If the upper limit has not been reached and the buffer is not pinned, then a buffer is allocated and pinned. Once pinned, the memory stays pinned but can be freed back to the network pool. If the number of free buffers reaches a high-water mark, then a certain number is unpinned and given back to the system for general use. This unpinning is done by the netm() kernel process. The caller of the m_get() subroutine can specify whether to wait for a network memory buffer. If the M_DONTWAIT flag is specified and no pinned buffers are available at that time, a failed counter is incremented. If the M_WAIT flag is specified, the process is put to sleep until the buffer can be allocated and pinned.
Use the netstat -m command to detect shortages or failures of network memory (mbufs/clusters) requests You can use the netstat -Zm command to clear (or zero) the mbuf statistics. This is helpful when running tests to start with a clean set of statistics. The following fields are provided with the netstat -m command:
You should not see a large number of failed calls. There might be a few, which trigger the system to allocate more buffers as the buffer pool size increases. There is a predefined set of buffers of each size that the system starts with after each reboot, and the number of buffers increases as necessary.
The following is an example of the netstat -m command from a two-processor or CPU machine:
# netstat -m Kernel malloc statistics: ******* CPU 0 ******* By size inuse calls failed delayed free hiwat freed 32 68 693 0 0 60 2320 0 64 55 115 0 0 9 1160 0 128 21 451 0 0 11 580 0 256 1064 5331 0 0 1384 1392 42 512 41 136 0 0 7 145 0 1024 10 231 0 0 6 362 0 2048 2049 4097 0 0 361 362 844 4096 2 8 0 0 435 435 453 8192 2 4 0 0 0 36 0 16384 0 513 0 0 86 87 470 ******* CPU 1 ******* By size inuse calls failed delayed free hiwat freed 32 139 710 0 0 117 2320 0 64 53 125 0 0 11 1160 0 128 41 946 0 0 23 580 0 256 62 7703 0 0 1378 1392 120 512 37 109 0 0 11 145 0 1024 21 217 0 0 3 362 0 2048 2 2052 0 0 362 362 843 4096 7 10 0 0 434 435 449 8192 0 4 0 0 1 36 0 16384 0 5023 0 0 87 87 2667 ***** Allocations greater than 16384 Bytes ***** By size inuse calls failed delayed free hiwat freed 65536 2 2 0 0 0 4096 0 Streams mblk statistic failures: 0 high priority mblk failures 0 medium priority mblk failures 0 low priority mblk failures
The Address Resolution Protocol (ARP) is a protocol used to map 32-bit IPv4 addresses into a 48-bit host adapter address required by the data link protocol. ARP is handled transparently by the system. However, the system maintains an ARP cache, which is a table that holds the associated 32-bit IP addresses and its 48-bit host address. You might need to change the size of the ARP cache in environments where large numbers of machines (clients) are connected.
The no command tunable parameters are:
The ARP table size is composed of a number of buckets, defined by the arptab_nb parameter. Each bucket holds arptab_bsiz entries. The defaults are 73 buckets with 7 entries each, so the table can hold 511 (73 x 7) host addresses. If a server connects to 1000 client machines concurrently, then the default ARP table is too small, which causes AIX to thrash the ARP cache. The operating system then has to purge an entry in the cache and replace it with a new address. This requires the TCP or UDP packets to wait (be queued) while the ARP protocol exchanges this information. The arpqsize parameter determines how many of these waiting packets can be queued by the ARP layer until an ARP response is received back from an ARP request. If the ARP queue is overrun, outgoing TCP or UDP packets are dropped.
ARP cache thrashing might have a negative impact on performance for the following reasons:
The arpqsize, arptab_bsiz, and arptab_nb parameters are all reboot parameters in that the system must be rebooted if their values change because they alter tables that are built at boot time or TCP/IP load time.
The arpt_killc parameter is the time, in minutes, before an ARP entry is deleted. The default value of the arpt_killc parameter is 20 minutes. ARP entries are deleted from the table every arpt_killc minutes to cover the case where a host system might change its 48-bit address, which can occur when its network adapter is replaced for example. This ensures that any stale entries in the cache are deleted, as these would prevent communication with such a host until its old address is removed. Increasing this time would reduce ARP lookups by the system, but can result in holding stale host addresses longer. The arpt_killc parameter is a dynamic parameter, so it can be changed on the fly without rebooting the system.
The netstat -p arp command displays the ARP statistics. These statistics show how many total ARP request have been sent and how many packets have been purged from the table when an entry is deleted to make room for a new entry. If this purged count is high, then your ARP table size should be increased. The following is an example of the netstat -p arpcommand.
# netstat -p arp
arp:
6 packets sent
0 packets purged
You can display the ARP table with the arp -a command. The command output shows which addresses are in the ARP table and how those addresses are hashed and to what buckets.
? (10.3.6.1) at 0:6:29:dc:28:71 [ethernet] stored
bucket: 0 contains: 0 entries
bucket: 1 contains: 0 entries
bucket: 2 contains: 0 entries
bucket: 3 contains: 0 entries
bucket: 4 contains: 0 entries
bucket: 5 contains: 0 entries
bucket: 6 contains: 0 entries
bucket: 7 contains: 0 entries
bucket: 8 contains: 0 entries
bucket: 9 contains: 0 entries
bucket: 10 contains: 0 entries
bucket: 11 contains: 0 entries
bucket: 12 contains: 0 entries
bucket: 13 contains: 0 entries
bucket: 14 contains: 1 entries
bucket: 15 contains: 0 entries
...lines omitted...
There are 1 entries in the arp table.