ice FreeBSD* Base Driver for the Intel(R) Ethernet 800 Series of Adapters ================================================================ May 18, 2021 Contents ======== - Overview - Identifying Your Adapter - The VF Driver - Building and Installation - Configuration and Tuning - Known Issues/Troubleshooting Important Notes =============== Firmware Recovery Mode ---------------------- A device will enter Firmware Recovery mode if it detects a problem that requires the firmware to be reprogrammed. When a device is in Firmware Recovery mode it will not pass traffic or allow any configuration; you can only attempt to recover the device's firmware. Refer to the Intel(R) Ethernet Adapters and Devices User Guide for details on Firmware Recovery Mode and how to recover from it. Overview ======== This file describes the FreeBSD* driver for Intel(R) Ethernet. This driver has been developed for use with all community-supported versions of FreeBSD. For questions related to hardware requirements, refer to the documentation supplied with your Intel Ethernet Adapter. All hardware requirements listed apply to use with FreeBSD. The associated Virtual Function (VF) driver for this driver is iavf. Identifying Your Adapter ======================== The driver is compatible with devices based on the following: * Intel(R) Ethernet Controller E810-C * Intel(R) Ethernet Controller E810-XXV For information on how to identify your adapter, and for the latest Intel network drivers, refer to the Intel Support website: http://www.intel.com/support The VF Driver ============= The VF driver is normally used in a virtualized environment where a host driver manages SRIOV, and provides a VF device to the guest. In the FreeBSD guest, the iavf driver would be loaded and will function using the VF device assigned to it. The VF driver provides most of the same functionality as the core driver, but is actually a subordinate to the host. Access to many controls is accomplished by a request to the host via what is called the "Admin queue." These are startup and initialization events, however; once in operation, the device is self-contained and should achieve near native performance. Some notable limitations of the VF environment: * The PF can configure the VF to allow promiscuous mode, when using iovctl. * Media info is not available from the PF, so it will always appear as auto. Adaptive Virtual Function ------------------------- Adaptive Virtual Function (AVF) allows the virtual function driver, or VF, to adapt to changing feature sets of the physical function driver (PF) with which it is associated. This allows system administrators to update a PF without having to update all the VFs associated with it. All AVFs have a single common device ID and branding string. AVFs have a minimum set of features known as "base mode," but may provide additional features depending on what features are available in the PF with which the AVF is associated. The following are base mode features: - 4 Queue Pairs (QP) and associated Configuration Status Registers (CSRs) for Tx/Rx - iavf descriptors and ring format - Descriptor write-back completion - 1 control queue, with iavf descriptors, CSRs and ring format - 5 MSI-X interrupt vectors and corresponding iavf CSRs - 1 Interrupt Throttle Rate (ITR) index - 1 Virtual Station Interface (VSI) per VF - 1 Traffic Class (TC), TC0 - Receive Side Scaling (RSS) with 64 entry indirection table and key, configured through the PF - 1 unicast MAC address reserved per VF - 16 MAC address filters for each VF - Stateless offloads - non-tunneled checksums - AVF device ID - HW mailbox is used for VF to PF communications (including on Windows) Building and Installation ========================= NOTE: This driver package is to be used only as a standalone archive and the user should not attempt to incorporate it into the kernel source tree. In the instructions below, x.x.x is the driver version as indicated in the name of the driver tar file. 1. Move the base driver tar file to the directory of your choice. For example, use /home/username/ice or /usr/local/src/ice. 2. Untar/unzip the archive: # tar xzf ice-x.x.x.tar.gz This will create the ice-x.x.x directory. 3. To install man page: # cd ice-x.x.x # gzip -c ice.4 > /usr/share/man/man4/ice.4.gz 4. To load the driver onto a running system: # cd ice-x.x.x/src # make # kldload ./if_ice.ko NOTE: Running the make command will not install the Dynamic Device Personalization (DDP) package and could cause the driver to fail to load. See step #7 below for more information. 5. To assign an IP address to the interface, enter the following, where X is the interface number for the device: # ifconfig iceX 6. Verify that the interface works. Enter the following, where is the IP address for another machine on the same subnet as the interface that is being tested: # ping 7. If you want the driver to load automatically when the system is booted: # cd ice-x.x.x/src # make # make install NOTE: It's important to do make install so that the driver loads the DDP package automatically. Edit /boot/loader.conf, and add the following line: if_ice_load="YES" Edit /etc/rc.conf, and create the appropriate ifconfig_iceX entry: ifconfig_iceX="" Example usage: ifconfig_ice0="inet 192.168.10.1 netmask 255.255.255.0" NOTE: For assistance, see the ifconfig man page. Configuration and Tuning ======================== Important System Configuration Changes -------------------------------------- - Change the file /etc/sysctl.conf, and add the line: hw.intr_storm_threshold: 0 (the default is 1000) - Best throughput results are seen with a large MTU; use 9706 if possible. The default number of descriptors per ring is 1024. Increasing this may improve performance, depending on your use case. - If you have a choice, run on a 64-bit OS rather than a 32-bit OS. Configuring for iflib --------------------- Iflib is a common framework for network interface drivers for FreeBSD that uses a shared set of sysctl names. The iflib driver works best in FreeBSD 11.3 and later. See the iflib man page for more information. Dynamic Device Personalization ------------------------------ Dynamic Device Personalization (DDP) allows you to change the packet processing pipeline of a device by applying a profile package to the device at runtime. Profiles can be used to, for example, add support for new protocols, change existing protocols, or change default settings. DDP profiles can also be rolled back without rebooting the system. The ice driver automatically installs the default DDP package file during driver installation. NOTE: It's important to do 'make install' during initial ice driver installation so that the driver loads the DDP package automatically. The DDP package loads during device initialization. The driver looks for the ice_ddp module and checks that it contains a valid DDP package file. If the driver is unable to load the DDP package, the device will enter Safe Mode. Safe Mode disables advanced and performance features and supports only basic traffic and minimal functionality, such as updating the NVM or downloading a new driver or DDP package. Safe Mode only applies to the affected physical function and does not impact any other PFs. See the "Intel(R) Ethernet Adapters and Devices User Guide" for more details on DDP and Safe Mode. NOTES: - If you encounter issues with the DDP package file, you may need to download an updated driver or ice_ddp module. See the log messages for more information. - You cannot update the DDP package if any PF drivers are already loaded. To overwrite a package, unload all PFs and then reload the driver with the new package. - You can only use one DDP package per driver, even if you have more than one device installed that uses the driver. - Only the first loaded PF per device can download a package for that device. FW-LLDP (Firmware Link Layer Discovery Protocol) ------------------------------------------------ Use sysctl to change FW-LLDP settings. The FW-LLDP setting is per port and persists across boots. To enable LLDP: # sysctl dev.ice..fw_lldp_agent=1 To disable LLDP: # sysctl dev.ice..fw_lldp_agent=0 To check the current LLDP setting: # sysctl dev.ice..fw_lldp_agent or # sysctl -a|grep lldp NOTE: You must enable the UEFI HII "LLDP Agent" attribute for this setting to take effect. If "LLDP AGENT" is set to disabled, you cannot enable it from the OS. Jumbo Frames ------------ Jumbo Frames support is enabled by changing the Maximum Transmission Unit (MTU) to a value larger than the default value of 1500. Use the ifconfig command to increase the MTU size. For example, enter the following where X is the interface number: # ifconfig iceX mtu 9000 To confirm an interface's MTU value, use the ifconfig command. To confirm the MTU used between two specific devices, use: # route get NOTE: The maximum MTU setting for jumbo frames is 9706. This corresponds to the maximum jumbo frame size of 9728 bytes. NOTE: This driver will attempt to use multiple page sized buffers to receive each jumbo packet. This should help to avoid buffer starvation issues when allocating receive packets. NOTE: Packet loss may have a greater impact on throughput when you use jumbo frames. If you observe a drop in performance after enabling jumbo frames, enabling flow control may mitigate the issue. VLANS ----- To create a new VLAN interface: # ifconfig create To associate the VLAN interface with a physical interface and assign a VLAN ID, IP address, and netmask: # ifconfig netmask vlan vlandev Example: # ifconfig vlan10 10.0.0.1 netmask 255.255.255.0 vlan 10 vlandev ice0 In this example, all packets will be marked on egress with 802.1Q VLAN tags, specifying a VLAN ID of 10. To remove a VLAN interface: # ifconfig destroy Checksum Offload ---------------- Checksum offloading supports both TCP and UDP packets and is supported for both transmit and receive. Checksum offloading can be enabled or disabled using ifconfig. To enable checksum offloading: # ifconfig iceX rxcsum rxcsum6 # ifconfig iceX txcsum txcsum6 To disable checksum offloading: # ifconfig iceX -rxcsum -rxcsum6 # ifconfig iceX -txcsum -txcsum6 To confirm the current setting: # ifconfig iceX Look for the presence or absence of the following line: options=3 See the ifconfig man page for further information. TSO --- TSO (TCP Segmentation Offload) supports both IPv4 and IPv6. TSO can be disabled and enabled using the ifconfig utility or sysctl. NOTE: TSO requires Tx checksum, if Tx checksum is disabled, TSO will also be disabled. To enable/disable TSO in the stack: # sysctl net.inet.tcp.tso=0 (or 1 to enable it) Doing this disables/enables TSO in the stack and affects all installed adapters. To disable BOTH TSO IPv4 and IPv6, where X is the number of the interface in use: # ifconfig iceX -tso To enable BOTH TSO IPv4 and IPv6: # ifconfig iceX tso You can also enable/disable IPv4 TSO or IPv6 TSO individually. Simply replace tso|-tso in the above command with tso4 or tso6. For example, to disable TSO IPv4: # ifconfig iceX -tso4 To disable TSO IPv6: # ifconfig iceX -tso6 LRO --- LRO (Large Receive Offload) may provide Rx performance improvement. However, it is incompatible with packet-forwarding workloads. You should carefully evaluate the environment and enable LRO when possible. To enable: # ifconfig iceX lro It can be disabled by using: # ifconfig iceX -lro Rx and Tx Descriptor Rings -------------------------- Allows you to set the Rx and Tx descriptor rings independently. The tunables are: hw.ice.rx_ring_size hw.ice.tx_ring_size The valid range is 32-4096 in increments of 32. Use kenv to configure the descriptor rings. Changes will take effect on the next driver reload. For example: # kenv hw.ice.rx_ring_size=1024 # kenv hw.ice.rx_ring_size=1280 You can verify the descriptor ring size by using the following sysctls: # sysctl dev.ice..rx_ring_size # sysctl dev.ice..tx_ring_size If you are using iflib, use the following sysctls instead: # sysctl dev.ice..iflib.override_nrxds # sysctl dev.ice..iflib.override_ntxds Flow Control ------------ Ethernet Flow Control (IEEE 802.3x) can be configured with sysctl to enable receiving and transmitting pause frames for ice. When transmit is enabled, pause frames are generated when the receive packet buffer crosses a predefined threshold. When receive is enabled, the transmit unit will halt for the time delay specified when a pause frame is received. NOTE: You must have a flow control capable link partner. Flow Control is disabled by default. Use sysctl to change the flow control settings for a single interface without reloading the driver. The available values for flow control are: 0 = Disable flow control 1 = Enable Rx pause 2 = Enable Tx pause 3 = Enable Rx and Tx pause Examples: - To enable a flow control setting with sysctl: # sysctl dev.ice..fc=3 - To disable flow control using sysctl: # sysctl dev.ice..fc=0 NOTE: - The ice driver requires flow control on both the port and link partner. If flow control is disabled on one of the sides, the port may appear to hang on heavy traffic. NOTE: The VF driver does not have access to flow control. It must be managed from the host side. Forward Error Correction (FEC) ------------------------------ Allows you to set the Forward Error Correction (FEC) mode. FEC improves link stability, but increases latency. Many high quality optics, direct attach cables, and backplane channels provide a stable link without FEC. NOTE: For devices to benefit from this feature, link partners must have FEC enabled. Use sysctl to configure FEC. To show the current FEC settings that are negotiated on the link: # sysctl dev.ice.#.negotiated_fec To view or set the FEC setting that was requested on the link: # sysctl dev.ice.#.requested_fec To see the valid modes for the link: # sysctl -d dev.ice.#.requested_fec Firmware Logs ------------- The ice driver allows you to generate firmware logs for supported categories of events, to help debug issues with Intel Customer Support. Firmware logs are enabled by default. Once the driver is loaded, it will create the fw_log sysctl node under the debug section of the driver's sysctl list. The driver groups these events into categories, called "modules." Supported modules include: * general - General (Bit 0) * ctrl - Control (Bit 1) * link - Link Management (Bit 2) * link_topo - Link Topology Detection (Bit 3) * dnl - Link Control Technology (Bit 4) * i2c - I2C (Bit 5) * sdp - SDP (Bit 6) * mdio - MDIO (Bit 7) * adminq - Admin Queue (Bit 8) * hdma - Host DMA (Bit 9) * lldp - LLDP (Bit 10) * dcbx - DCBx (Bit 11) * dcb - DCB (Bit 12) * xlr - XLR (function-level resets; Bit 13) * nvm - NVM (Bit 14) * auth - Authentication (Bit 15) * vpd - Vital Product Data (Bit 16) * iosf - Intel On-Chip System Fabric (Bit 17) * parser - Parser (Bit 18) * sw - Switch (Bit 19) * scheduler - Scheduler (Bit 20) * txq - TX Queue Management (Bit 21) * acl - ACL (Access Control List; Bit 22) * post - Post (Bit 23) * watchdog - Watchdog (Bit 24) * task_dispatch - Task Dispatcher (Bit 25) * mng - Manageability (Bit 26) * synce - SyncE (Bit 27) * health - Health (Bit 28) * tsdrv - Time Sync (Bit 29) * pfreg - PF Registration (Bit 30) * mdlver - Module Version (Bit 31) You can change the verbosity level of the firmware logs. You can set only one log level per module, and each level includes the verbosity levels lower than it. For instance, setting the level to "normal" will also log warning and error messages. Available verbosity levels are: 0 = none 1 = error 2 = warning 3 = normal 4 = verbose To set the desired verbosity level for a module, use the following sysctl command and then register it: # sysctl dev.ice..debug.fw_log.severity.= For example: # sysctl dev.ice.0.debug.fw_log.severity.link=1 # sysctl dev.ice.0.debug.fw_log.severity.link_topo=2 # sysctl dev.ice.0.debug.fw_log.register=1 To log firmware messages before the driver initializes, use the kenv command to set the tunable. The on_load setting tells the device to register the variable as soon as possible during driver load. For example: # kenv dev.ice.0.debug.fw_log.severity.link=1 # kenv dev.ice.0.debug.fw_log.severity.link_topo=2 # kenv dev.ice.0.debug.fw_log.on_load=1 To view the firmware logs and redirect them to a file, use the following command: # dmesg > log_output NOTE: Logging a large number of modules or too high of a verbosity level will add extraneous messages to dmesg and could hinder debug efforts. Speed and Duplex Configuration ------------------------------ You cannot set speed, duplex, or autonegotiation settings. To have your device advertise supported speeds, use the following: # sysctl dev.ice..advertise_speed= Supported speeds will vary by device. Depending on the speeds your device supports, available speed masks could include: 0x0 - Auto 0x2 - 100 Mbps 0x4 - 1 Gbps 0x8 - 2.5 Gbps 0x10 - 5 Gbps 0x20 - 10 Gbps 0x80 - 25 Gbps 0x100 - 40 Gbps 0x200 - 50 Gbps 0x400 - 100 Gbps Known Issues/Troubleshooting ============================ Driver Buffer Overflow Fix -------------------------- The fix to resolve CVE-2016-8105, referenced in Intel SA-00069 , is included in this and future versions of the driver. Network Memory Buffer Allocation -------------------------------- FreeBSD may have a low number of network memory buffers (mbufs) by default. If your mbuf value is too low, it may cause the driver to fail to initialize and/or cause the system to become unresponsive. You can check to see if the system is mbuf-starved by running 'netstat -m'. Increase the number of mbufs by editing the lines below in /etc/sysctl.conf: kern.ipc.nmbclusters kern.ipc.nmbjumbop kern.ipc.nmbjumbo9 kern.ipc.nmbjumbo16 kern.ipc.nmbufs The amount of memory that you allocate is system specific, and may require some trial and error. Also, increasing the following in /etc/sysctl.conf could help increase network performance: kern.ipc.maxsockbuf net.inet.tcp.sendspace net.inet.tcp.recvspace net.inet.udp.maxdgram net.inet.udp.recvspace UDP Stress Test Dropped Packet Issue ------------------------------------ Under small packet UDP stress with the ice driver, the system may drop UDP packets due to socket buffers being full. Setting the driver Intel Ethernet Flow Control variables to the minimum may resolve the issue. You may also try increasing the kernel's default buffer sizes by changing the values in /proc/sys/net/core/rmem_default and rmem_max Disable LRO when routing/bridging --------------------------------- LRO must be turned off when forwarding traffic. Lower than expected performance ------------------------------- Some PCIe x8 slots are actually configured as x4 slots. These slots have insufficient bandwidth for full line rate with dual port and quad port devices. In addition, if you put a PCIe v4.0 or v3.0-capable adapter into a PCIe v2.x slot, you cannot get full bandwidth. The driver detects this situation and writes one of the following messages in the system log: "PCI-Express bandwidth available for this card is not sufficient for optimal performance. For optimal performance a x8 PCI-Express slot is required." or "PCI-Express bandwidth available for this device may be insufficient for optimal performance. Please move the device to a different PCI-e link with more lanes and/or higher transfer rate." If this error occurs, moving your adapter to a true PCIe v3.0 x8 slot will resolve the issue. For best performance, the device needs to be installed in a PCIe v4.0 x8 or v3.0 x16 slot. Throughput lower than expected ------------------------------ In FreeBSD 11.3, you may observe lower than expected throughput. This is due to an underlying OS limitation in FreeBSD 11.3. Using FreeBSD 12.0 or newer should resolve the issue. If your Rx throughput is lower than expected in FreeBSD 11.3 or 12.1, you can also adjust the iflib sysctl variable 'rx_budget.' We have seen performance benefits by increasing that value to at least 85. For example: # sysctl dev.ice.0.iflib.rx_budget=85 Fiber optics and auto-negotiation --------------------------------- Modules based on 100GBASE-SR4, active optical cable (AOC), and active copper cable (ACC) do not support auto-negotiation per the IEEE specification. To obtain link with these modules, you must turn off auto-negotiation on the link partner's switch ports. Support ======= For general information, go to the Intel support website at: http://www.intel.com/support/ If an issue is identified with the released source code on a supported kernel with a supported adapter, email the specific information related to the issue to freebsd@intel.com Copyright(c) 2019 - 2021 Intel Corporation. Trademarks ========== Intel is a trademark or registered trademark of Intel Corporation or its subsidiaries in the United States and/or other countries. * Other names and brands may be claimed as the property of others.