summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/igb/igb_main.c
AgeCommit message (Collapse)Author
2018-09-23igb: remove ndo_poll_controllerEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. igb uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-24igb: Replace GFP_ATOMIC with GFP_KERNEL in igb_sw_init()Jia-Ju Bai
igb_sw_init() is never called in atomic context. It calls kzalloc() and kcalloc() with GFP_ATOMIC, which is not necessary. GFP_ATOMIC can be replaced with GFP_KERNEL. This is found by a static analysis tool named DCNS written by myself. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24igb: Use an advanced ctx descriptor for launchtimeJesus Sanchez-Palencia
On i210, Launchtime (TxTime) requires the usage of an "Advanced Transmit Context Descriptor" for retrieving the timestamp of a packet. The igb driver correctly builds such descriptor on the segmentation flow (i.e. igb_tso()) or on the checksum one (i.e. igb_tx_csum()), but the feature is broken for AF_PACKET if the IGB_TX_FLAGS_VLAN is not set, which happens due to an early return on igb_tx_csum(). This flag is only set by the kernel when a VLAN interface is used, thus we can't just rely on it. Here we are fixing this issue by checking if launchtime is enabled for the current tx_ring before performing the early return. Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-16Merge tag 'pci-v4.19-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: - Decode AER errors with names similar to "lspci" (Tyler Baicar) - Expose AER statistics in sysfs (Rajat Jain) - Clear AER status bits selectively based on the type of recovery (Oza Pawandeep) - Honor "pcie_ports=native" even if HEST sets FIRMWARE_FIRST (Alexandru Gagniuc) - Don't clear AER status bits if we're using the "Firmware-First" strategy where firmware owns the registers (Alexandru Gagniuc) - Use sysfs_match_string() to simplify ASPM sysfs parsing (Andy Shevchenko) - Remove unnecessary includes of <linux/pci-aspm.h> (Bjorn Helgaas) - Defer DPC event handling to work queue (Keith Busch) - Use threaded IRQ for DPC bottom half (Keith Busch) - Print AER status while handling DPC events (Keith Busch) - Work around IDT switch ACS Source Validation erratum (James Puthukattukaran) - Emit diagnostics for all cases of PCIe Link downtraining (Links operating slower than they're capable of) (Alexandru Gagniuc) - Skip VFs when configuring Max Payload Size (Myron Stowe) - Reduce Root Port Max Payload Size if necessary when hot-adding a device below it (Myron Stowe) - Simplify SHPC existence/permission checks (Bjorn Helgaas) - Remove hotplug sample skeleton driver (Lukas Wunner) - Convert pciehp to threaded IRQ handling (Lukas Wunner) - Improve pciehp tolerance of missed events and initially unstable links (Lukas Wunner) - Clear spurious pciehp events on resume (Lukas Wunner) - Add pciehp runtime PM support, including for Thunderbolt controllers (Lukas Wunner) - Support interrupts from pciehp bridges in D3hot (Lukas Wunner) - Mark fall-through switch cases before enabling -Wimplicit-fallthrough (Gustavo A. R. Silva) - Move DMA-debug PCI init from arch code to PCI core (Christoph Hellwig) - Fix pci_request_irq() usage of IRQF_ONESHOT when no handler is supplied (Heiner Kallweit) - Unify PCI and DMA direction #defines (Shunyong Yang) - Add PCI_DEVICE_DATA() macro (Andy Shevchenko) - Check for VPD completion before checking for timeout (Bert Kenward) - Limit Netronome NFP5000 config space size to work around erratum (Jakub Kicinski) - Set IRQCHIP_ONESHOT_SAFE for PCI MSI irqchips (Heiner Kallweit) - Document ACPI description of PCI host bridges (Bjorn Helgaas) - Add "pci=disable_acs_redir=" parameter to disable ACS redirection for peer-to-peer DMA support (we don't have the peer-to-peer support yet; this is just one piece) (Logan Gunthorpe) - Clean up devm_of_pci_get_host_bridge_resources() resource allocation (Jan Kiszka) - Fixup resizable BARs after suspend/resume (Christian König) - Make "pci=earlydump" generic (Sinan Kaya) - Fix ROM BAR access routines to stay in bounds and check for signature correctly (Rex Zhu) - Add DMA alias quirk for Microsemi Switchtec NTB (Doug Meyer) - Expand documentation for pci_add_dma_alias() (Logan Gunthorpe) - To avoid bus errors, enable PASID only if entire path supports End-End TLP prefixes (Sinan Kaya) - Unify slot and bus reset functions and remove hotplug knowledge from callers (Sinan Kaya) - Add Function-Level Reset quirks for Intel and Samsung NVMe devices to fix guest reboot issues (Alex Williamson) - Add function 1 DMA alias quirk for Marvell 88SS9183 PCIe SSD Controller (Bjorn Helgaas) - Remove Xilinx AXI-PCIe host bridge arch dependency (Palmer Dabbelt) - Remove Aardvark outbound window configuration (Evan Wang) - Fix Aardvark bridge window sizing issue (Zachary Zhang) - Convert Aardvark to use pci_host_probe() to reduce code duplication (Thomas Petazzoni) - Correct the Cadence cdns_pcie_writel() signature (Alan Douglas) - Add Cadence support for optional generic PHYs (Alan Douglas) - Add Cadence power management ops (Alan Douglas) - Remove redundant variable from Cadence driver (Colin Ian King) - Add Kirin MSI support (Xiaowei Song) - Drop unnecessary root_bus_nr setting from exynos, imx6, keystone, armada8k, artpec6, designware-plat, histb, qcom, spear13xx (Shawn Guo) - Move link notification settings from DesignWare core to individual drivers (Gustavo Pimentel) - Add endpoint library MSI-X interfaces (Gustavo Pimentel) - Correct signature of endpoint library IRQ interfaces (Gustavo Pimentel) - Add DesignWare endpoint library MSI-X callbacks (Gustavo Pimentel) - Add endpoint library MSI-X test support (Gustavo Pimentel) - Remove unnecessary GFP_ATOMIC from Hyper-V "new child" allocation (Jia-Ju Bai) - Add more devices to Broadcom PAXC quirk (Ray Jui) - Work around corrupted Broadcom PAXC config space to enable SMMU and GICv3 ITS (Ray Jui) - Disable MSI parsing to work around broken Broadcom PAXC logic in some devices (Ray Jui) - Hide unconfigured functions to work around a Broadcom PAXC defect (Ray Jui) - Lower iproc log level to reduce console output during boot (Ray Jui) - Fix mobiveil iomem/phys_addr_t type usage (Lorenzo Pieralisi) - Fix mobiveil missing include file (Lorenzo Pieralisi) - Add mobiveil Kconfig/Makefile support (Lorenzo Pieralisi) - Fix mvebu I/O space remapping issues (Thomas Petazzoni) - Use generic pci_host_bridge in mvebu instead of ARM-specific API (Thomas Petazzoni) - Whitelist VMD devices with fast interrupt handlers to avoid sharing vectors with slow handlers (Keith Busch) * tag 'pci-v4.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (153 commits) PCI/AER: Don't clear AER bits if error handling is Firmware-First PCI: Limit config space size for Netronome NFP5000 PCI/MSI: Set IRQCHIP_ONESHOT_SAFE for PCI-MSI irqchips PCI/VPD: Check for VPD access completion before checking for timeout PCI: Add PCI_DEVICE_DATA() macro to fully describe device ID entry PCI: Match Root Port's MPS to endpoint's MPSS as necessary PCI: Skip MPS logic for Virtual Functions (VFs) PCI: Add function 1 DMA alias quirk for Marvell 88SS9183 PCI: Check for PCIe Link downtraining PCI: Add ACS Redirect disable quirk for Intel Sunrise Point PCI: Add device-specific ACS Redirect disable infrastructure PCI: Convert device-specific ACS quirks from NULL termination to ARRAY_SIZE PCI: Add "pci=disable_acs_redir=" parameter for peer-to-peer support PCI: Allow specifying devices using a base bus and path of devfns PCI: Make specifying PCI devices in kernel parameters reusable PCI: Hide ACS quirk declarations inside PCI core PCI: Delay after FLR of Intel DC P3700 NVMe PCI: Disable Samsung SM961/PM961 NVMe before FLR PCI: Export pcie_has_flr() PCI: mvebu: Drop bogus comment above mvebu_pcie_map_registers() ...
2018-08-07igb_main: Mark expected switch fall-throughsGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 200521 ("Missing break in switch") Addresses-Coverity-ID: 114797 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-06igb: Remove unnecessary include of <linux/pci-aspm.h>Bjorn Helgaas
The igb driver doesn't need anything provided by pci-aspm.h, so remove the unnecessary include of it. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-26igb: Use dma_wmb() instead of wmb() before doorbell writesVenkatesh Srinivas
igb writes to doorbells to post transmit and receive descriptors; after writing descriptors to memory but before writing to doorbells, use dma_wmb() rather than wmb(). wmb() is more heavyweight than necessary before doorbell writes. On x86, this avoids SFENCEs before doorbell writes in both the tx and rx refill paths. Signed-off-by: Venkatesh Srinivas <venkateshs@google.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-04igb: Add support for ETF offloadJesus Sanchez-Palencia
Implement HW offload support for SO_TXTIME through igb's Launchtime feature. This is done by extending igb_setup_tc() so it supports TC_SETUP_QDISC_ETF and configuring i210 so time based transmit arbitration is enabled. The FQTSS transmission mode added before is extended so strict priority (SP) queues wait for stream reservation (SR) ones. igb_config_tx_modes() is extended so it can support enabling/disabling Launchtime following the previous approach used for the credit-based shaper (CBS). As the previous flow, FQTSS transmission mode is enabled automatically by the driver once Launchtime (or CBS, as before) is enabled. Similarly, it's automatically disabled when the feature is disabled for the last queue that had it setup on. The driver just consumes the transmit times from the skbuffs directly, so no special handling is done in case an 'invalid' time is provided. We assume this has been handled by the ETF qdisc already. Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04igb: Only call skb_tx_timestamp after descriptors are readyJesus Sanchez-Palencia
Currently, skb_tx_timestamp() is being called before the Tx descriptors are prepared in igb_xmit_frame_ring(), which happens during either the igb_tso() or igb_tx_csum() calls. Given that now the skb->tstamp might be used to carry the timestamp for SO_TXTIME, we must only call skb_tx_timestamp() after the information has been copied into the Tx descriptors. Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04igb: Refactor igb_offload_cbs()Jesus Sanchez-Palencia
Split code into a separate function (igb_offload_apply()) that will be used by ETF offload implementation. Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04igb: Only change Tx arbitration when CBS is onJesus Sanchez-Palencia
Currently the data transmission arbitration algorithm - DataTranARB field on TQAVCTRL reg - is always set to CBS when the Tx mode is changed from legacy to 'Qav' mode. Make that configuration a bit more granular in preparation for the upcoming Launchtime enabling patches, since CBS and Launchtime can be enabled separately. That is achieved by moving the DataTranARB setup to igb_config_tx_modes() instead. Similarly, when disabling CBS we must check if it has been disabled for all queues, and clear the DataTranARB accordingly. Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04igb: Refactor igb_configure_cbs()Jesus Sanchez-Palencia
Make this function retrieve what it needs from the Tx ring being addressed since it already relies on what had been saved on it before. Also, since this function will be used by the upcoming Launchtime patches rename it to better reflect its intention. Note that Launchtime is not part of what 802.1Qav specifies, but the i210 datasheet refers to this set of functionality as "Qav Transmission Mode". Here we also perform a tiny refactor at is_any_cbs_enabled(), and add further documentation to igb_setup_tx_mode(). Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: pass extack pointer to block binds and cb registrationJohn Hurley
Pass the extact struct from a tc qdisc add to the block bind function and, in turn, to the setup_tc ndo of binding device via the tc_block_offload struct. Pass this back to any block callback registrations to allow netlink logging of fails in the bind process. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-25cls_flower: fix error values for commands not supported by driversJiri Pirko
-EOPNOTSUPP is the error value that should be reported if a flower command is not supported by a driver. Fix it in couple of Intel drivers. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-12treewide: kzalloc() -> kcalloc()Kees Cook
The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(char) * COUNT + COUNT , ...) | kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) | kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) | kzalloc(sizeof(TYPE) * C2, ...) | kzalloc(C1 * C2 * C3, ...) | kzalloc(C1 * C2, ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) | - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) | - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-04igb: Wait 10ms just once after TX queues resetSergey Nemov
Move 10ms sleep out of function resetting TX queue. Reset all the TX queues in one turn and wait for all of them just once. Use usleep_range() instead of mdelay() in order not to affect transmission on other interfaces. Signed-off-by: Sergey Nemov <sergey.nemov@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-04igb: Clear TSICR interrupts together with ICRJoanna Yurdal
Issuing "ip link set up/down" can block TSICR interrupts, what results in missing PTP Tx timestamp and no PPS pulse generation. Problem happens when the link is set up with the TSICR interrupts pending. ICR is cleared before enabling interrupts, while TSICR is not. When all TSICR interrupts are pending at this moment, time_sync interrupt will never be generated. TSICR should be cleared as well. In order to reproduce the issue: 1. Setup linux with IEEE 1588 grandmaster and PPS output enabled 2. Continue setting link up/down with random intervals between commands 3. Wait until PPS is not generated ( only one pulse is generated and PPS dies), and ptp4l complains constantly about Tx timeout. Signed-off-by: Joanna Yurdal <jyu@trackman.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-27net: intel: Cleanup the copyright/license headersJeff Kirsher
After many years of having a ~30 line copyright and license header to our source files, we are finally able to reduce that to one line with the advent of the SPDX identifier. Also caught a few files missing the SPDX license identifier, so fixed them up. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Shannon Nelson <shannon.nelson@oracle.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-25igb: Add support for adding offloaded clsflower filtersVinicius Costa Gomes
This allows filters added by tc-flower and specifying MAC addresses, Ethernet types, and the VLAN priority field, to be offloaded to the controller. This reuses most of the infrastructure used by ethtool, but clsflower filters are kept in a separated list, so they are invisible to ethtool. To setup clsflower offloading: $ tc qdisc replace dev eth0 handle 100: parent root mqprio \ num_tc 3 map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ queues 1@0 1@1 2@2 hw 0 (clsflower offloading depends on the netword driver to be configured with multiple traffic classes, we use mqprio's 'num_tc' parameter to set it to 3) $ tc qdisc add dev eth0 ingress Examples of filters: $ tc filter add dev eth0 parent ffff: flower \ dst_mac aa:aa:aa:aa:aa:aa \ hw_tc 2 skip_sw (just a simple filter filtering for the destination MAC address and steering that traffic to queue 2) $ tc filter add dev enp2s0 parent ffff: proto 0x22f0 flower \ src_mac cc:cc:cc:cc:cc:cc \ hw_tc 1 skip_sw (as the i210 doesn't support steering traffic based on the source address alone, we need to use another steering traffic, in this case we are using the ethernet type (0x22f0) to steer traffic to queue 1) Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-25igb: Add the skeletons for tc-flower offloadingVinicius Costa Gomes
This adds basic functions needed to implement offloading for filters created by tc-flower. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-25igb: Allow filters to be added for the local MAC addressVinicius Costa Gomes
Users expect that when adding a steering filter for the local MAC address, that all the traffic directed to that address will go to some queue. Currently, it's not possible to configure entries in the "in use" state, which is the normal state of the local MAC address entry (it is the default), this patch allows to override the steering configuration of "in use" entries, if the filter to be added match the address and address type (source or destination) of an existing entry. There is a bit of a special handling for entries referring to the local MAC address, when they are removed, only the steering configuration is reset. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-25igb: Add support for enabling queue steering in filtersVinicius Costa Gomes
On some igb models (82575 and i210) the MAC address filters can control to which queue the packet will be assigned. This extends the 'state' with one more state to signify that queue selection should be enabled for that filter. As 82575 parts are no longer easily obtained (and this was developed against i210), only support for the i210 model is enabled. These functions are exported and will be used in the next patch. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-25igb: Add support for MAC address filters specifying source addressesVinicius Costa Gomes
Makes it possible to direct packets to queues based on their source address. Documents the expected usage of the 'flags' parameter. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-25igb: Enable the hardware traffic class feature bit for igb modelsVinicius Costa Gomes
This will allow functionality depending on the hardware being traffic class aware to work. In particular the tc-flower offloading checks verifies that this bit is set. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-25igb: Fix queue selection on MAC filters on i210Vinicius Costa Gomes
On the RAH registers there are semantic differences on the meaning of the "queue" parameter for traffic steering depending on the controller model: there is the 82575 meaning, which "queue" means a RX Hardware Queue, and the i350 meaning, where it is a reception pool. The previous behaviour was having no effect for i210 based controllers because the QSEL bit of the RAH register wasn't being set. This patch separates the condition in discrete cases, so the different handling is clearer. Fixes: 83c21335c876 ("igb: improve MAC filter handling") Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-24igb: Fix the transmission mode of queue 0 for Qav modeVinicius Costa Gomes
When Qav mode is enabled, queue 0 should be kept on Stream Reservation mode. From the i210 datasheet, section 8.12.19: "Note: Queue0 QueueMode must be set to 1b when TransmitMode is set to Qav." ("QueueMode 1b" represents the Stream Reservation mode) The solution is to give queue 0 the all the credits it might need, so it has priority over queue 1. A situation where this can happen is when cbs is "installed" only on queue 1, leaving queue 0 alone. For example: $ tc qdisc replace dev enp2s0 handle 100: parent root mqprio num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0 $ tc qdisc replace dev enp2s0 parent 100:2 cbs locredit -1470 \ hicredit 30 sendslope -980000 idleslope 20000 offload 1 Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-23intel: add SPDX identifiers to all the Intel driversJeff Kirsher
Add the SPDX identifiers to all the Intel wired LAN driver files, as outlined in Documentation/process/license-rules.rst. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-05igb: Fix a test with HWTSTAMP_TX_ONChristophe JAILLET
'HWTSTAMP_TX_ON' should be handled as a value, not as a bit mask. The modified code should behave the same, because HWTSTAMP_TX_ON is 1 and no other possible values of 'tx_type' would match the test. However, this is more future-proof, should other values be allowed one day. See 'struct hwtstamp_config' in 'include/uapi/linux/net_tstamp.h' This fixes a warning reported by smatch: igb_xmit_frame_ring() warn: bit shifter 'HWTSTAMP_TX_ON' used for logical '&' Fixes: 26bd4e2db06be ("igb: protect TX timestamping from API misuse") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-05igb: Do not call netif_device_detach() when PCIe link goes missingMika Westerberg
When the driver notices that PCIe link is gone by reading 0xffffffff from a register it clears hw->hw_addr and then calls netif_device_detach(). This happens when the PCIe device is physically unplugged for example the user disconnected the Thunderbolt cable. However, netif_device_detach() prevents netif_unregister() from bringing the device down properly including tearing down MSI-X vectors. This triggers following crash during the driver removal: igb 0000:0b:00.0 enp11s0f0: PCIe link lost, device now detached ------------[ cut here ]------------ kernel BUG at drivers/pci/msi.c:352! invalid opcode: 0000 [#1] PREEMPT SMP PTI ... Call Trace: pci_disable_msix+0xc9/0xf0 igb_reset_interrupt_capability+0x58/0x60 [igb] igb_remove+0x90/0x100 [igb] pci_device_remove+0x31/0xa0 device_release_driver_internal+0x152/0x210 pci_stop_bus_device+0x78/0xa0 pci_stop_bus_device+0x38/0xa0 pci_stop_bus_device+0x38/0xa0 pci_stop_bus_device+0x26/0xa0 pci_stop_bus_device+0x38/0xa0 pci_stop_and_remove_bus_device+0x9/0x20 trim_stale_devices+0xee/0x130 ? _raw_spin_unlock_irqrestore+0xf/0x30 trim_stale_devices+0x8f/0x130 ? _raw_spin_unlock_irqrestore+0xf/0x30 trim_stale_devices+0xa1/0x130 ? get_slot_status+0x8b/0xc0 acpiphp_check_bridge.part.7+0xf9/0x140 acpiphp_hotplug_notify+0x170/0x1f0 ... To prevent the crash do not call netif_device_detach() in igb_rd32(). This should be fine because hw->hw_addr is set to NULL preventing future hardware access of the now missing device. Link: https://bugzilla.kernel.org/show_bug.cgi?id=198181 Reported-by: Ferenc Boldog <ferenc.boldog@gmail.com> Reported-by: Nikolay Bogoychev <nheart@gmail.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-05igb: add VF trust infrastructureCorinna Vinschen
* Add a per-VF value to know if a VF is trusted, by default don't trust VFs. * Implement netdev op to trust VFs (igb_ndo_set_vf_trust) and add trust status to ndo_get_vf_config output. * Allow a trusted VF to change MAC and MAC filters even if MAC has been administratively set. Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-24igb: Delete an error message for a failed memory allocation in ↵Markus Elfring
igb_enable_sriov() Omit an extra message for a memory allocation failure in this function. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-24igb: Free IRQs when device is hotpluggedLyude Paul
Recently I got a Caldigit TS3 Thunderbolt 3 dock, and noticed that upon hotplugging my kernel would immediately crash due to igb: [ 680.825801] kernel BUG at drivers/pci/msi.c:352! [ 680.828388] invalid opcode: 0000 [#1] SMP [ 680.829194] Modules linked in: igb(O) thunderbolt i2c_algo_bit joydev vfat fat btusb btrtl btbcm btintel bluetooth ecdh_generic hp_wmi sparse_keymap rfkill wmi_bmof iTCO_wdt intel_rapl x86_pkg_temp_thermal coretemp crc32_pclmul snd_pcm rtsx_pci_ms mei_me snd_timer memstick snd pcspkr mei soundcore i2c_i801 tpm_tis psmouse shpchp wmi tpm_tis_core tpm video hp_wireless acpi_pad rtsx_pci_sdmmc mmc_core crc32c_intel serio_raw rtsx_pci mfd_core xhci_pci xhci_hcd i2c_hid i2c_core [last unloaded: igb] [ 680.831085] CPU: 1 PID: 78 Comm: kworker/u16:1 Tainted: G O 4.15.0-rc3Lyude-Test+ #6 [ 680.831596] Hardware name: HP HP ZBook Studio G4/826B, BIOS P71 Ver. 01.03 06/09/2017 [ 680.832168] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [ 680.832687] RIP: 0010:free_msi_irqs+0x180/0x1b0 [ 680.833271] RSP: 0018:ffffc9000030fbf0 EFLAGS: 00010286 [ 680.833761] RAX: ffff8803405f9c00 RBX: ffff88033e3d2e40 RCX: 000000000000002c [ 680.834278] RDX: 0000000000000000 RSI: 00000000000000ac RDI: ffff880340be2178 [ 680.834832] RBP: 0000000000000000 R08: ffff880340be1ff0 R09: ffff8803405f9c00 [ 680.835342] R10: 0000000000000000 R11: 0000000000000040 R12: ffff88033d63a298 [ 680.835822] R13: ffff88033d63a000 R14: 0000000000000060 R15: ffff880341959000 [ 680.836332] FS: 0000000000000000(0000) GS:ffff88034f440000(0000) knlGS:0000000000000000 [ 680.836817] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 680.837360] CR2: 000055e64044afdf CR3: 0000000001c09002 CR4: 00000000003606e0 [ 680.837954] Call Trace: [ 680.838853] pci_disable_msix+0xce/0xf0 [ 680.839616] igb_reset_interrupt_capability+0x5d/0x60 [igb] [ 680.840278] igb_remove+0x9d/0x110 [igb] [ 680.840764] pci_device_remove+0x36/0xb0 [ 680.841279] device_release_driver_internal+0x157/0x220 [ 680.841739] pci_stop_bus_device+0x7d/0xa0 [ 680.842255] pci_stop_bus_device+0x2b/0xa0 [ 680.842722] pci_stop_bus_device+0x3d/0xa0 [ 680.843189] pci_stop_and_remove_bus_device+0xe/0x20 [ 680.843627] trim_stale_devices+0xf3/0x140 [ 680.844086] trim_stale_devices+0x94/0x140 [ 680.844532] trim_stale_devices+0xa6/0x140 [ 680.845031] ? get_slot_status+0x90/0xc0 [ 680.845536] acpiphp_check_bridge.part.5+0xfe/0x140 [ 680.846021] acpiphp_hotplug_notify+0x175/0x200 [ 680.846581] ? free_bridge+0x100/0x100 [ 680.847113] acpi_device_hotplug+0x8a/0x490 [ 680.847535] acpi_hotplug_work_fn+0x1a/0x30 [ 680.848076] process_one_work+0x182/0x3a0 [ 680.848543] worker_thread+0x2e/0x380 [ 680.848963] ? process_one_work+0x3a0/0x3a0 [ 680.849373] kthread+0x111/0x130 [ 680.849776] ? kthread_create_worker_on_cpu+0x50/0x50 [ 680.850188] ret_from_fork+0x1f/0x30 [ 680.850601] Code: 43 14 85 c0 0f 84 d5 fe ff ff 31 ed eb 0f 83 c5 01 39 6b 14 0f 86 c5 fe ff ff 8b 7b 10 01 ef e8 b7 e4 d2 ff 48 83 78 70 00 74 e3 <0f> 0b 49 8d b5 a0 00 00 00 e8 62 6f d3 ff e9 c7 fe ff ff 48 8b [ 680.851497] RIP: free_msi_irqs+0x180/0x1b0 RSP: ffffc9000030fbf0 As it turns out, normally the freeing of IRQs that would fix this is called inside of the scope of __igb_close(). However, since the device is already gone by the point we try to unregister the netdevice from the driver due to a hotplug we end up seeing that the netif isn't present and thus, forget to free any of the device IRQs. So: make sure that if we're in the process of dismantling the netdev, we always allow __igb_close() to be called so that IRQs may be freed normally. Additionally, only allow igb_close() to be called from __igb_close() if it hasn't already been called for the given adapter. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 9474933caf21 ("igb: close/suspend race in netif_device_detach") Cc: Todd Fujinaka <todd.fujinaka@intel.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: stable@vger.kernel.org Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-24igb: Clarify idleslope config constraintsJesus Sanchez-Palencia
By design, the idleslope increments are restricted to 16.384kbps steps. Add a comment to igb_main.c making that explicit and add one example that illustrates the impact of that. Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-24igb: add function to get maximum RSS queuesZhang Shengju
This patch adds a new function igb_get_max_rss_queues() to get maximum RSS queues, this will reduce duplicate code and facilitate future maintenance. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-24igb: Allow to remove administratively set MAC on VFsCorinna Vinschen
Before libvirt modifies the MAC address and vlan tag for an SRIOV VF for use by a virtual machine (either using vfio device assignment or macvtap passthru mode), it saves the current MAC address and vlan tag so that it can reset them to their original value when the guest is done. Libvirt can't leave the VF MAC set to the value used by the now-defunct guest since it may be started again later using a different VF, but it certainly shouldn't just pick any random value, either. So it saves the state of everything prior to using the VF, and resets it to that. The igb driver initializes the MAC addresses of all VFs to 00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK netlink message, also visible in the list of VFs in the output of "ip link show"). But when libvirt attempts to restore the MAC address back to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the kernel responds with "Invalid argument". Forbidding a reset back to the original value leaves the VF MAC at the value set for the now-defunct virtual machine. Especially on a system with NetworkManager enabled, this has very bad consequences, since NetworkManager forces all interfacess to be IFF_UP all the time - if the same virtual machine is restarted using a different VF (or even on a different host), there will be multiple interfaces watching for traffic with the same MAC address. To allow libvirt to revert to the original state, we need a way to remove the administrative set MAC on a VF, to allow normal host operation again, and to reset/overwrite the VF MAC via VF netdev. This patch implements the outlined scenario by allowing to set the VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF. igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0, so it's possible to reset the VF MAC back to the original value via the VF netdev. Note: Recent patches to libvirt allow for a workaround if the NIC isn't capable of resetting the administrative MAC back to all 0, but in theory the NIC should allow resetting the MAC in the first place. Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Tested-by: Aaron Brown <arron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-11-21igb: Use smp_rmb rather than read_barrier_dependsBrian King
The original issue being fixed in this patch was seen with the ixgbe driver, but the same issue exists with igb as well, as the code is very similar. read_barrier_depends is not sufficient to ensure loads following it are not speculatively loaded out of order by the CPU, which can result in stale data being loaded, causing potential system crashes. Cc: stable <stable@vger.kernel.org> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-11-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: "Highlights: 1) Maintain the TCP retransmit queue using an rbtree, with 1GB windows at 100Gb this really has become necessary. From Eric Dumazet. 2) Multi-program support for cgroup+bpf, from Alexei Starovoitov. 3) Perform broadcast flooding in hardware in mv88e6xxx, from Andrew Lunn. 4) Add meter action support to openvswitch, from Andy Zhou. 5) Add a data meta pointer for BPF accessible packets, from Daniel Borkmann. 6) Namespace-ify almost all TCP sysctl knobs, from Eric Dumazet. 7) Turn on Broadcom Tags in b53 driver, from Florian Fainelli. 8) More work to move the RTNL mutex down, from Florian Westphal. 9) Add 'bpftool' utility, to help with bpf program introspection. From Jakub Kicinski. 10) Add new 'cpumap' type for XDP_REDIRECT action, from Jesper Dangaard Brouer. 11) Support 'blocks' of transformations in the packet scheduler which can span multiple network devices, from Jiri Pirko. 12) TC flower offload support in cxgb4, from Kumar Sanghvi. 13) Priority based stream scheduler for SCTP, from Marcelo Ricardo Leitner. 14) Thunderbolt networking driver, from Amir Levy and Mika Westerberg. 15) Add RED qdisc offloadability, and use it in mlxsw driver. From Nogah Frankel. 16) eBPF based device controller for cgroup v2, from Roman Gushchin. 17) Add some fundamental tracepoints for TCP, from Song Liu. 18) Remove garbage collection from ipv6 route layer, this is a significant accomplishment. From Wei Wang. 19) Add multicast route offload support to mlxsw, from Yotam Gigi" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2177 commits) tcp: highest_sack fix geneve: fix fill_info when link down bpf: fix lockdep splat net: cdc_ncm: GetNtbFormat endian fix openvswitch: meter: fix NULL pointer dereference in ovs_meter_cmd_reply_start netem: remove unnecessary 64 bit modulus netem: use 64 bit divide by rate tcp: Namespace-ify sysctl_tcp_default_congestion_control net: Protect iterations over net::fib_notifier_ops in fib_seq_sum() ipv6: set all.accept_dad to 0 by default uapi: fix linux/tls.h userspace compilation error usbnet: ipheth: prevent TX queue timeouts when device not ready vhost_net: conditionally enable tx polling uapi: fix linux/rxrpc.h userspace compilation errors net: stmmac: fix LPI transitioning for dwmac4 atm: horizon: Fix irq release error net-sysfs: trigger netlink notification on ifalias change via sysfs openvswitch: Using kfree_rcu() to simplify the code openvswitch: Make local function ovs_nsh_key_attr_size() static openvswitch: Fix return value check in ovs_meter_cmd_features() ...
2017-11-08net_sch: cbs: Change TC_SETUP_CBS to TC_SETUP_QDISC_CBSNogah Frankel
Change TC_SETUP_CBS to TC_SETUP_QDISC_CBS to match the new convention.. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-07Merge branch 'linus' into locking/core, to resolve conflictsIngo Molnar
Conflicts: include/linux/compiler-clang.h include/linux/compiler-gcc.h include/linux/compiler-intel.h include/uapi/linux/stddef.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Several conflicts here. NFP driver bug fix adding nfp_netdev_is_nfp_repr() check to nfp_fl_output() needed some adjustments because the code block is in an else block now. Parallel additions to net/pkt_cls.h and net/sch_generic.h A bug fix in __tcp_retransmit_skb() conflicted with some of the rbtree changes in net-next. The tc action RCU callback fixes in 'net' had some overlap with some of the recent tcf_block reworking. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27igb: Add support for CBS offloadAndre Guedes
This patch adds support for Credit-Based Shaper (CBS) qdisc offload from Traffic Control system. This support enable us to leverage the Forwarding and Queuing for Time-Sensitive Streams (FQTSS) features from Intel i210 Ethernet Controller. FQTSS is the former 802.1Qav standard which was merged into 802.1Q in 2014. It enables traffic prioritization and bandwidth reservation via the Credit-Based Shaper which is implemented in hardware by i210 controller. The patch introduces the igb_setup_tc() function which implements the support for CBS qdisc hardware offload in the IGB driver. CBS offload is the only traffic control offload supported by the driver at the moment. FQTSS transmission mode from i210 controller is automatically enabled by the IGB driver when the CBS is enabled for the first hardware queue. Likewise, FQTSS mode is automatically disabled when CBS is disabled for the last hardware queue. Changing FQTSS mode requires NIC reset. FQTSS feature is supported by i210 controller only. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Tested-by: Henrik Austad <henrik@austad.us> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26igb: Fix TX map failure pathJean-Philippe Brucker
When the driver cannot map a TX buffer, instead of rolling back gracefully and retrying later, we currently get a panic: [ 159.885994] igb 0000:00:00.0: TX DMA map failed [ 159.886588] Unable to handle kernel paging request at virtual address ffff00000a08c7a8 ... [ 159.897031] PC is at igb_xmit_frame_ring+0x9c8/0xcb8 Fix the erroneous test that leads to this situation. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-25locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns ↵Mark Rutland
to READ_ONCE()/WRITE_ONCE() Please do not apply this to mainline directly, instead please re-run the coccinelle script shown below and apply its output. For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't harmful, and changing them results in churn. However, for some features, the read/write distinction is critical to correct operation. To distinguish these cases, separate read/write accessors must be used. This patch migrates (most) remaining ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following coccinelle script: ---- // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and // WRITE_ONCE() // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-18ethernet/intel: Convert timers to use timer_setup()Kees Cook
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Switches test of .data field to .function, since .data will be going away. Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10igb: check memory allocation failureChristophe JAILLET
Check memory allocation failures and return -ENOMEM in such cases, as already done for other memory allocations in this function. This avoids NULL pointers dereference. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Aaron Brown <aaron.f.brown@intel.com Acked-by: PJ Waskiewicz <peter.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-08igb: do not drop PF mailbox lock after read of VF messageGreg Edwards
When the PF receives a mailbox message from the VF, it grabs the mailbox lock, reads the VF message from the mailbox, ACKs the message and drops the lock. While the PF is performing the action for the VF message, nothing prevents another VF message from being posted to the mailbox. The current code handles this condition by just dropping any new VF messages without processing them. This results in a mailbox timeout in the VM for posted messages waiting for an ACK, and the VF is reset by the igbvf_watchdog_task in the VM. Given the right sequence of VF messages and mailbox timeouts, this condition can go on ad infinitum. Modify the PF mailbox read method to take an 'unlock' argument that optionally leaves the mailbox locked by the PF after reading the VF message. This ensures another VF message is not posted to the mailbox until after the PF has completed processing the VF message and written its reply. Signed-off-by: Greg Edwards <gedwards@ddn.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-08igb: Remove incorrect "unexpected SYS WRAP" log messageCorinna Vinschen
TSAUXC.DisableSystime is never set, so SYSTIM runs into a SYS WRAP every 1100 secs on 80580/i350/i354 (40 bit SYSTIM) and every 35000 secs on 80576 (45 bit SYSTIM). This wrap event sets the TSICR.SysWrap bit unconditionally. However, checking TSIM at interrupt time shows that this event does not actually cause the interrupt. Rather, it's just bycatch while the actual interrupt is caused by, for instance, TSICR.TXTS. The conclusion is that the SYS WRAP is actually expected, so the "unexpected SYS WRAP" message is entirely bogus and just helps to confuse users. Drop it. Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-08igb: protect TX timestamping from API misuseCliff Spradlin
HW timestamping can only be requested for a packet if the NIC is first setup via ioctl(SIOCSHWTSTAMP). If this step was skipped, then the igb driver still allowed TX packets to request HW timestamping. In this situation, the _IGB_PTP_TX_IN_PROGRESS flag was set and would never clear. This prevented any future HW timestamping requests to succeed. Fix this by checking that the NIC is configured for HW TX timestamping before accepting a HW TX timestamping request. Signed-off-by: Cliff Spradlin <cspradlin@google.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-08igb: Fix error of RX network flow classificationGangfeng Huang
After add an ethertype filter, if user change the adapter speed several times, the error "ethtool -N: etype filters are all used" is reported by igb driver. In older patch, function igb_nfc_filter_exit() and igb_nfc_filter_restore() is not paried. igb_nfc_filter_restore() exist in igb_up(), but function igb_nfc_filter_exit() is exist in __igb_close(). In the process of speed changing, only igb_nfc_filter_restore() is called, it will take a position of ethertype bitmap. Reproduce steps: Step 1: Add a etype filter by ethtool $ethtool -N eth0 flow-type ether proto 0x88F8 action 1 Step 2: Change the adapter speed to 100M/full duplex $ethtool -s eth0 speed 100 duplex full Step 3: Change the adapter speed to 1000M/full duplex ethtool -s eth0 speed 1000 duplex full Repeat step2 and step3, then dmesg the system log, you can find the error message, add new ethtype filter is also failed. This fixing is move igb_nfc_filter_exit() from __igb_close() to igb_down() to make igb_nfc_filter_restore()/igb_nfc_filter_exit() is paired. Signed-off-by: Gangfeng Huang <gangfeng.huang@ni.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-07igb: make a few local functions staticColin Ian King
Clean up a few sparse warnings, these following functions can be made static: drivers/net/ethernet/intel/igb/igb_main.c: warning: symbol 'igb_add_mac_filter' was not declared. Should it be static? drivers/net/ethernet/intel/igb/igb_main.c: warning: symbol 'igb_del_mac_filter' was not declared. Should it be static? drivers/net/ethernet/intel/igb/igb_main.c: warning: symbol 'igb_set_vf_mac_filter' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>