aboutsummaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2015-06-12flow_dissector: fix ipv6 dst, hop-by-hop and routing ext hdrsEric Dumazet1-2/+4
__skb_header_pointer() returns a pointer that must be checked. Fixes infinite loop reported by Alexei, and add __must_check to catch these errors earlier. Fixes: 6a74fcf426f5 ("flow_dissector: add support for dst, hop-by-hop and routing ext hdrs") Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Tested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-12flow_dissector: add support for dst, hop-by-hop and routing ext hdrsTom Herbert1-0/+17
If dst, hop-by-hop or routing extension headers are present determine length of the options and skip over them in flow dissection. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-12flow_dissector: Fix MPLS entropy label handling in flow dissectorTom Herbert1-2/+2
Need to shift after masking to get label value for comparison. Fixes: b3baa0fbd02a1a9d493d8 ("mpls: Add MPLS entropy label in flow_keys") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-12net: ipv4: un-inline ip_finish_output2Florian Westphal1-1/+1
text data bss dec hex filename old: 16527 44 0 16571 40bb net/ipv4/ip_output.o new: 14935 44 0 14979 3a83 net/ipv4/ip_output.o Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-11tcp: remove obsolete check in tcp_set_skb_tso_segs()Eric Dumazet1-3/+0
We had various issues in the past when TCP stack was modifying gso_size/gso_segs while clones were in flight. Commit c52e2421f73 ("tcp: must unclone packets before mangling them") fixed these bugs and added a WARN_ON_ONCE(skb_cloned(skb)); in tcp_set_skb_tso_segs() These bugs are now fixed, and because TCP stack now only sets shinfo->gso_size|segs on the clone itself, the check can be removed. As a result of this change, compiler inlines tcp_set_skb_tso_segs() in tcp_init_tso_segs() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-11tcp: fill shinfo->gso_size at last momentEric Dumazet2-12/+8
In commit cd7d8498c9a5 ("tcp: change tcp_skb_pcount() location") we stored gso_segs in a temporary cache hot location. This patch does the same for gso_size. This allows to save 2 cache line misses in tcp xmit path for the last packet that is considered but not sent because of various conditions (cwnd, tso defer, receiver window, TSQ...) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-11tcp: tcp_set_skb_tso_segs() no longer need struct sock parameterEric Dumazet1-16/+14
tcp_set_skb_tso_segs() & tcp_init_tso_segs() no longer use the sock pointer. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-11tcp: fill shinfo->gso_type at last momentEric Dumazet2-9/+3
Our goal is to touch skb_shinfo(skb) only when absolutely needed, to avoid two cache line misses in TCP output path for last skb that is considered but not sent because of various conditions (cwnd, tso defer, receiver window, TSQ...) A packet is GSO only when skb_shinfo(skb)->gso_size is not zero. We can set skb_shinfo(skb)->gso_type to sk->sk_gso_type even for non GSO packets. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-11tcp: reserve tcp_skb_mss() to tcp stackEric Dumazet1-2/+2
tcp_gso_segment() and tcp_gro_receive() are not strictly part of TCP stack. They should not assume tcp_skb_mss(skb) is in fact skb_shinfo(skb)->gso_size. This will allow us to change tcp_skb_mss() in following patches. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-11switchdev: fix BUG when port driver doesn't support set attr opScott Feldman1-1/+3
Fix a BUG_ON() where CONFIG_NET_SWITCHDEV is set but the driver for a bridged port does not support switchdev_port_attr_set op. Don't BUG_ON() if -EOPNOTSUPP is returned. Also change BUG_ON() to netdev_err since this is a normal error path and does not warrant the use of BUG_ON(), which is reserved for unrecoverable errs. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Reported-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-11net/ethtool: Add current supported tunable optionsHadar Hen Zion1-0/+12
Add strings array of the current supported tunable options. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-11tcp: add CDG congestion controlKenneth Klette Jonassen3-0/+454
CAIA Delay-Gradient (CDG) is a TCP congestion control that modifies the TCP sender in order to [1]: o Use the delay gradient as a congestion signal. o Back off with an average probability that is independent of the RTT. o Coexist with flows that use loss-based congestion control, i.e., flows that are unresponsive to the delay signal. o Tolerate packet loss unrelated to congestion. (Disabled by default.) Its FreeBSD implementation was presented for the ICCRG in July 2012; slides are available at http://www.ietf.org/proceedings/84/iccrg.html Running the experiment scenarios in [1] suggests that our implementation achieves more goodput compared with FreeBSD 10.0 senders, although it also causes more queueing delay for a given backoff factor. The loss tolerance heuristic is disabled by default due to safety concerns for its use in the Internet [2, p. 45-46]. We use a variant of the Hybrid Slow start algorithm in tcp_cubic to reduce the probability of slow start overshoot. [1] D.A. Hayes and G. Armitage. "Revisiting TCP congestion control using delay gradients." In Networking 2011, pages 328-341. Springer, 2011. [2] K.K. Jonassen. "Implementing CAIA Delay-Gradient in Linux." MSc thesis. Department of Informatics, University of Oslo, 2015. Cc: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Neal Cardwell <ncardwell@google.com> Cc: David Hayes <davihay@ifi.uio.no> Cc: Andreas Petlund <apetlund@simula.no> Cc: Dave Taht <dave.taht@bufferbloat.net> Cc: Nicolas Kuhn <nicolas.kuhn@telecom-bretagne.eu> Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-11tcp: export tcp_enter_cwr()Kenneth Klette Jonassen1-0/+1
Upcoming tcp_cdg uses tcp_enter_cwr() to initiate PRR. Export this function so that CDG can be compiled as a module. Cc: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Neal Cardwell <ncardwell@google.com> Cc: David Hayes <davihay@ifi.uio.no> Cc: Andreas Petlund <apetlund@simula.no> Cc: Dave Taht <dave.taht@bufferbloat.net> Cc: Nicolas Kuhn <nicolas.kuhn@telecom-bretagne.eu> Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-10switchdev: fix handling for drivers not supporting IPv4 fib add/del opsScott Feldman1-2/+2
If CONFIG_NET_SWITCHDEV is enabled, but port driver does not implement support for IPv4 FIB add/del ops, don't fail route add/del offload operations. Route adds will not be marked as OFFLOAD. Routes will be installed in the kernel FIB, as usual. This was report/fixed by Florian when testing DSA driver with net-next on devices with L2 offload support but no L3 offload support. What he reported was an initial route installed from DHCP client would fail (route not installed to kernel FIB). This was triggering the setting of ipv4.fib_offload_disabled, which would disable route offloading after the first failure. So subsequent attempts to install the route would succeed. There is follow-on work/discussion to address the handling of route install failures, but for now, let's differentiate between no support and failed support. Reported-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-10net: tcp: dctcp_update_alpha() fixes.Eric Dumazet1-10/+16
dctcp_alpha can be read by from dctcp_get_info() without synchro, so use WRITE_ONCE() to prevent compiler from using dctcp_alpha as a temporary variable. Also, playing with small dctcp_shift_g (like 1), can expose an overflow with 32bit values shifted 9 times before divide. Use an u64 field to avoid this problem, and perform the divide only if acked_bytes_ecn is not zero. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-10Merge tag 'mac80211-next-for-davem-2015-06-10' of ↵David S. Miller27-457/+465
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== For this round we mostly have fixes: * mesh fixes from Alexis Green and Chun-Yeow Yeoh, * a documentation fix from Jakub Kicinski, * a missing channel release (from Michal Kazior), * a fix for a signal strength reporting bug (from Sara Sharon), * handle deauth while associating (myself), * don't report mangled TX SKB back to userspace for status (myself), * handle aggregation session timeouts properly in fast-xmit (myself) However, there are also a few cleanups and one big change that affects all drivers (and that required me to pull in your tree) to change the mac80211 HW flags to use an unsigned long bitmap so that we can extend them more easily - we're running out of flags even with a cleanup to remove the two unused ones. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-10net/unix: support SCM_SECURITY for stream socketsStephen Smalley1-4/+16
SCM_SECURITY was originally only implemented for datagram sockets, not for stream sockets. However, SCM_CREDENTIALS is supported on Unix stream sockets. For consistency, implement Unix stream support for SCM_SECURITY as well. Also clean up the existing code and get rid of the superfluous UNIXSID macro. Motivated by https://bugzilla.redhat.com/show_bug.cgi?id=1224211, where systemd was using SCM_CREDENTIALS and assumed wrongly that SCM_SECURITY was also supported on Unix stream sockets. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-10bridge: make br_fdb_delete also check if the port matchesNikolay Aleksandrov1-3/+5
Before this patch the user-specified bridge port was ignored when deleting an fdb entry and thus one could delete an entry that belonged to any port. Example (eth0 and eth1 are br0 ports): bridge fdb add 00:11:22:33:44:55 dev eth0 master bridge fdb del 00:11:22:33:44:55 dev eth1 master (succeeds) after the patch: bridge fdb add 00:11:22:33:44:55 dev eth0 master bridge fdb del 00:11:22:33:44:55 dev eth1 master RTNETLINK answers: No such file or directory Based on a patch by Wilson Kok. Reported-by: Wilson Kok <wkok@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-10Merge tag 'linux-can-next-for-4.2-20150609' of ↵David S. Miller1-12/+56
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2015-05-06 this is a pull request of a two patches for net-next. The first patch is by Tomas Krcka, he fixes the (currently unused) register address for acceptance filters. Oliver Hartkopp contributes a patch for the cangw, where an optional UID is added to reference routing jobs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-10mac80211: convert HW flags to unsigned long bitmapJohannes Berg19-158/+164
As we're running out of hardware capability flags pretty quickly, convert them to use the regular test_bit() style unsigned long bitmaps. This introduces a number of helper functions/macros to set and to test the bits, along with new debugfs code. The occurrences of an explicit __clear_bit() are intentional, the drivers were never supposed to change their supported bits on the fly. We should investigate changing this to be a per-frame flag. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-10Merge remote-tracking branch 'net-next/master' into mac80211-nextJohannes Berg362-4687/+10446
Merge back net-next to get wireless driver changes (from Kalle) to be able to create the API change across all trees properly. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-10mac80211: Fix a case of incorrect metric used when forwarding a PREQAlexis Green1-9/+8
This patch fixes a bug in hwmp_preq_frame_process where the wrong metric can be used when forwarding a PREQ. This happens because the code uses the same metric variable to record the value of the metric to the source of the PREQ and the value of the metric to the target of the PREQ. This comes into play when both reply and forward are set which happens when IEEE80211_PREQ_PROACTIVE_PREP_FLAG is set and when MP_F_DO | MP_F_RF is set. The original code had a special case to handle the first case but not the second. The patch uses distinct variables for the two metrics which makes the code flow much clearer and removes the need to restore the original value of metric when forwarding. Signed-off-by: Alexis Green <agreen@cococorp.com> CC: Jesse Jones <jjones@cococorp.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-09Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller41-264/+784
Antonio Quartulli says: ==================== Included changes: - use common Jenkins hash instead of private implementation - extend internal routing API - properly re-arrange header files inclusion - clarify precedence between '&' and '?' - remove unused ethhdr variable in batadv_gw_dhcp_recipient_get() - ensure per-VLAN structs are updated upon MAC change ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-09mac80211: Always check rates and capabilities in mesh modeAlexis Green2-1/+5
In mesh mode there is a race between establishing links and processing rates and capabilities in beacons. This is very noticeable with slow beacons (e.g. beacon intervals of 1s) and manifested for us as stations using minstrel when minstrel_ht should be used. Fixed by changing mesh_sta_info_init so that it always checks rates and such if it has not already done so. Signed-off-by: Alexis Green <agreen@cococorp.com> CC: Jesse Jones <jjones@cococorp.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-09mac80211: fix the beacon csa counter for mesh and ibssChun-Yeow Yeoh3-0/+3
The csa counter has moved from sdata to beacon/presp but it is not updated accordingly for mesh and ibss. Fix this. Fixes: af296bdb8da4 ("mac80211: move csa counters from sdata to beacon/presp") Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-09mac80211: Fix incorrectly named last_hop_metric variable in ↵Alexis Green1-9/+9
mesh_rx_path_sel_frame The last hop metric should refer to link cost (this is how hwmp_route_info_get uses it for example). But in mesh_rx_path_sel_frame we are not dealing with link cost but with the total cost to the origin of a PREQ or PREP. Signed-off-by: Alexis Green <agreen@cococorp.com> CC: Jesse Jones <jjones@cococorp.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-09mac80211: ignore invalid scan RSSI valuesSara Sharon1-1/+7
Channels in 2.4GHz band overlap, this means that if we send a probe request on channel 1 and then move to channel 2, we will hear the probe response on channel 2. In this case, the RSSI will be lower than if we had heard it on the channel on which it was sent (1 in this case). The scan result ignores those invalid values and the station last signal should not be updated as well. In case the scan determines the signal to be invalid turn on the flag so the station last signal will not be updated with the value and thus user space probing for NL80211_STA_INFO_SIGNAL and NL80211_STA_INFO_SIGNAL_AVG will not get this invalid RSSI value. Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-09mac80211: release channel on auth failureMichal Kazior1-0/+3
There were a few rare cases when upon authentication failure channel wasn't released. This could cause stale pointers to remain in chanctx assigned_vifs after interface removal and trigger general protection fault later. This could be triggered, e.g. on ath10k with the following steps: 1. start an AP 2. create 2 extra vifs on ath10k host 3. connect vif1 to the AP 4. connect vif2 to the AP (auth fails because ath10k firmware isn't able to maintain 2 peers with colliding AP mac addresses across vifs and consequently refuses sta_info_insert() in ieee80211_prep_connection()) 5. remove the 2 extra vifs 6. goto step 2; at step 3 kernel was crashing: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC Modules linked in: ath10k_pci ath10k_core ath ... Call Trace: [<ffffffff81a2dabb>] ieee80211_check_combinations+0x22b/0x290 [<ffffffff819fb825>] ? ieee80211_check_concurrent_iface+0x125/0x220 [<ffffffff8180f664>] ? netpoll_poll_disable+0x84/0x100 [<ffffffff819fb833>] ieee80211_check_concurrent_iface+0x133/0x220 [<ffffffff81a0029e>] ieee80211_open+0x3e/0x80 [<ffffffff817f2d26>] __dev_open+0xb6/0x130 [<ffffffff817f3051>] __dev_change_flags+0xa1/0x170 ... RIP [<ffffffff81a23140>] ieee80211_chanctx_radar_detect+0xa0/0x170 (gdb) l * ieee80211_chanctx_radar_detect+0xa0 0xffffffff81a23140 is in ieee80211_chanctx_radar_detect (/devel/src/linux/net/mac80211/util.c:3182). 3177 */ 3178 WARN_ON(ctx->replace_state == IEEE80211_CHANCTX_REPLACES_OTHER && 3179 !list_empty(&ctx->assigned_vifs)); 3180 3181 list_for_each_entry(sdata, &ctx->assigned_vifs, assigned_chanctx_list) 3182 if (sdata->radar_required) 3183 radar_detect |= BIT(sdata->vif.bss_conf.chandef.width); 3184 3185 return radar_detect; Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-09mac80211: handle aggregation session timeout on fast-xmit pathJohannes Berg1-3/+6
The conversion to the fast-xmit path lost proper aggregation session timeout handling - the last_tx wasn't set on that path and the timer would therefore incorrectly tear down the session periodically (with those drivers/rate control algorithms that have a timeout.) In case of iwlwifi, this was every 5 seconds and caused significant throughput degradation. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-09can: cangw: introduce optional uid to reference created routing jobsOliver Hartkopp1-12/+56
Similar to referencing iptables rules by their line number this UID allows to reference created routing jobs, e.g. to alter configured data modifications. The UID is an optional non-zero value which can be provided at routing job creation time. When the UID is set the UID replaces the data modification configuration as job identification attribute e.g. at job removal time. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2015-06-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller8-18/+36
2015-06-08net: replace last open coded skb_orphan_frags with function callWillem de Bruijn1-9/+2
Commit 70008aa50e92 ("skbuff: convert to skb_orphan_frags") replaced open coded tests of SKBTX_DEV_ZEROCOPY and skb_copy_ubufs with calls to helper function skb_orphan_frags. Apply that to the last remaining open coded site. Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-08ipv6: Fix protocol resubmissionJosh Hunt1-3/+5
UDP encapsulation is broken on IPv6. This is because the logic to resubmit the nexthdr is inverted, checking for a ret value > 0 instead of < 0. Also, the resubmit label is in the wrong position since we already get the nexthdr value when performing decapsulation. In addition the skb pull is no longer necessary either. This changes the return value check to look for < 0, using it for the nexthdr on the next iteration, and moves the resubmit label to the proper location. With these changes the v6 code now matches what we do in the v4 ip input code wrt resubmitting when decapsulating. Signed-off-by: Josh Hunt <johunt@akamai.com> Acked-by: "Tom Herbert" <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-08ipv6: fix possible use after free of dev statsRobert Shearman1-2/+9
The memory pointed to by idev->stats.icmpv6msgdev, idev->stats.icmpv6dev and idev->stats.ipv6 can each be used in an RCU read context without taking a reference on idev. For example, through IP6_*_STATS_* calls in ip6_rcv. These memory blocks are freed without waiting for an RCU grace period to elapse. This could lead to the memory being written to after it has been freed. Fix this by using call_rcu to free the memory used for stats, as well as idev after an RCU grace period has elapsed. Signed-off-by: Robert Shearman <rshearma@brocade.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07fib_trie: coding style: Use pointer after checkFiro Yang1-8/+13
As Alexander Duyck pointed out that: struct tnode { ... struct key_vector kv[1]; } The kv[1] member of struct tnode is an arry that refernced by a null pointer will not crash the system, like this: struct tnode *p = NULL; struct key_vector *kv = p->kv; As such p->kv doesn't actually dereference anything, it is simply a means for getting the offset to the array from the pointer p. This patch make the code more regular to avoid making people feel odd when they look at the code. Signed-off-by: Firo Yang <firogm@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07bridge: disable softirqs around br_fdb_update to avoid lockupNikolay Aleksandrov1-0/+2
br_fdb_update() can be called in process context in the following way: br_fdb_add() -> __br_fdb_add() -> br_fdb_update() (if NTF_USE flag is set) so we need to disable softirqs because there are softirq users of the hash_lock. One easy way to reproduce this is to modify the bridge utility to set NTF_USE, enable stp and then set maxageing to a low value so br_fdb_cleanup() is called frequently and then just add new entries in a loop. This happens because br_fdb_cleanup() is called from timer/softirq context. The spin locks in br_fdb_update were _bh before commit f8ae737deea1 ("[BRIDGE]: forwarding remove unneeded preempt and bh diasables") and at the time that commit was correct because br_fdb_update() couldn't be called from process context, but that changed after commit: 292d1398983f ("bridge: add NTF_USE support") Using local_bh_disable/enable around br_fdb_update() allows us to keep using the spin_lock/unlock in br_fdb_update for the fast-path. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: 292d1398983f ("bridge: add NTF_USE support") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07Revert "bridge: use _bh spinlock variant for br_fdb_update to avoid lockup"David S. Miller1-2/+2
This reverts commit 1d7c49037b12016e7056b9f2c990380e2187e766. Nikolay Aleksandrov has a better version of this fix. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07mpls: fix possible use after free of deviceRobert Shearman2-1/+2
The mpls device is used in an RCU read context without a lock being held. As the memory is freed without waiting for the RCU grace period to elapse, the freed memory could still be in use. Address this by using kfree_rcu to free the memory for the mpls device after the RCU grace period has elapsed. Fixes: 03c57747a702 ("mpls: Per-device MPLS state") Signed-off-by: Robert Shearman <rshearma@brocade.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07bridge: use _bh spinlock variant for br_fdb_update to avoid lockupWilson Kok1-2/+2
br_fdb_update() can be called in process context in the following way: br_fdb_add() -> __br_fdb_add() -> br_fdb_update() (if NTF_USE flag is set) so we need to use spin_lock_bh because there are softirq users of the hash_lock. One easy way to reproduce this is to modify the bridge utility to set NTF_USE, enable stp and then set maxageing to a low value so br_fdb_cleanup() is called frequently and then just add new entries in a loop. This happens because br_fdb_cleanup() is called from timer/softirq context. These locks were _bh before commit f8ae737deea1 ("[BRIDGE]: forwarding remove unneeded preempt and bh diasables") and at the time that commit was correct because br_fdb_update() couldn't be called from process context, but that changed after commit: 292d1398983f ("bridge: add NTF_USE support") Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: 292d1398983f ("bridge: add NTF_USE support") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07tcp: get_cookie_sock() consolidationEric Dumazet2-23/+6
IPv4 and IPv6 share same implementation of get_cookie_sock(), and there is no point inlining it. We add tcp_ prefix to the common helper name and export it. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07batman-adv: change the MAC of each VLAN upon ndo_set_mac_addressAntonio Quartulli1-3/+9
The MAC address of the soft-interface is used to initialise the "non-purge" TT entry of each existing VLAN. Therefore when the user invokes ndo_set_mac_address() all the "non-purge" TT entries have to be updated, not only the one belonging to the non-tagged network. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2015-06-07batman-adv: Remove unused post-VLAN ethhdr in batadv_gw_dhcp_recipient_getSven Eckelmann1-5/+0
Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2015-06-07batman-adv: Clarify calculation precedence for '&' and '?'Sven Eckelmann3-23/+23
Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2015-06-07batman-adv: Add required includes to all filesSven Eckelmann41-158/+703
The header files could not be build indepdent from each other. This is happened because headers didn't include the files for things they've used. This was problematic because the success of a build depended on the knowledge about the right order of local includes. Also source files were not including everything they've used explicitly. Instead they required that transitive includes are always stable. This is problematic because some transitive includes are not obvious, depend on config settings and may not be stable in the future. The order for include blocks are: * primary headers (main.h and the *.h file of a *.c file) * global linux headers * required local headers * extra forward declarations for pointers in function/struct declarations The only exceptions are linux/bitops.h and linux/if_ether.h in packet.h. This header file is shared with userspace applications like batctl and must therefore build together with userspace applications. The header linux/bitops.h is not part of the uapi headers and linux/if_ether.h conflicts with the musl implementation of netinet/if_ether.h. The maintainers rejected the use of __KERNEL__ preprocessor checks and thus these two headers are only in main.h. All files using packet.h first have to include main.h to work correctly. Reported-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2015-06-07batman-adv: add bat_neigh_free APIAntonio Quartulli2-0/+10
This API has to be used to let any routing protocol free neighbor specific allocated resources Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2015-06-07batman-adv: split name from variable for uint mesh attributesAntonio Quartulli1-14/+15
Some mesh attributes are behind substructs in the batadv_priv object and for this reason the name cannot be used anymore to refer to them. This patch allows to specify the variable name where the attribute is stored inside batadv_priv instead of using the name Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2015-06-07batman-adv: Use common Jenkins Hash implementationSven Eckelmann7-62/+25
An unoptimized version of the Jenkins one-at-a-time hash function is used and partially copied all over the code wherever an hashtable is used. Instead the optimized version shared between the whole kernel should be used to reduce code duplication and use better optimized code. Only the DAT code must use the old implementation because it is used as distributed hash function which has to be common for all nodes. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2015-06-07bpf: allow programs to write to certain skb fieldsAlexei Starovoitov1-12/+82
allow programs read/write skb->mark, tc_index fields and ((struct qdisc_skb_cb *)cb)->data. mark and tc_index are generically useful in TC. cb[0]-cb[4] are primarily used to pass arguments from one program to another called via bpf_tail_call() which can be seen in sockex3_kern.c example. All fields of 'struct __sk_buff' are readable to socket and tc_cls_act progs. mark, tc_index are writeable from tc_cls_act only. cb[0]-cb[4] are writeable by both sockets and tc_cls_act. Add verifier tests and improve sample code. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07bpf: make programs see skb->data == L2 for ingress and egressAlexei Starovoitov3-25/+26
eBPF programs attached to ingress and egress qdiscs see inconsistent skb->data. For ingress L2 header is already pulled, whereas for egress it's present. This is known to program writers which are currently forced to use BPF_LL_OFF workaround. Since programs don't change skb internal pointers it is safe to do pull/push right around invocation of the program and earlier taps and later pt->func() will not be affected. Multiple taps via packet_rcv(), tpacket_rcv() are doing the same trick around run_filter/BPF_PROG_RUN even if skb_shared. This fix finally allows programs to use optimized LD_ABS/IND instructions without BPF_LL_OFF for higher performance. tc ingress + cls_bpf + samples/bpf/tcbpf1_kern.o w/o JIT w/JIT before 20.5 23.6 Mpps after 21.8 26.6 Mpps Old programs with BPF_LL_OFF will still work as-is. We can now undo most of the earlier workaround commit: a166151cbe33 ("bpf: fix bpf helpers to use skb->mac_header relative offsets") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07tcp: remove redundant checks IIEric Dumazet2-8/+0
For same reasons than in commit 12e25e1041d0 ("tcp: remove redundant checks"), we can remove redundant checks done for timewait sockets. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>

Privacy Policy