A lesson in verifying Nexus 7000 MTU

Nexus 7000 has it’s system jumbo mtu set to 9216 by default.  Even though system MTU is set, notice the interface MTU:

N7K-1# sh run all | i mtu
  system jumbomtu 9216
N7K-1# sh int e3/7 | i MTU
  MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec

Try to ping over that interface with a large packet, and it will certainly drop:

N7K-1# ping 10.12.0.29 df-bit packet-size 8972
  PING 10.12.0.29 (10.12.0.29): 8972 data bytes
  Request 0 timed out
  Request 1 timed out
  ^C
  --- 10.12.0.29 ping statistics ---
  3 packets transmitted, 0 packets received, 100.00% packet loss

Change the interface MTU and try another ping across that interface:

N7K-1 (config)# int e3/7
  N7K-1 (config-if) mtu 9000
N7K-1# sh int e3/7 | i MTU
  MTU 9000 bytes, BW 10000000 Kbit, DLY 10 usec
N7K-1# ping 10.12.0.29 df-bit packet-size 8972
  PING 10.12.0.29 (10.12.0.29): 8972 data bytes
  8980 bytes from 10.12.0.29: icmp_seq=0 ttl=254 time=3.321 ms
  8980 bytes from 10.12.0.29: icmp_seq=1 ttl=254 time=3.117 ms
  Request 2 timed out
  8980 bytes from 10.12.0.29: icmp_seq=3 ttl=254 time=3.374 ms
  8980 bytes from 10.12.0.29: icmp_seq=4 ttl=254 time=3.153 ms
--- 10.12.0.29 ping statistics ---
  5 packets transmitted, 4 packets received, 20.00% packet loss 

Great, we can ping with a large packet = jumbo MTU is working!  However, there’s a problem – notice the dropped packet.  For some reason, when I ping with anything higher than 4026 bytes, I will drop 1 out of every 4-5 packets.  Here’s the evidence:

N7K-1# ping 10.12.0.29 packet-size 8092
 PING 10.12.0.29 (10.12.0.29): 8092 data bytes
  8100 bytes from 10.12.0.29: icmp_seq=0 ttl=254 time=3.258 ms
  8100 bytes from 10.12.0.29: icmp_seq=1 ttl=254 time=3.056 ms
  Request 2 timed out
  8100 bytes from 10.12.0.29: icmp_seq=3 ttl=254 time=3.712 ms
  8100 bytes from 10.12.0.29: icmp_seq=4 ttl=254 time=3.965 ms
--- 10.12.0.29 ping statistics —
  5 packets transmitted, 4 packets received, 20.00% packet loss
 round-trip min/avg/max = 3.056/3.497/3.965 ms
N7K-1# ping 10.12.0.29 packet-size 4019
 PING 10.12.0.29 (10.12.0.29): 4019 data bytes
  4027 bytes from 10.12.0.29: icmp_seq=0 ttl=254 time=3.085 ms
  4027 bytes from 10.12.0.29: icmp_seq=1 ttl=254 time=2.815 ms
  4027 bytes from 10.12.0.29: icmp_seq=2 ttl=254 time=2.765 ms
  4027 bytes from 10.12.0.29: icmp_seq=3 ttl=254 time=2.804 ms
  Request 4 timed out
--- 10.12.0.29 ping statistics —
  5 packets transmitted, 4 packets received, 20.00% packet loss
 round-trip min/avg/max = 2.765/2.867/3.085 ms
N7K-1# ping 10.12.0.29 packet-size 4018
 PING 10.12.0.29 (10.12.0.29): 4018 data bytes
  4026 bytes from 10.12.0.29: icmp_seq=0 ttl=254 time=3.23 ms
  4026 bytes from 10.12.0.29: icmp_seq=1 ttl=254 time=3.018 ms
  4026 bytes from 10.12.0.29: icmp_seq=2 ttl=254 time=1.881 ms
  4026 bytes from 10.12.0.29: icmp_seq=3 ttl=254 time=2.984 ms
  4026 bytes from 10.12.0.29: icmp_seq=4 ttl=254 time=2.979 ms
--- 10.12.0.29 ping statistics —
  5 packets transmitted, 5 packets received, 0.00% packet loss
 round-trip min/avg/max = 1.881/2.818/3.23 ms

The Root Cause:

Nexus COPP is set to strict, which protects the control plane from icmp abuse.  Hop onto the Admin VDC and take a look at the copp statistics:

N7K-1# show policy-map interface control-plane | i "class|conform|violated"
 ...
  class-map copp-system-p-class-monitoring (match-any)
  conform action: transmit
  conformed 7281105 bytes,
  violated 221256 bytes,
  conformed 0 bytes,
  violated 0 bytes,
 ...

Let’s take a look at what’s included in that class-map:

policy-map type control-plane copp-system-p-policy-strict
  class copp-system-p-class-monitoring
  set cos 1
   police cir 130 kbps bc 1000 ms conform transmit violate drop
class-map type control-plane match-any copp-system-p-class-monitoring
  match access-group name copp-system-p-acl-icmp
  match access-group name copp-system-p-acl-icmp6
  match access-group name copp-system-p-acl-traceroute
ip access-list copp-system-p-acl-icmp
  10 permit icmp any any echo
  20 permit icmp any any echo-reply

Ah-ha!  As expected, icmp is being policed and dropped.  I cleared the statistics just to verify:

N7K-1# clear copp statistics
N7K-1# show policy-map interface control-plane | i "class|conform|violated”
  Truncated to only show class-map copp-system-p-class-monitoring
class-map copp-system-p-class-monitoring (match-any)
  conform action: transmit
  conformed 625 bytes,
        violated 0 bytes,
N7K-1# ping 10.12.0.29 packet-size 8000
  PING 10.12.0.29 (10.12.0.29): 8000 data bytes
  8008 bytes from 10.12.0.29: icmp_seq=0 ttl=254 time=3.617 ms
  8008 bytes from 10.12.0.29: icmp_seq=1 ttl=254 time=3.359 ms
  8008 bytes from 10.12.0.29: icmp_seq=2 ttl=254 time=3.493 ms
  Request 3 timed out
  8008 bytes from 10.12.0.29: icmp_seq=4 ttl=254 time=3.687 ms
--- 10.12.0.29 ping statistics ---
  5 packets transmitted, 4 packets received, 20.00% packet loss
  round-trip min/avg/max = 3.359/3.538/3.687 ms

Now let’s look at the copp statistics:

class-map copp-system-p-class-monitoring (match-any)
  conform action: transmit
  conformed 33059 bytes,
          violated 8046 bytes,

Verified – Control Plane policing is dropping my pings, as it rightfully should!
 

The Solution

In order to correct this, we’ll need to modify the COPP profile.  The default strict profile is READ-ONLY!  You’ll need to make a copy in order to make any modifications:

N7K-1# copp copy profile strict prefix custom

If you do a show run copp, you’ll see the strict policy is copied and the prefix “custom-“ added to it.  We now want to remove the icmp class-map and apply this copp profile to our control-plane.

N7K-1(config)# policy-map type control-plane custom-copp-policy-strict
  N7K-1(config-pmap)# no class custom-copp-class-monitoring

Notice, at the moment we are still using the default strict profile (not the copied custom one):

N7K-1# sh copp status
  Last Config Operation: copp copy profile strict prefix custom
  Last Config Operation Timestamp: 23:54:55 UTC Feb 18 2014
  Last Config Operation Status: Success
  Policy-map attached to the control-plane: copp-system-p-policy-strict  

Let’s change that:

N7K-1(config)# control-plane
  N7K-1(config-cp)# service-policy input custom-copp-policy-strict
N7K-1# sh copp status
  Last Config Operation: class custom-copp-class-monitoring
  Last Config Operation Timestamp: 23:59:42 UTC Feb 18 2014
  Last Config Operation Status: Success
  Policy-map attached to the control-plane: custom-copp-policy-strict 

Let’s try our ping now:

N7K-1# ping 10.12.0.29 packet-size 8000
  PING 10.12.0.29 (10.12.0.29): 8000 data bytes
  8008 bytes from 10.12.0.29: icmp_seq=0 ttl=255 time=0.539 ms
  8008 bytes from 10.12.0.29: icmp_seq=1 ttl=255 time=0.64 ms
  8008 bytes from 10.12.0.29: icmp_seq=2 ttl=255 time=0.321 ms
  8008 bytes from 10.12.0.29: icmp_seq=3 ttl=255 time=0.327 ms
  8008 bytes from 10.12.0.29: icmp_seq=4 ttl=255 time=0.408 ms
--- 10.12.0.29 ping statistics ---
  5 packets transmitted, 5 packets received, 0.00% packet loss
  round-trip min/avg/max = 0.321/0.446/0.64 ms

Success!!

Now, this was only for verification, and I really do want protection from icmp DoS attacks, so I made sure to reapply the icmp class-map.

Additional Note:

These tests were done using an M-Series card.  If you’re running F-Series cards, MTU is configured differently.  Interfaces on the F-Series cards only support an MTU of 1500 or the equivalent value of the system jumbomtu.  If you try to enter any other value, NX-OS will kindly remind you of this.  However, if specific QoS is applied, the F-Series will follow per-CoS MTU settings.

Future testing on Nexus 7Ks will always have me pondering…

fry-copp

3 comments

  1. When only change the ” system jumbomtu 9216″ and interface mtu to 9216 will not work to enable that port support jumbo packet

    1. No, it does not. The issue here was simply the CoPP policy, not the MTU configuration. The policy was dropping my icmp traffic hitting the control plane, so the modification lifted that restriction. Thanks for the question!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s