Nathan Patwardhan | 1 Oct 2009 13:24
Picon

Re: NFSv4 issue with netapp filer

On Thu, Oct 1, 2009 at 5:12 AM, Guillaume Rousse
<Guillaume.Rousse <at> inria.fr> wrote:
> Nathan Patwardhan a écrit :
>>
>> On Wed, Sep 30, 2009 at 12:28 PM, Trond Myklebust
>> <trond.myklebust <at> fys.uio.no> wrote:
>>>
>>> On Wed, 2009-09-30 at 18:05 +0200, Guillaume Rousse wrote:
>>>>
>>>> This seems to match the 'v4 server returned a bad sequence-id error on
>>>> an unconfirmed sequence f5462f44!' in client logs.
>>>
>>> This is clearly a server bug. The client is allowed to use whatever
>>> sequence id it likes for an OPEN with an unconfirmed open owner. I
>>> thought this bug had been fixed in OnTap, though. Are you running the
>>> latest version?
>>
>> I can confirm this still happens under OnTap 7.3.1 and NetApp has
>> confirmed the bug (filed a Sev 2 ticket awhile back about it).  It
>> appears that it's related to memory count on the filer and cannot be
>> tuned in OnTap.
>
> Can you give me the bug number ? I can't find it on netapp bug tracking
> system.

Funny you should ask.  I just got a response from NetApp last night.
The bug ID is 276821.

The NetApp tech's description of the bug is:

"
On that case, this system hit BUG 276821 after
upgrading Data ONTAP to 7.3.1P2 because the system limit was reached.
For a FAS3020, limits for NFSv4 are:

Max Clients       16384
Max Owners      32768
Max StateIds     32768

Reading the case notes on XXXXXXXXXXXX, the NFSv4 locks will not be
released until the point they are unlocked or the client does not stop
renewing. Thus, if the client does not send the ops that renew the
lease, this is not an issue and you would need to do some tuning on the
client side. However, if the client is not renewing the locks but NFS is
released until the point they are unlocked or the client does not stop
renewing. Thus, if the client does not send the ops that renew the
lease, this is not an issue and you would need to do some tuning on the
client side. However, if the client is not renewing the locks but NFS is
not freeing the resource on the filer side, it would be definitely an
issue.
"

>
> In all case, the list of bug fixed between 7.3.1.1 and 7.3rc2 has 6
> NFS-related bugs, none of them nfs4-specific, nor matching this problem.
>
>> Looking at the release notes for 7.3.2RC, it appears that a number of
>> NFSv4 changes were made WRT delegation but given the nature of the bug
>> as it was explained to me, I'm thinking that the issue hasn't resolved
>> given our filers and their existing hardware (3020).
>
> I didn't saw anything related to NFS4 to changes between 7.3.1.1 and 7.3.2RC
> in the releast notes. AFAIK, The changes you're mentionning are part of the
> 7.3 series.

Further, from NetApp, the bug has been addressed in 7.3.2RC:

"
In regards to your question about the Data ONTAP 7.3.2 as GA, Data ONTAP
7.3.2RC1 was released on July 2009. Normally, the GA classification is
given to the feature release of Data ONTAP approximately 1-2 months from
the first RC release. However, I cannot tell when the GA version will be
available because we do not have an estimate date for the next version.
On this release, BUG 276821 is fixed.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
"

I believe that our issues are identical, btw.  A large rsync against a
NFSv4 mount to a 3020 filer will DoS NFSv4 services (CIFS and NFSv3
will still work).

--

-- 
Nathan Patwardhan

Gmane