Chad M Stewart | 7 Jun 2011 18:31

Re: Debugging problems w/o tcpdump


On Jun 6, 2011, at 4:11 PM, Rolf E. Sonneveld wrote:

> Hi, Chad,

Hi Rolf, 

long time no chat 

> 
> On 06/06/2011 10:47 PM, Chad M Stewart wrote:
> 
> [...]
> 
>> 
>>> before it gets sent out.
>>         ^^^^^^^^^^^^^^^^
>> 
>> Those being the important words, in this case the messages are in the queue and the tcp_smtp_client is
unable deliver.   It is the "unable to deliver" part that I want to resolve, but darn it all I feel as though
both of my hands are tied behind my back.  I've got messages going to random domains.  One recipient is in
Yahoo (spelled correctly, etc.), a dot-stuffed response.   Given the nature of the problems and the
randomness of the domains, I'm leaning towards something on the local network vs problems at remote sites.
>> 
>> I realize my case is probably a one off and not common, as in can't run tcpdump and very hard to make
configuration changes, but that was why my idea of a dynamic ability to increase the logging level for a
specific message that is already sitting in the queue.
> 
> what happens if you lookup the MX records for one or two of the domain(s) and telnet port 25 to the various MX
hosts? Can you complete an entire SMTP session, or do you encounter delays, broken links, or other problems?

Those are things I can do anytime during the day and generally things work just fine.  I'm in a very unique
situation (an ISP not an enterprise) regarding addresses, as in originator.  Lets say the domain that has
its MX record pointing to servers I look after is XYZ.com.   For now messages in the outbound servers I manage
would not have that as the return path domain.  Instead the return path domain could be one of about 12 other
domains.  Those other domains are all managed by other ISPs.  I'm finding it hard to emulate one of the
messages stuck in the queue, past rcpt to.  I don't not want to send any test messages to either the sender or
the recipient.  I check basic things like can my outbound servers connect, etc..  I also check from my own
servers to get a different network/server p
 erspective on things.

> 
> Any Pix firewall between you and the Internet, with 'fixup' for smtp enabled? Or load balancers, SMTP
proxies? Does the site have multiple Internet links, using BGP or similar?

I know there are firewalls, what I don't know (yet) is what if anything they do to the packets leaving the 
outbound MTAs.  I know they provide NAT services.  I did inject a message and let it get delivered to my
servers, so I know some messages are working.

I learned late yesterday that I can run tcpdump, it just has to be during a maintenance window and
unfortunately for my sleep that is in early morning hours my time.   The good news is that very early this
morning I was able to capture a couple of traces.  (Note --  I'm glad job_controller did what I hoped it would
do and that is I listed the messages in the queue, found one that I wanted a trace on, setup snoop, and then did
rel N, and job_controller immediately submitted another delivery attempt.  I think this has been around a
while but I couldn't remember.)  

Both of my traces show the servers communicating nicely and then at some point during the transmission of
the message data the sending system stops processing the acknowledgements.   I've got to load the captures
up in wireshark and other tools and analyze further.

-Chad


Gmane