|
Grex > Helpers > #149: Grex System Problems - Spring 2006 | |
|
| Author |
Message |
| 25 new of 333 responses total. |
naftee
|
|
response 300 of 333:
|
Jun 12 20:27 UTC 2006 |
what on earth would sindii post on craigslist ?
does she write about GreX twits in the rants & raves section ?
|
cross
|
|
response 301 of 333:
|
Jun 12 20:56 UTC 2006 |
This response has been erased.
|
gull
|
|
response 302 of 333:
|
Jun 13 03:37 UTC 2006 |
The number of simultaneous connections is governed by the
'smtp_accept_max' variable. Unfortunately the Exim documentation
doesn't mention what the default setting of this variable is.
A couple other suggestions to streamline the process:
- Currently you wait up to 30 seconds for an ident response. This
happens during the connect phase. I'd shorten that. Set
"rfc_1413_query_timeout" to something in the 10 second range, if you
absolutely can't live without it. I find ident isn't very useful these
days, so I tend to disable that lookup entirely.
- You have 'deliver_queue_load_max' set to 1.0, which means you aren't
doing any deliveries when the load average is above that value. I
might bump that up to 2.0, since Grex seems to run pretty high load
averages at times. This may be part of the reason you aren't getting
much mail through. This is a tricky one; you're potentially trading
off system responsiveness for getting more mail through.
If you've got more people who are familiar with postfix, switching
might be a good idea. MTAs are funny that way; they're complex
programs and people who know one generally find any others to be
incomprehensible. For example, I know Exim reasonably well, but I'd be
lost with Postfix, and I can just barely get Sendmail going.
|
nharmon
|
|
response 303 of 333:
|
Jun 13 12:01 UTC 2006 |
I'm familiar with exchange, but not out of choice.
|
keesan
|
|
response 304 of 333:
|
Jun 13 14:18 UTC 2006 |
I could never get mail from earthlink/mindspring, it always timed out (they
did not wait long enough). Is that related to the previous response?
|
mcnally
|
|
response 305 of 333:
|
Jun 13 18:36 UTC 2006 |
re #302: I tried bumping up the load average limits for queueing and
delivering mail to 2.5 or 3 or something like that before I also tried
cutting out several of the RBL checks. It didn't appear to make any
significant improvement in delivery or in the number of messages dropped.
If you're knowledgable about exim configs you could have a look at the
configs, suggest specific changes (maybe edit a copy, I'll look at the
diffs, and apply them) if you have the time..
|
davel
|
|
response 306 of 333:
|
Jun 13 20:01 UTC 2006 |
Re 297 (way back):
Mike, I certainly didn't mean to be dumping on you. My experience over the
years has been that you're willing to help people whenever you can,
& I'm confident that you approach staff stuff the same way. I know I
wasn't the only one complaining; but I for one wasn't pointing any fingers.
Meanwhile, it sure looks like someone did *something*. The dozen queued-up
messages that had been trying to go to Grex for several days now seem to have
suddenly vanished from the queue - & as I haven's gotten any bounce messages,
I think they must have gone through. But a new message I just sent seems
to be hanging there. Dunno.
|
keesan
|
|
response 307 of 333:
|
Jun 13 21:12 UTC 2006 |
I just tried to send myself a mail with pine. I did not get the usual error
messages but it took a long time to go from 0% to 100% sent and then got stuck
at 100%. Ctrl-C exited pine and told me I had sent the mail (after I waited
a few minutes for the prompt) but the mail has not arrived.
|
cross
|
|
response 308 of 333:
|
Jun 13 21:23 UTC 2006 |
This response has been erased.
|
keesan
|
|
response 309 of 333:
|
Jun 13 22:04 UTC 2006 |
finger root -- Charlie Root, mail last read June 12, logged in from msu.edu,
running sh which is using 14% of CPU, causing load average to be around 8.
Amazing that I can still see what I am typing. sh is a shell.
|
keesan
|
|
response 310 of 333:
|
Jun 13 22:06 UTC 2006 |
sh is now up to 16% of CPU time. It was 13% when I first looked.
Another hole to be plugged?
166 processes, of which 1 is running and the rest are stopped.
|
keesan
|
|
response 311 of 333:
|
Jun 13 22:07 UTC 2006 |
Make that the rest are idle or zombie, 1 on processor, and sh is 15%.
|
keesan
|
|
response 312 of 333:
|
Jun 13 22:10 UTC 2006 |
Charlie Root has been running ssh since approximately 5:47. jp2 has been
logged on since 5:38 and is running 'j p 2'. The rest of us are running bbs,
party, lynx, bash, and the like.
|
cross
|
|
response 313 of 333:
|
Jun 14 00:12 UTC 2006 |
This response has been erased.
|
gull
|
|
response 314 of 333:
|
Jun 14 00:25 UTC 2006 |
True, it's much better to log in as someone else and the su to root.
OpenBSD doesn't nag about it just to be contrary. ;)
|
keesan
|
|
response 315 of 333:
|
Jun 14 03:00 UTC 2006 |
I just got my first mail sent to grex in about a week, and was able to answer
it (I hope) in under 30 sec, but I still never got the mail I sent myself
earlier. Thanks if someone fixed something.
|
keesan
|
|
response 316 of 333:
|
Jun 14 03:16 UTC 2006 |
Load average is back down from 9 to .4. How would we mortals figure out what
is causing the load average to go up so we can report it to staff?
|
mcnally
|
|
response 317 of 333:
|
Jun 14 08:16 UTC 2006 |
One of the things which is causing mail failures, I believe,
(but not the only one) is that periodically /var/spool runs out
of free inodes. It runs out of free inodes because there's a
totally absurd number of files in several subdirectories of
/var/spool/exim which never, ever seem to get cleared out.
Can someone who's familiar with exim tell me what purpose the
various subdirectories of /var/spool/exim serve and how files
in there are supposed to be purged? Because we don't seem to
be doing it properly..
|
keesan
|
|
response 318 of 333:
|
Jun 14 21:52 UTC 2006 |
I am still getting mail from June 8 - better late than never. Thanks again
STeve et al.
|
cross
|
|
response 319 of 333:
|
Jun 14 23:18 UTC 2006 |
This response has been erased.
|
keesan
|
|
response 320 of 333:
|
Jun 15 00:48 UTC 2006 |
I was able to reply to a few June 7 mails that just showed up but now I am
getting an STMP greeting failure when I try to send mail. We picked the wrong
week to try to sell a car. Turns out lots of people were interested a week
ago.
|
keesan
|
|
response 321 of 333:
|
Jun 15 01:22 UTC 2006 |
Today I have been getting mail progressively from June 9, June 8, June 7 and
now June 6. I wondered why people had stopped writing. Thanks again.
|
keesan
|
|
response 322 of 333:
|
Jun 15 01:56 UTC 2006 |
In among the June 7 I just got a May 23 mail! Rane was saying it took up to
12 days to get mail at grex, but this is about 21 days. How does the mail
manage to get so bogged down? Can exim be instructed to deliver the oldest
mails first instead of the newest ones?
|
steve
|
|
response 323 of 333:
|
Jun 15 04:09 UTC 2006 |
There are some old pieces of mail in the queue which I think will now
get liberated. Exim is likely going to get a little more tweaking
before this is all over. But mails are moving fairly quickly now.
It took about 1.5 minutes for mail to get from msu.edu to grex, and
faster than that the other way around.
|
rcurl
|
|
response 324 of 333:
|
Jun 15 07:57 UTC 2006 |
Wow! The e-mail dam has burst. But, thanks!
|