69 new of 333 responses total.
Re 261 re 259: I see this all the time - mail to cyberspace.org hangs in the queue for hours to days, often generating warnings after 4 hours. If I connect to the SMTP port, I see that "too many concurrent SMTP connections" message, & an immediate disconnect. And I've also gotten complaints about mail to my Grex address not getting through; & this is much more of a problem for Grace, who has no other email address. But is this really a problem with the delay? It seems to me that the delay must be really, really, REALLY long if this is the case; when I see the too-many-connections message the disconnect is *immediate*. What is the maximum allowable number of SMTP connections (or processes) set to? Can this be bumped up, if it realistically is much too low? I realize that the limit is set to keep mail from completely busying out the system. But that doesn't mean that the default limit (if that's what it is) is reasonable for Grex. If the limit is inherited somehow from the previous hardware, it's even less likely to be reasonable. (But that seems a bit unlikely, given that we went from a hacked sendmail to something else at that time.)
Grex's mail config sucks, but I'm afraid I don't know enough about exim to fix it. The maillog shows we're dropping connections on the floor pretty much constantly, but I'm not sure how to fix that. I have a hypothesis that it may be due to our policy of introducing a 30s delay for any host listed in one of several RBLs, which I think causes a lot of tied up exim processes, but my attempts to reduce that delay to 1s by tweaking the exim.conf file have probably not been successful.
Several people report that they can never send me mail, including from mindspring/earthlink, which is impatient so it always times out.
Re #267: That's what prompted me to post what I did on my Web site. It's pointless people trying to send me email on Grex unless they happen to be another local user here.
I see what people mean about a problem with e-mail. I had sent a message here a few days ago that arrived within minutes, but last night I sent one that has still not arrived after nearly 12 hours.
ssh did not work just now but telnet does. Please someone remove the email not working message from last week (motd).
My unscientific observation is that my incoming and outgoing mail have stopped working for approximately the last 24 hours. I realize that some of this is still my fault for not having moved to a modern mail platform.
I have received some mail dated today, but not what I sent yesterday. Has it gone into a black hole?
I got three mails from AOL today (they have fixed the problem of refusing to accept mails from us too).
sdf.lonestar.org is working now, but freeshell.org is not, and the disk quotas all set themselves to 0 (with 400MB free) and I had to 'tweak' them - I wonder what people do who have not paid for the use of 'tweak'. I posted this info for people who signed up at freeshell.org for more reliable email than grex, which has been astonishingly reliable recently.
After a few days with no spam, the dam broke this morning and 21 spam messages poured in. But the message I sent two days ago has not. Where is it?
Your message must have drowned in the flood of spam.
I also am seeing much more spam. OTOH, SMTP connections are not being closed (or much, much less frequently) with that too-many-smtp-connections message. I suspect no mere coincidence.
In an attempt to end the backlogs that were causing many valid messages to be dropped, I disabled two of the three spam blacklist checks to see whether that would improve mail delivery. Apparently it hasn't had any affect on the delay problems people are complaining about, only facilitated the delivery of more spam. I'll put back the copy of the exim configuration I kept from before my changes, restoring the status ante. I apologize for the extra spam; it was an experiment to see whether some tuning of the system would help the situation; unfortunately I don't know enough about exim to figure out how to increase the number of simultaneous connections it will accept. I *do* know that the mail log is full of dropped connections, constantly, and it would be nice if whoever committed us to exim would look at them.
Mike, could you set up some simple script that would let people use spamassassin? I have never ever had a false positive (I only require three points to dump suspected mail) but still sometimes get false negatives. Spamassassin with 3 points was getting at least 3/4 of my spam. (I added some more filters on top of it). Unfortunately it puts some large files into a ./.spamassassin directory but they could be deleted automatically at login.
Re #278 - so my message to me here was killed by a Grex spam filter? From a UM server address? Why would any of those be in a spam blacklist?
In case you are unaware, some ISPs, etc, filter HARD on umich.edu. I know from experience because several times in the last year or so I've had that problem with another ISP I use blocking the umich mail. There reasons were quite understandable. Apparently, a lot of spam or other problems involve umich.edu addresses. I was once told it had to do with all the freebies the students download using their umich accounts (presumably a reference to the hidden "zombieware" some freebies contain) though I don't know the details. In any case, if you haven't learned already, be warned now: a umich.edu address is apt to be filtered by any number if ISPs for any number of reasons. I'd suggest an alternate address for time-critical communications.
I cannot send mail from grex or even postpone it.
Sdf.lonestar.org is also inaccessible again. When I try to telnet it gets stuck - how do I exit from the attempt (in DOS, Ctrl-C does not work)?
I've never used telnet under DOS, but in UNIX if you enter Control-] you should get a prompt. Enter q and press enter and you should quit. If you don't get the prompt then I have no idea.
Thanks. First time today sdf at freeshell was 'down', an hour later it had been up for over 3 days. ???
I'm not able to get any mail at all into Grex.
Ctrl-] works when I am ssh'ed to grex, but what I need is a way to end a telnet attempt FROM grex to freeshell/lonestar.
re #280: > so my message to me here was killed by a Grex spam filter? I have no way of knowing that, but probably not. Delaying messages from sites that are believed to be spam sites potentially affects delivery of messages from all sites. Let's say that Grex is configured to support N simultaneous mail connections at any given time. Now imagine a whole bunch of sites that are listed in these RBLs connect and attempt to deliver messages. Because they're listed in the RBLs their connections are intentionally delayed to slow down spam delivery. What happens if N of these sites are being kept waiting while your non-blacklisted mail site attempts to make the N+1th connection? If I understand the system properly your connection, the N+1th, is rejected because the mail server is busy and it's assumed that the host trying to deliver it will reconnect later when the Grex server isn't busy. But from what I see in the log files I think we're being more or less constantly bombarded with connections from other hosts and there's never a time when Grex's server is not busy and dropping connections from other hosts that want to connect. I really hope I'm misunderstanding something fundamental here but whether I am or not something is clearly very wrong with mail delivery.
Grex is not accepting mail for the last day or two, and yesterday (and probably today) was not sending mail either. What is the problem and is anyone working on it? I have a couple of craigslist ads listing my grex address, because freeshell was broken at the time.
fastmail.fm offers 10MB webmail without ads. Login at least 8 characters.
When are we going to get e-mail back on Grex? Or is now the time to "jump ship" from using Grex for e-mail?
Some of us "jumped ship" a long time ago.
As much as I hate to say this, I dont think that grex currently has the staff needed to maintain email up to the standards we all would like. This is especially true since there are so many excellent free email services out there (like gmail).
I tried to offer a gmail invitation to Grace. They supposedly sent her email containing the invitation. But since mail to Grex isn't going through, she never got it. AFAICS Grex is just plain not accepting mail at present. Always that same too-many-SMTP-connections message.
Last I knew grex was not sending outgoing email either. I don't know any other place besides freeshell to get a shell account that will let us use non webmail with mail, mutt, or pine, and set up spamassassin and procmail. Is anyone working on this problem?
Rane in resp:291 :: It was probably time to leave Grex's email service about a year ago. I'm still doing too much mail here, unsuccessfully, too. Dave in resp:294 on Gmail invites: Have grace get a hotmail account, then send the Gmail invite to the hotmail account? Sindi in resp:295 :: There are probably good reasons there are very few public shell accounts with e-mail any more. Email has become a very difficult and hostile environment. There is little reason to expect volunteers to work like beavers to give you reliable 1985-style e-mail any more.
Here's the deal. I'm out of town, on the first vacation I've had in quite a while. As much as it annoys me (I conduct most of my personal e-mail through Grex and there's no telling what I'm missing, same as many of the rest of you..) I'm not planning on spending my vacation learning how to administer exim properly and fixing Grex's e-mail. Unfortunately nobody else from staff seems to be responding to, or possibly even reading this conference posts. When I get back I'm willing to take a shot at getting a mail configuration working on Grex, but if I do it I'd prefer to use postfix, a mailer I'm more familiar with. Also, there may be a period when mail doesn't work at all while things are swapped over. If people can live with that I'll give it a try when I get home.
Well, I'm at least reading this item. Don't know that I'll have time to work on exim either. (Mail server configuration is in general something I haven't had a lot of experience with.) As I recall the history, we're using exim because a staff member at the time we were transitioning to openbsd was intimately familiar with it from work and volunteered to set it up. Unfortunately, for reasons that I think were beyond his control, he's no longer an active staff member, so we no longer have a resident exim expert. I think that if you're willing to work on mail configuration and nobody else is, the mail software we use should be your call, so I would support switching to something you'd be more comfortable with.
Thanks Mike and John. Perhaps I should start an agora item in which we can post messages for other grexers. In the meantime I won't use the grex email address for craigslist postings.
what on earth would sindii post on craigslist ? does she write about GreX twits in the rants & raves section ?
This response has been erased.
The number of simultaneous connections is governed by the 'smtp_accept_max' variable. Unfortunately the Exim documentation doesn't mention what the default setting of this variable is. A couple other suggestions to streamline the process: - Currently you wait up to 30 seconds for an ident response. This happens during the connect phase. I'd shorten that. Set "rfc_1413_query_timeout" to something in the 10 second range, if you absolutely can't live without it. I find ident isn't very useful these days, so I tend to disable that lookup entirely. - You have 'deliver_queue_load_max' set to 1.0, which means you aren't doing any deliveries when the load average is above that value. I might bump that up to 2.0, since Grex seems to run pretty high load averages at times. This may be part of the reason you aren't getting much mail through. This is a tricky one; you're potentially trading off system responsiveness for getting more mail through. If you've got more people who are familiar with postfix, switching might be a good idea. MTAs are funny that way; they're complex programs and people who know one generally find any others to be incomprehensible. For example, I know Exim reasonably well, but I'd be lost with Postfix, and I can just barely get Sendmail going.
I'm familiar with exchange, but not out of choice.
I could never get mail from earthlink/mindspring, it always timed out (they did not wait long enough). Is that related to the previous response?
re #302: I tried bumping up the load average limits for queueing and delivering mail to 2.5 or 3 or something like that before I also tried cutting out several of the RBL checks. It didn't appear to make any significant improvement in delivery or in the number of messages dropped. If you're knowledgable about exim configs you could have a look at the configs, suggest specific changes (maybe edit a copy, I'll look at the diffs, and apply them) if you have the time..
Re 297 (way back): Mike, I certainly didn't mean to be dumping on you. My experience over the years has been that you're willing to help people whenever you can, & I'm confident that you approach staff stuff the same way. I know I wasn't the only one complaining; but I for one wasn't pointing any fingers. Meanwhile, it sure looks like someone did *something*. The dozen queued-up messages that had been trying to go to Grex for several days now seem to have suddenly vanished from the queue - & as I haven's gotten any bounce messages, I think they must have gone through. But a new message I just sent seems to be hanging there. Dunno.
I just tried to send myself a mail with pine. I did not get the usual error messages but it took a long time to go from 0% to 100% sent and then got stuck at 100%. Ctrl-C exited pine and told me I had sent the mail (after I waited a few minutes for the prompt) but the mail has not arrived.
This response has been erased.
finger root -- Charlie Root, mail last read June 12, logged in from msu.edu, running sh which is using 14% of CPU, causing load average to be around 8. Amazing that I can still see what I am typing. sh is a shell.
sh is now up to 16% of CPU time. It was 13% when I first looked. Another hole to be plugged? 166 processes, of which 1 is running and the rest are stopped.
Make that the rest are idle or zombie, 1 on processor, and sh is 15%.
Charlie Root has been running ssh since approximately 5:47. jp2 has been logged on since 5:38 and is running 'j p 2'. The rest of us are running bbs, party, lynx, bash, and the like.
This response has been erased.
True, it's much better to log in as someone else and the su to root. OpenBSD doesn't nag about it just to be contrary. ;)
I just got my first mail sent to grex in about a week, and was able to answer it (I hope) in under 30 sec, but I still never got the mail I sent myself earlier. Thanks if someone fixed something.
Load average is back down from 9 to .4. How would we mortals figure out what is causing the load average to go up so we can report it to staff?
One of the things which is causing mail failures, I believe, (but not the only one) is that periodically /var/spool runs out of free inodes. It runs out of free inodes because there's a totally absurd number of files in several subdirectories of /var/spool/exim which never, ever seem to get cleared out. Can someone who's familiar with exim tell me what purpose the various subdirectories of /var/spool/exim serve and how files in there are supposed to be purged? Because we don't seem to be doing it properly..
I am still getting mail from June 8 - better late than never. Thanks again STeve et al.
This response has been erased.
I was able to reply to a few June 7 mails that just showed up but now I am getting an STMP greeting failure when I try to send mail. We picked the wrong week to try to sell a car. Turns out lots of people were interested a week ago.
Today I have been getting mail progressively from June 9, June 8, June 7 and now June 6. I wondered why people had stopped writing. Thanks again.
In among the June 7 I just got a May 23 mail! Rane was saying it took up to 12 days to get mail at grex, but this is about 21 days. How does the mail manage to get so bogged down? Can exim be instructed to deliver the oldest mails first instead of the newest ones?
There are some old pieces of mail in the queue which I think will now get liberated. Exim is likely going to get a little more tweaking before this is all over. But mails are moving fairly quickly now. It took about 1.5 minutes for mail to get from msu.edu to grex, and faster than that the other way around.
Wow! The e-mail dam has burst. But, thanks!
Re resp:317: I don't have the permissions to see what's in /var/spool/exim here, but on my own machine I see four directories. db/ - This only has four files in it, on my system. I think it contains Exim's retry database. input/ - This holds the message queue. If I remember right there are two files per message, one containing just the headers and one containing the message body. If you've got a lot of crap in there, you need to figure out why messages are stacking up in the queue. A common culprit is undeliverable bounce messages, which Exim "freezes" and keeps in the queue until the time set in the "ignore_bounce_errors_after" option is reached. You want this time *short* because undeliverable bounces are almost always the worthless backscatter from spam runs. Right now Grex has this set to 2 days, which I'd say is on the long side. msglog/ - This holds log information about messages that are in transit. The files here (one per message) are normally deleted once the message is delivered, although sometimes they get missed and linger. Again, if this is full, it might be because you have a long queue. This information is duplicated in the main log, so it's safe to say "message_logs=false" in the beginning part of exim.conf and delete the contents of this directory. This should help the inode problem. scan/ - This is a temporary directory where messages are unpacked while they're scanned by external software like ClamAV or SpamAssassin. I don't think Grex is running either of those programs, so there shouldn't be anything in there.
The msglog directory has tens, or possibly hundreds, of thousands of files, some of them dating back to 2004. I presume those aren't from messages Grex is still attempting to deliver..
Remember that Grex crashed a lot, for a while. Most likely Exim never got to delete those files due to a crash, or perhaps there's a bug that causes them to occasionally escape deletion. Like I said, they're safe to delete, and if you set "message_logs=false" you shouldn't have to deal with them anymore. They're really only useful for troubleshooting delivery problems, and the same info can be gleaned from the other logs with a bit of effort.
This response has been erased.
I'm not arguing with that. I've got four old message log files on my own small server, and I have no idea why. Frankly, I think the whole individual-message-log thing is a misfeature that should be removed, or turned off by default. UNIX software in general doesn't seem to clean up after itself very well. If it did there wouldn't be a need for periodic /tmp cleaning scripts to get rid of old, stale lockfiles and the like.
its officially summer
I could not dial either number just now and connect - they both just rang.
what ! m-net seems to be down :(
:( plz fix
You have several choices: