|
|
| Author |
Message |
| 25 new of 547 responses total. |
scg
|
|
response 412 of 547:
|
Jun 6 07:07 UTC 2003 |
I should note that one of the places I receive mail through runs SpamAssassin,
and most of the spam I get scores considerably lower on the SpamAssassin
scores than most of my legitimate mail. Of course, I never see the mail
SpamAssassin does catch, and that's probably a considerable quantity of mail.
Rejecting spam in the MTA is obnoxious. If you're receiving the spam directly
from the sender, it probably makes a lot of sense. In general, though, most
of the spam I receive is forwarded through other lists and aliases, and the
return addresses are generally invalid, so bouncing spam just forwards it to
the postmaster of whatever mail server was forwarding it, disguised as bounce
messages of a sort the postmaster might actually need to look at. Spam
filters should throw the spam away silently, with the caveat that you have
to be really careful to err on the side of assuming mail is legitimate, since
the sender of legitimate mail won't know the mail isn't getting through. In
addition, if you're tagging spam in SpamAssassin and letting procmail do the
discarding, you give individual users some control over how much is being
deleted.
My complaint about SpamAssassin letting through too much spam doesn't mean
it's any worse than Grex's current filtering. I started automatically
discarding all my Grex mail because more spam was coming through than I could
manage.
|
cross
|
|
response 413 of 547:
|
Jun 6 07:33 UTC 2003 |
A good amount of the spam I receive these days comes through grex.
I don't bother forwarding it back to grex's uce alias since most of it
is the standard type of junk one typically sees. That is, I doubt anyone
is going to learn anything from it that isn't already common knowledge.
I will note that spamassassin now includes a Bayesian filter that can
be `trained' to recognize real spam. Running it and letting it do the
tagging can't hurt much.
There might be something to be said about generating bounces to
postmasters who are running open mail relays, but I tend to doubt it.
|
mdw
|
|
response 414 of 547:
|
Jun 6 07:51 UTC 2003 |
It's extremely difficult to collect good statistics about the
effectiveness of many anti-spam measures. Spammers tend not complain
when they can't send spam, so we don't know how many gave up. Natural
selection has clearly favored the evolution of spammers who get by
grex's defenses.
The problem with bayesian filters is it has to be trained on an
individual basis to give the good results people quote. It's not enough
to give it spam, it also have to be given ham, and each person's spam
and ham are different enough that a filter trained to one person's ham
isn't going to do so good at another's ham. A group of people with
common interests may get acceptable results even so; and in a work
environment defining "common" may even be possible. But I doubt there's
enough commonality amongst all grex users to achieve results of any
great value.
|
janc
|
|
response 415 of 547:
|
Jun 6 13:14 UTC 2003 |
Case in point: I'd be perfectly happy with a spam filter that rejects all
mail written in any languages other than English and German. If I trained
a Bayesian filter, that's probably part of what it would do. However, there
are lots of other users on Grex who for some strange reason like foriegn
language main. A suitable Bayesian filter for them would be very different
than one for me.
|
gull
|
|
response 416 of 547:
|
Jun 6 13:28 UTC 2003 |
Re #412: If a postmaster's site is acting as a spam relay, they deserve
to be annoyed.
Re #414: Where I work we're currently using a Bayesian filter with a
site-wide corpus, with pretty good results. But this is a small
company, with about 30 employees that tend to make similar decisions
about what is and isn't spam. (The majority of our spam is, for some
reason, for porn sites and penis enlargment products, which no one
admits to wanting in their work accounts.) We also don't bounce
messages based on the filter, just tag them for later filtering with
each user's mail client. I seriously doubt this approach would work for
Grex, because the user base here is too diverse.
To me, it's far, far more important that legitimate mail to my Grex
account not be rejected than that spam be blocked. For that reason I'd
oppose any heavy-handed spam filtering unless it was configurable on an
individual user basis. I'd also oppose any spam filtering that silently
dropped messages instead of bouncing them, because it's far better if a
legitimate message bounces than if it just disappears. People are
conditioned to assume that if an email doesn't bounce back, it arrived
intact.
|
cross
|
|
response 417 of 547:
|
Jun 6 14:31 UTC 2003 |
A global installation of spamasassin can be use each individual user's
preferences. That is, it can be run by default and use a target users
data; there doesn't have to be a system wide immutable setting. Also,
spamassassin doesn't automatically delete spam, it just tags it and
let's the user decide what to do with it. I dump mine into a special
MH folder that I scan once or twice a day to pull out any false positives.
|
carson
|
|
response 418 of 547:
|
Jun 6 17:06 UTC 2003 |
(when installing spamassassin on a system, it's a good idea to run it in
a "learning" mode before using it to block mail. because it works using
Bayesian filters, it needs time to A) learn what to block and B) prove
that it's not dumping legitimate mail at an unacceptable rate.)
(Marcus wrote somewhere [I don't remember where off-hand] recently
indicating that, if Grex were to use an open relay blocking list, it would
occasionally reject mail to itself because Grex sometimes appears on RBL
lists. I'm not sure why he caem to that conclusion. while I've
occasionally seen Grex on DNS blacklists, I've yet to see it on an open
relay list. at any rate, it seems trivial [to me, anyway] that, if Grex
were to use outside blacklists, it would be on its own local whitelist.)
(Scott Vintinner wrote a document on how to set up an anti-spam gateway
using a combination of OpenBSD, Postfix, Amavisd-new, SpamAssassin, Vipul's
Razor, and DCC. it's at http://lawmonkey.org/anti-spam.html . he makes
some choices in implementation that we likely would not. still, it
provides a start, and there are some useful suggestions in the document
that are valid in and of themselves. the one question I'd have about the
gateway set-up is "how resource-intensive is it?", but I keep reminding
myself that NextGrex is more powerful than the current model and will
likely surprise us with what it can and can't handle.)
|
scg
|
|
response 419 of 547:
|
Jun 6 21:26 UTC 2003 |
re 416:
There's a rather large difference between an open relay forwarding spam
in random directions, and a well secured mail server handling mailing lists
or .forward files.
As a case in point, I used to host the Grex staff mailing lists on my mail
server. That is, people would send mail to the staff@grex.org address or some
other less publicized aliases, and it would be forwarded to my mail server,
which would then forward it on to the individual staffers on mail servers all
over the place. Then a couple of staff members started using spam filtering
that rejected spam in MTA, thus sending the spam back to the postmaster of
the mail server that was trying to deliver it to them, in this case me. One
staffer had this imposed on him by his mail provider, and didn't have much
of a choice in it, while another staff member had configured it himself and
refused to fix it. The result was that running a mailing list that sent mail
to those people was more trouble than it was worth.
|
dang
|
|
response 420 of 547:
|
Jun 10 18:19 UTC 2003 |
My provider dedided a while ago to start bouncing mail with
SpamAssassin. For whatever reason, this was catching a lot of mail that
I wanted, and people were complaining. I left them and went to a site
that didn't block spam, as I run my own spam filters (a combination of
spamassassin with custom rules and bogofilter). Just a datapoint.
|
i
|
|
response 421 of 547:
|
Jun 21 13:13 UTC 2003 |
There was a decent item on SlashDot yesterday about noting the tuple of
sender, recipient, & IP address at the the start of a SMTP session and
putting the e-mail off an hour (with "try again later" or some such) if
the tuple isn't in a kept-on-the-side database of tuples seen more than
one hour but less than 36 days ago.
This does a passable job of blocking most try-once-quickly-and-move-on
spam, but little try-again-per-the-RFC real e-mail (according to the
author). The 1 hour and 36 days were adjustable parameters; there were
other details & some real-use-experience statistics.
This sounds like it has a number of features we're looking for in an
anti-spam tool for Grex...any thoughts?
|
other
|
|
response 422 of 547:
|
Jun 21 17:05 UTC 2003 |
Whatever will help stem the flow...
|
gull
|
|
response 423 of 547:
|
Jun 23 14:17 UTC 2003 |
It would tend to increase the network load. (Each valid message that
wasn't high-traffic would require two connections and attempts intead of
one.) Is the majority of spam really "try once and move on?" I imagine
that's probably true of direct-to-MX spam, but probably not true of open
relay spam.
|
keesan
|
|
response 424 of 547:
|
Jun 23 21:06 UTC 2003 |
I get the same spams numerous times.
|
cross
|
|
response 425 of 547:
|
Jun 23 21:33 UTC 2003 |
Perhaps, but it's not clear to me that the load on the network link
is excessive right now. It's thought to be, but no one's ever measured
it. I'd think the latency in getting email would be more troublesome.
Spam isn't a problem that's solved by adding arbitrary delays into the
mix.
|
i
|
|
response 426 of 547:
|
Jun 24 00:39 UTC 2003 |
We're close to having this ready for production use at work. Very early
results suggest that most spam doesn't come back to retry delivery later.
|
russ
|
|
response 427 of 547:
|
Jun 24 03:49 UTC 2003 |
It might be easier to deal with spam by accepting the first N pieces
of mail from a novel host, then delaying the rest; if the mail starts
showing up in the UCE bin, further mail from that host is refused for
an extended period (or perhaps permanently). This takes care of
hijacked relays as well, while passing the occasional e-mail from an
odd host without any delays at all.
While spam may not be a big bandwidth load, it sure isn't going to
stay small unless we act; it really behooves us to frustrate spammers
if we can.
Jan, is there any way to get Backtalk to salt its pages with fake
e-mail addresses that would be picked up by spam harvesters? That
would be one way to be certain that a host was sending spam to Grex.
|
gull
|
|
response 428 of 547:
|
Jun 24 13:11 UTC 2003 |
All of these delaying tactics sound like they have the potential to
cause problems for people who are on legitimate mailing lists.
|
janc
|
|
response 429 of 547:
|
Jun 24 15:34 UTC 2003 |
I think it's probable that email addresses are being harvested from Grex, but
I'm not sure of the extent of the problem. Grex backtalk has a robot.txt file
that requests honest robots not to harvest it, which is why google searches
don't find Grex items (somewhat mixed blessing). Obviously dishonest spammers
are likely to disregard this, and I have seen indications of robots walking
through Backtalk on Grex. In most Backtalk interfaces, clicking on the user
name will give user bio page - but on Grex that is just the .plan file, and
won't contain in clickable email addresses, which are probably the spammer's
favorite thing to harvest (on other systems Backtalk does have clickable email
links, an issue that I need to address). Some people will have their
email addresses in their .plan files, but most probably don't or have email
addresses for systems other than Grex, so spam to addressesharvested from
there would go to non-Grex addresses and be hard to recognize.
A spammer might be smart enough to go through the Backtalk pages, pick up login
names and attach "@grex.org" to the end of each. But I'd think this would be
uncommon.
I'm not really inclined to think that seeding a lot of bad addresses is
going to help enough to be worth the ugliness. However, you are certainly
welcome to include <A HREF=mailto:uce@grex.org>send spam here</A> links
in all your HTML postings on Grex.
|
scg
|
|
response 430 of 547:
|
Jun 27 22:40 UTC 2003 |
I don't doubt that delaying mail acceptance for an hour would be effective
against the current generation of spammers, but my general impression of
techniques for blocking spam by insisting on standards compliance is that
spammers are getting better and better at following standards. That strikes
me as something which, if done commonly, would put a lot of extra load on
legitimate mail servers, and would break the ability of e-mail to be used as
a fairly instantanious back and forth communications tool.
|
i
|
|
response 431 of 547:
|
Jun 28 00:22 UTC 2003 |
More results from using this at work - over 75% of spammers do not come
back within 24 hours. No evidence that any legit e-mail has been lost.
Looking through the tuple database, substantial spam attacks are *really*
obvious...suggesting automated means to keep 'em locked out after their
tuples age out of probation...i think we're using 5 minutes for this now.
Yes, like any anti-spam technology, this has downsides and costs for both
the infrastructure & users. But *not* using any anti-spam technology
also has downsides & costs - like overflowing "in" boxes of left-'cause-
there-was-too-much-spam ex-users.
|
gull
|
|
response 432 of 547:
|
Jun 30 15:31 UTC 2003 |
Walter, are you using some kind of pre-made package to do this or did
you roll your own?
|
i
|
|
response 433 of 547:
|
Jul 2 01:08 UTC 2003 |
We rolled our own (adding a few lines of C to our mail server software
with MySQL on the back end). Graylisting is hardly more complex than
a bubblesort routine, and it took fewer lines of code than any bubblesort
that i can recall.
We started it in "observer, record, & report what you'd do" mode. We've
added more simple features (whitelist, etc.) to it as they've occured to
us.
It looks like spammers are far more impatient on retrying than legit mail
servers - we're hoping to add the rule "if graylisted e-mail comes back
before the graylist time expires, then add a minute to the interval it
came back within, compare that sum to the remaining graylist interval,
and update the graylist interval to the greater of those two". Tweaking
the numbers promises to let almost all legit e-mail avoid additional
delays from this, hopefully spam with delay itself to death (or blacklist).
|
jep
|
|
response 434 of 547:
|
Oct 17 16:04 UTC 2003 |
How's the NextGrex project coming along? I haven't seen any updates
here since July. Is the computer still going to be new when this is
completed?
|
cross
|
|
response 435 of 547:
|
Oct 17 17:23 UTC 2003 |
It's stalled; too many staff have too many other things going on. I was
going to propose, and I suppose here is as good a place to do it, that
we move nextgrex from Jan's house to the pumpkin after the new version
of OpenBSD comes out. I think that'll make it easier to test and debug
subsystems, and ultimately easier to transition over from old grex to
new grex. Once we do that, we should set a schedule; a reasonable one,
that we can stick to, trying to anticipate people's time demands, for
switching over within, say, no more than three months.
Major subsystems I see needing configuration before we can switch are:
1) The BBS. Someone has to port PicoSpan (mdw?) or provide
an alternative (YAPP? Frontalk?)
2) Mail. We have to build out the mail system.
3) set up newuser
4) Configuration changes.
That's about it. Everything else, we can do once we've transitioned.
|
remmers
|
|
response 436 of 547:
|
Oct 17 17:30 UTC 2003 |
I'd say that's a fair summary. Only potential drawback that I can see
to locating it in the pumpkin is if the hardware still needs a fair
amount of hands-on attention, it'd be a nuisance for somebody to run
over there all the time to tend to it. If the hardware configuration
is pretty stable now, this isn't an issue.
|