|
Grex > Coop12 > #12: brainstorming solutions for the full disk problem |  |
|
| Author |
Message |
| 25 new of 203 responses total. |
tenser
|
|
response 98 of 203:
|
Apr 2 01:17 UTC 2001 |
Regarding #97; Russ, it depends. Back in the days of yore, I seem
to recall that certain FTP daemons (like WU) would allow you to use
a configuration file to do things like specify regular expressions of
files that you didn't want people uploading or downloading. The problem
with those sorts of schemes is that it doesn't prevent someone from
sidestepping the them by just renaming the file when they transfer it.
Also, it doesn't stop them from using eg SSH or email for their transfers.
Then again, if someone is sharp enough to think that they can run an IRC
bot here, they're probably also too smart to know how to get around it.
:-)
Perhaps one of the grex staffers can comment further. I'd be interested
to know if anyone has studied HOW copies of eggdrop etc get onto the
system. Is it really via FTP most of the time?
|
gull
|
|
response 99 of 203:
|
Apr 2 02:13 UTC 2001 |
I'd be in favor of a maximum file size on FTP PUT operations. Many
places now do this. We have a limit on incoming mail attachments, why
not on incoming FTP files as well?
|
cmcgee
|
|
response 100 of 203:
|
Apr 2 13:46 UTC 2001 |
I use Grex as my primary email address because:
I don't have any other ISP
I can get to grex from anywhere in the US with a long distance phone call
for dialin access. For the amount of time I am on email when I'm out of
town, a long distance call is much cheaper than any other access to the
'net that I've found.
I can get to grex from anywhere in the US with a long distance phone call.
I don't need any fancy internet access, and there are fewer technology
links to fail.
"cyberspace.org" is the neatest domain name I know. Since the internet was
opened to everyone, and .edu is no longer high-status, .org reflects my
values. cyberspace is simply easy to remember.
All the privacy and filter stuff that other people said
The staff is the coolest bunch of people I've met in a long time. They are
patient with me, and impatient with the nasties that sometimes get through
to my mailbox. And they respond, fast!
Checking my email and checking bbs are close to the same thing. I like
getting "messages" on bbs and staying up with what my friends are doing.
|
gelinas
|
|
response 101 of 203:
|
Apr 2 14:55 UTC 2001 |
I chose grex for my children's mail boxes simply because I don't like web
browser-based mail and because I don't want their e-mail on the machine
they happen to be using at the moment. Yes, I have an IP connection, so
we telnet in to grex and so could use just about anything in the world.
But grex has the kind of facilities I want them to learn to use. And its
affordable.
I know the advantages of POP and IMAP. And their disadvantages. As long
as we all have to share a computer, we'll stick with Pine (and MH for me)
on UNIX boxes.
|
jared
|
|
response 102 of 203:
|
Apr 2 16:27 UTC 2001 |
What i've looked at doing (and am in the process of doing) on nether.net
is I have it mount a (the) users partition from a NFS server that is
x-over 100Mb/s FE to the machine. I see low utilization on this nfs link
and allows me to set up a PC w/ 256M ram (memory is cheap these days :)
and a 80G ide disk. This means that we have the users on "cheap" disk
that has this ~256M buffer inbetween the disk and everything else.
It works out quite well, not great I/O performance for the users who want
to suck down files as fast as possible to disk but you just set up
rpc.quotad and you have your quotas, etc.. It means you're not buying
a expensive scsi disk to do mass file storage (which is what any
home directory system is). The disk needn't be fast, just reliable.
Plus you can also do a hot-backup or nightly copy to another 80G
ide in the box providing an online backup. Reduces restore times quite
a lot.
|
mdw
|
|
response 103 of 203:
|
Apr 2 21:34 UTC 2001 |
We've certainly considered mail-to-user-directories, but haven't done
it. Here are some of the issues:
(1) security - if mail spool files are directly in home directories
there are more security issues to face, and more potential
for bad stuff to happen.
(2) separating mail from home directories means that one thing filling
up doesn't break everything.
(3) we'd like to back up home directories. we don't want to back up mail.
(4) having a separate mail directory facilitates moving into a
distributed environment. One possible direction would
be similar to Umich: mail delivery & processing on
a separate mail machine, home directories in AFS, and
a pool of login machines that share access to the common
resources.
Also, regarding NFS,
(1) NFS is not secure.
(2) mail and NFS don't play well together.
|
tenser
|
|
response 104 of 203:
|
Apr 3 00:04 UTC 2001 |
Regarding #103; Marcus, thanks for the informative note. However, as
usual, I have some questions. Regarding your point 1, what security
problems are encountered by moving mail to the user's home directory?
It's been my experience that this generally makes a system more secure
by avoiding all of the problems with /var/{spool/,}mail and friends.
Indeed, qmail has adopted this approach by default in order to *avoid*
security problems.
Point (2) is definately a problem. I'm not sure I agree with (3),
but mail does suck up a lot of tape feet....
Regarding point (4), I'm not sure how having a seperate mail area
facilitates moving to a distributed system, while mail in the home
directory doesn't. Going with AFS opens a pretty hefty can of worms,
not the least of which is licensing (though OpenAFS might help with that).
On the other hand, if mail exists in the user's directory, it just sort
of automagically follows him/her around.
Also, recent versions of NFS *are* pretty secure. Indeed, Solaris 8
includes code to authenticate NFS via Kerberos. This probably puts
it on par with AFS. Neither system, however, does encryption of file
data as far as I am aware.
Also, as long as NFS locking works (which it does in newer versions
of Solaris, though not really well in SunOS), it and mail are reasonably
good playmates.
Then again, Dan Bernstein, author of qmail, has the following to
say about NFS: ``NFS is the most unreliable computing environment
ever invented''.
|
scott
|
|
response 105 of 203:
|
Apr 3 00:09 UTC 2001 |
I can answer point (2). We don't want to back up mail for legal/social
reasons, mainly because we don't want to be on the ugly end of a "subpoeana
all his/her email activity" demand. Plus inboxes are often the places that
a failed (created the account then disappeared) user fill up most.
|
scg
|
|
response 106 of 203:
|
Apr 3 00:15 UTC 2001 |
That's not the reason for not backing up mail. Mail is transient enough that
backing it up every few weeks won't really do anything useful.
This does, I suppose, make a user's old mail harder to subpoena, but that's
a side effect. We've always been pretty clear that Grex is not a system for
use for illegal purposes.
|
tenser
|
|
response 107 of 203:
|
Apr 3 00:21 UTC 2001 |
Regarding #105; Thanks Scott, that makes a lot of sense. Hmmm....
Regarding #103; Jared, that model (single 80GB IDE disk) will work,
but it won't scale. In particular, you end up stuffing close to 80
gigs of data onto one single drive, which then gets absolutely
hammered. The seek time alone will kill you if you have more than
a modest load.... It's much better to use a set of disks, and
spread the rotational latency and seek time across a set of spindles.
Also, even if you put multiple ``cheap'' IDE disks in a machine (only
two per controller, though), you end up with another problem: the IDE
controller can't do overlapping seeks (ie, I can't tell one drive to
seek somewhere and then tell the other drive to seek somewhere while
the first is still seeking), which means that access to the bus is
effectively serialized. At least, this is how it used to work.
Obviously, this will make the solution rather slow as time goes by
(especially on a timesharing system).
The reason for going with SCSI isn't so much that it sounds cool, but
that it does away with these limitations, leading to a much higher
performance I/O subsystem.
Then again, maybe some of these issues have been addressed in more
recent versions of the IDE standard. Somehow, I kind of doubt it,
though. Think of IDE as the SunOS 4 SMP of disk architectures. ;-)
|
gelinas
|
|
response 108 of 203:
|
Apr 3 01:40 UTC 2001 |
File permissions on /var/spool are controlled by people who actually
understand unix file permissions. File permissions on user directories
are controlled by people who quite likely can't even define the term, much
less manage them. Dropping mail into a user's directory can be embarrassing
at best.
|
tenser
|
|
response 109 of 203:
|
Apr 3 02:24 UTC 2001 |
Regarding #108; Okay, so what exactly prevents me from doing `chmod a+rw'
to my maildrop in /var/spool/mail? What about if I do it, ``because
someone told me to....'' (``But he said he was Grex staff and he was
trying to fix a problem with mail! His login was mbw, that's Marcus
Watts, right?'')
On the other hand, what prevents me, a malicious user, from attempting
all sorts of race condition attacks against /var/spool/mail. Or from
attacking a whole slew of newly setgid or setuid programs, including
many MUA's. Suppose I find a race in temp file creation in a program
which was originally intended to be run as the user, but must now run as
root to create a file in /var/spool/mail. The list of potential problems
with /var/mail and friends goes on, as does the list of problems which
have been realized over the years.
Grex has taken steps to prevent much of this from happening, but I don't
think the measures are perfect, if for no other reason than the staff
must expend large amounts of time caring for the changes anytime they
want to upgrade almost any part of the mail system.
What's more, a very well respected (err, sorta [*]) paranoid security
freak has done a thorough analysis of the problem and determined
that delivering mail into $HOME isn't that bad, and is preferable to
the alternative. Delivery into $HOME is thus the default for qmail.
My own study of the problem over several years leads me to agree with
him. Perhaps you'd like to tell Dan Bernstein that his decision is
``embarassing at best''?
One point I'd like to make is that discussions of arbitrary user screwups
in setting permissions on directories, files, and maildrops is misguided.
No technical solution can be found to problems that involve inadequate
user education, especially in the area of social engineering. Users who
don't even know who Unix is to ask him for permissions to read a file
aren't likely to just play around with the chmod command for no reason,
and thus aren't likely to mess anything up.
And I've seen a few professional sysadmins screw up permissions in
various places in a big way. Sometimes decentralizing something (like
mail) can significantly reduce the impact of such mistakes, resulting
in a net increase in overall system security.
As a final aside, please note that most MUA's move mail out of /var/mail
or /var/spool/mail and into folders in the user's home directory.
In particular, you've mentioned that you use MH elsewhere in this
discussion. MH's inc and rcvstore by default place mail into folders
which live beneath $HOME/Mail. Has this caused a problem in the past?
If not, then I see very little reason to assume apriori that moving
the MTA's mail delivery into $HOME will cause a problem in the future,
especially since other sites have been doing this for years now.
In particular, the Unix permissions argument doesn't hold up, since
folders would suffer from the same problem.
--
[*] DJB is repsected in terms of his ability to write secure software.
His ability to win friends and influence people is considerably less
respected, however. It's telling, however, that he's willing to put
some of his own money on the line to prove his point; he's willing to
pay a cash reward to anyone who finds a security bug in qmail.
|
jared
|
|
response 110 of 203:
|
Apr 3 03:25 UTC 2001 |
Marcus, when was the last time you used nfs? v3 or v4? I suspect you
are missing out on recent updates.
|
gelinas
|
|
response 111 of 203:
|
Apr 3 04:47 UTC 2001 |
The 'default' umask most users get saddled with is '022': everyone gets to
read any file created. Until the user learns enough to change the umask,
they have no privacy. Since MH honors umask, yup, it's a problem.
Yes, 'social engineering' is a problem. At least the 'engineer' has to
introduce certain concepts to compromise /var/spool directories; $HOME is
compromised by default.
|
tenser
|
|
response 112 of 203:
|
Apr 3 17:35 UTC 2001 |
Regarding #111; Actually, MH ignores umask (or, rather, resets it
to something ``safe'') when creating directories. Have a look at
sbr/makedir.c in the NMH sources. Note that at least inc makes use of
this when creating folders (see uip/inc.c).
So, even if files were created by MH with world read permission, (and
note that they're chmod'ed to 0600 anyway) the fact that the directory
they're in is mode 0700 means that no one can read them other than the
owner of the directory (and root). About the only files in MH that
I know of which default to listening to the umask are .mh_profile, and
the global context file (the latter lives under a protected directory
anyway, though, and the default .mh_profile is hardly earth shatteringly
important enough to keep secret).
If that weren't the case, this would have been a problem for years on
grex and many other systems. However, this has not been a problem for
years on any other system that I know of. In fact, I've seen MH *not*
work if it felt that permissions on certain files were *too* lax (this
happened to a friend of mine once whose .mh_profile was group writable.
Certain MH commands flat out refused to work).
Regarding social engineering, I'd like to know what ``concepts'' an
attacker would have to introduce to make use of a social engineering
attack in /var/mail? The only one I can think of was the one introduced
by PT Barnum nearly a century ago, that of: ``There's a sucker born
every minute.''
Also, I fail to see how $HOME is ``compromised'' by default in any way
while /var/mail isn't.
Finally, whatever any MUA (such as MH) does when creating files is
independent of what an MTA would do if configured to deliver into a
user's directory.
|
scg
|
|
response 113 of 203:
|
Apr 3 19:03 UTC 2001 |
The Grex staff gets tons of mail from people who have messed with their home
directory permissions and locked their accounts. We don't seem to get this
sort of complaint about /var/spool/mail much. I don't know why.
|
tenser
|
|
response 114 of 203:
|
Apr 3 19:17 UTC 2001 |
Regarding #113; Interesting. btw- how do you define ``ton'' in this
context? Typing random octal numbers to chmod can definately do weird
things to one's permissions. I think that's independent of the
security of any maildrop scheme, though.
Hey, some quantitative data on what staff sees most frequently might
be really interesting. Can someone post any here?
|
remmers
|
|
response 115 of 203:
|
Apr 3 21:24 UTC 2001 |
Staff is too busy answering the messages to count them. :)
|
tenser
|
|
response 116 of 203:
|
Apr 3 21:41 UTC 2001 |
Regarding #115. Heh, what you mean you don't have a helpdesk system
like RT set up? :-)
|
scg
|
|
response 117 of 203:
|
Apr 3 22:58 UTC 2001 |
Nope.
Tons was probably an overstatement. I haven't been paying much attention to
staff mail for the last few years, beyond skimming the ones with interesting
subject lines, but my guess is it's probably not more than one or two a week,
if that. Still, that's many more people than screw up the permissions on
/var/spool/mail.
This is generally a matter of people trying to lock down the privacy of their
home directory, and locking it down so far that they can't even get in
themselves.
|
russ
|
|
response 118 of 203:
|
Apr 4 00:16 UTC 2001 |
Re #98: Banning certain filenames just provokes countermeasures. What
I'd like to see is a log which can be checked later, to clean up after
eggdroppers ASAP. Scanning mail for uuencode headers of such files might
also be useful for marking accounts for automatic scrutiny and cleanup.
Idea: Keep copies of eggdrop &c on-line, perhaps in CVS. Tell people
where they are, and that they won't work here. Follow up after the CVS
logs every few hours with a daemon and delete the files and binaries.
This should keep the total inventory down to a few copies at most.
|
gull
|
|
response 119 of 203:
|
Apr 4 01:48 UTC 2001 |
Re #118: Nah. When they tried the CVS version and it didn't work,
they'd just assume we'd done something to it to break it, and download
their own 'clean' source to try.
|
mdw
|
|
response 120 of 203:
|
Apr 4 08:56 UTC 2001 |
Putting mail in home directories creates mail race conditions, with
people renaming things. Sure, there are ways around it, but it's more
of a problem than when mailboxes live in a directory people can't munge.
It's not an unsolvable problem, and indeed, we're still living with some
of these security concerns, because we honor .forward's in people's home
directories.
Perhaps I should describe one, just to make people sweat: originally,
grex used the actual sunos mail programs, and even after we mostly
switched away, we were saddled with one remainder, /usr/ucb/mail.
/usr/ucb/mail, for ugly reasons, required write perms on
/usr/spool/mail, which meant /usr/spool/mail perms were in fact set like
/tmp--the sticky text bit was set. This meant user A couldn't mess with
user B's file if it already existed, which was fine. However, user A
could *link* his mailbox to user B's file, if it *didn't* exist. This
was not good, but livable. We lived with it for a while. Every so
often, weird stuff would happen because people would discover this, and
try to exploit it. When we moved to hierarchal mail files, we finally
got around to fixing /usr/ucb/mail, and got rid of write perms. End of
problem. Now, if we move back to putting mail in people's home
directories, people can now create accounts, and do a hard link to each
other's mail files. This can be fixed, sort of, but I'm not sure it can
be fixed in such a way that there aren't race conditions. Perhaps they
can be, and perhaps the qmail author has fixed them, but I, for one,
sure didn't want to think about them hard enough to be *sure* the result
was right.
So far as NFS/AFS goes. I'm not sure I want to spend a lot of time on
this, but yes, I know NFS has been advancing. Indeed, I know some of
the folks (at CITI) who are working on this, and I know NFS is evolving
in the direction of AFS--including much of its functionality, though not
always in the same way. There's an interesting conflict here though,
which is "secure by default" vs. "compatible with the past". We
obviously want "secure", but we'd have to be very careful to be sure
it's secure enough, and to avoid the temptation to relax something
because we got some legacy product (a terminal controller or backup
device) that required some insecure mode to be turned on in order to use
NFS. AFS supports integrity checking (by default) and will encrypt
things over the wire (if requested). There are definitely features in
AFS that are not yet in NFS (pt groups, global name space), although NFS
is catching up on caching and (maybe) security.
An issue with both NFS and AFS is file locking and reliability. This
becomes a major problem with mail, because these are both crucial to
reliable mail operation. It is hard to do file locking right with any
sort of network based scheme, because you've now introduced network
delays and retries, which can slow things down a great deal. I don't
believe NFS originally supported locks at all (granted, this now purely
historical); modern versions support file locking, iff the server
doesn't crash... AFS isn't really much better about locking, although
since it's never tried to pretend file transactions are "stateless" it's
probably at least architecturally more "honest". Reliability gets to be
an interesting problem - what do you do when you're delivering mail, and
the file server goes down? Do you just drop it? Requeue it? How do you
know that the mail wasn't in fact stored before the server crashed and
you just didn't get the notification?
NFS and AFS file semantics are naturally going to be different than UFS,
so that means the mail delivery client has to know about these changes.
With AFS, for instance, files get flushed to the server upon fsync or
close. With NFS, this is implementation dependent and depends on on the
caching strategy - I believe some modern implementations don't give you
any control at all over when things get flushed, because the API doesn't
permit it.
Security is another messy area. With any secure network filesystem
(where "secure" = secured by kerberos), there is no such thing as SUID
perms, and "root" becomes an interesting problem in definitions. One
solution is mail is delivered by "the mail guy" and you give "the mail
guy" read/write perms on your mailbox, if you want mail to work.
Another would be to forge kerberos credentials -- doable, but real
scary. Now, there *are* sites, some of them quite big that have mail
working acceptably via NFS or AFS. There are plenty of other sites that
have had scary results trying to do this.
There is one other point to consider in all of this; grex is
*different*. With virtually any large institution, users can be
classified into one of two groups, "us" and "them". Any of "us" that
misbehaves badly enough can be turned into one of "them". You're
forkbombing everytime the campus president logs in? Off with your head.
You filled up the disk? You get a huge charge in your next student bill,
for all the file space you're "using". This leads naturally to the
"firewall" mentality, which says, put all of "us" inside the magic fire
ring, and don't let any of "them" cross the ring. Doesn't work with
grex. Any of "them" just has to take out a new account. Users have to
be presumed hostile, even if almost always innocent as lambs. This has
interesting implications for NFS.
|
jared
|
|
response 121 of 203:
|
Apr 6 00:00 UTC 2001 |
(My suggestion is only meant to get "cheap" decent disk attached to grex.
despite the performance losses of IDE disk, there are not a lot
of applications (Except for mail) that are very I/O intensive and
demand performance out of the disk that grex needs).
I'm not for providing an insecure NFS/AFS/"Networked File System"
but the lack of hardware support in the current playform for the IDE
disks makes it more complicated/dificult.
One could purchase one of these new "SunBlade" workstations
($1k list and takes IDE disk which can be obtained.. and
at the price of memory these days $88 for a 256M dimm) and turn it
into the "next grex" and have ditch all the present disk. For
a complete investment of $3k (memory, disk, system) there could
be 100G+ of disk and ~1G of ram online in the system.
(here's the url for the sunblade system:
http://store.sun.com/webconfig/BuildConfig.jhtml;$sessionid$WHDHN2IAAAS3BAM
TA1ESQ1T5AAAACJ1K)
Someone w/ campus/student afiliation can get the solaris-8 src
and modify the kernel for the tcp/ip restrictions and we can be
off and running w/ more disk, faster system, etc.. :)
|
mdw
|
|
response 122 of 203:
|
Apr 6 01:29 UTC 2001 |
I believe IDE still doesn't support disconnected disk seeks, doesn't
support more than 2 disks per controller, doesn't support long cables,
and most IDE controllers are built into a motherboard or cards are
available for just a few bus architectures (ie, not sbus, not vmebus,
which is what we could use today). Modern IDE drives (with the right
controller) do support DMA and linear block addressing, which at least
removes two of the most serious traditional faults of IDE. That makes
IDE competive performance-wise with SCSI for low end systems.
For better or worse, grex does a fair bit of disk I/O - enough that disk
is at times the performance bottleneck, and enough to shorten the lives
of even fairly rugged disks. We have a fairly heavy mail load, pine is
a pig, conferencing and in particular backtalk regularly appear in the
process logs, and of course there are the endless parade of stupid
wannabe eggbotters. We also have a pile of cheap SCSI hardware,
including drives and controllers, which would do just fine on grex. The
only real bottleneck here is staff time and system downtime.
Putting everything on one spindle of a drive intended for a low duty
cycle operating environment, losing the expansion possibilities, and
incidently throwing out the rest of our hardware & software environment,
does not sound like a good solution for reliability, stability, or
performance. I suppose, of course, that if we grew sufficiently
unreliable, the resulting drop in user demand would drop to meet the
performance.
|