|
|
| Author |
Message |
| 25 new of 547 responses total. |
lk
|
|
response 371 of 547:
|
May 24 16:18 UTC 2003 |
Sorry, jep, I didn't mean to imply that you (or others) were holding up
anything. I certainly have no idea what the implementation time frame is.
For all I know, Grex budgeted the next 3 months for such discusssion
before finalizing NewGrex and putting it on-line. (:
There's a lot of worthy discussion here and many good suggestions.
But I do know how over-discussion can become negative on a BBS, and I don't
want to see that happen here. Not to sound like the US Patent Officer of
125 years ago, I think all the constructive comments about RAID, with all
its pluses and minuses have been made. It's time to make a decision....
These are the points I'd consider:
(Note that whether RAID is useful for Grex almost becomes a moot point)
1. We have no RAID controller
(and I'm not impressed by the list supported by OpenBSD)
2. The software RAID-5 performance rules that out.
3. Software RAID-1 remains a possibility.
(At least Walter and I think so.)
|
janc
|
|
response 372 of 547:
|
May 24 23:04 UTC 2003 |
Yeah, it occurred to me a little after I wrote my introduction to RAID that
there were more efficient ways to maintain parity on writing - you can read
the parity disk, and the value you are about to overwrite, and use those to
compute the new check sum. So 2 reads and 2 writes suffices no matter how
many disks you have in a RAID 5 array. So, Walter's correction is correct.
I'm not at all unhappy with this discussion. I think we are still in a mode
of usefully exploring options and collecting data. If I feel the discussion
is stagnating, I'll bring it to completion, by declaring a solution by fiat
if necessary, though I'd prefer to boil it down to a few options and get some
concensus among staff. (If Marcus weren't out of town this month, I'd
probably call a staff meeting. We'll need one after he's back in any case.)
I'm interested in Leeron's RAID 1 suggestion. Two disks in RAID 1 and one
disk plain wastes 1/3 of our space, just as RAID 5 would have. If RAID 1
performs substantially better than RAID 5, then this might be a viable option.
The performance is going to have to be pretty good to convince me that this
is better than the rsync option though. However, I plan to rearrage two disks
into a RAID 1 array tonight, so we can benchmark that.
The other project I'm pursuing is improving my understanding of Grex's disk
usage patterns. If you're reading this in coop, you may want to check out
Garage item 150 (I think) where I recently posted some statistics on old
Grex's disk usage. Preliminary results seem to indicate that most of Grex's
disk usage is on the /var drive (which does not include /var/spool/mail).
Apparantly what Grex does most of is logging. More than half the disk
activity is there, and it is almost all writes, not reads. I want to keep
investigating this.
I don't think we are in any special hurry to get the new Grex up, but I want
to keep the process in motion, not letting it stagnate or stall. We are not
stalled. Things are good.
|
janc
|
|
response 373 of 547:
|
May 25 03:08 UTC 2003 |
OK, I've re-arranged the disks once again. Now /sd0 is a plain filesystem
comprising of of SCSI disk 0, and /raid is a RAID 1 array consisting of SCSI
drives 1 and 2.
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
scsi 2000 53754 43.4 54106 14.1 10090 2.6 60326 70.9 61067 11.5 201.2 0.8
raid 1 2000 16651 13.5 19368 3.4 10702 3.6 61614 73.5 68343 14.5 197.9 1.5
raid 5 2000 9520 6.8 7974 1.3 5706 2.0 50932 62.5 63815 13.0 147.9 1.6
This is definately performing much better than RAID 5, but the writes are
still rather on the slow side. (Though we do seem to be getting a slight
win on the READ side - looks like it is balancing reads across the two
disks well enough to get a moderate performance win over a single disk.)
Dan's multi-process benchmark might be worth trying.
Leeron's idea was to use RAID 1 for the more ephemeral partitions -
partitions where data changes rapidly, and restoring from a week-old
backup tape after a crash might be unsatistactory. RAID 1 would provide
a full backup of that data.
So, the RAID might have /bbs, root (mainly for /etc/passwd), /var
(current log files). The regular disk might have /usr, /usr/local, etc.
Dunno where users would go.
The problem is, that the partitions whose contents change a lot (and are
thus more interesting to keep a real-time mirror of) also tend to have
a lot of writes. So putting a partition like /var, which is almost
write-only, on RAID would be pretty unattractive from a performance
point of view. So there's a bit of a paradox here - RAID's advantage
over rsync is greatest when writes are frequent, but it's performance
suffers most under those circumstances.
The one partition where RAID 1 looks good to me right now is /bbs.
More reading than writing certainly happens there, but there is enough
writing so that keeping a mirror would be nice. I guess user partitions
would be a possibility for RAID too.
|
cross
|
|
response 374 of 547:
|
May 25 04:03 UTC 2003 |
Hmm, I don't know. The more we look at the performance numbers, the
less and less impressed I am, to the point of actually being really
disappointed in RAIDframe. It almost doesn't seem worth it.
Doing something like RAID 1+0 might be better, but would require another
disk to really be useful.
I'm not sure I think doing RAID on one partition alone is really worth
it, the rationale being that if a disk dies, without using RAID everywhere,
you have to do a lot of work to bring it back online. Doing the same work
with one or two partitions more doesn't seem like that much of an added
incremental cost. That doesn't solve the problem of lost data, though.
One solution to that would be to leave a tape in the tape drive all the
time, and do a nightly full backup of /bbs and the user partitions (just
overwrite the tape). Every now and then, do a full backup of everything
on a seperate tape and keep it for posterity.
|
cross
|
|
response 375 of 547:
|
May 25 04:15 UTC 2003 |
FYI, I logged back into the nextgrex machine and re-ran my simple
benchmark. The one that took 81 seconds on the RAID5 partition (using an
interleave size of 64; I didn't get a chance to try it on the one with an
interleave size of 256) took about 5 and a quarter seconds on average.
Almost a 20 fold speed increase. Bonnie shows that performance on
a mirror is about 1/3 that of a straight disk. With another disk,
I'd champion using RAID 1+0, as I'm guessing that would be in the same
general area performance wise as `normal' partitions, while still giving
high availability. It'd cost another $200 to get another disk to do
it, though.
|
janc
|
|
response 376 of 547:
|
May 25 13:51 UTC 2003 |
Yeah, I think later today I'll make another pass at designing a
RAID-less partition scheme. This has all been very educational, and
RAID 1 is almost good enough to use, but I don't feel it is quite good
enough.
|
scg
|
|
response 377 of 547:
|
May 27 18:51 UTC 2003 |
I should note that I'm not pushing hard for using RAID. My impression has
been that RAID is a good thing, all other things being equal, but I don't
know enough about to make a good choice.
What I would object to, and what it seemed to me was being advocated in some
of the earlier arguments, is designing for low availability. There are all
sorts of things it makes sense to design for in various situations, such as
low cost, low maintenance, high performance, high availability, and so forth,
and declaring one of those to be a high priority generally involves tradeoffs
in other areas. If cost or performance are determined to be more important
than high availability, I might agree and I certainly wouldn't argue. It's
not doing RAID purely for the sake of not needing high availability that I
was objecting to, and it doesn't sound to me like that's what's going on here
anymore.
|
janc
|
|
response 378 of 547:
|
May 27 20:35 UTC 2003 |
No, that certainly isn't my thinking here. RAID costs a lot of disk space,
which we can probably afford. RAID, at least as implemented in software under
OpenBSD, seems to have a pretty huge performance penalty. Much bigger than
it theoretically should have. High availability *IS* the benefit of RAID.
(RAID 0 and well-implemented RAID 1 might give you performance benefits, but
in most version of RAID other overhead will eat any performance benefit).
I like RAID, but my feeling is that high availability isn't important enough
to Grex to justify its other costs.
I may be wrong. It may be that this new computer is going to be so fast,
that running Grex will hardly load it, and the performance cost of RAID
wouldn't mean anything to it. If so, we should consider moving onto RAID
in the future. I don't think making that change later will be hard. We
need to rebuild the system every year and a half anyway, and changing the
disks from flat disks to RAID does not have broad implications for the
rest of the system configuration.
|
lk
|
|
response 379 of 547:
|
May 28 04:38 UTC 2003 |
With all due respect, check out the speed of M-Net these days (arbornet.org).
I'm not up-to-date on the hardware specs, but I'd assume it's running on
a CPU that is 1/3rd to 1/4 the horsepower and slower drives.
|
cross
|
|
response 380 of 547:
|
May 28 06:00 UTC 2003 |
Hmm, I use mnet...every couple of days or so. It's usually quite
fast, but I don't believe they're using RAID. What's more, they're
running FreeBSD, which has a different RAID implementation yet again.
Leeron, what are you refering to that one should note in terms of mnet's
performance?
|
janc
|
|
response 381 of 547:
|
May 28 14:42 UTC 2003 |
I haven't been on M-Net for a while - but it also generally had fewer users
than Grex.
Generally I expect that the new Grex will be way too fast for the load the
current user base will put on it. However, the user base may grow with better
performance. Also we will be turning on quotas, which is going to put some
drag on the disk performance - that's a lot more important to me than RAID.
Also Grex occasionally gets hit by vandals - I just spent some time tracking
down a mailbomber who was slowing the system down badly. How will the new
Grex perform under those conditions? I don't know. I think we'll need to
gain experience with the new computer before we can really decide this.
I think we can reconfigure to use RAID later if we feel the need. I think
we could do such a reconfiguration in a day, if needed.
|
cross
|
|
response 382 of 547:
|
May 28 16:13 UTC 2003 |
Sounds good to me. Also, going to the next grex allows one to do some
things that I think will be beneficial, such as turning off the queueing
telnet daemon (the queue is almost always empty, anyway, except in like,
5% of all cases), using a new version of SSH, ditching sendmail in favor
of something like postfix, etc.
|
tod
|
|
response 383 of 547:
|
May 28 16:58 UTC 2003 |
This response has been erased.
|
janc
|
|
response 384 of 547:
|
May 29 01:18 UTC 2003 |
Well, I didn't get much work on next Grex done today, but I built a
respectable castle out of Lego, so the day isn't entirely a waste.
|
aruba
|
|
response 385 of 547:
|
May 29 03:07 UTC 2003 |
We finally received our OpenBSD CDs today - it took them 16 days to get here
from Calgary. Stickers were included.
|
spooked
|
|
response 386 of 547:
|
May 29 11:00 UTC 2003 |
I have an inkling Marcus won't ditch sendmail as readily as you might
wish, Dan. It seems to be one of his favourite hacking toys.
|
janc
|
|
response 387 of 547:
|
May 29 13:43 UTC 2003 |
I don't know what his plans are, but I'd be surprised if he didn't seriously
consider alternatives. I think the port to OpenBSD is going to be a bit of
a "start over" for him even if he decides to stay with sendmail, because
moving all his modifications into a current sendmail release is going to
be nearly as much work as switching to a different program. I don't think
he's really all that fond of sendmail.
|
cross
|
|
response 388 of 547:
|
May 29 16:01 UTC 2003 |
He was talking about exim recently, but I still think postfix is a better
choice. Grex isn't for people's personal hacking toys, anyway.
|
gull
|
|
response 389 of 547:
|
May 29 17:05 UTC 2003 |
I use Exim, and it's certainly easier to configure than sendmail. It
has a good, flexible filter language, too. It doesn't have the same
privilage seperation features as Postfix, though -- it's still a
monolithic binary.
|
cross
|
|
response 390 of 547:
|
May 29 19:31 UTC 2003 |
Yeah, that's one of my problems with exim. I honestly believe that
postfix is just as powerful, can be made to do everything that grex
wants/needs, and is more secure. I also argue that it's better documented.
|
mdw
|
|
response 391 of 547:
|
Jun 3 07:14 UTC 2003 |
Major overload here. Hm.
Regarding old hardware. Even when we switch over, we'll want to keep
the old stuff intact for at least a bit in case of some sort of truely
disastrous problem with the new hardware. Once we're comfortable, then
we can decommission it. The disks, being slightly newer, may have some
slight use for other small projects. Much of the data on them doesn't
really matter, but for mail, spool, user files, swap, and /etc, we
certainly want to scrub those before using them for other purposes or if
we decide to sell or give away any of them (even to my basement
collection). Scrubbing them *is* going to be an easier way to ensure
reasonable security than destroying the disks. This is because
sufficient physical destruction has its own issues. Disassembling
things then bashing them with a hammer and using a bulk eraser may make
data recovery more difficult, but it may still leave traces of data that
could be recovered by the same sort of determined adversary that could
recover data from a "single overwrite of all 0" drive. If that is the
level of security you want, then physical destruction would require
either a fairly good acid bath of all disk surfaces, or probably better
yet incineration of the aluminum platters. No doubt we have pyromaniacs
would would enjoy doing this, and there are certainly services that will
do this (for a fee), but we'd probably be better off reserving these
drives for future small projects (such as offloading mail processing,
kerberos, etc.) or doing a multiple overwrite scrub procedure then
selling them on ebay.
Regarding kerberos. Cross and I clearly have a unreconciable difference
of opinion here. I'm clearly not going to change his opinion, there
seems little likelyhood he'll change mine, and I doubt most others share
even my level of paranoia or care all that much. So I don't want to
waste time arguing this. Cross (and any others who care) is welcome to
change his password just before & after switching to kerberos, which
should cover any personal concerns he may have. Root & other passwords
will almost certainly change or be addressed by new mechanisms - this is
almost inherent in any switchover in any case. Once we switch to k5,
baring unexpected changes to the standard, changing one's password will
likely result in a standards compliant k5 key at least potentially
useful from other machines.
Regarding mail. Hm, I think some of this scrolled off. Yes, we want to
keep hierarchical mail boxes. The 4 possible mta's include exim,
postfix, current sendmail, & legacy/hacked sendmail. Unfortunately,
mail is an area where we have significant functional requirements, which
means any stock solution out of the box will almost certainly prove
unacceptable. One functional requirement is mailbox quotas, which at
least has simple design parameters. Another functional requirement is
anti-spam logic, which is both controversial and important. A final
functional requirement, unfortunately, is that this all needs to come up
in some finite amount of time. I intend to look at exim & postfix, with
a view that one of these should a good enough base to support the
functionality we want. As a fallback position, I am at least somewhat
willing to consider installing the current legacy/hacked sendmail, with
the understanding that it's both temporary and very very undesirable. I
hope to spend time coding to avoid this possibility rather than
composing lengthy responses defending whatever choices I make here.
Hm. Surely I've said at least half of this somewhere already? Is this
useful?
|
cross
|
|
response 392 of 547:
|
Jun 3 07:23 UTC 2003 |
Yes, it's useful.
Tell me, what do you think is so unique about grex's mail setup that
a stock solution won't work? Surely postfix+procmail+spamassassin
could handle the load grex would put on it, complete with hierarchial
directories and mailbox quotas. Much larger sites use that combo and it
works well. In fact, if you went with putting mail in $home/Mailbox,
you'd get hierarchial mail directories for free, and eliminate a
filesystem.
Regarding Kerberos: I'd feel more comfortable with using your hashing
algorithm if the guarantee was made that it would disappear from
the system's Kerberos implementation no more than one year after its
introduction, or some other suitable timeframe. There's no reason not
to agree to that.
|
carson
|
|
response 393 of 547:
|
Jun 3 08:10 UTC 2003 |
(re: anti-spam logic: I agree that a procmail/spamassassin combo would be
a good move. nearly all of the spam that I receive at my Grex account is
sent via open relays, which both SpamAssassin and SpamCop [via reporting]
recognize and flag as such. whatever Grex is currently using, obviously,
does not. given Grex's culture, I understand the reluctance to outright
block mail from open relays, but I'd like to think that, with Next Grex,
we should have sufficient processing capability to flag such mail.)
|
mdw
|
|
response 394 of 547:
|
Jun 3 11:59 UTC 2003 |
Mail has enough issues that perhaps it ought to be discussed in its own
item. At some point, I will need to come up with a list of what grex
mail currently has as "custom hacks"; that's not the same as a list of
functional specs, but might beat idle speculation that just because
there are "a lot" of solutions out there there's necessarily a set that
matches our needs. I hope we will be able to take advantage of other
people's work as much as possible. But I don't think there are any
guarantees that we will necessarily find exactly what we want.
Regarding procmail+spamassassin; this can't reject mail, which would be
a significant step backwards spam-wise from what we can currently do.
There are other issues regarding procmail+spamassassin (such as
enforcing mailbox quotas, running perl on every piece of mail) that I
don't find particularly attractive on a system-wide basis. I don't have
a problem with this as a user option, but I'm much more concerned what
to do for everybody else as a default.
Regarding RBL - grex gets listed on them just often enough there's no
way I can see us wanting to do this. RBL would be less unpalatable when
used in conjunction with other stuff as just one more clue something
"might" be spam. I hope that whatever we end up will have the
flexibility to allow us such options, but not a sufficient or reasonable
solution on its own.
Regarding kerberos - there's no guarantee that *the* standard will
necessarily do what grex needs, especially right off. Just for
starters, des/des3 have inadequate etype info, preauth methods to
reinforce weak passwords is lacking, aes is not yet fully standardized,
and there is argument that the default aes string to key ought to be
computationally intensive - fine for single/user workstations, not at
all a good match for a popular timesharing system. I very much hope the
standard evolves to a point where it fully meets the needs I think we
have for it here on grex. For the short-term, our schedule means we
probably shouldn't be, and when the standard converges to our needs is
not something we can dictate. So, I don't think such a promise as you
ask would be in grex's best interest.
It may be worth keeping in mind that until we deploy useful kerberoized
distributed applications from grex, the ability to kerberos authenticate
to grex from elsewhere will be almost entirely only of academic
interest. The real compatibility issue we have to sort out in the short
term is not kerberos standards compliance, but how well does it fit into
openbsd supplied interfaces and the grex environment? We aren't even
close to worrying about distributed desktop applications, single
sign-on, or making sure user passwords never leave the desktop.
|
cross
|
|
response 395 of 547:
|
Jun 3 17:02 UTC 2003 |
That begs the question, why bother with Kerberos at all, then?
I don't understand how other, much larger sites get away with stuff like
using spamassassin on all incoming mail, and otherwise working on stock
anti-spam solutions, but grex can't do it.
Regarding timeframes; well, grex has already blown its one year timeframe.
Given that, it seems most profitable to just use the BSD login API to
deal with the custom hash algorithm and skip Kerberos for a later day.
|