|
Grex > Oldcoop > #380: Cyberspace Communications finances for November 2006 | |
|
| Author |
Message |
| 25 new of 124 responses total. |
nharmon
|
|
response 25 of 124:
|
Dec 12 14:58 UTC 2006 |
About 10 minutes.
|
easlern
|
|
response 26 of 124:
|
Dec 12 15:43 UTC 2006 |
Two cents from the peanut gallery: seems like we can prevent downtime like
what's been recent by policing accounts better. It's hard to see what benefit
there would be in giving anonymous accounts a more powerful system to beat
on the ISP with.
|
aruba
|
|
response 27 of 124:
|
Dec 13 14:34 UTC 2006 |
Re #22: Dan, I'm just not convinced that throwing money at the problem is
going to help at all. I'm not an expert on hardware, but I know that it's
not literally true that we "went cheap" when we bought the current machine.
The total initial cost of the current machine was $2,201, more than you're
proposing to spend.
It seems to me that there is some probability that any piece of hardware
will go bad, in any time interval. Grex pushes hardware pretty hard, so it
doesn't surprise me a whole lot that we lost a disk and a memory chip in the
3.5 years the machine has been running (2 years since it's been online).
|
cross
|
|
response 28 of 124:
|
Dec 13 14:53 UTC 2006 |
Well, consider the disk failure for instance: yes, you're absolutely right
that hardware components tend to fail over time, and there's not much that
can be done to prevent it. These things just wear out after a while. But,
if grex had invested in a hardware RAID solution, then losing a disk
wouldn't have necessarily brought the entire machine down. And repairing
the problem would have been about as easy as taking a spare to the colo and
yanking out the old disk and plugging in the new one. The hardware would
take care of the rest. This isn't magic; hot-swappable hardware RAID
controllers aren't hard to come by. And it would have prevented a week of
downtime. And it wouldn't have required Steve or anyone else to spend hours
and hours at the colo facility. And of course, had we used ECC memory like
was discussed ad naseum before buying the current hardware, the memory chip
just wouldn't have been an issue: it would have told us it was bad (the
memory hardware would have told the operating system that would have logged
a message) and it could have been replaced without a tremendous amount of
downtime (if, indeed, that was the problem at all), or people going back and
forth to the colo facility to run diagnostics, etc. What's more, it
wouldn't have taken down the machine. Is that worth it? You tell me.
As for the cost of the current grex hardware.... Remember that the Sun 4
that it replaced cost somewhere on the order of $100,000 when new. $2,201
is pretty cheap compared to that.
I guess I don't understand why you think that this is just ``throwing money
at the problem.'' Well, I'm not going to try and convince you. If you
don't think it's worth it, then you don't think it's worth it. But, I just
consider it making wise investments.
|
slynne
|
|
response 29 of 124:
|
Dec 13 20:58 UTC 2006 |
I work with hot swappable hard drives on servers and I have to admit
that I really do like them. Our set up has three harddrives and we can
lose one without having *any* downtime. Fixing it is pretty easy too. We
ship a hard drive to the retail location where we have someone who is
almost completely computer illiterate install it. It is pretty cool.
|
drew
|
|
response 30 of 124:
|
Dec 14 05:01 UTC 2006 |
I was told by someone in the IT industry that RAID is only worthwhile
if your downtime costs are measured in dollars per minute. Nonetheless I
recommend installing hardware RAID anyway, for reasons given by others
in this item.
It also occurs to me that with a RAID system, producing an offsite
backup should consist mainly of pulling out one of the redundant hard
drives to take offsite, and putting an empty in its place. Much faster
and easier than babysitting a tape drive.
|
cross
|
|
response 31 of 124:
|
Dec 14 05:13 UTC 2006 |
(I'm not sure that last paragraph follows - in particular, if you do, say,
RAID 5, one disk won't necessarily give you complete information in a backup.)
|
mcnally
|
|
response 32 of 124:
|
Dec 14 09:00 UTC 2006 |
That's a terrible way to back up a RAID array, even one that's just
basic disk mirroring.
|
nharmon
|
|
response 33 of 124:
|
Dec 14 16:13 UTC 2006 |
Does Grex even need a backup system, let alone an offsite backup? It
seems to me that all Grex needs is some sort of "Recovery Kit", or a
collection of software for Grex that can be put on DVD and distributed
to staffers or maybe even given away as free OSS (assuming we used OSS).
User home directories should be the responsibility of end users. We
could recruit tech-savy users to assist other people in backing up their
own data.
|
mcnally
|
|
response 34 of 124:
|
Dec 14 17:38 UTC 2006 |
Well, I still remember when STeve deleted all the mail on the
/var/spool/mail partition, so I'm inclined to think that Grex
ought to have a backup system. It'd also be kind of a bummer
if all the data in the conferencing system disappeared tomorrow
and couldn't be restored..
Users probably *should* back up their important data offsite,
but that process will certainly tax Grex's bandwidth if more
than a few people start to do that frequently.
|
cross
|
|
response 35 of 124:
|
Dec 14 17:48 UTC 2006 |
Email really ought to be delivered into the user's home directory, not a
separate partition. Then the mail spool area could be reallocated to more
user space. Backups of all a user's data would be pretty easy (just tar
up one directory instead of one directory and another file that the user
might not even know about). I suspect sufficiently few people use grex
seriously enough that backups on an individual basis would really tax the
system's bandwidth.
Since I'm the politically incorrect firebrand right now anyway, I'll say that
the lose of /var/spool/mail was just due to poor planning. It's interesting
to note that grex's disks were repartitioned without any concensus.
|
aruba
|
|
response 36 of 124:
|
Dec 15 13:47 UTC 2006 |
How much would it cost to add a hardware RAID system to our current machine?
|
maus
|
|
response 37 of 124:
|
Dec 15 14:29 UTC 2006 |
resp:36 It depends on a few things. Are we talking about adding a
two-drive mirror set or a RAID that spans many drives? Do we already own
the drives? Will we use SCSI or Serial ATA or IDE? Do we need hot-plug
capabilities? Do we want it to be battery backed so it can finish
commits to discs even if the system loses power in the middle of a
commit? Does it need to support a hot spare? Will the drives be in the
server's chassis or do we also need a shelf/enclosure for the drives?
I'll try to get you a few quotes over the next few days once I have an
idea of what you need.
|
nharmon
|
|
response 38 of 124:
|
Dec 15 14:34 UTC 2006 |
Can we price out a 3TB fiberchannel SAN? :-)
|
aruba
|
|
response 39 of 124:
|
Dec 16 19:05 UTC 2006 |
Re #37: I don't know the answer to those questions, except that we currently
have a lot of SCSI disk. I want to say 3 x 18 gig, plus one more rebuilt
disk sitting on my desk.
|
maus
|
|
response 40 of 124:
|
Dec 16 20:39 UTC 2006 |
We could get comparable performance and significant capacity increase by
doing the following:
Array0:
4-port Serial ATA 3Ware Escalade RAID board
port 0: 200GByte Serial ATA drive (possibly Seagate or Maxtor)
port 1: 200GByte Serial ATA drive (possibly Seagate or Maxtor)
port 2: 200GByte Serial ATA drive (possibly Seagate or Maxtor)
port 3: 200GByte Serial ATA drive (possibly Seagate or Maxtor)
port0 + port1 as a RAID-Mirror
port2 + port3 as hot-spares
This could sustain the loss of *ANY* two drives failing, as long as they
do not fail simultaneously. Serial ATA is hot plugable, provided the
drives are in a cage with proper connectors (to assure that logic or
data is not asserted while power is not on -- achieved by varying pin
length to have power use the longest pins in the connectors).
Equipment proposed:
===================
RAID Controller: 3Ware 9550SX-4LP
Drive Enclosure: 3Ware RDC- 400
Drive Cables: 3Ware Cables for 9590SE, 9550SX and 3ware Sidecar
High-Speed Drives: 4 x Seagate Barracuda ES Hard Drives
Would this setup do us for a while? If so, let me know and I will try to
get us quotes on this stack of kit. I will say that I have been
consistently pleased with 3Ware's kit, and Seagate has always been good
so long as I remember. OpenBSD 3.9 recognizes the 3Ware Escalade
automagically, and can use the array hanging off of it as a single SCSI
drive/LUN.
Also, just so you know, setting up the array on the 3Ware is easy,
provided you have console access to the server before the "boot>"
prompt.
|
maus
|
|
response 41 of 124:
|
Dec 16 20:40 UTC 2006 |
Re resp:38
That is totally inappropriate and a waste of time to look up. We could
not afford it and do not need it, so in short the answer to your
question is "eat me!".
|
nharmon
|
|
response 42 of 124:
|
Dec 16 22:50 UTC 2006 |
re 41: You're pretty quick to flame people who use emoticons to express
their sarcasm.
|
nharmon
|
|
response 43 of 124:
|
Dec 16 22:56 UTC 2006 |
And why would you mirror two of the drives and leave two as hot spares?
With that same hardware you could set up a RAID 5 array with three of
the drives and still have a hot spare. With RAID 5 you would have double
the amount of space versus what you proposed.
|
maus
|
|
response 44 of 124:
|
Dec 16 23:09 UTC 2006 |
Sorry, you're fair. Looking back, my mind skipped over the smile, so I
completely missed the humor. I apologize for my unkind response.
The reason for my proposed setup is several-fold:
- Grex server is maintained by volunteers, and they may not be able to
do a truck-roll when the first drive fails, and this setup allows time
to respond or even save up for the beginning of the month before the
array is even considered degraded
- The design as I spec'ed it would nearly quadruple the available
space. In my (not humble) oppinion, this would last a substantial amount
of time
- RAID-5 (striping with distributed parity, single-drive redundancy) is
expensive in terms of both read and write access times. With the large
number of files, especially the large number of small files that Grex
server accretes, this could translate into an io bottleneck during times
of heavy load. If we need more space than I proposed, I would recommend
that we get larger drives or set up a RAID 1 + 0 (a stripe of mirror
sets).
I will say that my suggestion to use two hot spares may be overly
cautious, and that we could step down to one hot spare without serious
risk to the system. If we are seriously concerned about availability, I
would recommend the two hot spares, and step from one 4-port RAID board
to a pair of 2-port boards (preferably on seperate busses) so that even
if a board fails, the array is still being managed.
|
cross
|
|
response 45 of 124:
|
Dec 17 03:52 UTC 2006 |
I like that configuration; Maus, do you have an offhand idea of how
much it would cost?
Regarding RAID-5: part of the idea of a hardware RAID controller is
that it handles all of that by itself and thus is ``fast.'' I concur
it won't be as fast as simple mirror (due to the parity calculations
being spread across drives), but that should be masked somewhat by
the controller.
|
maus
|
|
response 46 of 124:
|
Dec 17 04:23 UTC 2006 |
If Aruba gives it green light, I will call around to vendors I know and
ask for an estimate.
Regarding RAID-5, the work of making the parity comparisons on every
read and write is offloaded, so it is less work for the host's
processor, but still it is not fast, and in our case, probably not
needed.
One small nitpick, it's maus, not Maus.
|
cross
|
|
response 47 of 124:
|
Dec 17 04:41 UTC 2006 |
Sorry!
I agree that RAID-5 is not needed, but I'm surprised the controller doesn't
do a faster job of it.
|
nharmon
|
|
response 48 of 124:
|
Dec 17 14:34 UTC 2006 |
Most controllers do, but my experience is limited to Adaptec SCSI
controllers. We use RAID 5 on systems that are very disk intensive with
no noticeable delays. But RAID 1 will work.
|
aruba
|
|
response 49 of 124:
|
Dec 17 15:06 UTC 2006 |
I asked the staff to look in on this item to give an opinion on maus's
proposed setup. I don't have the technical expertise to say how elaborate
a system Grex could use.
|