|
Grex > Oldcoop > #380: Cyberspace Communications finances for November 2006 | |
|
| Author |
Message |
| 25 new of 124 responses total. |
cross
|
|
response 13 of 124:
|
Dec 6 23:54 UTC 2006 |
I could think of a few ways:
Buy a hardware RAID controller and some more disk space. Revamp grex's
storage solution.
Buy a rackmount case and put grex in a rack instead of in a large
tower-style case. That might further reduce costs by lowering the physical
footprint at the colo facility.
Upgrade the grex computer by getting a new processor, RAM, and motherboard.
Put ECC memory and a faster processor onto a server-class motherboard that
can handle serial BIOS consoles. That would eliminate the need for a ``pc
weasel'' card that is continually talked about and never bought.
Pay janc to fix the outstanding bugs in fronttalk and replace the
ever-buggier picospan. Or buy a YAPP license.
With the exception of the last item, these are roughly in decreasing order
of cost.
|
aruba
|
|
response 14 of 124:
|
Dec 7 05:18 UTC 2006 |
(We tried to buy a PC-Weasel, but it seems to be impossible to get one now.
So another solution is warranted.)
Those are good ideas. I don't think reducing our footprint will affect the
price we're paying at the colo facility - right now we're in the attic, and
it's a rather informal situaion. ANd we're getting a good deal. If we had
to leave and find a new home, then it would be an advantage to be small.
How much would a server-class motherboard and processor cost us? What else
would we need to buy? I'm assuming we could move the disks over as they
are.
|
cross
|
|
response 15 of 124:
|
Dec 7 23:14 UTC 2006 |
Regarding #14; At the time grex switched to the current hardware, I
championed getting server-grade components, but it didn't happen. But,
they're not significantly more expensive than the current commodity
hardware. I'd estimate that a new motherboard might be $300-$400. A new
processor might run a couple hundred. A good 3U or 4U case might run
$400-$500. A couple of gigabytes of ECC RAM might be similarly priced (or
even cheaper...). A hardware SCSI RAID controller might be upwards of $600,
and new disks could run a grand or so. I'd champion replacing grex's
existing disks with new, larger capacity drives that are all the same size
and can thus be mirrored more easily. I'm not sure how much 4 x 72GB drives
would cost off the top of my head.
All in all, I'd say allocating $2000 to new hardware wouldn't be a bad
investiment at all. Grex went cheap on the current hardware and that has
cost us: I can remember some things - like hardware RAID - being shot down
because they were ``too expensive'' and then grex being down for extended
periods due to disk failures. Similarly, a rackmount case was shot down
because it was deemed unnecessary since grex was still in the pumpkin, not
colocation.
|
gelinas
|
|
response 16 of 124:
|
Dec 8 01:41 UTC 2006 |
(Just a note: that "attic" is our host's server space. The last time I
was up there, they were using less than half the available floor space.)
|
aruba
|
|
response 17 of 124:
|
Dec 11 15:49 UTC 2006 |
This response has been erased.
|
aruba
|
|
response 18 of 124:
|
Dec 12 05:47 UTC 2006 |
Dan - I guess I'm not convinced that Grex's users will see much benefit from
that $2000 investment. Grex's hardware had a ouple glitches (which cost a
lot less than $2000 to fix), but it's been pretty stable lately. So
convince me that we'll see $2000 worth of improvement if we spend that much
money.
|
mcnally
|
|
response 19 of 124:
|
Dec 12 06:24 UTC 2006 |
> Grex's hardware had a couple glitches (which cost a lot less than $2000
> to fix)
Actually, we had months of almost daily downtime, and we *still* have
periodic problems with user home directory partitions and (much more
frequently) /var/spool/mail filling up.
|
spooked
|
|
response 20 of 124:
|
Dec 12 06:30 UTC 2006 |
Mike and Dan are accurate in their arguments and comments.
However, buying better hardware will NOT fix the problems because good
system administration is more about active monitoring, tailoring, and
anticipating problems --- none of these 3 are currently sufficiently met
by the Grex staff.
I know this may sound harsh, but it is spot on. There really needs to be
change in Grex staff, its culture in particular, and processes.
|
aruba
|
|
response 21 of 124:
|
Dec 12 13:52 UTC 2006 |
Re #19: I agree that when we were having memory problems, that was ugly, adn
if throwing money at the problem would have fixed it, it would have been a
good thing. But we haven't had that problem for the last year, since STeve
pulled the bad memory chip. So I think it's a moot point.
I suppose we could buy a bigger disk and alleviate the mail spool problem
for a while. But it would just fill up again, right? So I'm not convinced
money can solve that problem.
|
cross
|
|
response 22 of 124:
|
Dec 12 14:30 UTC 2006 |
There have been downtime periods of greater than a week on grex, largely due
to hardware (and more frequently) software failures. How much does that cost
grex in terms of opportunity costs? How much does it cost the staff people
who have to turn around and fix those problems?
Sure, in a direct, apples-to-apples comparison you won't see $2000 of benefit
for a $2000 investment, but that's the wrong metric. Instead, judge it based
on how much money is *saved* from things like reduced staff time commitment,
improved reliability, etc. Would the mailbox partition fill up if staff could
have devoted more time several months ago (when staff *had* time) to tweaking
the mail system rather than figuring out why grex was crashing all the time?
What was the cost to Steve for nursing a sick grex back to health in terms
of time away from his job, his family, etc? Is that worth $2000?
|
keesan
|
|
response 23 of 124:
|
Dec 12 14:46 UTC 2006 |
How long would it take to write some program that deletes any mailbox which
has not been accessed for a month after the account was opened?
|
nharmon
|
|
response 24 of 124:
|
Dec 12 14:57 UTC 2006 |
This response has been erased.
|
nharmon
|
|
response 25 of 124:
|
Dec 12 14:58 UTC 2006 |
About 10 minutes.
|
easlern
|
|
response 26 of 124:
|
Dec 12 15:43 UTC 2006 |
Two cents from the peanut gallery: seems like we can prevent downtime like
what's been recent by policing accounts better. It's hard to see what benefit
there would be in giving anonymous accounts a more powerful system to beat
on the ISP with.
|
aruba
|
|
response 27 of 124:
|
Dec 13 14:34 UTC 2006 |
Re #22: Dan, I'm just not convinced that throwing money at the problem is
going to help at all. I'm not an expert on hardware, but I know that it's
not literally true that we "went cheap" when we bought the current machine.
The total initial cost of the current machine was $2,201, more than you're
proposing to spend.
It seems to me that there is some probability that any piece of hardware
will go bad, in any time interval. Grex pushes hardware pretty hard, so it
doesn't surprise me a whole lot that we lost a disk and a memory chip in the
3.5 years the machine has been running (2 years since it's been online).
|
cross
|
|
response 28 of 124:
|
Dec 13 14:53 UTC 2006 |
Well, consider the disk failure for instance: yes, you're absolutely right
that hardware components tend to fail over time, and there's not much that
can be done to prevent it. These things just wear out after a while. But,
if grex had invested in a hardware RAID solution, then losing a disk
wouldn't have necessarily brought the entire machine down. And repairing
the problem would have been about as easy as taking a spare to the colo and
yanking out the old disk and plugging in the new one. The hardware would
take care of the rest. This isn't magic; hot-swappable hardware RAID
controllers aren't hard to come by. And it would have prevented a week of
downtime. And it wouldn't have required Steve or anyone else to spend hours
and hours at the colo facility. And of course, had we used ECC memory like
was discussed ad naseum before buying the current hardware, the memory chip
just wouldn't have been an issue: it would have told us it was bad (the
memory hardware would have told the operating system that would have logged
a message) and it could have been replaced without a tremendous amount of
downtime (if, indeed, that was the problem at all), or people going back and
forth to the colo facility to run diagnostics, etc. What's more, it
wouldn't have taken down the machine. Is that worth it? You tell me.
As for the cost of the current grex hardware.... Remember that the Sun 4
that it replaced cost somewhere on the order of $100,000 when new. $2,201
is pretty cheap compared to that.
I guess I don't understand why you think that this is just ``throwing money
at the problem.'' Well, I'm not going to try and convince you. If you
don't think it's worth it, then you don't think it's worth it. But, I just
consider it making wise investments.
|
slynne
|
|
response 29 of 124:
|
Dec 13 20:58 UTC 2006 |
I work with hot swappable hard drives on servers and I have to admit
that I really do like them. Our set up has three harddrives and we can
lose one without having *any* downtime. Fixing it is pretty easy too. We
ship a hard drive to the retail location where we have someone who is
almost completely computer illiterate install it. It is pretty cool.
|
drew
|
|
response 30 of 124:
|
Dec 14 05:01 UTC 2006 |
I was told by someone in the IT industry that RAID is only worthwhile
if your downtime costs are measured in dollars per minute. Nonetheless I
recommend installing hardware RAID anyway, for reasons given by others
in this item.
It also occurs to me that with a RAID system, producing an offsite
backup should consist mainly of pulling out one of the redundant hard
drives to take offsite, and putting an empty in its place. Much faster
and easier than babysitting a tape drive.
|
cross
|
|
response 31 of 124:
|
Dec 14 05:13 UTC 2006 |
(I'm not sure that last paragraph follows - in particular, if you do, say,
RAID 5, one disk won't necessarily give you complete information in a backup.)
|
mcnally
|
|
response 32 of 124:
|
Dec 14 09:00 UTC 2006 |
That's a terrible way to back up a RAID array, even one that's just
basic disk mirroring.
|
nharmon
|
|
response 33 of 124:
|
Dec 14 16:13 UTC 2006 |
Does Grex even need a backup system, let alone an offsite backup? It
seems to me that all Grex needs is some sort of "Recovery Kit", or a
collection of software for Grex that can be put on DVD and distributed
to staffers or maybe even given away as free OSS (assuming we used OSS).
User home directories should be the responsibility of end users. We
could recruit tech-savy users to assist other people in backing up their
own data.
|
mcnally
|
|
response 34 of 124:
|
Dec 14 17:38 UTC 2006 |
Well, I still remember when STeve deleted all the mail on the
/var/spool/mail partition, so I'm inclined to think that Grex
ought to have a backup system. It'd also be kind of a bummer
if all the data in the conferencing system disappeared tomorrow
and couldn't be restored..
Users probably *should* back up their important data offsite,
but that process will certainly tax Grex's bandwidth if more
than a few people start to do that frequently.
|
cross
|
|
response 35 of 124:
|
Dec 14 17:48 UTC 2006 |
Email really ought to be delivered into the user's home directory, not a
separate partition. Then the mail spool area could be reallocated to more
user space. Backups of all a user's data would be pretty easy (just tar
up one directory instead of one directory and another file that the user
might not even know about). I suspect sufficiently few people use grex
seriously enough that backups on an individual basis would really tax the
system's bandwidth.
Since I'm the politically incorrect firebrand right now anyway, I'll say that
the lose of /var/spool/mail was just due to poor planning. It's interesting
to note that grex's disks were repartitioned without any concensus.
|
aruba
|
|
response 36 of 124:
|
Dec 15 13:47 UTC 2006 |
How much would it cost to add a hardware RAID system to our current machine?
|
maus
|
|
response 37 of 124:
|
Dec 15 14:29 UTC 2006 |
resp:36 It depends on a few things. Are we talking about adding a
two-drive mirror set or a RAID that spans many drives? Do we already own
the drives? Will we use SCSI or Serial ATA or IDE? Do we need hot-plug
capabilities? Do we want it to be battery backed so it can finish
commits to discs even if the system loses power in the middle of a
commit? Does it need to support a hot spare? Will the drives be in the
server's chassis or do we also need a shelf/enclosure for the drives?
I'll try to get you a few quotes over the next few days once I have an
idea of what you need.
|