Grex Oldcoop Conference

Item 380: Cyberspace Communications finances for November 2006

Entered by aruba on Sat Dec 2 03:04:32 2006:

Here is the treasurer's report on Cyberspace Communications, Inc. finances 
through November 30th, 2006.

Beginning Balance     $6,022.39

Credits                 $150.00         Member contributions
                          $1.18         Interest on our savings account
                   ------------
                        $151.18

Debits                  $100.00         Provide Net colocation (thru 12/22/06)
                         $48.98         Phone Bill
                         $29.90         Renewal of grex.org and cyberspace.org
                          $3.63         Paypal fees (income = $90)
                   ------------
                        $182.51

Ending Balance        $5,991.06

Our current balance breaks down as follows:

$5,814.69               General Fund
  $176.37               Silly Hat Fund

The money is distributed like this:

$4,076.75   Checking account
$1,914.31   Savings account earning 0.75% interest annually

We had one new member (easlern) in November. We are currently at 58 
members, 47 of whom are paid through at least December 15th.  (The others 
expired recently and are in a grace period.)

Notes:

- We renewed both domain names through the beginning of 2008.

Thanks to everyone who contributed in November:

arabella, easlern, keesan, and witling.

If you or your institution would like to become a member of Grex, it 
only costs $6/month or $60/year.  Send money to:

Cyberspace Communications
P. O. Box 4432
Ann Arbor, MI 48106-4432

If you pay by cash or money order, please include a photocopy of some 
form of ID.  We can't add you to the rolls without ID.  (If you pay 
with a personal check that has your name pre-printed on it, we 
consider that a good enough ID.)  Type !support or see 
http://www.cyberspace.org/member.html for more info.
124 responses total.

#1 of 124 by jep on Wed Dec 6 00:47:43 2006:

At $182.51 per month, Grex has over 32 months of expenses secured.  I
suppose there are other expenses as well and so that may not be exact
but it seems to me if Grex stopped taking in money entirely, I think it
would still be securely financed at the current level for at least 2 years.

In the event of a major expense, such as a new computer, Grex users will
no doubt step forward, as they have in the past, to contribute the
needed money.  That means there's no need for Grex to have a pile of
money like it does.

Grex has the funds to expand into new services or areas, increase it's
level of service, or to reduce it's required membership contributions.

How about exploring some ways to use some of Grex's money?


#2 of 124 by mcnally on Wed Dec 6 00:52:51 2006:

 Or perhaps Grex could lower its membership dues and consider tailoring
 a special membership program to users in developing countries who 
 probably cannot afford $60/year but who might take a more active role
 on Grex if they were more engaged..


#3 of 124 by keesan on Wed Dec 6 02:13:41 2006:

The grex membership is steadily shrinking and I think we should hold onto the
reserve cash.  


#4 of 124 by slynne on Wed Dec 6 03:54:55 2006:

I think that considering a membership scheme that makes it easier for
people in other countries to become members doesnt necessarily have to
cost us a lot of money. 


#5 of 124 by jep on Wed Dec 6 16:28:31 2006:

I don't have much interest in sending in more membership money at
present.  Grex doesn't need the money.  We're not using it for anything
and not planning on using it for anything.  If $60 has got to sit in a
bank account, it might as well be my bank account.


#6 of 124 by keesan on Wed Dec 6 18:03:10 2006:

We are using the money to pay monthly expenses, including an internet
connection that lets you access grex.


#7 of 124 by cross on Wed Dec 6 19:22:37 2006:

...and jep's point is, that with the money in the bank, there's no need to
send in more money for another couple of years.


#8 of 124 by tod on Wed Dec 6 20:15:30 2006:

re #7
I'd totally disagree with that presumption.  One of the board's primary
responsibilities is fiduciary and as such should always strive to bring in
some funding.  Inflation and whatever else could easily make the current
reserves insufficient.


#9 of 124 by keesan on Wed Dec 6 21:29:15 2006:

If provide.net dumps us costs could go way up.


#10 of 124 by nharmon on Wed Dec 6 21:34:17 2006:

At which point, most of us will throw in some bucks. Now, if Grex wants 
to start using that money to carry out its mission statement instead of 
eating it all up in operating expenses....then JUST DO IT (tm).

But it bothers me that Grex says they need funds with no plan on how to 
spend them. Tell 'ya what, publish a budget, a plan of how you would 
spend money you don't have yet. If you have a vision, you may find more 
people willing to part with their hard earned scratch.


#11 of 124 by aruba on Wed Dec 6 22:13:22 2006:

Grex's financial situation is pretty stable at the moment, thanks to the
fact that we decreased our expenses a lot by moving into colocation two
years ago.  That's a good thing!  Before that our bank account was steadily
declining.

I agree tht we don't need a cushion as big as we have right now, though I am
happy we have it.  And I agree we should talk about ways to use some of our
money to improve the infrastructure.


#12 of 124 by denise on Wed Dec 6 23:45:33 2006:

Maybe if we reduced the price of a membership, more people would become
members.


#13 of 124 by cross on Wed Dec 6 23:54:22 2006:

I could think of a few ways:

Buy a hardware RAID controller and some more disk space.  Revamp grex's
storage solution.

Buy a rackmount case and put grex in a rack instead of in a large
tower-style case.  That might further reduce costs by lowering the physical
footprint at the colo facility.

Upgrade the grex computer by getting a new processor, RAM, and motherboard.
Put ECC memory and a faster processor onto a server-class motherboard that
can handle serial BIOS consoles.  That would eliminate the need for a ``pc
weasel'' card that is continually talked about and never bought.

Pay janc to fix the outstanding bugs in fronttalk and replace the
ever-buggier picospan.  Or buy a YAPP license.

With the exception of the last item, these are roughly in decreasing order
of cost.


#14 of 124 by aruba on Thu Dec 7 05:18:56 2006:

(We tried to buy a PC-Weasel, but it seems to be impossible to get one now.
So another solution is warranted.)

Those are good ideas.  I don't think reducing our footprint will affect the
price we're paying at the colo facility - right now we're in the attic, and
it's a rather informal situaion.  ANd we're getting a good deal.  If we had
to leave and find a new home, then it would be an advantage to be small.

How much would a server-class motherboard and processor cost us?  What else
would we need to buy?  I'm assuming we could move the disks over as they
are.


#15 of 124 by cross on Thu Dec 7 23:14:38 2006:

Regarding #14; At the time grex switched to the current hardware, I
championed getting server-grade components, but it didn't happen.  But,
they're not significantly more expensive than the current commodity
hardware.  I'd estimate that a new motherboard might be $300-$400.  A new
processor might run a couple hundred.  A good 3U or 4U case might run
$400-$500.  A couple of gigabytes of ECC RAM might be similarly priced (or
even cheaper...).  A hardware SCSI RAID controller might be upwards of $600,
and new disks could run a grand or so.  I'd champion replacing grex's
existing disks with new, larger capacity drives that are all the same size
and can thus be mirrored more easily.  I'm not sure how much 4 x 72GB drives
would cost off the top of my head.

All in all, I'd say allocating $2000 to new hardware wouldn't be a bad
investiment at all.  Grex went cheap on the current hardware and that has
cost us: I can remember some things - like hardware RAID - being shot down
because they were ``too expensive'' and then grex being down for extended
periods due to disk failures.  Similarly, a rackmount case was shot down
because it was deemed unnecessary since grex was still in the pumpkin, not
colocation.


#16 of 124 by gelinas on Fri Dec 8 01:41:38 2006:

(Just a note: that "attic" is our host's server space.  The last time I
was up there, they were using less than half the available floor space.)


#17 of 124 by aruba on Mon Dec 11 15:49:31 2006:

This response has been erased.



#18 of 124 by aruba on Tue Dec 12 05:47:52 2006:

Dan - I guess I'm not convinced that Grex's users will see much benefit from
that $2000 investment.  Grex's hardware had a ouple glitches (which cost a
lot less than $2000 to fix), but it's been pretty stable lately.  So
convince me that we'll see $2000 worth of improvement if we spend that much
money.


#19 of 124 by mcnally on Tue Dec 12 06:24:22 2006:

 > Grex's hardware had a couple glitches (which cost a lot less than $2000
 > to fix)

 Actually, we had months of almost daily downtime, and we *still* have
 periodic problems with user home directory partitions and (much more
 frequently) /var/spool/mail filling up.  


#20 of 124 by spooked on Tue Dec 12 06:30:54 2006:

Mike and Dan are accurate in their arguments and comments.

However, buying better hardware will NOT fix the problems because good 
system administration is more about active monitoring, tailoring, and 
anticipating problems --- none of these 3 are currently sufficiently met 
by the Grex staff.

I know this may sound harsh, but it is spot on.  There really needs to be 
change in Grex staff, its culture in particular, and processes.



#21 of 124 by aruba on Tue Dec 12 13:52:30 2006:

Re #19: I agree that when we were having memory problems, that was ugly, adn
if throwing money at the problem would have fixed it, it would have been a
good thing.  But we haven't had that problem for the last year, since STeve
pulled the bad memory chip.  So I think it's a moot point.

I suppose we could buy a bigger disk and alleviate the mail spool problem
for a while.  But it would just fill up again, right?  So I'm not convinced
money can solve that problem.


#22 of 124 by cross on Tue Dec 12 14:30:36 2006:

There have been downtime periods of greater than a week on grex, largely due
to hardware (and more frequently) software failures.  How much does that cost
grex in terms of opportunity costs?  How much does it cost the staff people
who have to turn around and fix those problems?

Sure, in a direct, apples-to-apples comparison you won't see $2000 of benefit
for a $2000 investment, but that's the wrong metric.  Instead, judge it based
on how much money is *saved* from things like reduced staff time commitment,
improved reliability, etc.  Would the mailbox partition fill up if staff could
have devoted more time several months ago (when staff *had* time) to tweaking
the mail system rather than figuring out why grex was crashing all the time?
What was the cost to Steve for nursing a sick grex back to health in terms
of time away from his job, his family, etc?  Is that worth $2000?


#23 of 124 by keesan on Tue Dec 12 14:46:31 2006:

How long would it take to write some program that deletes any mailbox which
has not been accessed for a month after the account was opened?  


#24 of 124 by nharmon on Tue Dec 12 14:57:44 2006:

This response has been erased.



#25 of 124 by nharmon on Tue Dec 12 14:58:12 2006:

About 10 minutes.


#26 of 124 by easlern on Tue Dec 12 15:43:06 2006:

Two cents from the peanut gallery: seems like we can prevent downtime like
what's been recent by policing accounts better. It's hard to see what benefit
there would be in giving anonymous accounts a more powerful system to beat
on the ISP with. 


#27 of 124 by aruba on Wed Dec 13 14:34:27 2006:

Re #22: Dan, I'm just not convinced that throwing money at the problem is
going to help at all.  I'm not an expert on hardware, but I know that it's
not literally true that we "went cheap" when we bought the current machine.
The total initial cost of the current machine was $2,201, more than you're
proposing to spend.

It seems to me that there is some probability that any piece of hardware
will go bad, in any time interval.  Grex pushes hardware pretty hard, so it
doesn't surprise me a whole lot that we lost a disk and a memory chip in the
3.5 years the machine has been running (2 years since it's been online).


#28 of 124 by cross on Wed Dec 13 14:53:55 2006:

Well, consider the disk failure for instance: yes, you're absolutely right
that hardware components tend to fail over time, and there's not much that
can be done to prevent it.  These things just wear out after a while.  But,
if grex had invested in a hardware RAID solution, then losing a disk
wouldn't have necessarily brought the entire machine down.  And repairing
the problem would have been about as easy as taking a spare to the colo and
yanking out the old disk and plugging in the new one.  The hardware would
take care of the rest.  This isn't magic; hot-swappable hardware RAID
controllers aren't hard to come by.  And it would have prevented a week of
downtime.  And it wouldn't have required Steve or anyone else to spend hours
and hours at the colo facility.  And of course, had we used ECC memory like
was discussed ad naseum before buying the current hardware, the memory chip
just wouldn't have been an issue: it would have told us it was bad (the
memory hardware would have told the operating system that would have logged
a message) and it could have been replaced without a tremendous amount of
downtime (if, indeed, that was the problem at all), or people going back and
forth to the colo facility to run diagnostics, etc.  What's more, it
wouldn't have taken down the machine.  Is that worth it?  You tell me.

As for the cost of the current grex hardware....  Remember that the Sun 4
that it replaced cost somewhere on the order of $100,000 when new.  $2,201
is pretty cheap compared to that.

I guess I don't understand why you think that this is just ``throwing money
at the problem.''  Well, I'm not going to try and convince you.  If you
don't think it's worth it, then you don't think it's worth it.  But, I just
consider it making wise investments.


#29 of 124 by slynne on Wed Dec 13 20:58:36 2006:

I work with hot swappable hard drives on servers and I have to admit
that I really do like them. Our set up has three harddrives and we can
lose one without having *any* downtime. Fixing it is pretty easy too. We
ship a hard drive to the retail location where we have someone who is
almost completely computer illiterate install it. It is pretty cool. 


#30 of 124 by drew on Thu Dec 14 05:01:14 2006:

    I was told by someone in the IT industry that RAID is only worthwhile
if your downtime costs are measured in dollars per minute. Nonetheless I
recommend installing hardware RAID anyway, for reasons given by others
in this item.

    It also occurs to me that with a RAID system, producing an offsite
backup should consist mainly of pulling out one of the redundant hard
drives to take offsite, and putting an empty in its place. Much faster
and easier than babysitting a tape drive.


#31 of 124 by cross on Thu Dec 14 05:13:13 2006:

(I'm not sure that last paragraph follows - in particular, if you do, say,
RAID 5, one disk won't necessarily give you complete information in a backup.)


#32 of 124 by mcnally on Thu Dec 14 09:00:35 2006:

 That's a terrible way to back up a RAID array, even one that's just
 basic disk mirroring.


#33 of 124 by nharmon on Thu Dec 14 16:13:18 2006:

Does Grex even need a backup system, let alone an offsite backup? It
seems to me that all Grex needs is some sort of "Recovery Kit", or a
collection of software for Grex that can be put on DVD and distributed
to staffers or maybe even given away as free OSS (assuming we used OSS).

User home directories should be the responsibility of end users. We
could recruit tech-savy users to assist other people in backing up their
own data.


#34 of 124 by mcnally on Thu Dec 14 17:38:23 2006:

 Well, I still remember when STeve deleted all the mail on the 
 /var/spool/mail partition, so I'm inclined to think that Grex
 ought to have a backup system.  It'd also be kind of a bummer
 if all the data in the conferencing system disappeared tomorrow
 and couldn't be restored..

 Users probably *should* back up their important data offsite,
 but that process will certainly tax Grex's bandwidth if more
 than a few people start to do that frequently.


#35 of 124 by cross on Thu Dec 14 17:48:05 2006:

Email really ought to be delivered into the user's home directory, not a
separate partition.  Then the mail spool area could be reallocated to more
user space.  Backups of all a user's data would be pretty easy (just tar
up one directory instead of one directory and another file that the user
might not even know about).  I suspect sufficiently few people use grex
seriously enough that backups on an individual basis would really tax the
system's bandwidth.

Since I'm the politically incorrect firebrand right now anyway, I'll say that
the lose of /var/spool/mail was just due to poor planning.  It's interesting
to note that grex's disks were repartitioned without any concensus.


#36 of 124 by aruba on Fri Dec 15 13:47:20 2006:

How much would it cost to add a hardware RAID system to our current machine?


#37 of 124 by maus on Fri Dec 15 14:29:53 2006:

resp:36 It depends on a few things. Are we talking about adding a
two-drive mirror set or a RAID that spans many drives? Do we already own
the drives? Will we use SCSI or Serial ATA or IDE? Do we need hot-plug
capabilities? Do we want it to be battery backed so it can finish
commits to discs even if the system loses power in the middle of a
commit? Does it need to support a hot spare? Will the drives be in the
server's chassis or do we also need a shelf/enclosure for the drives? 

I'll try to get you a few quotes over the next few days once I have an
idea of what you need. 


#38 of 124 by nharmon on Fri Dec 15 14:34:30 2006:

Can we price out a 3TB fiberchannel SAN? :-)


#39 of 124 by aruba on Sat Dec 16 19:05:02 2006:

Re #37: I don't know the answer to those questions, except that we currently
have a lot of SCSI disk.  I want to say 3 x 18 gig, plus one more rebuilt
disk sitting on my desk.


#40 of 124 by maus on Sat Dec 16 20:39:40 2006:

We could get comparable performance and significant capacity increase by
doing the following: 

Array0: 
4-port Serial ATA 3Ware Escalade RAID board
 port 0: 200GByte Serial ATA drive (possibly Seagate or Maxtor)
 port 1: 200GByte Serial ATA drive (possibly Seagate or Maxtor)
 port 2: 200GByte Serial ATA drive (possibly Seagate or Maxtor)
 port 3: 200GByte Serial ATA drive (possibly Seagate or Maxtor)

port0 + port1 as a RAID-Mirror
port2 + port3 as hot-spares

This could sustain the loss of *ANY* two drives failing, as long as they
do not fail simultaneously. Serial ATA is hot plugable, provided the
drives are in a cage with proper connectors (to assure that logic or
data is not asserted while power is not on -- achieved by varying pin
length to have power use the longest pins in the connectors). 

Equipment proposed: 
===================
RAID Controller:   3Ware 9550SX-4LP
Drive Enclosure:   3Ware RDC- 400
Drive Cables:      3Ware Cables for 9590SE, 9550SX and 3ware Sidecar
High-Speed Drives: 4 x Seagate Barracuda ES Hard Drives


Would this setup do us for a while? If so, let me know and I will try to
get us quotes on this stack of kit. I will say that I have been
consistently pleased with 3Ware's kit, and Seagate has always been good
so long as I remember. OpenBSD 3.9 recognizes the 3Ware Escalade
automagically, and can use the array hanging off of it as a single SCSI
drive/LUN. 

Also, just so you know, setting up the array on the 3Ware is easy,
provided you have console access to the server before the "boot>"
prompt. 


#41 of 124 by maus on Sat Dec 16 20:40:45 2006:

Re resp:38

That is totally inappropriate and a waste of time to look up. We could
not afford it and do not need it, so in short the answer to your
question is "eat me!". 


#42 of 124 by nharmon on Sat Dec 16 22:50:04 2006:

re 41: You're pretty quick to flame people who use emoticons to express
their sarcasm.


#43 of 124 by nharmon on Sat Dec 16 22:56:57 2006:

And why would you mirror two of the drives and leave two as hot spares?
With that same hardware you could set up a RAID 5 array with three of
the drives and still have a hot spare. With RAID 5 you would have double
the amount of space versus what you proposed.


#44 of 124 by maus on Sat Dec 16 23:09:43 2006:

Sorry, you're fair. Looking back, my mind skipped over the smile, so I
completely missed the humor. I apologize for my unkind response. 

The reason for my proposed setup is several-fold: 

 - Grex server is maintained by volunteers, and they may not be able to
do a truck-roll when the first drive fails, and this setup allows time
to respond or even save up for the beginning of the month before the
array is even considered degraded

 - The design as I spec'ed it would nearly quadruple the available
space. In my (not humble) oppinion, this would last a substantial amount
of time 

 - RAID-5 (striping with distributed parity, single-drive redundancy) is
expensive in terms of both read and write access times. With the large
number of files, especially the large number of small files that Grex
server accretes, this could translate into an io bottleneck during times
of heavy load. If we need more space than I proposed, I would recommend
that we get larger drives or set up a RAID 1 + 0 (a stripe of mirror
sets). 

I will say that my suggestion to use two hot spares may be overly
cautious, and that we could step down to one hot spare without serious
risk to the system. If we are seriously concerned about availability, I
would recommend the two hot spares, and step from one 4-port RAID board
to a pair of 2-port boards (preferably on seperate busses) so that even
if a board fails, the array is still being managed. 


#45 of 124 by cross on Sun Dec 17 03:52:06 2006:

I like that configuration; Maus, do you have an offhand idea of how
much it would cost?

Regarding RAID-5: part of the idea of a hardware RAID controller is
that it handles all of that by itself and thus is ``fast.''  I concur
it won't be as fast as simple mirror (due to the parity calculations
being spread across drives), but that should be masked somewhat by
the controller.


#46 of 124 by maus on Sun Dec 17 04:23:03 2006:

If Aruba gives it green light, I will call around to vendors I know and
ask for an estimate. 

Regarding RAID-5, the work of making the parity comparisons on every
read and write is offloaded, so it is less work for the host's
processor, but still it is not fast, and in our case, probably not
needed. 

One small nitpick, it's maus, not Maus. 


#47 of 124 by cross on Sun Dec 17 04:41:28 2006:

Sorry!

I agree that RAID-5 is not needed, but I'm surprised the controller doesn't
do a faster job of it.


#48 of 124 by nharmon on Sun Dec 17 14:34:31 2006:

Most controllers do, but my experience is limited to Adaptec SCSI
controllers. We use RAID 5 on systems that are very disk intensive with
no noticeable delays. But RAID 1 will work.


#49 of 124 by aruba on Sun Dec 17 15:06:21 2006:

I asked the staff to look in on this item to give an opinion on maus's
proposed setup.  I don't have the technical expertise to say how elaborate
a system Grex could use.


#50 of 124 by maus on Sun Dec 17 16:58:19 2006:

Aruba, thanks for sending this to them, and for your confidence in my
solution. If they like it, I'll start gathering quotes from places like
Altex, Cyberguys and a couple of others (I don't have my catalogs with
me at work, so I can't remember the names of the other good vendors). 


#51 of 124 by maus on Sun Dec 17 17:02:47 2006:

Cross and Nharmon,

About RAID 5 performance, I may be mistaken. I regularly work with the 2
port RAID boards for simple mirror sets (pretty much every day), but
have not used a RAID 5 array since back in the day of the Escalade 7000
series (IDE), and I hope that write performance has improved
substantially since then. What I typically do for size+redundancy is two
drives per controller as a mirror set, and then either make a stripe of
the mirror sets or concatenate them using LVM in RHEL or ccd in OpenBSD.
There may be better solutions, this is just how we've been doing it
here. I appreciate what I can learn from each of you guys. 


#52 of 124 by nharmon on Sun Dec 17 17:55:59 2006:

I have never implemented RAID 5 with SATA, only mirroring like maus
described. However, my experience with SCSI RAID 5 is that it performs
well when the number of drives in the array are low (say, 3-8 drives).
When our needs exceeded that, we've implemented RAID 50, which is
striping across multiple RAID 5 arrays. Most RAID controllers aren't
capable of this, and the ones that are cost too much for Grex.



#53 of 124 by maus on Sun Dec 17 21:00:34 2006:

Sorry for the dumb question, but what is "Silly Hat Fund"? 


#54 of 124 by aruba on Sun Dec 17 22:56:43 2006:

At a board meeting long ago, there was a discussion of fundraising, and 
whether we should collect money for particular goals (like buying a hard 
drive, for instance) or just put everything in the general fund.  Having 
concrete goals sometimes helps to get donation, but, someone pointed out, 
it's important to have money that as available to be spent on anything, 
"even silly hats".  So we agreed to set up funds and raise money for them 
as appropriate, but keep most of our money in the general fund.

After the meeting, Peter Riley (nestene) gave me $5 and said, "This is 
for the Silly Hat Fund".  So I dutifully created said partition of Grex's 
money, and it has been on the treasurer's report ever since.  Lots of 
people have donated to it over the years, and occasionally when the Grex 
walkers accumulated a little more cash than necessary to pay our lunch 
tab, the excess has gone into the Silly Hat Fund.

No one's come up with a good use for it, yet.  Whatever we spend it on, it 
should be something frivolous.


#55 of 124 by steve on Mon Dec 18 05:12:32 2006:

   Back when we were looking at new hardware for Grex, the goal was to
get as good a set of hardware for as little money as we could do.  Had
we had infiniate money I'd have likely gotten different things.  The
Antec tower and PS are known for their reliability; since we wern't in
a rackmount situation we could avoid possible heat problems by having
a large case, and also we could stuff a lot of disks in that case.
I didn't get ECC memory because the last memory failure I have seen
on any machine I've worked on was in 1999.

   I've been dealing with raid stuff at work and have gotten an
ARCO ide raid 1 card.  This is a hardware raid system: its invisable
to the operating system, appearing as a single IDE disk which
connects to the IDE controller.  This kind of raid card gets 
completely around drivers, etc.  I would think thats the way to go
for Grex.  The simpler we can make things the better off we are.


#56 of 124 by steve on Mon Dec 18 05:15:05 2006:

   One thing to remember about Grex is that having multiple spindles
running is a good thing.  A huge raid 5 array could hold everything,
but at the cost of losing multiple disks.  I'm not sure what the
best tradeoff is now, but I suspect SATA disks are the way to go,
since they're working out well in terms of reliability, and are
fast.


#57 of 124 by maus on Tue Dec 19 16:24:41 2006:

This is not an official quote, just a first approximation based on
prices listed on one vendor's web-page. I figure it should at least give
us an order of magnitude approximation that we can use for sanity
checks. Substituting Maxtor drives instead of Seagate would change the
grand total by less than 10$ and neither brand seems to be sold out at
any of the vendors I've dealt with, so the brand of drives comes down to
preferences of staff. 

PROVANTAGE: 

1x 3ware 9550SX-4 - 4-pt PCI-X Low-profile SATA II RAID Card with Cables
1x RDC-400 SATA RAID Drive Cage 
4x Barracuda ES 250GB SATA NL35 7200RPM 16MB Cache 3Gb/s 8.5ms  

US$ Subtotal:   $ 827.70 


==========================

The vendors I will be contacting are (provided board and staff want me
to go ahead and get bids): 

Provantage
synnex.com
Bell Micro
CyberGuys
Altex
The Delcom Group


#58 of 124 by maus on Tue Dec 19 16:31:58 2006:

After more reading, it appears that the hit we would take on computing
parity and writing parity would be offset by parallelizing seeking. Does
anyone have numbers for this? Is a RAID 5 with one hot spare
sufficiently reliable for us? 

Also, I realized I made a big assumption. Does our chassis have physical
space for a drive cage that is the height of 3 CDRom drives and 5.25"
wide? 


#59 of 124 by maus on Tue Dec 19 16:40:34 2006:

Just wondering, since this has a decidedly technical side to it and is a
discussion of hardware for Grexserver, should this item be moved/linked
into Agorage conference? 


#60 of 124 by remmers on Tue Dec 19 17:17:27 2006:

Probably the Garage conference, which is specifically for public 
discussions of Grex technical issues.  I think the fairwitness there
is janc, so he'd be the person to link it.


#61 of 124 by maus on Tue Dec 19 17:25:38 2006:

I have a rough draft of the RFQ ready for staff and board to look over.
It has a couple of spaces that need to be filled in (the contact person
and the due date). Comments, criticisms, thoughts, etc are encouraged. 

=========================================================================
=


Introduction: 
Cyberspace Communications, Inc. (Grex) is a nonprofit organization
dedicated to the advancement of public education and scientific endeavor
through interaction with computers, and humans via computers, using
computer conferencing. Further purposes include the exchange of
scientific and technical information about the various aspects of
computer science, such as operating systems, computer networks, and
computer programming. Cyberspace Communications is investigating
increasing their conferencing server's capacity, capabilities and
reliability by adding a new drive array. 

Instructions:
Please provide an anticipated and a maximum quote. Specifications are
firm, and bidders are not to deviate from the specification without
prior written approval from Cyberspace Communications. Quantities are
firm. No under-runs. Cyberspace Communications will retain units over
the specified quantity and will not pay for over-runs.

Award Criteria: 
Consideration will be given to price, warranty terms, responsibility,
delivery time, delivery terms and experience in performance of similar
deals.

Contact For Specification:
********@cyberspace.org 

Alternates/Substitutes:
Alternate bids of substantially the same quality, style and features are
invited. In order to receive full consideration, such alternate bids
must be accompanied by sufficient descriptive literature and/or
specifications to clearly identify the offer and allow for a complete
evaluation.

Acceptance of Bids:
Any bid may be rejected as non-responsive in the judgment of Cyberspace
Communications should any of the following occur: Material alteration or
erasure of the RFP/RFQ documents; Failure to submit required bid
guaranty (when required); Failure to furnish requested pricing or other
information; Submission of a late bid. In addition, bids may be rejected
for any other justifiable reason including, but not limited to, failure
to perform on previous contracts with Cyberspace Communications. 

Withdrawl of Bids:
Bids may be withdrawn or modified upon written request from the properly
identified bidder, prior to the date and hour scheduled for the closing
of bids.

Receipt of Quotes: 
Responses to this RFQ should be emailed to ********@cyberspace.org by
********** at noon, Eastern Time. 





Item    Quantity        Description
0       1               3Ware 9550SX-4LP: PCI-X 4 port Serial ATA RAID
controller board 1       1               3Ware RDC-400: Serial ATA RAID
Drive Cage 2       4               Seagate Barracuda ES 200 GByte Serial
ATA Drives
                        *OR* 
2       4               Maxtor DiamondMAX 10 200 GByte Serial ATA Drives
3       4               3Ware Cables for 9590SE, 9550SX and 3ware
Sidecar





Earliest Delivery Date (after receipt of order): ____________________

Shipping Charges: INCLUDED_______EXTRA_______ COST________

Method of Shipping: _________________________________________________


#62 of 124 by mcnally on Tue Dec 19 18:16:32 2006:

 Is the "SATA RAID Drive Cage" an external SATA enclosure, or does
 "drive cage" in this parlance mean a sled that fits into another
 enclosure?  In other words, will these drives be in a separate 
 box or inside the main system unit?  Either way, what other cabling
 is necessary?


#63 of 124 by maus on Tue Dec 19 18:58:32 2006:

It is a metal box that sits inside the front of the server chassis, in 
the same place where one would normally put internal CDRom drives or 
tape drives. It has four slots in it, and you slide the drives (on 
trays) into those slots. You can see hte photograph for it at 
http://3ware.com/products/ata.asp. 


#64 of 124 by mary on Tue Dec 19 20:53:40 2006:

Do we want to invest any money in non-rack mounted hardware?

If we ever wanted to move to another location we'd pretty much
need to fit in standard rack space.


#65 of 124 by nharmon on Tue Dec 19 21:28:08 2006:

This is an internal drive array, Mary.


#66 of 124 by keesan on Tue Dec 19 21:42:19 2006:

I think we should fix the mail problem before investing in more hardware, and
put back newuser.  Why spend money on grex if it is going to evaporate?


#67 of 124 by mcnally on Tue Dec 19 22:09:28 2006:

 re #65:  If it's built to fit into a tower case it's almost certainly
 not going to fit into a rack-mount chassis.


#68 of 124 by maus on Wed Dec 20 00:41:57 2006:

If the rack case can accommodate 3 CDRom drives (all the ones 3U or
taller that I have seen can), then this will fit just fine. Technically,
we could get by with mounting the drives directly in drive slots in the
chassis, but then we would lose hot-plug capabilities. It ill fit into
any case that has three adjacent externally accessible 5.25 inch drive
bays (so a space 4.8 inches by 5.25 inches approximately). 



#69 of 124 by maus on Wed Dec 20 00:52:53 2006:

resp:66 Keesan, I think the storage issue is orthogonal to the spam
issue, and in some ways, I think the additional space would help, since
if we moved /var and our equivalent of /home onto the new drives, they
would not fill up as easily, thereby preventing the potential DOS from
overflowing /var. 


#70 of 124 by keesan on Wed Dec 20 01:39:59 2006:

Why not just prevent the spam from coming in instead of finding a place to
store it?  Or at least get rid of all unused accounts, to start with.


#71 of 124 by mcnally on Wed Dec 20 02:08:57 2006:

  re #70:  
  >  Why not just prevent the spam from coming in instead of finding
  >  a place to store it?

  We've answered this question time and time and time again -- why should
  you expect us to answer it again just because you don't like the answers?


#72 of 124 by nharmon on Wed Dec 20 03:40:36 2006:

> If it's built to fit into a tower case it's almost certainly not going 
> to fit into a rack-mount chassis.

If we moved to a rack mountable chassis, we would have to get one that
had enough 5.25" bays for the drive array. I have no experience with
putting PC hardware into a rack mountable chassis, so I would defer to
maus's expertise.


#73 of 124 by maus on Wed Dec 20 06:19:58 2006:

Most 3U and 4U chassis will accommodate this drive cage. If we want to
skip the cage and just put the drives directly into the server chassis
in the internal 3.5 inch slots, we can, but I recommend against doing
so, since we would lose hot-plug capabilities, and because the cage is
designed with the thermal characteristics of four drives in mind and
mitigates or dissipates the heat generated by running the four drives.
If we can give up hot-plug and if we want to do some air-flow
engineering of our own then go ahead and skip the cage, just make sure
that thermal damage does not void the warranty of the drives. 


#74 of 124 by maus on Wed Dec 20 06:26:04 2006:

Keesan, I agree that something needs to be done about the spam, but a
total 100% solution is probably not feasible, and the problem of spam
does not negate the fact that we need reliable, capacious storage. 

If negating spam is your top priority, perhaps you could donate
something like
http://www.barracudanetworks.com/ns/products/spam_overview.php for Grex
to use. 


#75 of 124 by spooked on Wed Dec 20 12:02:29 2006:

*MANY* solutions exist today, before-your-eyes on the Internet which could 
easily catch 95%+ of incoming spam into the Grex mail server.  They could 
be implemented quickly by any staff member with half-a-degree of 
intelligence.

Unfortunately, Grex is so backward and naive (did I mention 
anti-progressionary?) that it (in particular its staff) will find any 
excuse not to move forward from its ancient software and system 
architecture base.

So glad I resigned from those cronies.





#76 of 124 by nharmon on Wed Dec 20 13:01:54 2006:

You don't seem glad.


#77 of 124 by spooked on Wed Dec 20 13:11:28 2006:

*giggles* Thanks for the light amusement :)



#78 of 124 by mary on Wed Dec 20 13:22:27 2006:

The reason I bring up the rack-mount issue is I believe we'll
someday need to fit into the smallest space possible at some
other location than Provide.  When we moved from the Pumpkin,
to Provide, we were very lucky Provide had the space and inclination
to allow our hardware to occupy a footprint outside of their racks.
Every other affordable ISP I contacted wanted us in a rack and charged
for service based on the amount of rack space (and bandwidth) used. 

I would really like to see space considerations made part of any
hardware decisions we make at this point.  So, thanks for all 
the information on this.


#79 of 124 by ric on Wed Dec 20 13:34:51 2006:

I don't know of any spam fighting systems that are easily implemented thta
actually eliminate 95% of spam without also blocking desired email.

Even greylisting, and using all sort of DNS blacklists, does *NOT* reduce my
spam intake by 95% on my server, and I even use some mroe aggresive DNS
blacklists like spamcop.


#80 of 124 by nharmon on Wed Dec 20 14:00:13 2006:

I understand and appreciate the need to keep our physical footprint as
small as possible. If we needed to put Grex into a rack mountable case
right now, we would need one that was at least 3U to accommodate the PC
components that were used to build Grex (Rack space is measured in U's,
with each U being about 1.75 inches).

http://www.directron.com/ra349c00300w.html

This is a 3U rack chassis that would accommodate Grex's present
motherboard and cards as well as two of the drive cages that maus is
proposing.

If we wanted to venture into 2U or 1U territory we would be looking at a
complete system repurchase, and we might even have to get 2.5" (read:
laptop) hard drives in the case of a 1U solution. And laptop hard drives
are NOT cheap, nor as reliable, nor as spacious, as 3.5" drives.


#81 of 124 by maus on Wed Dec 20 20:22:42 2006:

I like that chassis you showed. And anything smaller than 3U would
require a reengineering. Laptop hard drives are not standard for a 1U
chassis, but you would be limited to two or three normal-sized drives,
which means putting all of our eggs into one basket (in performance as
well as redundancy). If we only had 2U of space, what we could do is put
just our system drives into the host and data drives into a separate
drive shelf. 

An alternate solution might be to find out if our ISP offers a managed 
SAN option. in this case, we would simply pay the monthly fee instead of
amortizing out a the cost of installing this storage equipment
ourselves. At worst, we would have to buy a gigabit NIC and an initiator
programme (though I have heard that the initiator programme from NetBSD
can be ported to OpenBSD with little work). 


#82 of 124 by aruba on Wed Dec 20 23:15:39 2006:

maus - thanks a lot for your work on this.  Your RFQ looks very 
professional.  I just want to make sure it doesn't commit us to anything, 
if we get a bid. Going forward with a RAID array will require some time to 
get the board and staff on board, so I don't want you to be annoyed if we 
get a quote and then sit on it for a while.  I hope we'll discuss it in 
depth soon, but the process of agreeing on what we want and then the 
logistics of the changeover may be much more elaborate than the actual 
purchase.

Here is Grex's current case:
  http://www.antec.com/pdf/drawings/PLUS1080AMG.pdf
It has room for 8 5.25 drives.  I tend to agree with Mary, though, that we
should think in terms of rack mounting in the future.


#83 of 124 by cross on Thu Dec 21 00:07:51 2006:

I think it's reasonable to look at 3U as a lower bound on space required.

Regarding #51; The thing about ccd or the like is that you can't boot off of
it.  I'd be less worried about a controller going bad and more worried about
having a good hot-swappable disk system.

Regarding #55; I agree, we need to make things as simple as possible.  I
further agree that an SATA RAID solution really looks promising for grex.

Regarding #56; RAID-5 *does* have multiple spindles, but they're all required
for reads and writes.  Something like RAID 0+1 would be a better fit for grex,
I think.

With respect to spam and newuser ...  You need a decent foundation to build
off of.


#84 of 124 by maus on Thu Dec 21 00:36:01 2006:

I would like to have one of our Legal Weasels read over the RFQ and make
sure it does not obligate us to anything. I think we should also wait
before sending it to the vendors until we get a commit from the board
that we will start the selection process as soon as the deadline clicks,
and time it so that the deadline is something like a week before a BOD
meeting so that we could decide on it fairly quickly. We should probably
have a stanza in there that says something to the effect of "we will
notify vendors within *** days of our decision".

Should we have a standardized worksheet for RFQs so that in the future
if we need gear over a certain dollar amount (maybe arbitrarily over
400$ or something), we can just fill in a few blanks and email it off? 

Also, who would be the recipient both for bids and for
questions/clarifications?


#85 of 124 by maus on Thu Dec 21 00:40:02 2006:

Cross, I know, ccd is only for a data volume, such as /var/www or
something that needs to be big without needing super performance. If you
need big and performance, RAID 1+0 is your friend. 

I still think we should use the new RAID for our equivalent of /home and
/var and have the system on the existing SCSI drives (maybe using
RAIDFrame (the software RAID) to mirror two system drives). 


#86 of 124 by keesan on Thu Dec 21 02:11:04 2006:

Ric, what percentage of spam do you eliminate?  Remmers, when do you expect
to have something set up for people to use who are averse to copying over two
files and changing the login name and want a script to do it for them?


#87 of 124 by cross on Thu Dec 21 03:02:30 2006:

Regarding #85; Personally, I'd like to see the entire system on hot-swappable
media: both the user data, and the operating system.  We had an occassion
where the root filesystem got lost once, and grex was down for at least
several days.  If that filesystem had been RAIDed, we could have avoided that
downtime.  I don't believe you can boot fram RAIDframe, either, which implies
that the root filesystem cannot be as redundant as we'd (perhaps) like.

I'm in favor of moving to a rack mount case with the storage system you
proposed, and disposing of the SCSI disks.  Perhaps selling them and the SCSI
controller would be a way to offset the cost --- at least partially --- of
getting this new hardware.


#88 of 124 by maus on Thu Dec 21 04:59:16 2006:

If I remember correctly, the way you do it is to make every filesystem
except / on software RAID and make an identical copy of / on the first
slice of the second drive, so no, it is not fault tolerant live, but if
you cannot boot from the normal /, then you just issue your boot command
to bring you up on the alternate copy of /. Things could have changed,
though, since I have not done RAIDframe-based RAID in a while, mostly
relying on 3ware boards for mirroring Serial ATA or IDE drives, and LSI
or Adaptec boards for mirroring SCSI drives. 


#89 of 124 by cross on Thu Dec 21 14:12:42 2006:

That is what you're supposed to do, but then you have to have some mechanism
for mirroring the root filesystem over to the spare partition; that gets
ugly after a bit.  What's more, if one of the disks running the root
filesystem goes down, you still have to manually reboot.  A hardware RAID
solution is better in the sense that this is handled for you automatically;
if one of the disks holding / dies, you just throw in another hot-swappable
disk and go on your merry way.  Sure, one can approximate this using our
existing SCSI disks and RAIDframe and mirroring the root filesystem, but
why bother?


#90 of 124 by maus on Thu Dec 21 14:50:22 2006:

I recommended keeping our existing SCSI infrastructure to capitalize on
sunk costs and because by keeping the system separate from the data, we
decrease a bottle-neck. I agree, with good, inexpensive hardware RAID,
RAIDframe kind of blows in comparison. 


#91 of 124 by cross on Thu Dec 21 15:50:47 2006:

You decrease a bottleneck, but at what cost?  Then you have the associated
maintenance costs if the root disk fails, which is what we're trying to avoid.
I'd say that the goal of an increase in performance at the expense of added
(or, unchanged) administrative burden is the opposite of what we're trying
to achieve (or, what we *should* be trying to achieve).  As for the sunk
costs....  Well, grex cold do several things with the existing SCSI disks and
controller.  (1) Put up a satellite machine ala gryps to offload some of the
processing from the main machine.  For instance, a basic spam/virus blocker
for mail before it gets to the `main' grex machine, or running proxy servers
for web and/or DNS, having a serial port plugged into the serial console of
grex itself, etc.  (2) Sell them and use the proceeds to offset the new
hardware costs.  (3) I'm sure there are others.

The other factor is that I *really* don't believe that grex gets enough usage
to worry about bottlenecks throught he I/O controller right now.


#92 of 124 by maus on Thu Dec 21 16:43:15 2006:

That's a fair assessment. Consider my mumblings about the bottleneck the
ramblings of a weary mind, and please ignore said mumblings.

So, silly question: if we are thinking about moving the system-space
onto a new disc subsystem, does this mean a fresh, new installation? Can
we use the opportunity to request new commands to be added and to
implement new controls and move to standards from odd Grex-isms? 



#93 of 124 by cross on Thu Dec 21 17:01:34 2006:

No, not at all; I think it's good to be challenged and be asked to justify
one's conclusions.  I thank you for that.

I think you can always request the installation of additional software.  And
yes, I *do* think it would mean a new installation of the basic system.  But,
that might not be a bad thing.  Any opportunity to move to standard commands
from weird customizations is a plus, in my opinion.


#94 of 124 by maus on Thu Dec 21 17:15:28 2006:

I agree that moving to standards would be a good thing, provided nothing
is broken in the process (if a command that users or staff depend on is
broken by the move to standardize, then the standardization is crap; if
no-one is hurt and we make the system easier to maintain and easier to
upgrade and actually match what the man pages and web-pages say, then we
have done a good thing by standardizing and deserve doughnuts). 

I didn't have specific commands or software in mind (or, at least,
nothing appropriate for this system), but I figured that if we were
facing a fresh installation, this would be the time to ask people what
commands they would like to see on here, and also see if there are
commands that users would want to see upgraded or replaced. 


#95 of 124 by maus on Thu Dec 21 17:29:57 2006:

On the new commands front, I was just looking through /usr/local/bin and
noticed javac and java and jar. I thought the port of native java to
OBSD was still a couple of years away. Did we build this by way of RHEL
emulation or something else entirely? Have we published somewhere how we
managed it? Hurray f


#96 of 124 by remmers on Thu Dec 21 17:44:34 2006:

Hmm... I seem to recall that I saw the Java stuff sitting in either the 
OpenBSD ports or packages collection a few months ago and installed it.  
Didn't do any testing or anything (I'm not a Java person), so whether it 
all works is another issue.

Oh, I remember now.  Have a look at /usr/ports/lang/kaffe.


#97 of 124 by cross on Thu Dec 21 18:11:26 2006:

Regarding #94; That can be relative.  For instance, on the Sun4, staff
depended on a custom command to edit password information, because the
password stuff was so hacked.  But, the standard commands are better;
we made a net gain by leaving behind the old stuff we *had* depended on
and moving to a newer system.  Someone definitely deserved a Krispy
Kreme on that one.


#98 of 124 by maus on Thu Dec 21 19:17:25 2006:

Krispy Kreme? Ewwww!!!! Give me a nice Shipley's or a Dunkin' Donuts any
day. 

giggle


#99 of 124 by cross on Thu Dec 21 19:27:06 2006:

Blasphemy!


#100 of 124 by cmcgee on Thu Dec 21 19:43:17 2006:

Colleen hands out hot fresh Krispy Kremes to Cross, and warm, sugary Dunkin
Donuts to maus.

Any other staff members wanting special donuts should post here.  If donuts
were all it took to keep staff motivated, I'd take on the whole job myself.


#101 of 124 by gelinas on Fri Dec 22 02:00:36 2006:

Someone, somewhere, mentioned vipw not working.  Today, while not thinking of
anything important (or maybe it was yesterday; they all run together any
more), I realised why vipw is a bad idea on grex:  newuser is updating the
password file fairly often.  Since "last write wins," it's possible to
_really_ screw things up with two things modifying the password file at the
same time.

I'm hoping that "moduser" will either lock the password file or be agile
enough not to interfere with newuser.


#102 of 124 by cross on Fri Dec 22 02:05:43 2006:

vipw didn't work on old grex; it's all right on grex now; it does proper
locking of the passwd database as well rebuilding the DB files after changes.
If newuser is conflicting with it, then that's because newuser is broken.


#103 of 124 by remmers on Fri Dec 22 18:27:20 2006:

A cursory glance at the newuser source code indicates that it's using 
pw_mkdb(), pw_lock(), and pw_abort() to do its password file updating, so 
I suspect things are probably ok.


#104 of 124 by cross on Fri Dec 22 18:29:20 2006:

If it's calling those routines, then you're right; it's doing the same locking
that vipw is doing (essentially) and therefore vipw and newuser will play
nicely with each other (as will useradd, moduser, etc).


#105 of 124 by spooked on Sat Dec 23 09:50:19 2006:

For those hardware-inspired among you, please lend me your advice in item 
30 of the Garage conference.



#106 of 124 by maus on Fri Jan 5 22:04:52 2007:

Just for another thought, do we want to investigate something like
http://www.lacie.com/products/product.htm?pid=10876 as a NAS solution
instead of putting drives into the server itself? It looks fairly
inexpensive, and is very pretty. A good gigabit NIC that is known to
work well with OpenBSD is dirt cheap (under 50$ most of the time) and we
could grow storage space as needed by adding additional units. Lacie has
a pretty good reputation, especially amongst Apple people. We would also
probably want to see if Lacie guarantees the drives inside their device.
According to what i have seen, it looks like this would provide us
full-speed RAID 5 + hot spare if we wanted it.

Do we want to think about this or stay with the idea of drives inside
the server? 


#107 of 124 by nharmon on Fri Jan 5 22:13:19 2007:

I think drives inside a server would be best in order to minimize our
physical footprint in case we ever need to move to a rack.


#108 of 124 by scg on Sat Jan 6 08:22:21 2007:

Hi.  I'm doing my very occasional look through Grex, and saw this.

I'm a network person, not a systems person, so I'm way over my head when
talking about specific systems components.  That said, I do manage a lot
of "critical infrastructure" type services on a low budget with a small
staff, so I spend a lot of time thinking about how to keep services
reliable while also cheap and easy to operate.  And, for those of you
who are new in the last few years and don't know me, I'm a former Grex
staff and board member.

I'm a big fan of systems where the answer to a problem is to turn off
the malfunctioning component, and the users don't notice.  For that
reason, I like that hardware RAID systems and the like are being
discussed.  For partitions with lots of dynamic data that needs to stay
up to date, like the conferences, RAID or some equivalent is absolutely
the right way to go.

I also like that this is being thought about now, at a time when I
gather Grex has been relatively stable.  "If it ain't broke, don't fix
it" and "the number one cause of network outages is network engineers"
are both appropriate rules to keep in mind, but if you've got something
that looks really likely to fall apart it is easier to fix it before it
becomes an emergency.

However, there are a few other things about this discussion that worry
me.  Doing piecemeal upgrades to several year old hardware seems like a
good way to run into unexpected incompatibilities.  Using internal RAID
enclosures with the idea of moving them to as yet unspecified new
hardware seems like a big loss of flexibility.

If I were specing this out, and if it could fit within the budget,
here's what I would probably do:

Get a networked disk array, such as the one maus talks about in #106
(rack mountable, consuming as few rack units as possible), and put all
the dynamic data on it.  Then get a couple of 1U servers, standard and
self contained, with serial consoles and ideally some sort of "lights
out manager" thing.  Put the static non-changing stuff on the internal
hard drives, and set them up as clones of each other.  Add in a cheap
Ebay-purchased console server to manage it.  If the applications support
it, run the two 1U servers side by side, accessing the same data off the
RAID array and sharing the load; otherwise, keep one as a hot spare.

If one of the disks in the RAID array fails, pull and replace it.  If
one of the servers fails, turn it off and run on the other one.

I suspect you could fit this all in six U or so of rack space, which is
still not huge.

That said, I'd also question somewhat whether Grex should still be in
the hardware business.  It might be worth looking at some of the
"dedicated server hosting" companies, and see how what they're charging
to rent a server that includes colo space, network connectivity, and
hands on hardware support, compares to what you're paying ProvideNet.


#109 of 124 by mary on Sat Jan 6 12:33:20 2007:

Steve, nice to see you here and thanks for taking the time to jump in to 
the discussion.

I'm especially interested in hearing any comments on your last paragraph.  
For a while now I've been thinking about ways to get off of our own 
hardware.


#110 of 124 by nharmon on Sat Jan 6 13:08:31 2007:

It has been a while since I have priced it out but if I remember
correctly, ISP-provided servers are either very expensive, or
virtualized. So I'm not sure if it would be workable.


#111 of 124 by maus on Sun Jan 7 00:40:12 2007:

Working for a hosting provider (and having worked for others and knowing
the workings of lots of them), I would definitely say be cautious.
Hosting providers are notorious for using cheap (often recycled) gear,
and virtually every service incurs a nontrivial charge. Most of these
places have minimally skilled staff that try to make everything that
happens the direct result of something the customer did (even
catastrophic hardware failure) so that it is not their problem and is
something that they can charge an arm and leg to fix.  Additionally,
most hosting providers will not run phone lines for you. 

Just as a single datapoint, consider Layered Technologies (one of the
larger players in the hosting business): 

 - Celeron 2 GHz (their lowest spec chassis)
 - 2 GBytes non-ECC Reg RAM
 - 2 x 200 GByte Serial ATA hard drive
 - Serial ATA RAID controller (set up in a mirror set)
 - OpenBSD
 - No control panel
 - a /29 netblock
 - 10 Mbit/sec uplink
 - 1500 GBytes total network throughput per month
 
This would run about 214$ per month


I do think that scg has some good insight, and his cluster spec is
sound. 


#112 of 124 by maus on Sun Jan 7 00:44:49 2007:

resp:108

scg,

Just curious how you handle high-availability. I have looked over a few
solutions, though i have only ever implemented them in a lab setting, so
I am not sure how they would behave in the wild. I particularly like: 

 - HSRP/CARP with synchronization data sent over a dedicated seperate
 LAN - Round-robin NAT with IP-takeover (and using IPMI/LOM to "shoot
 the
other guy in the head")

I am sure there are much better solutions, and would love to learn them.


#113 of 124 by scg on Sun Jan 7 02:03:08 2007:

My favorite dedicated server hoster is ServePath
(http://www.servepath.com).  I designed their network a few years ago,
and they treat me very well.  My view that they're doing things right
isn't unbiased.  I've also heard good things about RackSpace and
Affinity (http://www.valueweb.com), but I've never actually dealt with
them.

I also wouldn't be so quick to dismiss server virtualization, if you
find a hosting provider you like who is doing that.  True, you only get
a fraction of the server's processing power, but the servers will likely
be a lot more powerful than what Grex is running on now.  From what I
hear (no direct experience yet), it's hard to tell the difference from a
distance between a real server and a virtual server, and a hosting
provider will probably be far more motivated to avoid or deal quickly
with hardware problems taking down lots of customers' virtual servers
than with issues affecting only a single customer.

How to handle load balancing and failover depends on what you're doing.
 Simplest is round robin DNS with a low TTL, perhaps accompanied by a
nanny script that watches to see if one of the servers goes away and
removes its DNS entry.  That's what used to be done in the old days
before all those fancy load balancing boxes were invented, and is more
or less what some of the fancy load balancing boxes do.

(What I do at work is to scatter the servers around the world and source
BGP announcements from the servers (google for "anycast"), but that's
not very well suited for what Grex is doing.)


#114 of 124 by maus on Sun Jan 7 02:17:20 2007:

I need to amend my previous statement. I had forgotten about RackSpace;
they have a good rep. I don't know ServPath or Affinity. 


#115 of 124 by pfv on Sun Jan 7 20:24:03 2007:

I guess I fail to see what all this "powerful" gains anyone.


#116 of 124 by maus on Wed Feb 14 04:21:32 2007:

While sitting in a hollowed out baseboard and chewing on some wires, I
found an Ultra320 SCSI RAID board in what appears to be new condition.
Do we want me to send it in so that when we reinstall, we can mirror the
root volume on high-speed SCSI drives? 


#117 of 124 by steve on Sun Feb 18 04:20:28 2007:

   Do you have documentation for it?  Is it hardware only?


#118 of 124 by maus on Sun Feb 18 05:26:23 2007:

It is pure hardware RAID. I would have to check the make and model
(documentation should be on the mfc's webpage). 


#119 of 124 by maus on Sun Feb 18 06:01:29 2007:

http://www.adaptec.com/en-US/products/scsi_tech/value/ASR-2230SLP/

Mirroring / and /usr on SCSI and /home and /var on Serial ATA would make
a very nice, well-split-up, performant, capacious system. 


#120 of 124 by steve on Sun Feb 18 07:28:47 2007:

   Hardware raid is definitely what we want.  I will look at this.


#121 of 124 by maus on Fri Mar 30 02:32:01 2007:

Just curious, I have started seeing 10K RPM Serial ATA drives. Does the
increased rotational speed noticeably improve reading/writing of data?
Is the increase in data access speed a direct function of the rotational
velocity of the center spindle? Presuming it does, is this a real bottle
neck that we would face, or do 7200 RPM drives get to the data fast
enough that choke-points would be elsewhere in the system? I guess my
real question is "would we get benefit enough from 10K RPM drives to
justify the higher cost versus 7200 RPM drives?". 


#122 of 124 by nharmon on Fri Mar 30 11:20:00 2007:

Yes, 10k RPM drives have higher I/O performance than slower spinning
drives. They also tend to have a lower capacity and are more expensive.
The rule of thumb I usually use to calculate I/O performance is:

                         RPM/100 = iops

That is, RPMs divided by 100 gives you I/Os per second. Of course, I
mainly deal with fiber channel drives so this may be way off. Your
arrangement is as important as your individual disk performance too. A
RAID 10 array is much faster than a RAID 5 array, but sacrifices a lot
of storage space.


#123 of 124 by maus on Fri Mar 30 15:45:21 2007:

Thanks for the rule of thumb and for confirming what I suspected about
RAID 1+0 vs RAID 5 performance (where I worked, we  did not do RAID 5
except on rare occasion, and when we did, they didn't trust the grunts
to set it up or maintain it, so I usually only saw RAID 1, RAID 1+0 or
LVM/concatenated over multiple RAID 1 sets). 


#124 of 124 by ric on Sat May 5 03:51:22 2007:

I've heard that these "perpendicular" drives at 7200 RPM are actually the
fastest for most situations.


There are no more items selected.

You have several choices: