Here is the treasurer's report on Cyberspace Communications, Inc. finances
through November 30th, 2006.
Beginning Balance $6,022.39
Credits $150.00 Member contributions
$1.18 Interest on our savings account
------------
$151.18
Debits $100.00 Provide Net colocation (thru 12/22/06)
$48.98 Phone Bill
$29.90 Renewal of grex.org and cyberspace.org
$3.63 Paypal fees (income = $90)
------------
$182.51
Ending Balance $5,991.06
Our current balance breaks down as follows:
$5,814.69 General Fund
$176.37 Silly Hat Fund
The money is distributed like this:
$4,076.75 Checking account
$1,914.31 Savings account earning 0.75% interest annually
We had one new member (easlern) in November. We are currently at 58
members, 47 of whom are paid through at least December 15th. (The others
expired recently and are in a grace period.)
Notes:
- We renewed both domain names through the beginning of 2008.
Thanks to everyone who contributed in November:
arabella, easlern, keesan, and witling.
If you or your institution would like to become a member of Grex, it
only costs $6/month or $60/year. Send money to:
Cyberspace Communications
P. O. Box 4432
Ann Arbor, MI 48106-4432
If you pay by cash or money order, please include a photocopy of some
form of ID. We can't add you to the rolls without ID. (If you pay
with a personal check that has your name pre-printed on it, we
consider that a good enough ID.) Type !support or see
http://www.cyberspace.org/member.html for more info.
124 responses total.
At $182.51 per month, Grex has over 32 months of expenses secured. I suppose there are other expenses as well and so that may not be exact but it seems to me if Grex stopped taking in money entirely, I think it would still be securely financed at the current level for at least 2 years. In the event of a major expense, such as a new computer, Grex users will no doubt step forward, as they have in the past, to contribute the needed money. That means there's no need for Grex to have a pile of money like it does. Grex has the funds to expand into new services or areas, increase it's level of service, or to reduce it's required membership contributions. How about exploring some ways to use some of Grex's money?
Or perhaps Grex could lower its membership dues and consider tailoring a special membership program to users in developing countries who probably cannot afford $60/year but who might take a more active role on Grex if they were more engaged..
The grex membership is steadily shrinking and I think we should hold onto the reserve cash.
I think that considering a membership scheme that makes it easier for people in other countries to become members doesnt necessarily have to cost us a lot of money.
I don't have much interest in sending in more membership money at present. Grex doesn't need the money. We're not using it for anything and not planning on using it for anything. If $60 has got to sit in a bank account, it might as well be my bank account.
We are using the money to pay monthly expenses, including an internet connection that lets you access grex.
...and jep's point is, that with the money in the bank, there's no need to send in more money for another couple of years.
re #7 I'd totally disagree with that presumption. One of the board's primary responsibilities is fiduciary and as such should always strive to bring in some funding. Inflation and whatever else could easily make the current reserves insufficient.
If provide.net dumps us costs could go way up.
At which point, most of us will throw in some bucks. Now, if Grex wants to start using that money to carry out its mission statement instead of eating it all up in operating expenses....then JUST DO IT (tm). But it bothers me that Grex says they need funds with no plan on how to spend them. Tell 'ya what, publish a budget, a plan of how you would spend money you don't have yet. If you have a vision, you may find more people willing to part with their hard earned scratch.
Grex's financial situation is pretty stable at the moment, thanks to the fact that we decreased our expenses a lot by moving into colocation two years ago. That's a good thing! Before that our bank account was steadily declining. I agree tht we don't need a cushion as big as we have right now, though I am happy we have it. And I agree we should talk about ways to use some of our money to improve the infrastructure.
Maybe if we reduced the price of a membership, more people would become members.
I could think of a few ways: Buy a hardware RAID controller and some more disk space. Revamp grex's storage solution. Buy a rackmount case and put grex in a rack instead of in a large tower-style case. That might further reduce costs by lowering the physical footprint at the colo facility. Upgrade the grex computer by getting a new processor, RAM, and motherboard. Put ECC memory and a faster processor onto a server-class motherboard that can handle serial BIOS consoles. That would eliminate the need for a ``pc weasel'' card that is continually talked about and never bought. Pay janc to fix the outstanding bugs in fronttalk and replace the ever-buggier picospan. Or buy a YAPP license. With the exception of the last item, these are roughly in decreasing order of cost.
(We tried to buy a PC-Weasel, but it seems to be impossible to get one now. So another solution is warranted.) Those are good ideas. I don't think reducing our footprint will affect the price we're paying at the colo facility - right now we're in the attic, and it's a rather informal situaion. ANd we're getting a good deal. If we had to leave and find a new home, then it would be an advantage to be small. How much would a server-class motherboard and processor cost us? What else would we need to buy? I'm assuming we could move the disks over as they are.
Regarding #14; At the time grex switched to the current hardware, I championed getting server-grade components, but it didn't happen. But, they're not significantly more expensive than the current commodity hardware. I'd estimate that a new motherboard might be $300-$400. A new processor might run a couple hundred. A good 3U or 4U case might run $400-$500. A couple of gigabytes of ECC RAM might be similarly priced (or even cheaper...). A hardware SCSI RAID controller might be upwards of $600, and new disks could run a grand or so. I'd champion replacing grex's existing disks with new, larger capacity drives that are all the same size and can thus be mirrored more easily. I'm not sure how much 4 x 72GB drives would cost off the top of my head. All in all, I'd say allocating $2000 to new hardware wouldn't be a bad investiment at all. Grex went cheap on the current hardware and that has cost us: I can remember some things - like hardware RAID - being shot down because they were ``too expensive'' and then grex being down for extended periods due to disk failures. Similarly, a rackmount case was shot down because it was deemed unnecessary since grex was still in the pumpkin, not colocation.
(Just a note: that "attic" is our host's server space. The last time I was up there, they were using less than half the available floor space.)
This response has been erased.
Dan - I guess I'm not convinced that Grex's users will see much benefit from that $2000 investment. Grex's hardware had a ouple glitches (which cost a lot less than $2000 to fix), but it's been pretty stable lately. So convince me that we'll see $2000 worth of improvement if we spend that much money.
> Grex's hardware had a couple glitches (which cost a lot less than $2000 > to fix) Actually, we had months of almost daily downtime, and we *still* have periodic problems with user home directory partitions and (much more frequently) /var/spool/mail filling up.
Mike and Dan are accurate in their arguments and comments. However, buying better hardware will NOT fix the problems because good system administration is more about active monitoring, tailoring, and anticipating problems --- none of these 3 are currently sufficiently met by the Grex staff. I know this may sound harsh, but it is spot on. There really needs to be change in Grex staff, its culture in particular, and processes.
Re #19: I agree that when we were having memory problems, that was ugly, adn if throwing money at the problem would have fixed it, it would have been a good thing. But we haven't had that problem for the last year, since STeve pulled the bad memory chip. So I think it's a moot point. I suppose we could buy a bigger disk and alleviate the mail spool problem for a while. But it would just fill up again, right? So I'm not convinced money can solve that problem.
There have been downtime periods of greater than a week on grex, largely due to hardware (and more frequently) software failures. How much does that cost grex in terms of opportunity costs? How much does it cost the staff people who have to turn around and fix those problems? Sure, in a direct, apples-to-apples comparison you won't see $2000 of benefit for a $2000 investment, but that's the wrong metric. Instead, judge it based on how much money is *saved* from things like reduced staff time commitment, improved reliability, etc. Would the mailbox partition fill up if staff could have devoted more time several months ago (when staff *had* time) to tweaking the mail system rather than figuring out why grex was crashing all the time? What was the cost to Steve for nursing a sick grex back to health in terms of time away from his job, his family, etc? Is that worth $2000?
How long would it take to write some program that deletes any mailbox which has not been accessed for a month after the account was opened?
This response has been erased.
About 10 minutes.
Two cents from the peanut gallery: seems like we can prevent downtime like what's been recent by policing accounts better. It's hard to see what benefit there would be in giving anonymous accounts a more powerful system to beat on the ISP with.
Re #22: Dan, I'm just not convinced that throwing money at the problem is going to help at all. I'm not an expert on hardware, but I know that it's not literally true that we "went cheap" when we bought the current machine. The total initial cost of the current machine was $2,201, more than you're proposing to spend. It seems to me that there is some probability that any piece of hardware will go bad, in any time interval. Grex pushes hardware pretty hard, so it doesn't surprise me a whole lot that we lost a disk and a memory chip in the 3.5 years the machine has been running (2 years since it's been online).
Well, consider the disk failure for instance: yes, you're absolutely right that hardware components tend to fail over time, and there's not much that can be done to prevent it. These things just wear out after a while. But, if grex had invested in a hardware RAID solution, then losing a disk wouldn't have necessarily brought the entire machine down. And repairing the problem would have been about as easy as taking a spare to the colo and yanking out the old disk and plugging in the new one. The hardware would take care of the rest. This isn't magic; hot-swappable hardware RAID controllers aren't hard to come by. And it would have prevented a week of downtime. And it wouldn't have required Steve or anyone else to spend hours and hours at the colo facility. And of course, had we used ECC memory like was discussed ad naseum before buying the current hardware, the memory chip just wouldn't have been an issue: it would have told us it was bad (the memory hardware would have told the operating system that would have logged a message) and it could have been replaced without a tremendous amount of downtime (if, indeed, that was the problem at all), or people going back and forth to the colo facility to run diagnostics, etc. What's more, it wouldn't have taken down the machine. Is that worth it? You tell me. As for the cost of the current grex hardware.... Remember that the Sun 4 that it replaced cost somewhere on the order of $100,000 when new. $2,201 is pretty cheap compared to that. I guess I don't understand why you think that this is just ``throwing money at the problem.'' Well, I'm not going to try and convince you. If you don't think it's worth it, then you don't think it's worth it. But, I just consider it making wise investments.
I work with hot swappable hard drives on servers and I have to admit that I really do like them. Our set up has three harddrives and we can lose one without having *any* downtime. Fixing it is pretty easy too. We ship a hard drive to the retail location where we have someone who is almost completely computer illiterate install it. It is pretty cool.
I was told by someone in the IT industry that RAID is only worthwhile
if your downtime costs are measured in dollars per minute. Nonetheless I
recommend installing hardware RAID anyway, for reasons given by others
in this item.
It also occurs to me that with a RAID system, producing an offsite
backup should consist mainly of pulling out one of the redundant hard
drives to take offsite, and putting an empty in its place. Much faster
and easier than babysitting a tape drive.
(I'm not sure that last paragraph follows - in particular, if you do, say, RAID 5, one disk won't necessarily give you complete information in a backup.)
That's a terrible way to back up a RAID array, even one that's just basic disk mirroring.
Does Grex even need a backup system, let alone an offsite backup? It seems to me that all Grex needs is some sort of "Recovery Kit", or a collection of software for Grex that can be put on DVD and distributed to staffers or maybe even given away as free OSS (assuming we used OSS). User home directories should be the responsibility of end users. We could recruit tech-savy users to assist other people in backing up their own data.
Well, I still remember when STeve deleted all the mail on the /var/spool/mail partition, so I'm inclined to think that Grex ought to have a backup system. It'd also be kind of a bummer if all the data in the conferencing system disappeared tomorrow and couldn't be restored.. Users probably *should* back up their important data offsite, but that process will certainly tax Grex's bandwidth if more than a few people start to do that frequently.
Email really ought to be delivered into the user's home directory, not a separate partition. Then the mail spool area could be reallocated to more user space. Backups of all a user's data would be pretty easy (just tar up one directory instead of one directory and another file that the user might not even know about). I suspect sufficiently few people use grex seriously enough that backups on an individual basis would really tax the system's bandwidth. Since I'm the politically incorrect firebrand right now anyway, I'll say that the lose of /var/spool/mail was just due to poor planning. It's interesting to note that grex's disks were repartitioned without any concensus.
How much would it cost to add a hardware RAID system to our current machine?
resp:36 It depends on a few things. Are we talking about adding a two-drive mirror set or a RAID that spans many drives? Do we already own the drives? Will we use SCSI or Serial ATA or IDE? Do we need hot-plug capabilities? Do we want it to be battery backed so it can finish commits to discs even if the system loses power in the middle of a commit? Does it need to support a hot spare? Will the drives be in the server's chassis or do we also need a shelf/enclosure for the drives? I'll try to get you a few quotes over the next few days once I have an idea of what you need.
Can we price out a 3TB fiberchannel SAN? :-)
Re #37: I don't know the answer to those questions, except that we currently have a lot of SCSI disk. I want to say 3 x 18 gig, plus one more rebuilt disk sitting on my desk.
We could get comparable performance and significant capacity increase by doing the following: Array0: 4-port Serial ATA 3Ware Escalade RAID board port 0: 200GByte Serial ATA drive (possibly Seagate or Maxtor) port 1: 200GByte Serial ATA drive (possibly Seagate or Maxtor) port 2: 200GByte Serial ATA drive (possibly Seagate or Maxtor) port 3: 200GByte Serial ATA drive (possibly Seagate or Maxtor) port0 + port1 as a RAID-Mirror port2 + port3 as hot-spares This could sustain the loss of *ANY* two drives failing, as long as they do not fail simultaneously. Serial ATA is hot plugable, provided the drives are in a cage with proper connectors (to assure that logic or data is not asserted while power is not on -- achieved by varying pin length to have power use the longest pins in the connectors). Equipment proposed: =================== RAID Controller: 3Ware 9550SX-4LP Drive Enclosure: 3Ware RDC- 400 Drive Cables: 3Ware Cables for 9590SE, 9550SX and 3ware Sidecar High-Speed Drives: 4 x Seagate Barracuda ES Hard Drives Would this setup do us for a while? If so, let me know and I will try to get us quotes on this stack of kit. I will say that I have been consistently pleased with 3Ware's kit, and Seagate has always been good so long as I remember. OpenBSD 3.9 recognizes the 3Ware Escalade automagically, and can use the array hanging off of it as a single SCSI drive/LUN. Also, just so you know, setting up the array on the 3Ware is easy, provided you have console access to the server before the "boot>" prompt.
Re resp:38 That is totally inappropriate and a waste of time to look up. We could not afford it and do not need it, so in short the answer to your question is "eat me!".
re 41: You're pretty quick to flame people who use emoticons to express their sarcasm.
And why would you mirror two of the drives and leave two as hot spares? With that same hardware you could set up a RAID 5 array with three of the drives and still have a hot spare. With RAID 5 you would have double the amount of space versus what you proposed.
Sorry, you're fair. Looking back, my mind skipped over the smile, so I completely missed the humor. I apologize for my unkind response. The reason for my proposed setup is several-fold: - Grex server is maintained by volunteers, and they may not be able to do a truck-roll when the first drive fails, and this setup allows time to respond or even save up for the beginning of the month before the array is even considered degraded - The design as I spec'ed it would nearly quadruple the available space. In my (not humble) oppinion, this would last a substantial amount of time - RAID-5 (striping with distributed parity, single-drive redundancy) is expensive in terms of both read and write access times. With the large number of files, especially the large number of small files that Grex server accretes, this could translate into an io bottleneck during times of heavy load. If we need more space than I proposed, I would recommend that we get larger drives or set up a RAID 1 + 0 (a stripe of mirror sets). I will say that my suggestion to use two hot spares may be overly cautious, and that we could step down to one hot spare without serious risk to the system. If we are seriously concerned about availability, I would recommend the two hot spares, and step from one 4-port RAID board to a pair of 2-port boards (preferably on seperate busses) so that even if a board fails, the array is still being managed.
I like that configuration; Maus, do you have an offhand idea of how much it would cost? Regarding RAID-5: part of the idea of a hardware RAID controller is that it handles all of that by itself and thus is ``fast.'' I concur it won't be as fast as simple mirror (due to the parity calculations being spread across drives), but that should be masked somewhat by the controller.
If Aruba gives it green light, I will call around to vendors I know and ask for an estimate. Regarding RAID-5, the work of making the parity comparisons on every read and write is offloaded, so it is less work for the host's processor, but still it is not fast, and in our case, probably not needed. One small nitpick, it's maus, not Maus.
Sorry! I agree that RAID-5 is not needed, but I'm surprised the controller doesn't do a faster job of it.
Most controllers do, but my experience is limited to Adaptec SCSI controllers. We use RAID 5 on systems that are very disk intensive with no noticeable delays. But RAID 1 will work.
I asked the staff to look in on this item to give an opinion on maus's proposed setup. I don't have the technical expertise to say how elaborate a system Grex could use.
Aruba, thanks for sending this to them, and for your confidence in my solution. If they like it, I'll start gathering quotes from places like Altex, Cyberguys and a couple of others (I don't have my catalogs with me at work, so I can't remember the names of the other good vendors).
Cross and Nharmon, About RAID 5 performance, I may be mistaken. I regularly work with the 2 port RAID boards for simple mirror sets (pretty much every day), but have not used a RAID 5 array since back in the day of the Escalade 7000 series (IDE), and I hope that write performance has improved substantially since then. What I typically do for size+redundancy is two drives per controller as a mirror set, and then either make a stripe of the mirror sets or concatenate them using LVM in RHEL or ccd in OpenBSD. There may be better solutions, this is just how we've been doing it here. I appreciate what I can learn from each of you guys.
I have never implemented RAID 5 with SATA, only mirroring like maus described. However, my experience with SCSI RAID 5 is that it performs well when the number of drives in the array are low (say, 3-8 drives). When our needs exceeded that, we've implemented RAID 50, which is striping across multiple RAID 5 arrays. Most RAID controllers aren't capable of this, and the ones that are cost too much for Grex.
Sorry for the dumb question, but what is "Silly Hat Fund"?
At a board meeting long ago, there was a discussion of fundraising, and whether we should collect money for particular goals (like buying a hard drive, for instance) or just put everything in the general fund. Having concrete goals sometimes helps to get donation, but, someone pointed out, it's important to have money that as available to be spent on anything, "even silly hats". So we agreed to set up funds and raise money for them as appropriate, but keep most of our money in the general fund. After the meeting, Peter Riley (nestene) gave me $5 and said, "This is for the Silly Hat Fund". So I dutifully created said partition of Grex's money, and it has been on the treasurer's report ever since. Lots of people have donated to it over the years, and occasionally when the Grex walkers accumulated a little more cash than necessary to pay our lunch tab, the excess has gone into the Silly Hat Fund. No one's come up with a good use for it, yet. Whatever we spend it on, it should be something frivolous.
Back when we were looking at new hardware for Grex, the goal was to get as good a set of hardware for as little money as we could do. Had we had infiniate money I'd have likely gotten different things. The Antec tower and PS are known for their reliability; since we wern't in a rackmount situation we could avoid possible heat problems by having a large case, and also we could stuff a lot of disks in that case. I didn't get ECC memory because the last memory failure I have seen on any machine I've worked on was in 1999. I've been dealing with raid stuff at work and have gotten an ARCO ide raid 1 card. This is a hardware raid system: its invisable to the operating system, appearing as a single IDE disk which connects to the IDE controller. This kind of raid card gets completely around drivers, etc. I would think thats the way to go for Grex. The simpler we can make things the better off we are.
One thing to remember about Grex is that having multiple spindles running is a good thing. A huge raid 5 array could hold everything, but at the cost of losing multiple disks. I'm not sure what the best tradeoff is now, but I suspect SATA disks are the way to go, since they're working out well in terms of reliability, and are fast.
This is not an official quote, just a first approximation based on prices listed on one vendor's web-page. I figure it should at least give us an order of magnitude approximation that we can use for sanity checks. Substituting Maxtor drives instead of Seagate would change the grand total by less than 10$ and neither brand seems to be sold out at any of the vendors I've dealt with, so the brand of drives comes down to preferences of staff. PROVANTAGE: 1x 3ware 9550SX-4 - 4-pt PCI-X Low-profile SATA II RAID Card with Cables 1x RDC-400 SATA RAID Drive Cage 4x Barracuda ES 250GB SATA NL35 7200RPM 16MB Cache 3Gb/s 8.5ms US$ Subtotal: $ 827.70 ========================== The vendors I will be contacting are (provided board and staff want me to go ahead and get bids): Provantage synnex.com Bell Micro CyberGuys Altex The Delcom Group
After more reading, it appears that the hit we would take on computing parity and writing parity would be offset by parallelizing seeking. Does anyone have numbers for this? Is a RAID 5 with one hot spare sufficiently reliable for us? Also, I realized I made a big assumption. Does our chassis have physical space for a drive cage that is the height of 3 CDRom drives and 5.25" wide?
Just wondering, since this has a decidedly technical side to it and is a discussion of hardware for Grexserver, should this item be moved/linked into Agorage conference?
Probably the Garage conference, which is specifically for public discussions of Grex technical issues. I think the fairwitness there is janc, so he'd be the person to link it.
I have a rough draft of the RFQ ready for staff and board to look over.
It has a couple of spaces that need to be filled in (the contact person
and the due date). Comments, criticisms, thoughts, etc are encouraged.
=========================================================================
=
Introduction:
Cyberspace Communications, Inc. (Grex) is a nonprofit organization
dedicated to the advancement of public education and scientific endeavor
through interaction with computers, and humans via computers, using
computer conferencing. Further purposes include the exchange of
scientific and technical information about the various aspects of
computer science, such as operating systems, computer networks, and
computer programming. Cyberspace Communications is investigating
increasing their conferencing server's capacity, capabilities and
reliability by adding a new drive array.
Instructions:
Please provide an anticipated and a maximum quote. Specifications are
firm, and bidders are not to deviate from the specification without
prior written approval from Cyberspace Communications. Quantities are
firm. No under-runs. Cyberspace Communications will retain units over
the specified quantity and will not pay for over-runs.
Award Criteria:
Consideration will be given to price, warranty terms, responsibility,
delivery time, delivery terms and experience in performance of similar
deals.
Contact For Specification:
********@cyberspace.org
Alternates/Substitutes:
Alternate bids of substantially the same quality, style and features are
invited. In order to receive full consideration, such alternate bids
must be accompanied by sufficient descriptive literature and/or
specifications to clearly identify the offer and allow for a complete
evaluation.
Acceptance of Bids:
Any bid may be rejected as non-responsive in the judgment of Cyberspace
Communications should any of the following occur: Material alteration or
erasure of the RFP/RFQ documents; Failure to submit required bid
guaranty (when required); Failure to furnish requested pricing or other
information; Submission of a late bid. In addition, bids may be rejected
for any other justifiable reason including, but not limited to, failure
to perform on previous contracts with Cyberspace Communications.
Withdrawl of Bids:
Bids may be withdrawn or modified upon written request from the properly
identified bidder, prior to the date and hour scheduled for the closing
of bids.
Receipt of Quotes:
Responses to this RFQ should be emailed to ********@cyberspace.org by
********** at noon, Eastern Time.
Item Quantity Description
0 1 3Ware 9550SX-4LP: PCI-X 4 port Serial ATA RAID
controller board 1 1 3Ware RDC-400: Serial ATA RAID
Drive Cage 2 4 Seagate Barracuda ES 200 GByte Serial
ATA Drives
*OR*
2 4 Maxtor DiamondMAX 10 200 GByte Serial ATA Drives
3 4 3Ware Cables for 9590SE, 9550SX and 3ware
Sidecar
Earliest Delivery Date (after receipt of order): ____________________
Shipping Charges: INCLUDED_______EXTRA_______ COST________
Method of Shipping: _________________________________________________
Is the "SATA RAID Drive Cage" an external SATA enclosure, or does "drive cage" in this parlance mean a sled that fits into another enclosure? In other words, will these drives be in a separate box or inside the main system unit? Either way, what other cabling is necessary?
It is a metal box that sits inside the front of the server chassis, in the same place where one would normally put internal CDRom drives or tape drives. It has four slots in it, and you slide the drives (on trays) into those slots. You can see hte photograph for it at http://3ware.com/products/ata.asp.
Do we want to invest any money in non-rack mounted hardware? If we ever wanted to move to another location we'd pretty much need to fit in standard rack space.
This is an internal drive array, Mary.
I think we should fix the mail problem before investing in more hardware, and put back newuser. Why spend money on grex if it is going to evaporate?
re #65: If it's built to fit into a tower case it's almost certainly not going to fit into a rack-mount chassis.
If the rack case can accommodate 3 CDRom drives (all the ones 3U or taller that I have seen can), then this will fit just fine. Technically, we could get by with mounting the drives directly in drive slots in the chassis, but then we would lose hot-plug capabilities. It ill fit into any case that has three adjacent externally accessible 5.25 inch drive bays (so a space 4.8 inches by 5.25 inches approximately).
resp:66 Keesan, I think the storage issue is orthogonal to the spam issue, and in some ways, I think the additional space would help, since if we moved /var and our equivalent of /home onto the new drives, they would not fill up as easily, thereby preventing the potential DOS from overflowing /var.
Why not just prevent the spam from coming in instead of finding a place to store it? Or at least get rid of all unused accounts, to start with.
re #70: > Why not just prevent the spam from coming in instead of finding > a place to store it? We've answered this question time and time and time again -- why should you expect us to answer it again just because you don't like the answers?
> If it's built to fit into a tower case it's almost certainly not going > to fit into a rack-mount chassis. If we moved to a rack mountable chassis, we would have to get one that had enough 5.25" bays for the drive array. I have no experience with putting PC hardware into a rack mountable chassis, so I would defer to maus's expertise.
Most 3U and 4U chassis will accommodate this drive cage. If we want to skip the cage and just put the drives directly into the server chassis in the internal 3.5 inch slots, we can, but I recommend against doing so, since we would lose hot-plug capabilities, and because the cage is designed with the thermal characteristics of four drives in mind and mitigates or dissipates the heat generated by running the four drives. If we can give up hot-plug and if we want to do some air-flow engineering of our own then go ahead and skip the cage, just make sure that thermal damage does not void the warranty of the drives.
Keesan, I agree that something needs to be done about the spam, but a total 100% solution is probably not feasible, and the problem of spam does not negate the fact that we need reliable, capacious storage. If negating spam is your top priority, perhaps you could donate something like http://www.barracudanetworks.com/ns/products/spam_overview.php for Grex to use.
*MANY* solutions exist today, before-your-eyes on the Internet which could easily catch 95%+ of incoming spam into the Grex mail server. They could be implemented quickly by any staff member with half-a-degree of intelligence. Unfortunately, Grex is so backward and naive (did I mention anti-progressionary?) that it (in particular its staff) will find any excuse not to move forward from its ancient software and system architecture base. So glad I resigned from those cronies.
You don't seem glad.
*giggles* Thanks for the light amusement :)
The reason I bring up the rack-mount issue is I believe we'll someday need to fit into the smallest space possible at some other location than Provide. When we moved from the Pumpkin, to Provide, we were very lucky Provide had the space and inclination to allow our hardware to occupy a footprint outside of their racks. Every other affordable ISP I contacted wanted us in a rack and charged for service based on the amount of rack space (and bandwidth) used. I would really like to see space considerations made part of any hardware decisions we make at this point. So, thanks for all the information on this.
I don't know of any spam fighting systems that are easily implemented thta actually eliminate 95% of spam without also blocking desired email. Even greylisting, and using all sort of DNS blacklists, does *NOT* reduce my spam intake by 95% on my server, and I even use some mroe aggresive DNS blacklists like spamcop.
I understand and appreciate the need to keep our physical footprint as small as possible. If we needed to put Grex into a rack mountable case right now, we would need one that was at least 3U to accommodate the PC components that were used to build Grex (Rack space is measured in U's, with each U being about 1.75 inches). http://www.directron.com/ra349c00300w.html This is a 3U rack chassis that would accommodate Grex's present motherboard and cards as well as two of the drive cages that maus is proposing. If we wanted to venture into 2U or 1U territory we would be looking at a complete system repurchase, and we might even have to get 2.5" (read: laptop) hard drives in the case of a 1U solution. And laptop hard drives are NOT cheap, nor as reliable, nor as spacious, as 3.5" drives.
I like that chassis you showed. And anything smaller than 3U would require a reengineering. Laptop hard drives are not standard for a 1U chassis, but you would be limited to two or three normal-sized drives, which means putting all of our eggs into one basket (in performance as well as redundancy). If we only had 2U of space, what we could do is put just our system drives into the host and data drives into a separate drive shelf. An alternate solution might be to find out if our ISP offers a managed SAN option. in this case, we would simply pay the monthly fee instead of amortizing out a the cost of installing this storage equipment ourselves. At worst, we would have to buy a gigabit NIC and an initiator programme (though I have heard that the initiator programme from NetBSD can be ported to OpenBSD with little work).
maus - thanks a lot for your work on this. Your RFQ looks very professional. I just want to make sure it doesn't commit us to anything, if we get a bid. Going forward with a RAID array will require some time to get the board and staff on board, so I don't want you to be annoyed if we get a quote and then sit on it for a while. I hope we'll discuss it in depth soon, but the process of agreeing on what we want and then the logistics of the changeover may be much more elaborate than the actual purchase. Here is Grex's current case: http://www.antec.com/pdf/drawings/PLUS1080AMG.pdf It has room for 8 5.25 drives. I tend to agree with Mary, though, that we should think in terms of rack mounting in the future.
I think it's reasonable to look at 3U as a lower bound on space required. Regarding #51; The thing about ccd or the like is that you can't boot off of it. I'd be less worried about a controller going bad and more worried about having a good hot-swappable disk system. Regarding #55; I agree, we need to make things as simple as possible. I further agree that an SATA RAID solution really looks promising for grex. Regarding #56; RAID-5 *does* have multiple spindles, but they're all required for reads and writes. Something like RAID 0+1 would be a better fit for grex, I think. With respect to spam and newuser ... You need a decent foundation to build off of.
I would like to have one of our Legal Weasels read over the RFQ and make sure it does not obligate us to anything. I think we should also wait before sending it to the vendors until we get a commit from the board that we will start the selection process as soon as the deadline clicks, and time it so that the deadline is something like a week before a BOD meeting so that we could decide on it fairly quickly. We should probably have a stanza in there that says something to the effect of "we will notify vendors within *** days of our decision". Should we have a standardized worksheet for RFQs so that in the future if we need gear over a certain dollar amount (maybe arbitrarily over 400$ or something), we can just fill in a few blanks and email it off? Also, who would be the recipient both for bids and for questions/clarifications?
Cross, I know, ccd is only for a data volume, such as /var/www or something that needs to be big without needing super performance. If you need big and performance, RAID 1+0 is your friend. I still think we should use the new RAID for our equivalent of /home and /var and have the system on the existing SCSI drives (maybe using RAIDFrame (the software RAID) to mirror two system drives).
Ric, what percentage of spam do you eliminate? Remmers, when do you expect to have something set up for people to use who are averse to copying over two files and changing the login name and want a script to do it for them?
Regarding #85; Personally, I'd like to see the entire system on hot-swappable media: both the user data, and the operating system. We had an occassion where the root filesystem got lost once, and grex was down for at least several days. If that filesystem had been RAIDed, we could have avoided that downtime. I don't believe you can boot fram RAIDframe, either, which implies that the root filesystem cannot be as redundant as we'd (perhaps) like. I'm in favor of moving to a rack mount case with the storage system you proposed, and disposing of the SCSI disks. Perhaps selling them and the SCSI controller would be a way to offset the cost --- at least partially --- of getting this new hardware.
If I remember correctly, the way you do it is to make every filesystem except / on software RAID and make an identical copy of / on the first slice of the second drive, so no, it is not fault tolerant live, but if you cannot boot from the normal /, then you just issue your boot command to bring you up on the alternate copy of /. Things could have changed, though, since I have not done RAIDframe-based RAID in a while, mostly relying on 3ware boards for mirroring Serial ATA or IDE drives, and LSI or Adaptec boards for mirroring SCSI drives.
That is what you're supposed to do, but then you have to have some mechanism for mirroring the root filesystem over to the spare partition; that gets ugly after a bit. What's more, if one of the disks running the root filesystem goes down, you still have to manually reboot. A hardware RAID solution is better in the sense that this is handled for you automatically; if one of the disks holding / dies, you just throw in another hot-swappable disk and go on your merry way. Sure, one can approximate this using our existing SCSI disks and RAIDframe and mirroring the root filesystem, but why bother?
I recommended keeping our existing SCSI infrastructure to capitalize on sunk costs and because by keeping the system separate from the data, we decrease a bottle-neck. I agree, with good, inexpensive hardware RAID, RAIDframe kind of blows in comparison.
You decrease a bottleneck, but at what cost? Then you have the associated maintenance costs if the root disk fails, which is what we're trying to avoid. I'd say that the goal of an increase in performance at the expense of added (or, unchanged) administrative burden is the opposite of what we're trying to achieve (or, what we *should* be trying to achieve). As for the sunk costs.... Well, grex cold do several things with the existing SCSI disks and controller. (1) Put up a satellite machine ala gryps to offload some of the processing from the main machine. For instance, a basic spam/virus blocker for mail before it gets to the `main' grex machine, or running proxy servers for web and/or DNS, having a serial port plugged into the serial console of grex itself, etc. (2) Sell them and use the proceeds to offset the new hardware costs. (3) I'm sure there are others. The other factor is that I *really* don't believe that grex gets enough usage to worry about bottlenecks throught he I/O controller right now.
That's a fair assessment. Consider my mumblings about the bottleneck the ramblings of a weary mind, and please ignore said mumblings. So, silly question: if we are thinking about moving the system-space onto a new disc subsystem, does this mean a fresh, new installation? Can we use the opportunity to request new commands to be added and to implement new controls and move to standards from odd Grex-isms?
No, not at all; I think it's good to be challenged and be asked to justify one's conclusions. I thank you for that. I think you can always request the installation of additional software. And yes, I *do* think it would mean a new installation of the basic system. But, that might not be a bad thing. Any opportunity to move to standard commands from weird customizations is a plus, in my opinion.
I agree that moving to standards would be a good thing, provided nothing is broken in the process (if a command that users or staff depend on is broken by the move to standardize, then the standardization is crap; if no-one is hurt and we make the system easier to maintain and easier to upgrade and actually match what the man pages and web-pages say, then we have done a good thing by standardizing and deserve doughnuts). I didn't have specific commands or software in mind (or, at least, nothing appropriate for this system), but I figured that if we were facing a fresh installation, this would be the time to ask people what commands they would like to see on here, and also see if there are commands that users would want to see upgraded or replaced.
On the new commands front, I was just looking through /usr/local/bin and noticed javac and java and jar. I thought the port of native java to OBSD was still a couple of years away. Did we build this by way of RHEL emulation or something else entirely? Have we published somewhere how we managed it? Hurray f
Hmm... I seem to recall that I saw the Java stuff sitting in either the OpenBSD ports or packages collection a few months ago and installed it. Didn't do any testing or anything (I'm not a Java person), so whether it all works is another issue. Oh, I remember now. Have a look at /usr/ports/lang/kaffe.
Regarding #94; That can be relative. For instance, on the Sun4, staff depended on a custom command to edit password information, because the password stuff was so hacked. But, the standard commands are better; we made a net gain by leaving behind the old stuff we *had* depended on and moving to a newer system. Someone definitely deserved a Krispy Kreme on that one.
Krispy Kreme? Ewwww!!!! Give me a nice Shipley's or a Dunkin' Donuts any day. giggle
Blasphemy!
Colleen hands out hot fresh Krispy Kremes to Cross, and warm, sugary Dunkin Donuts to maus. Any other staff members wanting special donuts should post here. If donuts were all it took to keep staff motivated, I'd take on the whole job myself.
Someone, somewhere, mentioned vipw not working. Today, while not thinking of anything important (or maybe it was yesterday; they all run together any more), I realised why vipw is a bad idea on grex: newuser is updating the password file fairly often. Since "last write wins," it's possible to _really_ screw things up with two things modifying the password file at the same time. I'm hoping that "moduser" will either lock the password file or be agile enough not to interfere with newuser.
vipw didn't work on old grex; it's all right on grex now; it does proper locking of the passwd database as well rebuilding the DB files after changes. If newuser is conflicting with it, then that's because newuser is broken.
A cursory glance at the newuser source code indicates that it's using pw_mkdb(), pw_lock(), and pw_abort() to do its password file updating, so I suspect things are probably ok.
If it's calling those routines, then you're right; it's doing the same locking that vipw is doing (essentially) and therefore vipw and newuser will play nicely with each other (as will useradd, moduser, etc).
For those hardware-inspired among you, please lend me your advice in item 30 of the Garage conference.
Just for another thought, do we want to investigate something like http://www.lacie.com/products/product.htm?pid=10876 as a NAS solution instead of putting drives into the server itself? It looks fairly inexpensive, and is very pretty. A good gigabit NIC that is known to work well with OpenBSD is dirt cheap (under 50$ most of the time) and we could grow storage space as needed by adding additional units. Lacie has a pretty good reputation, especially amongst Apple people. We would also probably want to see if Lacie guarantees the drives inside their device. According to what i have seen, it looks like this would provide us full-speed RAID 5 + hot spare if we wanted it. Do we want to think about this or stay with the idea of drives inside the server?
I think drives inside a server would be best in order to minimize our physical footprint in case we ever need to move to a rack.
Hi. I'm doing my very occasional look through Grex, and saw this. I'm a network person, not a systems person, so I'm way over my head when talking about specific systems components. That said, I do manage a lot of "critical infrastructure" type services on a low budget with a small staff, so I spend a lot of time thinking about how to keep services reliable while also cheap and easy to operate. And, for those of you who are new in the last few years and don't know me, I'm a former Grex staff and board member. I'm a big fan of systems where the answer to a problem is to turn off the malfunctioning component, and the users don't notice. For that reason, I like that hardware RAID systems and the like are being discussed. For partitions with lots of dynamic data that needs to stay up to date, like the conferences, RAID or some equivalent is absolutely the right way to go. I also like that this is being thought about now, at a time when I gather Grex has been relatively stable. "If it ain't broke, don't fix it" and "the number one cause of network outages is network engineers" are both appropriate rules to keep in mind, but if you've got something that looks really likely to fall apart it is easier to fix it before it becomes an emergency. However, there are a few other things about this discussion that worry me. Doing piecemeal upgrades to several year old hardware seems like a good way to run into unexpected incompatibilities. Using internal RAID enclosures with the idea of moving them to as yet unspecified new hardware seems like a big loss of flexibility. If I were specing this out, and if it could fit within the budget, here's what I would probably do: Get a networked disk array, such as the one maus talks about in #106 (rack mountable, consuming as few rack units as possible), and put all the dynamic data on it. Then get a couple of 1U servers, standard and self contained, with serial consoles and ideally some sort of "lights out manager" thing. Put the static non-changing stuff on the internal hard drives, and set them up as clones of each other. Add in a cheap Ebay-purchased console server to manage it. If the applications support it, run the two 1U servers side by side, accessing the same data off the RAID array and sharing the load; otherwise, keep one as a hot spare. If one of the disks in the RAID array fails, pull and replace it. If one of the servers fails, turn it off and run on the other one. I suspect you could fit this all in six U or so of rack space, which is still not huge. That said, I'd also question somewhat whether Grex should still be in the hardware business. It might be worth looking at some of the "dedicated server hosting" companies, and see how what they're charging to rent a server that includes colo space, network connectivity, and hands on hardware support, compares to what you're paying ProvideNet.
Steve, nice to see you here and thanks for taking the time to jump in to the discussion. I'm especially interested in hearing any comments on your last paragraph. For a while now I've been thinking about ways to get off of our own hardware.
It has been a while since I have priced it out but if I remember correctly, ISP-provided servers are either very expensive, or virtualized. So I'm not sure if it would be workable.
Working for a hosting provider (and having worked for others and knowing the workings of lots of them), I would definitely say be cautious. Hosting providers are notorious for using cheap (often recycled) gear, and virtually every service incurs a nontrivial charge. Most of these places have minimally skilled staff that try to make everything that happens the direct result of something the customer did (even catastrophic hardware failure) so that it is not their problem and is something that they can charge an arm and leg to fix. Additionally, most hosting providers will not run phone lines for you. Just as a single datapoint, consider Layered Technologies (one of the larger players in the hosting business): - Celeron 2 GHz (their lowest spec chassis) - 2 GBytes non-ECC Reg RAM - 2 x 200 GByte Serial ATA hard drive - Serial ATA RAID controller (set up in a mirror set) - OpenBSD - No control panel - a /29 netblock - 10 Mbit/sec uplink - 1500 GBytes total network throughput per month This would run about 214$ per month I do think that scg has some good insight, and his cluster spec is sound.
resp:108 scg, Just curious how you handle high-availability. I have looked over a few solutions, though i have only ever implemented them in a lab setting, so I am not sure how they would behave in the wild. I particularly like: - HSRP/CARP with synchronization data sent over a dedicated seperate LAN - Round-robin NAT with IP-takeover (and using IPMI/LOM to "shoot the other guy in the head") I am sure there are much better solutions, and would love to learn them.
My favorite dedicated server hoster is ServePath (http://www.servepath.com). I designed their network a few years ago, and they treat me very well. My view that they're doing things right isn't unbiased. I've also heard good things about RackSpace and Affinity (http://www.valueweb.com), but I've never actually dealt with them. I also wouldn't be so quick to dismiss server virtualization, if you find a hosting provider you like who is doing that. True, you only get a fraction of the server's processing power, but the servers will likely be a lot more powerful than what Grex is running on now. From what I hear (no direct experience yet), it's hard to tell the difference from a distance between a real server and a virtual server, and a hosting provider will probably be far more motivated to avoid or deal quickly with hardware problems taking down lots of customers' virtual servers than with issues affecting only a single customer. How to handle load balancing and failover depends on what you're doing. Simplest is round robin DNS with a low TTL, perhaps accompanied by a nanny script that watches to see if one of the servers goes away and removes its DNS entry. That's what used to be done in the old days before all those fancy load balancing boxes were invented, and is more or less what some of the fancy load balancing boxes do. (What I do at work is to scatter the servers around the world and source BGP announcements from the servers (google for "anycast"), but that's not very well suited for what Grex is doing.)
I need to amend my previous statement. I had forgotten about RackSpace; they have a good rep. I don't know ServPath or Affinity.
I guess I fail to see what all this "powerful" gains anyone.
While sitting in a hollowed out baseboard and chewing on some wires, I found an Ultra320 SCSI RAID board in what appears to be new condition. Do we want me to send it in so that when we reinstall, we can mirror the root volume on high-speed SCSI drives?
Do you have documentation for it? Is it hardware only?
It is pure hardware RAID. I would have to check the make and model (documentation should be on the mfc's webpage).
http://www.adaptec.com/en-US/products/scsi_tech/value/ASR-2230SLP/ Mirroring / and /usr on SCSI and /home and /var on Serial ATA would make a very nice, well-split-up, performant, capacious system.
Hardware raid is definitely what we want. I will look at this.
Just curious, I have started seeing 10K RPM Serial ATA drives. Does the increased rotational speed noticeably improve reading/writing of data? Is the increase in data access speed a direct function of the rotational velocity of the center spindle? Presuming it does, is this a real bottle neck that we would face, or do 7200 RPM drives get to the data fast enough that choke-points would be elsewhere in the system? I guess my real question is "would we get benefit enough from 10K RPM drives to justify the higher cost versus 7200 RPM drives?".
Yes, 10k RPM drives have higher I/O performance than slower spinning
drives. They also tend to have a lower capacity and are more expensive.
The rule of thumb I usually use to calculate I/O performance is:
RPM/100 = iops
That is, RPMs divided by 100 gives you I/Os per second. Of course, I
mainly deal with fiber channel drives so this may be way off. Your
arrangement is as important as your individual disk performance too. A
RAID 10 array is much faster than a RAID 5 array, but sacrifices a lot
of storage space.
Thanks for the rule of thumb and for confirming what I suspected about RAID 1+0 vs RAID 5 performance (where I worked, we did not do RAID 5 except on rare occasion, and when we did, they didn't trust the grunts to set it up or maintain it, so I usually only saw RAID 1, RAID 1+0 or LVM/concatenated over multiple RAID 1 sets).
I've heard that these "perpendicular" drives at 7200 RPM are actually the fastest for most situations.
You have several choices: