At Saturday's meeting, STeve Andre proposed that Grex purchase hardware for the next Grex system now, and that the remaining development work be done on that system. Most people seemed to be willing to buy that idea, so there was quite a bit of discussion of what hardware to get. I want to move that discussion on-line. First, universal agreement was reached on using an x86 system, not a SPARC. A number of people strongly prefer an AMD Athlon over an Intel Pentium, and nobody really objects to this, so we are likely going that way. There is a lot of concern over quality. I believe that in recent years the PC marketplace has shifted from competition based on performance, to competition based on price. It used to be that new desktop machines held price steady at a bit over $1000, while the performance steadily improved. But lately the prices have been falling (while performance has still steadily improved). This has placed substantial pressure on all manufacturers to cut cost where they can - power supplies and cases have been getting crappier, mechanical components of drives have gotten less reliable, and so forth. The feeling was that this trend had impacted a lot of companies that used to produce good stuff. Dell's servers, for example, aren't as solid as they sued to be (though they are more powerful). The best approach to acquiring a good new computer was to carefully buy separate components and integrate it ourselves. STeve Andre is likely to take the lead on this, though there are other staff members with plentiful experience building systems (Dan Gryniewicz, for one). STeve brought to the meeting a draft suggestion for a system. He is still working on refining this. His suggestion was: Athlon XP 2800 (I think this is 2.2 GHz) - about $400 Motherboard - STeve wants to buy two, keeping one as spare. I don't think a particular model was discussed. About $145 each. RAM - buy lots. It's cheap. Say 1.5G for $270 or so. Case/Power Supply. STeve like Antec. About $250. Misc parts, fans, etc. STeve wants lots of cooling. About $100. NIC - STeve likes Intel. 100 mbit. $33 SCSI controller. Ultra 160 at least, ultra 320 if possible. About $200. SCSI drives, two 18G ibm. About $142. CD rom, floppy, this and that maybe $250. Adding up to around $2000. STeve also include in his list a monitor and keyboard, but Dan says he can probably donate these. He also suggested an 80G IDE drive for about $100. This has lower performance and reliability than the SCSI drives, but is fine for stashing non-critical or rarely used data. With this, and various additional slough factors, we were mostly talking about something in the $2500 range.547 responses total.
The spare motherboard is so that we can have two *identical* motherboards - often the "same" motherboard a month later will have some minor revisions which can cause problems with existing software configurations - I've noticed this as well as STeve.
You know, if we ordered two motherboards from the same vendor at the same time, I wouldn't be too amazed if we received two that were *not* the same. It's probably worth specifying when we order them that we want identical twins.
Incidentally, I'm not sure which Antec in particular it was that Steve wanted, but you can get an Antec (SX1040BII) case with 400 watt power supply at CompUSA for $120. I have this case, and it's a wonderful case. It's a full tower, and easily fits my dual-athlon setup in it. It has good cooling (4 80mm case fan slots, comes with two fans, I have three), is solid, and the power-supply has been like a rock. I can understand if we're not sure 400 watts is enough. Monitor and keyboard are not an issue. I have several of each I can donate. As to motherboards, we might want to consider 64-bit/66Mhz PCI, as that will give us much better performance out of our SCSI, especially if we get Utra 320.
Thanks Dan.
Now linked to Coop as Item 176; Garage 147
I suspect the board will start the process of buying the hardware for the new Grex at the meeting on Thursday. So if people have strong opinions on what items we should buy, they should speak up soon.
One think I would suggest is a rack-mount case. While it's a little more expensive, it's also probably a little more rugged and can easily be fit into a colocation facility if, at some point in the future, that becomes desirable. I would suggest that, as part of this, grex either move out of the pumpkin, or try to do as much as possible to make it a more habitable place for the grex machine's. In particular, the descriptions I've heard of it like it's just too hot during the summer. I suspect that has a lot more to do with any of grex's system reliability problems than any concerns of component quality or load.
Grex has had extremely low hardware problems in the Pumpkin, though.
Nearly all new hardware supports internal temperature monitoring. If OpenBSD supports this, we could monitor the CPU core and case temperatures and see if they really are reaching unreasonable levels. That would, in my opinion, be a much better indication than the ambient room temperature. I know Linux supports reading most sensor chips via the 'sensors' package, but I don't know if OpenBSD has support for any of this yet. I agree a rack-mount case would be a good idea, but I don't feel too strongly about it because it would be relatively easy to shift the hardware into a rack-mount case later. What brand of motherboard are you thinking of using? Abit has had a lot of problems with defective capacitors lately and maybe should be avoided. I'm not sure who makes the best AMD boards right now.
Jim asks if grex would be interested in his basement once he gets it insulated, and the house rewired. He could have a separate entrance accessible at all hours. Might be a few years though.
Grex's hardware reliability problems in the past few years have mainly
been:
(1) DSL line flakiness. Almost certainly not heat related.
(2) random power weirdness. Almost certainly not heat related.
(Unless
you count all air conditioners in the state of Michigan.)
(3) random disk failures. These probably are heat related.
(4) weird modem problems. Wide range of potential causes.
The only one of the 3 we can control is (3). *However* -- we've gone to
some effort to secure the best cooling we can given our environment.
Some of this has included the use of extra large enclosures and a fair
amount of extra room. In a colocation, we'd have much less
space--smaller enclosures, less room, etc. Right there our improvements
go out the window. No doubt things are much better in NJ, but here in
SE Mich, it's not hard to collect interesting tales of various
colocation heating and cooling disasters. Backups are important - and
we definitely want to maintain our current advantage in terms of making
removable tape backups; this isn't just for disk failures, but also
covers floods and fires (both known risks in the local colocation
market) and vandals (a special and unique risk we also have to deal
with, which makes mirrored disks, normally a useful backup strategy,
much less attractive to us.) When you measure up cost, cooling, and
backup convenience, the pumpkin suddenly starts looking a lot less bad.
I said that our disk disasters probably are heat related. I suppose I
should expand on that. We've had several failures. We used to have
lousy disk enclosures. Eventually, we resorted to using box fans. It
was noisy and crude, but worked. We eventually got better disk
enclosures. Those have been basically adequate. We have been luckier
in our failures than perhaps we deserve -- our failures have generally
given us notice, often show up during backups, so we've generally been
able to simply restore that last backup. Some have shown up as heat
sensitivity - letting the disk cool often eliminates the errors (at
least long enough for that last backup). In at least one memorable
case, the completely dead disk turned out to simply be packed with dust
-- cleaning it throughly resulted in proper operation, although we got
nervous and replaced it before it had a chance to turn traitor on us.
I'd like to think our luck is mostly due to backups, observation, and
paranoia. But, to the extent that heat has played a factor, it may have
actually worked in our favour, although I'd hardly recommend it as a
good strategy.
Keep in mind that we're running mostly used disks of elderly vintage,
and basically running them until they give up the ghost This strategy is
guaranteed to eventually produce 100% mortality -- but it may
paradoxically produce more reliable storage meanwhile than constantly
purchasing new disks even though most of those won't fail before being
replaced. Perhaps this just shows that you can prove anything you want
with statistics.
I'll be the first to admit the pumpkin is far from perfect, but even so,
I'd have to say that in terms of dealing with disk disasters, it still
comes out way ahead of what we could manage for colocation deals. If
you're looking for that huge advantage that's bound to convince us to
move to colocation, this isn't it. The certain convenience in terms of
access and space is known to us.
Sindi - we really want Grex to be in a neutral property, instead of someone's house.
Besides, it would take Jim years to get set up to draw the copper wire for the cabling. And we don't want to run Grex on a refurbished 486, powered by a bicycle generator.
Regarding #11; Well, you mention several things that disturb me. Notably, dust and heat conditions in the pumpkin. If you're going to stay there, I suggest you make an effort to mitigate those to whatever extent is possible. Perhaps that means putting in a wall-mount A/C unit, or a bigger one if necessary; perhaps it means putting in a humidifier to keep down on dust; perhaps it means going over the whole room with a dust mop; perhaps it means throwing out old yellowing sheets of paper that have no further importance; it almost certainly means going with a new, server grade case with adequate cooling. Perhaps it also means something else. I don't know, but it strikes me, and has been stated by several others, that grex could do a bit better to make sure the conditions in the pumpkin don't kill your new servers. I have no idea what the colocation facilities in New Jersey are like, since I live in New York City. Here, our colo facilities are, umm, quite different from the way you describe your options. That's fine, but if putting in a wall-mount A/C unit and giving the pumpkin a thorough cleaning and removing a bunch of garbage from it will help improve grexes chances of not having a disk failure, I'd say go for it. In fact, that's all I'm saying.
How is the new system going to be financed? Might it make some sense to look at how much money is going to be available? Is Grex just going to write a check for the amount of the new computer? I don't see a tape drive listed. The computer I just ordered can have 2 GB installed. Whatever Grex gets, it'd seem to me to make sense to max out the RAM.
Indoor dust comes from people - the Pumpkin is quite dust-free, actually. My guess is that the bulk of the dust in that drive came from its previous life.
The pumpkin does not have windows. The owners might not appreciate a hole in the wall made by grex for an air conditioner.
Right, I think a wall-mount AC unit is not an option in the Pumpkin. Re #15: We plan to have a fundraiser to help pay for the hardware.
Ideally it would have been nice to have a fundraiser and buy hardware based on the money raised plus what we have already set aside for upgraded hardware. But instead what we have is a bit of a time crunch. Staff has time to put this together, nowish, but a big chunk of the work needs to be done before May. So instead of fundraiser first, purchase later, we are going to make a leap of faith that the users will want this badly enough to donate what they can, and get the project started. Do folks think this is a reasonable thing to do?
Wasn't there already a fundraiser for the last grex hardware, which ended up getting donated instead, plus a $1024 donation for new hardware that has not been spent yet?
Re #11: I'd also add that modern disks tend to run cooler. I bet the Pumpkin will be considerably cooler when Grex's old hardware is retired. Good airflow should definately be a consideration when picking a case, of course. Thanks to the overclocker market, you can now get cases with truely awe-inspiring numbers of fans. Since noise isn't much of a consideration where Grex is, we should take advantage of that. Re #15: I'd guess our current tape drive will work with the new system. If I remember right it's an external SCSI drive. These are quite standardized; it'll just be a matter of the right cable, most likely.
Re #20: In 1998, we had a fundraiser for spare parts for the current Grex machine. Then most of the spare parts we needed were donated to Grex, so we asked everyone who had donated what to do with what they had sent in - some of it was refunded or converted into membership dues, the rest was converted into miscellaneous donations. The $1024 which is currently in the infrastructure fund came from a single user in 2001. Its purpose is indeed to upgrade Grex's hardware, so the goal of a fundraiser would be to add to that fund.
Why is there a deadline of May, and what has to be accomplished by then? Is the goal or plan to get Grex actively on the PC machine by then? Why not start the fundraising plan now? I bet we could get at least some idea how much money will be available by the time of the next Board meeting, if people are asked for pledges. If there's a lot more (or less) money coming in for the upgrade than what's expected, it might affect what would be purchased.
The board meeting is tomorrow. I decided to wait until then to give people a little time to discuss what hardware we'd like to buy. I expect to start the fundraiser on Friday.
Oops. I thought you meant next month's Board meeting. I'm not objecting to any deadlines, by the way, just asking for further information about them.
I agree that we should buy the system as soon as we can, since we have the cash on hand, and then ask users to help pay for it through fundraising.
I think buying hardware now is wise.
Here is the second cute of the list. Sorry I'm so
late.
Approximate cost of a $2000 i386 Athlon box
--STeve 2/27/03
Here is the second cut of Grex's future hardware.
Quality always wins out over price. Given the state
of the world of computers at the moment, spending
extra makes sense. The amounts listed here are pretty
accurate, but things change.
Here is a short description of each item.
CPU - We want an Athlon over a Pentium. There are lots
of reasons for this; the easiest is that they simply
perform better. Given the cost differences, getting
the fastest one possible is reasonable.
Motherboard - We want two motherboards. Why? If the
motherboard fails we'll have an exact duplicate, and
be able to get back online *knowing* that we have
the same exact motherboard. Several times now, I have
been bitten by "small" changes in motherboards, ie
small revs on the artwork, etc that have made some
noticeable difference in the operation of the computer.
For $145 extra its an excellent investment. The other
parts, ram, disk controller, etc aren't nearly as
persnickety; we can get more of those on a demand
basis.
Ram - Ram is *cheap*. Having hordes of ram means that
we won't run out, and can work on some optimizations
like keeping binaries in ram disks, etc.
Box/power supply - THIS IS IMPORTANT. The power supply
is one of these overlooked thins which all too often
is of dismal quality. Antec goes against the grain
and still builds good power supplies. Spending half
this amount for a cheaper box is definitely possible
but is not reasonable.
Misc case fans - Heat is the killer of systems. I want
to be able to get several extra fans and over-engineer
the machine. It's cheap.
NIC - OpenBSD likes the intel nics. I've had excellent
results with them. This is a 100Mbit card. If we go to
1G, we'll have to do some research on this, and spend
more money. Good 1G cards cost a lot more than garden
variety ones do, so I'm assuming 100Mbit at this time.
SCSI card - We want SCSI over IDE for our main disks.
The card listed is a ultra 160 speed card. We want to
try and go with an Ultra 320 system if possible. This
is the biggest unknown for me at this time. The card
listed is Ultra 160, not a slouch.
Floppy/CD-rom - standard items; I like Sony stuff.
SCSI disk - I like IBM disks; when kept cool they are
excellent disks. These are Ultra 320 disks.
IDE disk - Perhaps not needed. This would be very
useful for file storage, such as /usr/local/src and
other things that we don't use often.
Monitor - I am really beginning to hate most of the
monitors out there. The viewsonic is OK and not too
pricy.
Keyabord - generic keyboard, perhaps found for less.
Misc items - Those things like cable wraps, cables,
etc.
Item Price Source Comments
CPU - Athlon XP 2500 $180 Newegg
motherboard $145 Newegg
motherboard $145 Newegg
RAM 512m $89ea $300 Crucial 3 512M, at $100ea
antec box/ps $250 Antec
misc case fans, etc $100 Antec
NIC Intel 100mbit $ 33 Newegg
SCSI card adap 29160 $194 Newegg
floppy (sony) $ 18 Newegg
CD-rom Sony, 48x $ 51 Newegg
SCSI disk ibm 18G $142 Newegg Ultra 160 15K rpm!
IDE disk IBM 80G $ 98 Newegg
misc items $150 misc
sub-total $1948
One question about the disk subsystem; why not go with a SCSI RAID card, and do hardware RAID? OpenBSD supports quite a few RAID controllers these days, and a RAID 5 setup would be a good investment.
I would recommend Tyan or Asus boards I have always had very good luke with them in the past.
hahah luke=luck
The pumpkin doesn't have any exterior walls. Cutting a hole in the wall is only going to get us into somebody else's space. I don't think we've had any other drives die from dust. So Scott may very likely be right that this was mostly dust from a previous life. While the space isn't ideal, I think it's far from the deathtrap some people here seem to assume.
One can only go with what one's been told. Having never seen the space myself, I can't really judge except by what others have said. From what others have said, it runs too hot during the summer. Perhaps newer hardware will fix that; I really don't know.
I think we have pictures up on the web somewhere.
http://www.wwnet.net/~janc/grextech/pumpkin/
I wasn't aware that those pictures conveyed the sense of ambient heat in the room.
re resp:28: STeve, you only listed 1 SCSI drive. Doesn't Grex want two? $51 for a CD ROM sounds like a lot to me. I'll bet any number of us have one or more around that Grex can have for free. Same with floppy drives. Is there any disadvantage to getting such items donated? Neither would be used often. You didn't list a price for a monitor. Isn't that another item Grex could easily get donated? I'm assuming few people use the console for anything other than booting the computer. I wouldn't expect Grex would even require a color monitor, if a mono VGA one can be had any more. My point is that this is one more place Grex doesn't need to spend any money. No arguments on such "invisible" parts as fans, power supply and memory; the things no one will ever notice unless there are problems. It is not possible to have too much or too good of such things.
Hey, while you're at it, why not get a PCI multi-port serial I/O card and ditch the terminal server? They're only a couple of hundred dollars.
Re cooling: Is there anything *below* the pumpkin? ISTR it's in the basement. Maybe the floor can be used as a heat sink, if one is really needed. Definitely get RAM, double whatever you planned to get. And make sure the motherboards can handle as much as possible. A 32 bit processor should be able to directly address 4 gigabytes.
Steve likes new parts and he is putting in the actual time. We offered TWO 40X CD-ROM drives and we have several high-quality floppy drives which he turned down. I would not try to argue with his personal preferences considering he is the one going to be doing the work. Make him happy, it might get the new grex built sooner.
Something to budget for might be some flanges to attach to equipment fan outlets, and flexible tubing (dryer hose) to duct exhaust air from equipment to whatever passage carries air out of the Pumpkin. Probably wouldn't be more than about fifty bucks.
Again this would mean a hole in someone else's wall. If grex is surrounced by air conditioned offices, it might be sufficient just to reduce the amount of heat generated.
Heating is not going to be a major issue. The Sun-4/670 monster consumes several hundred watts of power and uses 35w SCSI disks. The new system will be at least 100w less than that (more like 200w I'll bet) and uses disks which will eat less than 10w of power. The Pumpkin isn't great, but its certainly no worse than the Dungeon was, and actually better than Ken's warehouse was. With fans at the side of the case we'll be OK. The heat that we won't dissipate with the new hardware will make us cooler, too. I think we need to worry over the power more than heat, and we have the Leibert UPS for that. All in all I'm pretty happy with the physical condition of the pumpkin. As for random hardware peices, I'm willing to use things like donated monitors and keyboards, because if they fail the system won't be affected. Things like CD drives are a different matter. We need to get good NEW equipment here. Moving onto a new platform is a major effort for Grex staff, and I simply do not want to skimp on anything. For one thing, PC equipment is now just about commodity stuff. It isn't that expensive, and while finding high quality stuff can be a little drustrating at times, it's out there. The second and more important reason to be picky about things is to try and prevent every kind of hardware failure that we can. Getting used equipment is more of a gamble and more prone to failures. I could supply Grex with everything but an Athlon and motherboard, but Grex really doesn't want to go the cheap route. We're going to do this the right way. The CD drive might not be a little used item, too. There are schemes that we will investigate where we may boot off the CD, and having that fail would not be good. So it isn't a case of making me happy, but building a system that is as reliable as possible. These days is means building it yourself, using the best possible parts. Grex has never used inferior equipment, and it shows. Our Sun-4/670 is a marvelous beast. I can only hope that our i386 platform will be as reliable as that has been.
But STeve, getting things working is what *makes* you happy! I'd assumed the CD ROM would only be used for installation and then never used again. I don't mind Grex spending money on anything that will make any difference. I understand your goals, and support them. This is going to be a single CPU? No firewalls or mail servers? I know little about either of those things, but am curious as to why.
If there were TWO floppy drives and one failed you would have a backup. Are two currently good used ones as good as or better than one new one? I have never had a floppy drive fail and I use mine constantly. But we have been given machines with bad drives.
Regarding #45; This isn't 1987 anymore; grex isn't going to store data on floppies. A floppy drive is about 10 bucks; a CD-ROM about 30. That's not going to break the bank, so why bother worrying about it?
Steve's figures were actually about double that. But I am seriously suggesting that two used drives would give more reliability than one new one - put them both in the machine and you have a backup that does not require opening up the case. Same for CD-ROM drives, build with two of them. Or does this complicate the software?
Sindi, you aren't getting what I'm going for here. I want the next hardware to be as reliable as I can make it. I *don't* want to have a backup for something because I think I'll need it; we will have a backup for the motherboard because of its unique nature in the list of hardware. I want to get equipment which is new, that we can break/burn in, and have the highest factor of confidence in. If we start using used parts here and there, we might be OK. It's certainly possible that nothing would go wrong, etc. But if I can get new parts for a piece of Grex which increases its reliability then its the right thing to do. You'll notice that I omitted the keyboard and monitor from the list of things to buy. This is because of either fail, they won't affect the running of Grex immediately. We can boot the system up without a monitor (indeed, nost of my OpenBSD machine have only had a monitor on them during the initial install and upgrades), and we have spare keyboards in the pumpkin already. So those are of a nature where a failure means we have to drag something over to Grex and use that instead. But other components--including a CD and floppy--are different. Those may not be used a lot, but when they do need to be used, there is a fairly high chance that they are really critical. And, if we adopt a system where Grex boots off the CD, that becomes even more critical. The bottom line here is that Grex needs to run as well as it possibly can. We don't have a full-time staff to work on things. Any real disaster means that someone is going to spend quality emergency time working on the system rather than sleeping, working, or living their life, and I'd like to see that be a minimum. By using the Sun-4/670 we've done about as well as possible. We have a commercial quality system in terms of its reliability, and I'd like to see us keep that level of on the new hardware.
As for having two floppies in the unit, that just atrikes me as "wrong". If something like a floppy, which costs so little is seen as being untrustworthy, it makes more sense to me to buy a good one. Again, the idea here is to do it "right". Grex may be a hobby, but a large number of people see Grex as a utility. We should not cur corners when we do something right. The one large advantage of the i386 platform is that most parts are fairly cheap. We have the possibility of doing things pretty well for not a huge sum of money, so I want to do that.
Consider booting from compact flash? I *think* I've seen a 512K one at Best Buy for not *too* much.
I would suggest a removable IDE bay. IDE, because it's cheap. Removable, because big IDE drives are downright cheap these days and we could use the same bays for doing backups. Just dd copies of the other partitions to the removable, and take it home for off-site security.
Looks good to me. Re #50: That wouldn't be very expensive, but I don't know as it's worth the effort to set up when you can boot off CD and there's already a CD-ROM specified. You also tie up an IDE device position with a CompactFlash adapter. Additionally, CompactFlash is pretty slow as IDE devices go, and if I remember right putting a slow device on an IDE cable also forces the other device on that cable to run slow.
A lot of thought has gone into this list. It looks good to me, too.
Re 45: if two floppies, the second DOES NOT back up the first. The only purpose for floppies on servers these days is interruption of the boot controler, and only the first one is bootable anyway.
My CMOS has a setting to switch A: and B:, would that help?
No, because there's no point in doing it. If grex's floppy drive dies (why do they even *need* a floppy drive, by the way? Just boot off the CD-ROM), the correct way to fix it is to go to the store and buy a new one, and chuck the old one in the nearest trashcan. Not pull out bubble gum and string and coax it into ``working.''
I suspect they're installing a floppy because floppy drives are really cheap, and every once in a while it's handy to have one. That's why we ordered them on the machines at work.
We have been quite succesful in doing things the 'incorrect way' and recycling used equipment rather than using the trashcan. Have not needed any bubblegum. Cleaning a floppy drive every few years is probably less time consuming than a trip to the store.
Cleaning a floppy drive for two people who like to spend their time doing such things is less time consuming. Taking a system with >45,000 accounts offline for an hour to take out the floppy drive and clean it is probably a lot more time consuming, since you're no longer just wasting two people's time.
Does it take an hour to clean a floppy drive? There are cleaning diskettes that you can run without taking out the drive.
Sure, but then somebody has to *buy* a cleaning diskette and make sure it will be easily available 2 years from now when we actually need it.
I think we all appreciate that you're trying to save Grex money, Sindi, but I think the decision has already been made to use new hardware this time. Our system administrators are volunteers, and if they don't want to spend their time cleaning floppy drives and otherwise trying to nurse elderly hardware back to health I think we should respect that.
I already said that I think STeve should do this however he prefers. I thought we were talking theory here. I don't throw out my clothing or dishes when they need cleaning. And I do have two floppy drives in all my computers, which is handy when one needs cleaning (every 3-4 years, with heavy daily use).
You make heavy daily use of your floppy drive? I'm not sure I've ever used the floppy drive in my current computer.
Floppies are a dying media format. Current trend is for them to be optional rather than standard equipment.
I used them fairly heavily until I started networking everything together. Now it's rare for me to use one. When I have to I'm struck by how small, slow, and unreliable they are.
I actually still use mine a fair amount, for carrying files between home and my two workplaces.
I use mine to carry small stuff back and forth to a friend's place. Anything huge gets burned to a CD. I did indeed see 512M compact flash cards at Best Buy - retail for $179. This is almost as big as a CDROM, and costs not all that much more than a top o'the line burner. Advantage: no moving parts. Possible drawback: I don't know whether there's a hard write-protect switch. The idea of booting from CDR or CF is that the boot media should contain only the memory resident kernel, plus a .gz of the root filesystem to be used. It should expand to at least a gigabyte, possibly two depending on how good the compression is. In this manner, the system should run blindingly fast with most of it on a Ramdisk, and hard drives only used where absolutely needed. In addition, you get proof against root system corruption, as whatever hacks are committed get undone with a simple reboot. As for CF slowing down other devices, most modern boards come with two IDE chains, and besides, the hard drives are to be Scuzzy.
I don't think CF cards have a hardware write-protect switch. At least, I've never seen one that had one.
What is grex planning to use a floppy drive for? Most of the files that I produce are translations of under 20K and they fit nicely on floppy disks. I use one disk per translation agency. My 360K disks are 95% reliable (I lose maybe 1-2 per year). I don't trust the higher density ones, they are always going bad. I move files between computers on the 720K disks. We have several computers at each of three locations. This is quicker and easier than uploading and downloading via grex or an ISP. Most of our DOS files fit on 720K disks. I have a little file splitting program that I have used on 2M files (such as a Linux distribution). A page of text, single spaced, is about 2K. (page = screenful). I don't recall ever having a floppy drive go bad, they just get dirty and won't read all the disks after a few years. We have been given computers with bad floppy drives (they will do a dir but won't read a file). To replace them you remove the monitor (unless it is a tower), unscrew a few screws, remove a cover, unscrew a few more screws, unplug two cables, replace the drive, and reverse the process. Total time perhaps five minutes. Before actually replacing the drive you can plug in the new one and make sure it fixes the problem. In 50 Borders computers that were really heavily used for many years and were full of large dust bunnies there were a lot of bad floppy drives. Perhaps the dust causes them to overheat. It might be a good idea to vacuum out the grex computer once a year. You take them outside and blow the dust out.
Just some random thoughts. IBM is no longer in the drive manufacturing biz and hasn't been for some time now. They are sweet drives and stand up to lots of abuse and it sure wasn't for technical reasons IBM is out of that biz. Be careful of older 18G drives tho-, (sold as 'new' in 3rd party channels). Adaptec is having financial problems, I seems to recall they just laid off a lot of people recently. It kinda doesn't matter if you don't expect support in the future anyway, but kinda makes you wonder what manufacturing shortcuts might have taken place in the recent past. Nothing bad to say about AMD CPUs - more integer 'pute bang for the buck. (We are talking about delivering text to the Internet over a slow pipe, not massive floating point calculations). I'd like to see some analysis of the current hardware and where its bottlenecks are. Seems to me that the huge theoretical performance allowed by Adaptec/fast scsi drives is still limited by the transfer rate of the PCI bus. Thus the delivery of little chunks of data that constitute web browsing of raw ascii text (the primary function of grex) might be equally well served by much larger cheaper IDE disks. And its easy to get 16 IDE disks in one system. (You were talking only two screamer scsi drives right?) Fat disk trying to send data to slim pipe? It doesn't matter if the PCI controller can read the screaming disk as 166/320 in theory if the best it can deliver is 66 and with most MBs 33. More memory is good. Its cheap and a whole lot faster than even the fastest disk. Don't just buy a spare identical motherboard which sits on a shelf gathering dust until it is needed. Take the next step. Invest the little extra in case/pws/cpu/memory and put it online at the same time. Look at clustering/HA/failover/distributed computing. Seems to me there are a number of well developed applications. Look at what you are doing. Sure writing a post to the disk is atomic but reading sure isn't. There is no reason that reading an item via a web page (probably the vast majority funtionality) couldn't be a distributed task. Why BSD? I used to be a BSD bigot where serious stuff should be done on it. But now Linux, with major players spending tons of bucks, is where everything is happening and backported to BSD et al. With linux you have things like MOSIX. Any thoughts of approaching name hardware manufacturers to donate for the tax right off? I seems to recall that it has been mentioned in the past. Why spend yer own money in the first place? So you don't get the latest and greatest, but you get it for 'free'? Just my 2-cents worth.
Pvn, care to elaborate how you get 16 IDE disks in one system? At two disks per channel you'd need eight controllers to do that. One will be on the motherboard, but that still leaves seven, and most machines just don't have that many PCI slots.
M-Net looked into asking for donations for a server, several years ago. I don't recall Grex ever doing so. I don't think processing power will be any problem with a modern PC- based system. M-Net has never had problems processing the amount of data they required, and Grex is not that substantially different. M- Net is still very much usable with load averages into the 20s. In fact, as a user, one doesn't even notice such loads. I don't think we need to worry about distributed computing just now. Distributing the computing load would add complexity to a process that's already going to be complicated for the staff. I'd hope someday we can get a mail server, but let's let staff get the new computer up and running first. Then they can add frills. We don't have to have it all at once.
Re #72: Err, sorry, two will be on the motherboard, generally. Still, machines with six PCI slots aren't terribly common, and you'll use up a lot of precious IRQs...
Good point. SCSI needed.
Grex has had equipment fundraisers for hardware before. Since we've previously gone with trailing edge CPU's, previous fundraisers have been for memory, hard disks, etc. I don't know if IDE commonly supports overlapped seeks yet. With only 2 devices per channel, there of course less advantage to overlapping seeks, but all other things being equal, a 2 disk IDE chain that can't do overlapped seeks is going to perform less well than a 2 disk SCSI chain which can. With small block transfers, overlapped seeks and more spindles per given capacity (ie, smaller drives) may be more important to us than transfer rates. There are 2, 4, and 6 channel IDE controllers. A 6 channel IDE controller can attach up to 12 IE disks using one PCI slot. I've heard people claim that some of these mega channel IDE based systems are very fast disk machines, and that it even makes more sense to do software RAID than hardware, on account of the CPU having so much more memory to buffer things. I don't know how much truth there is in all this. One difference that is likely to be important to grex is that SCSI drives today typically go into "server" machines. IDE drives available via retail channels are most commonly going into desktop machines. There is a real split in the PC x86 world between server, desktop, and home machines, with a corresponding descent in quality (and reduction in reliability) between the three. This is a recent development, so I'm afraid Sindi won't have seen this in any of the machines she sees. Since SCSI drives mainly go into server class machines, there is a chance they'll be more rugged and reliable.
Looking at the prices you guys are looking to pay for stuff, it really seems like you're spending a hell of a lot more money than you need to. You can get a 54X CD-ROM brand new for $28 at Sky-Tech in Ann Arbor. And it's not some cheap knock-off drive. Floppy drives cost no more than $14.95 locally, and you don't have to pay shipping. A lot of those costs that were listed really are inflated for what you're getting.
re#76: IDE drives are currently optimized for large storage (windoze bloat) and sequential access at high rate. (Speaking really generally, and it seems to me its been years since there was any increase in rpm (like 7200 has been around for awhile now and that is tops)). I absolutely do not disagree that SCSI drives have always been far superior at chunky random access. However, the reason IDE is not generally seen in the server class machines is because it is not hot swappable - its not a problem or question of being less reliable or lower quaility manufacture. Indeed I think WD for example has a 5year warranty on drives and I don't recall any SCSI manufacterer offering more. The major reason IDE drives are so much cheaper is economy of scale - you sell a ton of them for every SCSI drive. PLus the IDE drives really are stupider, although that advantage has gone away over time. What you gain by using really fast SCSI drives you lose by going with PC hardware (remember what PC stands for). Your motherboard itself is suboptimal as a 'server class' machine in the first place - and I think this is an arguement you know full well. That being said, the pure integer computes that modern PC CPUs deliver overcomes that by brute force - so what if you "share IRQs" when you can do it so fast. So for grex the PC motherboard is sure appropriate. Steve is theoretically correct in that if grex were a server delivering indexed 36G of data - such as a database - over a fast media such as GigE then absolutely the SCSI solution is the way to go. I'm just not sure that is a good model for what grex actually is or does. With a couple gig of memory I wouldn't be surprised if most browsing of conferences wasn't satisfied by memory read in what case disk speed is irrelevent. I'm also sure that the use of SCSI won't hurt, I'm just not sure it will help as much as folk think. Again, to your users you are delivering ascii content over a thin pipe. re#72: And as mdw already pointed out, yer modern MB already has typically 2ide(2IRQ) for 4 drives total. And there are PCI IDE controllers that can be add ins (typically sharing IRQ). I personally run one 'server' that has a total of 8 IDE drives (cheap Maxtor add on controller). Theoretically I could easily be serving close to a TB of disk. (In fact I'm running mostly a bunch of 500M drives that I got from the trash of a firm that decided the proper method of data destruction of old drives was to toss them about 50 feet across a room into a dumpster. Of the 20 or so drives I 'dived' 17 were apparently good (data was probably intact although I simply built linux filesystems on them). My big difficulty was to sheet metal screw together the frames of two old 'tower' cases front to back and saw off the opposite sides of the cases in order to have enough bays for all the drives - that and the y-cables...it sure don't look too good, but it works and has now for on about 2 years at least.) re#73: HA/Clustering/whatever you want to call it has been around a long time now. This is no longer rocket science. Once set up there really isn't that much more to do especially for something like grex where generally the users all do the same thing they ever do, over and over, and over again. The advantage of spending a little more time on the front end is that your single point of failure becomes your upstream connection (which is a significant POF in my opinion but one that you shouldn't bother to address - nobody dies if they can't get logged into grex). The other advantage is that it gives you the ability to do rolling backups and rolling upgrades. Unless you are a hardware maintenance organization the concept of having perfectly good hardware sitting gathering dust is silly. Again, its just my 2-cents worth and a couple minutes of typing based on years of experience fixing problems involving systems a little larger than grex. And if you are gonna have an identical MB on the shelf gathering dust, then you should make sure that you have at least 2 identical PWS as well. (And no, you really don't need to pay for server class hardware either. So what if grex is down for a couple days? You don't lose money and nobody dies.)
I'll put in another plea for a rack mount case. Even if you don't want to colocate somewhere now (a bad decision being made by refusing to look at current information, but not really worth arguing about at this point), it will keep that option open for the future. More to the point, the rack Grex already has could easily hold a rackmount PC server case, the DSL router, modems, a spare server for development work, the keyboard and router, and have lots of room to spare, freeing up the rest of the space in the Pumpkin for whatever it is that people think we need the Pumpkin for.
re#79: Rack mount cases are appropriate for a lot of things, grex isn't one of them. First, they tend to be far less forgiving of environment. Second, they tend to be a lot more expensive. Shelves for racks holding standard PC cases are a lot cheaper and give the same space saving quaility. Throw out the rack and cheap plastic shelving units perform the same function. If grex ever has to colo then cheap 2RU case at that time is probably better.
Regarding #80; what do you mean they're less forgiving of environment? It's been my experience that rackmount cases are far more rugged than your average tower.
Without a rack, rackmount cases are less convenient than mini-tower cases. With a rack, the rackmount cases become much more convenient. I just worry when I see people talking about getting a really expensive full tower case that the case will become a big limiter of future options. Full tower cases are fine when you've got one or two of them in a room (Grex's current situation), but they really don't scale.
Re #81: Rack mount cases are very restricted inside, which means cooling is more difficult and the ambient room temperature is much more critical. We have some 1U rack-mount servers at work. They have about five fans each, and the air that comes out of them is pretty hot. If we ever do decide to go with colocation, the hardware could be moved into a rack case.
1-U rackmount cases are a special kind of beast, requiring special components to go inside (normal PCI expansion cards don't have room to stick vertically out of the motherboard). 1 U rackmount cases certainly aren't the only kind of rackmount case out there.
Here we go, re-arranging deck chairs on the Titianic --- If you want hardware that will replace the current in its current environment then cheap commodity PC stuff is the way to go. My only question is if high end 'server class' SCSI drives on a PC platform is the way to go, nothing more. (Is is RU or U? I'm not clear.) Point being if you don't have a reasonable temp environment than rack mounted is not the way to go. Which is meaningless drift. I don't think anyone is suggesting spending the bucks for stupid racks instead of PC boxes. I merely suggest two things, that one reconsider SCSI in the first place and the second is that one budget for power supplies and have them on the shelf at least.
Just to give some of my experiences: with PC hardware, you want to have your /var/mail (mail spool) and swap on scsi disk. The rest tends to be less relevant. i "skipped to the end", so if backups haven't been discussed, i suggest grabbing some cheap disk and having hot-backups available on already spinning media in the same room. I've found this invaluable in my environment going across an x-over ethernet cable. you probally want daily (or even hourly?) backups of /etc (hourly of /etc/passwd perhaps?) in order to allow for easy recovery. I might be able to donate some hardware towards this.
re#86: With a lot of RAM why worry so about swap space? And in grex's case - all data at best over a thin pipe - with proper tuning even a micro$oft OS can keep up with email over DSL or Cable speeds using IDE drives (hopefully you don't have all that much local email over even 10mbs ethernet). With high density IDE drives you have a lot of spare space to do lots of backups. And with mirroring IDE controllers or even software RAID you have a lot of fault tolerance. Even with neither and simple more big disk (JABOCD) you still got lots of fault tolerance if you put your mind to it. As for RACK cases. That dog don't hunt. If you need to cram a lot of CPUs into a small expensive space with climate control then rackmounts are sure and the way to go. Even in a modern office environment with HVAC and cleaning services rackmount is questionable. I don't know what the grex current physical environment is but I bet its far more 'dirty' and with a much higher range of tempurature than a modern office. More like 'home' and thus a cheap conventional PC case with lots of 'dead space' is the way to go for that - big fan and lots of room for dust.
re #87 Virtually all modern unices swap out unused processes and with a high smtp and other load on the system you will see continued need to swap out a few processes as things are being used more efficently as disk buffers. I'd rather have my shell process be swapped out while i'm in bbs in order for caching of the password file for background smtp delivery instead of keeping it in memory and making the password lookups slower.
Here are a few comments from a perspective that a) is West Coast, and b) has developed over the past 25 years of building and operating large-scale data centers on a daily basis. First, I live in Silicon Valley, and visit the surplus sources here basically every weekend with a couple of friends for entertainment purposes. We've been doing this for nearly three years now. So, every time I hear about people making regular purchases at CompUSA, Best Buy, and similar national chains, I *CRINGE*. However, I lived in Ky. for 10 years or so before moving out here, so I understand how people get into that. Anyway, if Grex wants me to compare prices that I see out here with what's available in MI or via mail-order, let me know. I'll be happy to. Second, out here in Silicon Valley, the .com meltdown means that, literally, 200,000 jobs EVAPORATED. This has had an enormous impact on availability and pricing of all manner of commercial hardware. Example, I just bought what I refer to as my new "Not H-P" machine. It's "surplus." It has NO dust in it and is comprised of a 3.06 Ghz P4, 512MB of PC2100 memory, A Radeon 9700 Pro graphics card, on-board Ethernet, a 120GB Maxtor IDE drive, floppy, CDRW+DVDRW combo drive, a DVD-RAM drive, and a modem card. My Price? $1200. No Kidding. Why do I call it "Not H-P?" Because it was made for H-P and was an overstock item. So, H-P surplused and forced the OEM to paste pieces of plastic on the sides of the case where the H-P logo is normally visible. However, to anyone who has ever seen an H-P PC, it's OBVIOUS what it is. I saw an earlier mention of a "Liebert UPS" in this thread. I hope that means that Grex is in possession of what used to be regularly known as a "True Online" UPS. One where Utility AC power is converted by the UPS to DC, then BACK to AC so that the hardware connected to the UPS is fed power that is totally clean because it has gone through a complete AC->DC->AC conversion and therefore has a perfect 60 Hz sinewave ALL THE TIME. No surges, no sags, etc. The value of such a UPS design can not be overstated. It will prevent countless numbers of problems from ever occurring. To me, this issue is even more important than the details of the power supply and cooling within the case. Any dollars invested in a *real* UPS will last far longer than dollars invested in the computer itself. Finally, I would like to disagree on the subject of tower vs. rackmount cases. I'd vote for a rackmount unit. It doesn't have to be 1U. As someone mentioned earlier that has undesirable side-effects. But consider that with rackmount cases, you can easily get LOAD SHARING POWER SUPPLIES! Such a case will be equipped with two of them and the load can be handled by only one. Should a PS fail, the load is picked up by the surviving unit. THen, you just slide out the failing one and replace it. No rebooting, etc. John
There are tower cases available with hot-swappable power supplies, as well. It's nice, but given the reliability of power supplies and the non-critical nature of Grex, I think it's probably unnecessary. (Heck, Grex is often taken offline just to run backups.) It's worth doing if it doesn't cost much more, though. The big disadvantage I see, other than the cost of the case, is that you generally have to use 'special' power supplies then, instead of standard ATX ones.
'special' meaning higher price? Like the 'special' SCSI drives (36G total) instead of the 360G that one could get for the same price or less? I mean if you are going 'commodity' PC hardware for the MB why not go with commodity drives? If you want, for about the same amount of money as the SCSI drives you could do Fibre (using an obsolete controller I'll grant you) and kick SCSI's butt. You could theoritically have 1Gbps over a media that could theoretically deliver such over 10KM distance using optical fibre. Odd thing is even at 66Mhz 64-bit PCI they all seem to be about the same when content is delivered over yer average Internet connection....
Because your 'commodity' drives rely on the central processor for all disk I/O whereby using scsi offloads that to a seperate processor (on the scsi controller). But the performance gains from scsi are clear to be seen. On any system that gets the volume of mail and users as grex, you need fast disk for the day-to-day operations. swap, mail, /etc/passwd all take quite a hit. In my own personal mail/web/whatnot server once I made a recent switch to scsi from ide (with the exception of my truly mass storage, ie: /home and /mp3 partitions ;-) ) the system performance increased greatly. With the userbase the size of grex that type of benifit can not be ignored.
Re #91: "Special" meaning proprietary to the particular case manufacturer.
Re rack mount case. I think the cooling thing is a non-issue. There
are plenty of people who take short cuts on cooling. A tower case with
"bad" cooling is no worse than a rack mount case with good cooling.
Any inherent disadvantage rack mount cases might have is probably going
to be cancelled out by the fact that rackmount cases go into
environments where more is expected of them. *Neither* of these
cases--tower or rack mount, is going to have cooling anywhere near
equal our current sun hardware. That's a reality we've already
accepted by going to x86 hardware. I think for grex the real issues
are:
A rack mount case is going to be *slightly* more expensive.
(the estimate I heard was $150 higher.)
I don't think grex is actually likely to move into a
rack mountable space in the next 18 months
If we do move, the expense to buy another case and move our
guts is probably the least of our "moving" expenses.
If we do rackmount, we should probably get 2U [ which would
probably impact our rental costs slightly were we
ever to colocate. ]
We could certainly do rackmount in the pumpkin today - we have the very
heavy sun rack mount case sitting there empty today. It's even got
some very impressive fans of its own. I see mostly small disadvantages
to the rackmount (slightly more expensive case, more electricity for
fans) that doesn't quite equal the potential "advantage" of moving to a
colocation space where we'd have to be rackmountable. But frankly this
doesn't seem like a big point to me.
Perhaps we should commission John Doyle to find really cheap
rackmountable cases in SV. If he can get cases no more expensive than
good tower cases, then I think that make the difference insignificant
and worth going rackmountable. The negative to buying everything
surplus is much like our past bottom feeding habits - except the cost
is slightly more.
Jared is right that older IDE drives relied on the CPU to do
"programmed I/O". But this is no longer true, besides which there were
also even stupider SCSI controllers that also did programmed I/O
(mostly for the scanner market, so that's thankfully all been replaced
by USB today.) Basically, SCSI and IDE have been playing leapfrog with
each other, so today's fast IDE subsystem will outperform yesterday's
best SCSI. I don't think it's ever really been true that IDE drives
took fewer components than SCSI. The main win IDE used to have is that
it required fewer components *overall*; but I suspect this is both no
longer true (with ide dma) and no longer important (with the degree of
component integration we have today). What I think matters the most to
us is the relative markets SCSI and IDE aim for; SCSI aims for server
configurations, IDE aims for personal machines. Server configurations
are going to have greater demands for reliability and random I/O
throughput - at a price. We're going to have to pay attention to be
sure the avantage continues to be real, and that the price remains
acceptable. We will also have to accept that whatever we buy today
*will* be outclassed by something out next year - which will be both
faster *and* cheaper, and maybe even more reliable.
Marcus, I've noticed even modern systems that use the latest (E)IDE technology still see a considerable hit on the CPU for any disk I/O This is something that I think is important to keep in mind for Grex and plan to get a good price-performance ratio.
IF the rational for SCSI is reliability, then perhaps you should also consider mirrored IDE drives. (If at least one is in a removable bay, a backup could be as simple as swapping drives and letting the new drive be rebuilt. I haven't done this so I'm not sure about implemantation. The same would hold for SCSI but it will be faster -- and more expensive.) IF drive speed is a concern, get the 15K RPM U320 SCSI Drives. (I don't believe this has been mentioned, so that might be the plan rather then 10K rpm drives. I'm not sure if these are available in 18 GB denominations or just 36 and above.) Lastly, as bdh mentioned, IBM doesn't really make their own SCSI drives any more. I'm not sure if this is true across the line, but some recent 36 GB 15K U320 drives I installed were actually Hitachi drives. (You should be able to get IBM 18 GB U160 10K drives for about $100.)
It would be interesting to know what the CPU bottleneck is with (E)IDE these days. I sure haven't had the time to actually look. It shouldn't be DMA, so a good kernel profiler would be entertaining to run. IBM sold *all* their hard disk stuff to Hitachi. They've been busy getting rid of all their magnetic storage stuff. Given the length of time they've been in the field, the only reason I can see for them doing this that they have good reason to believe magnetic storage is going to become obselete fairly soon. I don't think this is of any immediate importance to grex, but if I were investing in the stockmarket, I might consider this very interesting. I believe STeve is looking for 15K U320 SCSI. There's at least 2 problems with mirrored IDE -- proprietary controllers, and performance during that "rebuild". The most common chipset seems to be adaptec "aac" - there are linux & openbsd drivers for this, but it's not fully functioned. The raid management stuff in particular loses; I'm not 100% convinced we would necessarily even know we lost a disk -- until we lost the 2nd one and were screwed. Regarding performance - I think the "7/24" shop most people have in mind with RAID includes windows of relative idleness. If you have a truely disk intensive load with no letup, then the rebuild never completes. Fortunately, grex doesn't have that, but we have seen that disk intensive things start slowing everything else enough that the load average starts to pile up and build. If we had to go visit the machine in person to install a new drive, then leave it in single-user mode during the rebuild, I'm not sure we've really gained all that much vs. the traditional "restore from tape" model, *especially* if this is liable to happen more often. There's another issue to think about too -- our reason for doing tape backups is *not* just hardware reliability, but also to cover the case of vandals destroying information. Online backups don't protect against this - if a vandal can destroy active filesystems, he can get at the backup just as easily - and one of the more attractive attacks he can make is to install a trojan in the backup then destroy the active filesystem. So, um, ya the mirrored IDE is an interesting option, but I'm skeptical that it makes sense for us. Sure, if we had extra time & money, mirrored storage could be fun, but I don't see it as really replacing the need either for reliable hardware in the first place, or backups to cover the case of vandals in the second.
Most ``24/7'' shops I've seen really are 24 hours a day. They use disk subsystems a lot more interesting than what you think, though.
Most activity I've seen is actually centered (somehow) around human schedules. Even in hospitals and in the travel industry this is true. To get something approximating 24 hours of real activity you pretty much need some sort of global presence (or some sort of artificial constraints that causes humans to rearrange their schedule to suit the computer). Despite the recent ubiquity of the internet, and the even more recent fall of the dollar in international markets, I doubt this is nearly as true of US business in general as Dan's experience apparently indicates. And, of course, silicon valley isn't necessarily designing for Dan's world either, despite the illusion their marketing droids cast. If they were, there'd be a lot more discussion about the possible performance hit while rebuilding a portion of a raid array.
Many factories in places like China run 24 hours to keep costs down.
Lots of factories in places like the US run 24 hours a day too.
I remember one site I had a mail account on that had some kind of external SCSI RAID storage array. One day they had a disk fail and the rebuild, which had to be done offline, took a week to complete. They were not amused.
#101: :)
Re #97: Incidentally, isn't it equally likely that the reason IBM is getting out of the disk business is that they feel there are no major new bit-density breakthroughs to be made? In the past they've done well by being on the cutting edge, but if capacities are going to plateau soon it's not going to be very profitable for them to compete with other companies in that area. I also recall they have a big class-action lawsuit against them over failures of some of their drives, and the selloff may be a way to get out from under that.
I'd expect that the price IBM gets for the disk business would fully reflect the lawsuit liability. The version i heard was that IBM couldn't get the kind of (financial) returns in the beaten-down, cut-price disk market that it could in most of its other lines of business; so IBM decided to move it's investments to where the returns were better.
In the US, labour is expensive, and the economy is down. I don't see an expansion of economic activity at 3am happening in the near future. I haven't heard of any technical barrier to higher density disk drives in the near to medium term. I imagine we will see a slow-down in disk drive growth, but that will almost certainly be due to the economy and demand affecting research and nothing more.
I purchased the first component of NextGrex today - a CDRW drive from CompUSA.
Hooray! I feel like we should have a ceremony or something.
Or a meeting! ;)
I don't think Dan would be able to attend the meeting.
I didn't realize a CDRW was being considered. I have no objection, but does Grex have something in mind for that?
(backups?)
boot disks. if you boot from a cd-r with your custom system on it, it makes it impossible for anyone without physical access to the machine to make unauthorized mods to the system.
That's my understanding as well, from what STeve told me.
No, the commute is a little difficult.
We now have a case to go with our CDRW drive - an Antec Plus 1080 AMG File Server case, with 430 watt power supply, that I bought today at CompUSA, at STeve's request. ($159.99).
#115: wimp. =D
Mark, we're not as convenient as the corner store, but we could probably get you much better pricing on such components. For example, we normally sell the 1080AMG for $140 and could do better for Grex. I'm in town at least once a week, so I could even deliver (though I'd rather not bring out one component at a time). If you have a list of stuff you need, feel free to email me: LK @ stratcom.com
Thanks Leeron. I'll send mail.
Don't support Zionism, or I won't support Grex.
If we buy stuff from Leeron at a cut rate price that he could actually sell to other people at list price plus making him waste his time delivering it to us, then we are probably costing him more money than we are making him, so if you think Leeron is synonymous with Zionism then this is a stroke against Zionsim, and you should send us all your money so we can buy more stuff from Leeron. Assuming you have any money. Thanks Leeron.
*ROFLOL*
Leeron is extremely generous with his time, as attested by him making numerous trips to deliver at least 50 computers etc. to us for recycling that he had saved over the years rather than putting out in the trash. (Some of these have now been donated to Eritrea, in working order, for use in schools there.) He does get to Ann Arbor pretty frequently for non-delivery reasons, though.
That's really cool about Eritrea. Makes it all worthwhile. In good socialist Zionist tradition, we subsidize our clients. If I were a religious Zionist, I'd probably be doing something dorky, like giving away free matzah with the purchase of a computer. So to one-up them, we'll include Israeli chocolate. With or without nuts, your choice....
We provided 9 computers for Eritrea - some from Leeron, some from Tim, and others that we had been using but were able to replace with Leeron's pentiums (the half of them that worked) and a few other computers from other grexers given to us as dead. They will be used with Linux and a wordprocessor, 250M or more hard drive, 16M RAM. All 486s. We were also able to upgrade several friends with pentiums (thanks, Mary). And we have more left. Thanks Leeron. We did not send chocolate. When I changed money at the border going into Greece from Macedonia, the banks gave everyone a free Easter bread.
OK, we ordered the processor, motherboards, and SCSI controller from Leeron. We still need the disks and memory.
I hit a little snag trying to order from NEwEgg - they won't ship to a P.O. Box or to an address that's not registered with our bank. Unfortunately the address registered with our bank *is* our P.O. Box, and they tell me they have no way to add another address to our account. I wrote to NewEgg for suggestions on how to get around the problem. I'd handle this by putting things on my credit card and then reimbursing myself, but then we would technically owe use tax on what we buy, whereas if the money comes directly from Grex we don't.
I spoke with Monique at NewEgg just now. I'm not convinced she knew what she was talking about, but when I suggested sending a check with a letter giving an alternate shipping address, she said that would work. So I'll do that. I'm waiting now to hear back from STeve on whether we should get the memory from NewEgg now too, 'cuz I don't really want to go through the process of sending them a check and waiting for them to deal with it twice. Can anyone explain to me the way memory is defined these days? Our motherboard has this in its description: 3 x DDR DIMM PC3200/2700/2100/1600 (DDR400/333/266) non ECC SDRAM (Note: PC3200 Max. to 2 banks only) which I think tells what type of memory it will accept. Can someone decipher that for me? In the memory aisle, I see stuff like this: Crucial Micron 512MB 64x64 PC2100 DDR RAM, 184-Pin, CL=2.5-Unbuffered 2.5V, 6-Layers CT6464Z265 Requires DDR supported Motherboard - Lifetime Warranty. Model#: CT6464Z265 -OEM But there are lots of other options for 512MB as well, with different numbers and different prices, and I don't know which we want.
OK, Steve says to get the memory from Crucial, not NewEgg, so I went ahead and ordered both of our disks from NewEgg. $212 for the 18GB SCSI, $96 for the 80GB IDE. Total $308.
Did the SCSI controller get ordered, too?
We ordered the processor, motherboards, and SCSI controller from Leeron. He's going to deliver them to me this weekend.
Cool....
This response has been erased.
I hope we got something that does ECC. STeve was having trouble
locating this, and the description above has me slightly worried. The
PC and macintosh world think ECC is unnecessary; in the "server" world,
ECC has been pretty much universal for at least a decade.
3x = 3 times
DDR, SDRAM = different memory bus chip interfaces
DIMM = physical package style
PC3200/ etc. == probably different PC world standards for memory;
in this case, sounds like there's a collection of related
standards that probably only differ in speed.
banks = in this case, probably slots. More generally, a section
of memory that is addressed and reponds as one unit.
6-layers = number of layers in PCB - not generally important
except as a measure of the cost and engineering in the design.
184 pin == # of conductors in a connector. The "pins" are more
often pads or fingers in modern designs.
unbuffered = no drivers. Generally faster but less fan-out.
ECC = error correction code. Generally such memory can
fix single-bit errors and detect double-bit errors.
parity = error detection. Can detect single-bit errors.
virtual parity = memory that lies and says it never has an error.
no parity = memory that can't detect any errors.
But I'd let STeve tell you what to get rather than spending too
much time trying to figure out what all the ciphers mean. As long
as the computer and the memory like each other, it's not really
important whether they measure things in terms of pins and banks,
or squirrels and pints. They'll have a different set of ciphers
next year anyways.
The number of memory slots has been declining in recent machine
designs - 2 or 3 slots is pretty common. This maybe an indication
of where "unbuffered" becomes important; to get more slots they'd
probably have to add additional buffering which might slow things
down.
Thanks Marcus. Here's the full description of the motherboard that STeve picked out, and we ordered from Leeron: ASUS A7V8X 1000Mb/s LAN, Firewire IEEE1394, Serial ATA DDR400 AMD Athlon/Athlon XP/Duron Socket A, Processor Mother Board Specifications: Socket A - AMD Athlon/Athlon XP/ Duron Chipsets: VIA KT400/8235 FSB: 333/266/200 MHz 3 x DDR DIMM PC3200/2700/2100/1600 (DDR400/333/266) non ECC SDRAM (Note: PC3200 Max. to 2 banks only) Serial ATA Firewire IEEE1394 LAN BroadCom 1000Mbs Network card Ports: 1 x AGP, 6 x PCI, 6 x USB 2.0 Realtek 6-channel CODEC ATX form factor
NON-ECC? Hmm, why? And why bother with firewire and all the USB interfaces? Man....
I'm a bit disappointed we didn't go with ECC, too, but it probably won't matter. USB and Firewire are almost impossible to avoid; they're built into most motherboards now. You can disable them in the BIOS if you need the interrupts for something else.
Leeron didn't receive the shipment in time to get it to me this weekend, but if all goes well his partner will drop it off with me tomorrow.
Leeron's partner should be bringing the motherboards, SCSI controller, processor, and floppy drive by this afternoon. I ordered the memory from Crucial. Since our motherboard can apparently only handle 2 of the really fast memory chips at once, I ordered 3 of the next fastest. Here's the description: Module Size: 512MB Package: 184-pin DIMM Feature: DDR PC2700 Configuration: 64Meg x 64 DIMM Type: Unbuffered Error Checking: Non-parity Speed: 6ns Voltage: 2.5V SDRAM Timings:CL=2.5
Enjoy the chocolates ;)
OK, I am now in possession of /------------------------------------------------------------\ | 1 AMD Athalon XP 2800, with a big honkin' heatsink and fan | | 2 Asus A7V8X motherboards with consecutive serial numbers | | 1 Adaptec 29160 SCSI controller card | | 1 Sony floppy drive | \------------------------------------------------------------/ Courtesy of Leeron and his partner Matt. No chocolates were included, though I did get some bubble wrap. I also ordered, at STeve's request, a 3-CD copy of Open BSD 3.3. It will ship Thursday, which is the release date. ($40).
STeve would also like to get a special OpenBSD keyboard - does anyone have one they could donate? I'm not sure exactly what's special about it - I asked STeve to explain it here.
Here's what STeve says he wants: They're USB keyboards, as in universal serial bus keyboards. These are new, which is why I think we'll wind up getting one ourselves, but its always possible I suppuse that we'll get one from someone. The older ibm at standard keyboards are just about a buck each, used, but usb keyboards are a different beast. And unforunately, the most common usb keyboards are made by Apple, and they're really pretty bad (the one bad thing about the new macs of the last couple years).
Has STeve said anything about when all of this is going to invade my house? (He says I get to help play with it and I need to figure fitting it into my schedule.)
No, I haven't, but thats because I don't know when all the parts will have straggled in. I think we're pretty close now. Re #142, 'tis USB keyboard, not OpenBSD. I wonder what an OpenBSD keyboard would look like? Would it be sutable for blowfish? ;-)
Ah, I see I misread the email. Why do we need a USB keyboard?
STeve, do you have the time to work on it at this point? Originally you said you'd be busy starting May 1. If you can get to it, great. But if you think it will have to be fit in around a very busy life then I'd like to look and maybe another volunteer, like dang, taking this part of the project on. It needs to go forward.
That's partly why I get to play. :-) It will get done!
I'm at the end of the academic year, and thats going to make a HUGE difference for me. Yes. This will be the fun part... We need a USB keyboard because the asus motherboard doesn't have much in the way ot "legacy" I/O devices. For instance, it has none of the old ISA cards, only PCI slots. This makes the board faster since it doesn't need any glue logic to handle the old style 8MHz 286 AT-style cards. We need a USB keyboard since it doesn't have an old style keyboard port.
There's a big push in the PC world to get rid of "legacy" 8-bit devices. That includes serial ports, floppy drives, and of course keyboards & mice. Pretty much everybody (apple, sun, ibm) supports USB - this is clearly the wave of the future, and AT keyboards will soon be about as obselete as appletalk keyboards are today, or 8" floppy drives 15 years ago. Basically, having to get a USB keyboard is a consequence of trying to get new high end hardware.
Get a Happy Hacking Lite 2 keyboard with a USB interface; it's about the best Unix keyboard I've ever used. And doesn't have any extra and useless keys. And it's tiny and only takes up a real small amount of space. http://shop.store.yahoo.com/pfuca-store/haphackeylit1.html
Hum. I guess I'm kind of bummed about the RAM; what was the rationale with not getting *any* error checking on it? (Not even parity checking!)
$69 seems pretty steep for a keyboard. I'll take a look at CompUSA and see what they have. STeve will have to answer the questions about memory. He told me to buy from Crucial, and they don't seem to offer any memory that isn't either ECC or non-parity. (At least not any that will fit our system.)
Finding Athlon motherboard that use ecc ram is really hard thesedays. In talking with hardware vendors, ecc is going away. The reliability of standard memory has improved to the point that the economics have driven ecc away. The good news is that ram is truly more reliable than it was even five years ago. That, and the fact that ecc memory is slower and has been pushed out of the way by people who want the fastest systems possible. In looking at high end motherboards I was struck by the fact that gamers now drive the market. Most people, and most businesses don't need that Nth increase in speed, but gamers do. CompUSA has changed their marketing scheme to accomodate this, for example. I couldn't get a figure out of some managers there when talking about their gamer market, but they did say it was "a lot".
The motherboard was not designed for it, and in fact crucial doesn't make ecc memory of that kind.
Tha happy hacking keyboard looks nice, but I wonder if spending $69 for it is worth it. Remember that this is going to be used 1) at the very start of installing OpenBSD, and then for emergencies and backups when in the Pumpkin. Newegg has several at $25 or less, as does compUSA. I want to poke at one to see what they feel like.
Would folks see any use to having another meeting to include board, staff and anyone else who cares to attend, prior to the assembly? The next BOD meeting is for next Tuesday, May 6th, at Zing's. The next Plan Grex meeting is for the afternoon of Saturday, May 18th. My only concern to simply adding this to the agenda for Tuesday's meeting is if the involved people (staff) would be able to make it.
Saturday, May 18th doesn't exist, the 18th is a Sunday :-) STeve will not be available for either as we will be in Dayton attending the Ham Radio Convention.
I got an IBM USB keyboard at Staples a while back. I don't remember how much I paid for it, though. It's nice, but fairly large because it has a lot of 'extra' buttons that you wouldn't need for Grex's console.
Doesn't Leeron sell USB keyboards?
Mary, you want to slow things down? I don't, more than they already have been.
It is indeed *Sunday*, the 18th. Nope, I don't want to slow this down. I'll be frank, STeve. Staff should be making decisions regarding Grex, not any one person. I don't see it as a given that this project rests in your hands just because you see it that way. If the consensus among staff is that the project is best given to you, great. I'm looking for that consensus. It wouldn't take a face to face for that to happen. I'm sorry you can't make any of the planned meeting dates.
I wished I'd remembered the date for Dayton a little better. We will be leaving for home in the early afternoon, so its still possible to make the next gen meeting, depending on when in the evening it is. When is it?
STeve, will you be available for the next board meeting, this coming Tuesday evening?
Well, as I said I think we can make that. It's important and its alawys better to leaver earlier than later. I'd like to get the components we have and start building what we have. Can I do that or are we going to have a meeting about it? I'd like to get them either tonight if I'm back in town early enough, or tomorrow.
Yes, I can make that. I hope we don't delay things a week.
I received mail from NewEgg, letting me know that they got our check. So hopefully they'll be able to ship our disks this week. I also checked on our memory order, and it did indeed ship today.
Looks like the motherboards we got from Leeron aren't quite the right ones. (There are a lot of different versions of this board, with different options. And everyone seems to have different model numbers for the different versions, making it all very confusing.) Leeron is looking into exchanging them right away, so this shouldn't slow us down too much.
Leeron's going to ship us new motherboards - they should arrive Friday. NewEgg's waiting for our check to clear, which should take 3-5 days. Our memory left Salt Lake City sometime yesterday, so it should be in the midwest by now.
Indeed, when I got home I found that the memory arrived this afternoon. So now we're just waiting on disks and replacement motherboards.
Of the three places I know of that I should or want to be on the evening of Sunday 18 May, the one that wins is "on a plane from Denver."
Leeron certainly is as good as his word - the replacement motherboards arrived just now, and they appear to be just what we need. Thanks Leeron!
Today I bought two extra Antec fans at CompUSA, for $15.99 apiece.
Today, janc, dang, danr and I got together and began assembling Next Grex. All went well, and NextGrex is currently running an infinite memory test. (Since it doesn't have any hard disks yet, there's not too much else it can do.) Jan's going to make up a web page describing everything we did. But basically, we - Installed the CPU in the motherboard. Since cooling is an issue, I decided to spen $14.99 for a tube of "Arctic Silver" thermal grease, which goes between the CPU and the heatsink. All the web pages I looked at said it works much better than the stuff which came already attached to the heatsink. - Installed the two extra fans I bought yesterday. - Installed the CDRW in the case. This is very easy in our fancy case - just screw some little plastic runners on the drive, and it slides in the bay right from the front. - Likewise with the floppy drive. - Installed the port template which came with the motherboard on the back of the case. - Screwed the motherboard into the case. - Put the memory in. - Connected lots of power wires to the motherboard, drives, and all the fans. Connected the USB ports on the front of the case, and firewire ports on the back. Also the wires for the power switches, speaker, and LEDs on the front of the case. - Installed the SCSI card. dang had brought a video card with him, and we put that in, plugged in a monitor and keyboard, and booted it up. dang set the processor speed in the BIOS, and set the system to boot from CD. He put in a copy of Linux he had on CD and we successfully booted from it. It recognized all the hardware we had installed. So, all in all, very successful. Hopefully by next weekend the disks and OS will have arrived, and we can take the next step.
Wonderful. Sounds great!
This response has been erased.
Is the CPU a retail or OEM unit? If it was retail, the usage of heat sink compound has voided the warranty. For an OEM CPU, it isn't quite clear to me what is what. There is an article on this at http://www.xtremetek.com/info/index.php?i d=14&page=1 that talks about this.
I'm glad to hear that it booted up and is running the memory test. Booting from a CD at least partly proves that the ide controller works, and that the CD works, too.
Wasn't the ide disk purchased? Seems one might be able to boot from that. But the booting from the cd should indicate that the onboard ide controller is OK.
The hard disks haven't arrived yet, because NewEgg needs our check to clear before they will mail us anything.
Re #177: That's an interesting article. Based on it and the other things I read on the web, I think using the fancy thermal compound was the right thing to do.
First draft of a web page on the system construction is at http://www.unixpapa.com/newgrex/ Some mediocre photos are included. This needs more work before it becomes a staff notes page.
When I left home this morning, NextGrex had run 15 cycles of the memory test (it takes about 67 minutes per cycle), with no memory errors.
This response has been erased.
A huge thank you to all involved. This is too cool.
NextGrex has been running over 24 hours now. It's completed 22 passes of the memory test, with no errors.
NextGrex has now completed 38 cycles of the memory test without an error. The room it's in is noticeably cooler than the rest of the house - I think those 8 fans are really pushing some air around.
lol at the fans
I wanted to see USB and FireWire available because the future is not clear, but having those abilities means we can use them if we want to. OpenBSD already has some FW support in it; I've been playing with it and while not stable, it definitely works. Not quite yet ready for prime time, but tats OK, since we aren't using it. USB 2.0 work is evolving as well. Unless the specifics of the motherboard have changed again (it gets confusing looking at Asus stuff), we also have serial ATA, should we want to go in that direction at some point. This means we have just about all the hardware options that exist: FireWire, USB, IDE, SCSI and SATA, for peripherals.
Yes, we definitely have SATA.
I pestered NewEgg, about our disks, and they have moved the process along a bit; with luck they will ship today or tomorrow, and wiht a little more luck we'll have our disks by the weekend. I also pestered openbsd.org, which hasn't even charged our credit card yet. They will be a little longer - they always experience backlogs around the time of a release, and the version we ordered was released May 1st. They haven't gotten to our order date yet, and their ordering FAQ says they can be as much as 10 days behind at release time. Then, since it's coming from Alberta, it will be a week to 10 days before we receive our CDs. :(
But that doesn't matter -- we have what we'll install already. Too bad about newegg; I've never made an order by anything other than a cc; seems they are really optimized for that and nothing else.
OK, our disks have shipped. We should receive them on Friday.
Re #177: That's kind of bone-headed of AMD. I've seen evidence that the phase change material simply isn't adequate in some situations, and not just on overclocked CPUs, either. A friend of mine put on a heat sink with just the phase change material, no grease, and the CPU had overheating problems until he went back and did it with thermal conductive grease.
At the board meeting last night we decided to order two more 18 Gig disks. Since NewEgg is kind of a pain for us to deal with (though they're just fine if you want to ship to the address on your credit card), we decided to order from Leeron, even though his price is slightly higher. I called Leeron and did that, and he thinks the disks will be here by Friday. Our disks from NewEgg are in LA, according to FedEx.
How much did grex save overall by ordering from Leeron?
We saved $136 over NewEgg's price, on the stuff we ordered before. We'll lose a little bit of it on these disks, because NewEgg's price is lower. But we'll have them a lot faster, and returning them if there's a problem will be a lot easier.
The two disks from NewEgg arrived today - I picked them up at the FedEx office by the airport.
BTW, if anyone wants to see the list of what's in the new machine, go to /----------------------------------------------------\ | http://www.cyberspace.org/~invent/item.cgi?num=256 | \----------------------------------------------------/ That shows the data for the case, and at the bottom is a list of everything inside. You can click on those items for details about them.
Looks like our SCSI controller card has 68 pins while our disks need an 80-pin SCA connector. I wrote to Leeron to see if he can sell us some adapters.
If Leeron doesn't have one, try www.atozcables.com. That's where I got mine last time I needed one. They have them for either $20 or $28 each, depending on whether or not you need one with termination.
Thanks David. Leeron says he can order some adapters for us, for about $10 each, but they will take 7-10 days. I understand the difference between the cables now - the 80-pin cable (which our drive wants) includes not only the data interface, but also power and SCSI ID setting. (The SCSI ID is set by the adapter via software, instead of being a jumper setting right on the drive.) These Seagate drives come in two versions, one with an 80-pin connection and one with a 68-pin connection (plus power connection and SCSI ID jumper block). At the moment, I'm inclined to send back what we have and get the 68-pin version, so that our drives are compatible with our interface card. Getting adapters for all the drives seems like a hack, and will make the inside of the case more complicated than it needs to be. (Here's a picture of an adapter; it's got a little circuit board: http://www.mycableshop.com/popups/SCA806850.htm) Plus, we'd need two types of cables. Unless, that is, there's an important advantage to having 80-pin drives.
I'd vote for sending them back and getting the correct drives. I've used the adaptors, and they're usually fairly shoddy (although they *do* work). I have a free 68-pin SCSI drive that I can temprorarily donate for testing/burn-in purposes, so that this doesn't waste any time for us.
OK, I called NewEgg and got an RMA number to send back the 80-pin drive. It was going to be a pain to re-order the right drive from them, so I called Leeron and told him to send back the two he just got for us, and in their place get us 3 with the correct connectors. This will cost us about $17 more per drive than going through NewEgg, but Leeron is a lot faster and more accomodating. :)
The general rule with non-obvious changes and warranties is that you void the warranty if you tell them you made the change.
I turned off NextGrex last night because some big thunderstorms were approaching Ann Arbor, and I don't have a UPS. It had been up for over 5 days, running the memory test, with no errors. We'll put the IDE disk in tomorrow, and test the SCSI controller with a disk of dang's.
Sounds like the right decision, disk-wise. SCA connectors seem to be mostly made for plugging hotswappable drives into backplanes. Any other use of them is kind of a hack.
Ditto. Much better to fix it now than to forever curse the adapters.
I put the IDE disk in yesterday, and installed Windows 98 on it in order to test out our hardware. (Don't panic, it's only temporary.) I had to hack system.ini because Windows gets confused by how much memory we have, but now everything seems fine. I installed a driver for the ethernet chip on our motherboard, connected the computer to the LAN in my house, and created an internet connection through the router in the basement, and voila, here I am talking to OldGrex from NextGrex. Everything looks good.
WE SHOUYLD HAVE OLDGRAX USEABLE EVEN AFTER NEWGREx, YOu're saying?
We, I guess. No parts from old grex will be used in newgrex. However, I can't, off hand, think of any use for old grex, and don't think we have any plans to keep it running.
And, before anyone asks, once the user partitions are successfully copied to nextgrex, the disks will be destroyed to insure the privacy of Grex's users. As far as I'm concerned, anyone willing to cart away the current machine after the new machine takes over (with appropriate transition period) is welcome to it. (Minus the user disks, of course.)
I can't imagine why we'd destroy the disks, and I can't imagine Marcus and STeve agreeing that we don't need the old Grex anymore.
Sufficiently sophisticated disk-recovery tools can do some amazing things. The only way to ensure these tools don't work is physical destruction of the disks. I can see an argument that nothing on grex should be that sensitive, but we aren't talking about *my* data on grex. As long as we retain physical possession, there is no need to destroy the disks.
I can't imagine anyone being that interested in grex's user disks, despite what some folks think. I'd say scrub them and give them away.
I can't imagine there being any real value in the old Grex hardware.
What is it that is supposed to be kept private, the passwords?
Files in home directory, email, staff conference.
If you can get good enough random numbers, it might suffice to do a dd if=/dev/random of=/dev/sdx.
`shred' (a GNU coreutils software) announces that it can prevent recovery of erased data by writing sucessively several different bit patterns over the files. More details on the paper "Secure Deletion of Data from Magnetic and Solid-State Memory", by Peter Gutmann. (http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html). The only drawback is that, since it overwrits the disk several times, it is extremely slow. But, after the transition, I don't think time will be a problem for oldgrex. Also note that nothing is 100% effective, of course. Physical destruction is the only guaranteed way of safeguarding the disk contents. shred's info page goes to the extreme of telling that the /only/ 100% way is melting the disk on *-acid.
Why the heck would we KEEP the current Grex one we complete the migration to the next Grex? Do we still need the first Grex? Or the second? So why not give it away to someone else who might actually put it to use? Including Marcus or STeve, if they want to take it away. And, if we're not going to use the current disks on the new system, then why should we keep them? And if we're not going to keep them, then we damn well ought to destroy them because it is the only way to absolutely insure that their contents are unrecoverable. I don't think my comment was radical, and I DO think it was logically sound and consistent with both our past practices and our current philosophies.
Can't you simply overwrite the entire disk with 0's?
There are a lot of levels of sophistication of data recovery tools available, and I don't know how available products of any particular level are, but it is quite possible that no reasonable amount of overwriting with 1s, 0s and/or random ASCII values would entirely obliterate and render irretrievable someone's personal data on these disks.
/dev/zero is your friend. dd if=/dev/zero of=/dev/whatever bs=8192 (blocksize on grex is probably lower) there may be some concern about the disks being magnetic and the zero's not doing enough.
(The question is not, "Who would be interested in the data on the disks?" The question is, "Who would be interested in _their_ data on the disks being released or revealed?" We've too many users to get ALL of them to answer that question negatively.)
The question I'd ask: is it easier to (potentially) crack root and see the data on the disks or to actually recover the data once reasonable precautions are taken to erase it. The point being that no one should ever expect that their data on a public access system is 100% secure. Of course, if STeve or mdw are interested in the old machine, that would solve the problem given that the scrubbed disks would be in safe hands (for some time to come).
I agree with #226; no one on grex has any sort of guarantee about the safety of their data. Indeed, grex is planning on using a password system on next grex that inherently compromises the data of all users if someone has managed to crack root. Going and getting the disks from someone in Michigan after they've been scrubbed is a lot more work than just getting the data off the disks now or after the transition to the next grex. I sympathize with Joe's sentiment about wanting to keep user data secure, not it's not going to be any less secure on a scrubbed disk as it is on grex now or in the future.
With a clean room for disecting disk drives, some millions of dollars worth of exotic high-tech instruments, and skilled staff to match, it should be presumed that supposedly-totally-erased data can be recovered from drives. Anyone *that* interested in the data could get it far faster, sooner, and cheaper in a host of other ways, starting with simple physical break-in. Thus, it's reasonable to assume that any data on grex worthy of such efforts has already been stolen, and giving the hypothetical hostiles an extra copy is actually *good* tactics - they waste resources to read it.
Well. I guess I'M the one being anal about security this time. It's a rotating responsibility. Someone else take over, 'cause it looks like I'm done.
WE NEED TO STOP THE SUBVERSIVEs... SQUIRRLEy-Group?
The first Grex is (or at least was the last time I saw it) in Marcus's basement. As of a couple years ago, when I was last in the Pumpkin, Grex 3 was still there. I think Grex 2 may have been as well, but Grex 2 may have been harvested for parts (2 and 3 were similar enough for some hardware to be interchangable.
Regarding #229; There's nothing wrong with being anal; but if you're going to be anal about one thing, it's best to be anal about everything else, as well. For instance, not just merging the existing contents of /etc/shadow into a Kerberos KDC for use as keys.... Security is all about tradeoffs. If people really wanted their data to be secure, they'd encrypt it, put it on some sort of tramper-resistant media, enclose that in a cube of lead with two foot walls, enclose that in a block of concrete, booby trap it so that if anyone tries to open it, they die, and dump it into the Mariana's trench; all in secret so that nobody knew they'd done it. Even then, it wouldn't be totally secure. One has to do a risk analysis, and determine whether the cost of protecting the data from prying eyes is worth the value of the data. If it is; great, do whatever you need to to make sure no one gets access to it. If not, then take some reasonable precautions, but don't lose sleep over it. Data from grex definately falls in the latter category.
Oh, I'd say para 2 in ersp:232 describes "total security" in real-world terms. There's no way to recover anything 7 miles into the ocean. Leeron in resp:226 and the next several comments describe my opinion about the need for disk security. Grex needs to reasonably match the security presently given to that data. That's all anyone has any right to expect. A good formatting of those drives ought to be easily sufficient to keep the data as secure as it is now. My goodness, how difficult would it be for someone to break into the Pumpkin right now and steal tapes, hard drives, or even all of Grex? Where else are backups kept? Any of those places could be breached by someone with such sophisticated specialized training as we probably all got from our parents when taught how to use a screwdriver. It'd be a lot easier to steal the data (and cheaper, and much more reliable) than to recover data from a formatted hard disk.
Yes, but one might throw out one's back trying to steal the current grex.
re #233: Out of curiosity, do you actually KNOW the location of the Pumpkin? When was the last time someone cracked root on Grex? What does it cost us to destroy the old disks? What if a user who wants their privacy doesn't know enough to know the real risks to the privacy of their data inherent in placing it on Grex? I'd say trashing the disks is less work and more security than wiping them a few times, and eliminates the risk of charges of carelessness with user data. (Whether that risk is real or imagined.) But I really don't care that much about it. I don't keep my SSID and credit numbers on Grex... <shrug>
Regarding #235; No, I don't. But I'm willing to bet that someone who's going to go to the trouble of restoring user data off of the disks (how are they going to locate them, anyway?) does. When was the last time someone broke root? Well, how do you know that anyone other than the person or persons who did so know? Someone who cares enough about Grex's data is likely to be able to find someone who could break in without anyone knowing. Besides, grex runs some insecure software. The version of sendmail it runs is (last time I checked, anyway) potentially vulnerable to some well-known holes. If a user stored data on grex without realizing that they had no expectation of the privacy of that data; well, tough. And besides, making a good faith effort at protecting that data by scrubbing the disks is enough to avoid any charges of negligence (which are purely hypothetical anyway). Now, don't get me wrong. If you want to destroy the disks; go for it. But it's not necessary, and people should be educated about why that is.
This response has been erased.
The SCSI disks arrived yesterday. They have the right connectors. Thanks Leeron! I'll be putting them in this week, and if I can, testing them with Windows.
Aww.... At least test it with some variant of Unix. :-0
UNIX will get its chance, don't worry.
I've got plenty of Linux distros, Mark.
(I don't think the old disks would be vulnerable to targetted data recovery, but they could cause unintended disclosure: someone put something they really shouldn't have on the disk and then forgot about it. If the disks were sold to a user of grex, though, targetted data recovery becomes a higher probability. (Say, 30% instead of 15%, to pull some numbers from the air.))
Being a pack rat, I'd be tempted to keep the data intact in case anybody wants it for historical research in a hundred years, but that's just me.
Regarding #242; Joe, even if they scrub the entire disk? Just curious.
I'd say the amount of time necessary to recover data from a scrubbed Grex disk is going to be totally out of proportion to the value of any data likely to be on those disks. We're not talking about a situation where you can just run 'undelete' and get it all back, this is an expensive and time-consuming process.
re resp:234: I don't know the location of the Pumpkin, but don't imagine it would be difficult to find it out if I wanted to. I might even send you an e-mail: Hey, Eric! Where is the Pumpkin? Just curious. Would you refuse to answer such a request? If I sent it to staff@grex.org, shouldn't I expect to get an answer? I don't think Grex is all *that* security conscious.
I can give you the street address if you wish.
Shh! Don't *do* that! The evil ones might go and steal grex. At least it'll be easy to identify them at the hospital: they'll have hernias.
The address of the Pumpkin is not something Grex makes a point of publishing. For one thing, we don't want anyone to go to the Pumpkin (or send mail there) if they need to contact someone about Grex. For another thing, well, I don't know what the other thing is. But there's no real reason for anyone but staff to go there. But, as several people have pointed out, I'm sure it wouldn't be hard to find out if you wanted to. I just typed the address into google and it found someone who's listing Grex under that address. Hmmm, we should probably do something about that...
This response has been erased.
No, no. NORTH Huron! (Grex moved to Ypsi...)
*smiles* We guard it by BIG nasty dogs - they won't get too far :)
Dan, yes, even if the disks were scrubbed first. I know folks with lots of spare time on their hands. I know folks who have written their own disk-recovery software. (To the best of my knowledge, the intersection of those two sets, BTW, is the null set.) I can see someone with the time and interest using the grex disks as an experiment base for their own efforts. (They'd probably settle for *any* disk, not just grex's.)
Who's in charge of offing people who find out the address of the Pumpkin? Mark, don't freak out, but I have your address. See that car parked outside your house? The dark van, with the tinted windows? That's my sister. She's Mossad so you might not be able to spot the van. Nonetheless, since you have the new disks, it's only a matter of time before you deliver them either to the Pumpkin or to someone else who will ultimately take them there. Don't look over your back, she's following you. Take my word for it. And then, when we discover the location of the pumpkin, we'll contact G Gordon Liddy to break in and steal the tapes. Er, disks.... Actually, I got a good laugh from Walter's comment: > it's reasonable to assume that any data on grex worthy of such > efforts has already been stolen So we're just discussing how wide to leave open the barn doors. (: (Though obviously, some horses are still inside.)
So *that's* why that woman was following me yesterday.
What's the matter guys, can't figure out how I know the address even though I am ~2000 miles away.
This response has been erased.
It depends on whether Leeron's sister is smoking or not. Leeron, got a picture? Nyuk nyuk nyuk. Beware those Israeli women, though; though they smoke, they're heart breakers. Regarding #253; Joe, if you know someone who can recover data from a properly scrubbed disk, I'd almost be willing to say, give them the disks and see if they can get anything off of them.
My dad wouldn't let us drink colored food items in the 1960s, long before the FDA would ban them. So despite all those bad influences outside the house, none of the kids smoke and we all detest it. As a child, when I was offered a puff (by an "uncle" who since died of emphezyma), I exhaled to make the tip glow (every 8-year old is a pyro). Like Clinton (or not), in never occured to me that I was supposed to inhale and ingest the smoke.... To this day I tell my clients that, if they love their computers, they shouldn't smoke around them. Which brings us back to the subject (whew!). Instead of scrubbing the computers clean, can't se smoke them dirty?
I installed the SCSI disks today - the controller and fdisk recognized them right away, with no problems. I'm running surface scans now.
good, can't beat w98 scandisk for that (though why I don't know)
Re #253: I've written disk recovery software, too, for retrieving *deleted* stuff. But that's different than recovering data from a disk that's been wiped, because the electronics in the drive are unable to read the data at that point. Recovery is then a matter of removing the platters in a clean room environment and using sophisticated equipment to analyze the magnetic patterns left on them. I *seriously* doubt anyone is going to go to that expense with a home disk from Grex so they can read a bunch of archived spam mail and obsolete copies of eggdrop. ;)
I ran surface scans of all four of our disks (3 SCSI and one IDE), and there were no bad sectors on any of them.
Good deal. I haven't found bad sectors on a new disk in a long time. Modern drives have "spare" sectors they remap to hide bad ones, so by the time you start seeing bad sectors on a disk it's already pretty sick, and probably has been getting worse for quite a while.
Excellent, Mark.
Hooray! So am I correct in understanding that all the hardware has now been acquired and installed, and we're ready to go with configuring NextGrex?
Yup, that's correct. All the hardware is in one box, and everything works.
Excellent.... Eeexxxccellet. Proceed with the next phase of the...operation...number two.
BTW, I know we don't need them, but it looks like our OpenBSD CDs shipped last Tuesday. At least our credit card was charged then - I haven't received a confirmation email.
This response has been erased.
re the responses to resp:246: I'm not going to embarrass anyone by posting the messages I got in response to my e-mail to staff, and I think some were humorously intended anyway. It took me 18 minutes to get an e-mailed response with the address of Grex. I was asked to not post the address. With all due respect, I think it's not a meaningful request. However, I do have a lot of respect for the Grex staff and since they asked, I won't post it. Certainly, anyone wishing to obtain anything from Grex could do so much more easily and certainly than by trying to recover data from a formatted hard disk. Eric's concerns were well-intended but not a realistic concern. If anyone really wants anything from the old Grex hard disks, enough to use data recovery techniques, Grex might entertain competitive bids. If they gave Grex half the money they'd otherwise have to spend, there would never again be a financial crisis. The staff could be paid, the Board could be paid, and if I were given an appropriate commission for coming up with the idea, I would never have another financial concern.
Yeah, true. Also, Joe earlier posted some odds for getting data off of a scrubbed disk. He made it clear he was doing so without real data, but I want to get those numbers closer to reality; he said something like 15% odds of getting something good for a casual attacker, 30% for a determined attacker. However, in both cases, the attacker isn't expending a lot of financial resources; only time and clevernes. No access to fancy clean-rooms or the like. In that case, I'd give a casual attacker something like 0.00001%, and a determined attacker something like 0.0001%. Note that determination gives you an order of magnitude advantage. Those numbers are probably conservative; real numbers are probably a lot closer to zero.
Well, we had another Next Grex Meeting, somewhat sparsely attended - Mark Conger, Joe Gelinas, John Remmers, Valerie Mates, Jan Wolter. Steves Weiss and Andre' called to say they couldn't make it. Mark brought along the Next Grex, which we set up temporarily and tried booting off an OpenBSD boot floppy that John had brought along. Mostly looked good, but although it found the SCSI controller, it did not find the SCSI drives. So we went about the proper businesses of sitting around and talking about the computer. After everyone left, I moved the machine down to my office, where it can be plugged into the LAN and tried various things. Second thing I tried was one of the other OpenBSD boot floppies. There are three in the distribution. One for standard systems, one with extra drivers for SCSI and raid and gigabit ethernet, one for laptops. John's was the second one, the one with SCSI stuff. I tried the standard one, floppy33.fs, and that found the SCSI drives without a problem. I have volunteered to take a first cut at partitioning the drives and installing OpenBSD. When I've done that, I'll get it on my LAN, open an SSH portal to in through my firewall, and advertize it to staff. I'm currently scratching my head of partitioning options, having had somewhat less input from others than I would like, but I'll do something plausible. If it stinks, we'll redo it.
Thanks Jan - enjoy those seven fans. :)
Suggestions regarding partitioning. This is what I would do:
(a) Use RAIDframe across all the SCSI disks. Partition them
thusly:
32MB sd[0-2]a (The second-stage bootstrap and kernel)
1024MB sd[0-2]b (Swap, striped across 3 disks)
rest sd[0-2]d (Everything else)
Set up the RAIDframe partitions on sd[0-2]d.
Use RAID-5 with an interleave size of 64KB
(the size of an FFS1 ``extent'').
(b) Configure the following filesystems. You'll have to use
disklabel, but it's not particularly hard:
512MB / raid0a
2048MB /usr raid0d
4096MB /usr/local raid0e
4096MB /var raid0f
4096MB /grex raid0g
512MB /tmp raid0h
rest /u raid0i
80GB /scratch wd0d
(Yes, OpenBSD supports disklabels with more than 8
partitions on them....)
(c) Put mail in $home/Mailbox; that does away with the need
for a seperate /var/mail partition.
(d) Merge /suidbin into /. / in 4.4BSD doesn't contain nearly
as much ``non-system'' stuff as did / in 4.3BSD and prior
versions. It's easiest to think of it as a ``system''
partition with a minimum of non-system related stuff in it;
having suid tools in /, if one restricts it to system purposes,
is just fine. In this configuration / can remain the only
partition that has suid binaries on it, and it can remain
writable. This has some advantages: (i) All the suid tools
are available in single user mode. (ii) It's writable, which
means that a bug in a suid program can be quickly corrected
by staff. (iii) It cleanly keeps all ``system'' related
files in one place.
(e) Create symbolic links from /usr/src and /usr/obj to
/var/src and /var/obj, respectively. Also, create a
/var/local hierarchy. Create symbolic links from
writable places in /usr/local to /var/local. By doing
this, you're able to (i) make both /usr and /usr/local
read-only most of the time, while (ii) retaining the
ability to keep the system sources up to date.
(f) Move the BBS and associated files into /grex; this is
the place for grex-specific software. Party, the BBS,
etc can go in there.
I'd further do the following:
(a) Split /suidbin into /suid/bin and /suid/sbin. This breaks
up functionality a bit; user-software that is used by
general users can go into /suid/bin. Sysadmin stuff goes
into /suid/sbin.
(b) Create /local; put local stuff that's useful in single user
mode in here. Ie, Kerberos, Kerberized sudo, SSH, maybe a
shell or something, etc.
(c) Remove some of the goofy symlinks from /; why is /b a symlink
to bbs?
(d) Change the startup scripts to newfs /tmp every time the
system boots.
The biggest changes here are putting more stuff in /, and doing away with
/var/mail. The latter is for security and convenience; the former is
purely for convenience.
Oh, a node on the difference between sd[0-2]a and raid0a. The OpenBSD
RAID software can get its root filesystem from raid, but cannot read the
kernel upon boot out of a RAID partition. sd0a would be a *really*
small partition, mirrored on sd1a and sd2a as well, that contains the
second-stage bootstrap and the kernel to boot from. A copy of the
exact same kernel would be in /; once the system started booting, it
would be transparent. The system can be booted off of any disk, and
the loss of any single drive wouldn't impact grex much. It could be
swapped out and the parity rebuilt while the system was operating.
Dan - Thanks. Looks like I'm going to have to read up more on RAID.
One note of possible concern: It's not clear how well our SCSI controller
is supported by OpenBSD 3.3. We have an Adaptec 29160 controller which
apparantly stars the 7899G chipset. In the OpenBSD 3.3 file INSTALL.i386
it says the following in the list of supported hardware:
Adaptec AIC-789[29] chips and products like the
AHA-29160 based upon it which do 160MB/sec SCSI. [C]
(However, the 7899G card is currently not supported with
more than one device attached)
Well, we have more than one device on our card. Web searches show lots of
messages from people who had problems with OpenBSD and the 29160. For some
of them, using only one device on the controller worked fine. These messages
generally had followups saying that these problems were fixed in later
releases. This is a fairly popular card, and you'd expect getting it to
work would have been a priority for someone. However, as you see, they
didn't delete the note saying it didn't work with multiple devices from the
install document.
I'm guessing/hoping that this is just a reflection of their poor document
maintenance. Though the 386 install document seems to me to be the single
most important document to maintain, it seems to be fraught with errors,
mainly places where it appears not to have been updated when the software
was. In the section quoted above, for example, the [C] indicates that the
driver is not included on install floppy C. However, it appeared not to work
on install floppy B either, and there is no [B] there.
Hmm...raid support is on floppy C, our SCSI driver support is on B. Could
be an annoyance if we try to build a RAID system.
Sure thing, Jan. I'm not sure what to say about the 7899G based card, other than, ``try it and see if it works.'' I do see that a few people on various mailing lists say things like, ``I beat the living snot out of a 7899G with 20 drives on it and it worked just fine.'' (http://archives.neohapsis.com/archives/openbsd/2002-12/0019.html). It seems they probably cut and pasted the supported hardware list from the architecture specific hardware web page into the install document; I'd suggest that you're correct in guessing it's an artifact of a less than perfect document update process. btw- Just because it's absent from the floppy doesn't mean it's absent from the CD boot media. For instance, the drivers for both the SCSI card and RAID should be on ftp://ftp.openbsd.org/pub/OpenBSD/3.3/cdrom33.fs
OK, just starting to read about RAID. Looks like the Promise RAID controller on the motherboard is for IDE only (and may not be supported by OpenBSD anyway) so we are talking about software RAID, which in the case of OpenBSD is RAIDframe. Apparantly OpenBSD 3.1 and later do support having the root partition mirrored on RAID. RAIDframe supports RAID levels 0, 1, 4 and 5 and miscellaneous other things. It's not in the generic kernal. We'd need to rebuild with it. There are a huge number of options here, just beginning with the question of which RAID level to use. My feeling is that RAID is a sensible answer for Grex. It can win us performance gains and added data security in the case of a disk crash. It wastes a lot of disk space, but we have the space to waste. It does not protect us from someone accidentally deleting files, so it is no substitute for backups. However, going the RAID route means (1) spending some time weighing which RAID configuration (if any) is right for Grex, and (2) spending some time getting it all set up. Doing this right is going to require a lot of time and a lot of staff members in the loop. I don't want to stall bringing the system on line for code porting and other development while we do this. Maybe I should do an OpenBSD install onto the IDE disk. We can work there, build a raid kernal, configure RAID on the SCSI disks then boot off that. This may be the best choice right now. It gets the system to a state where I can do what I know a lot about (software), and defers the decisions about disk setup a little longer to give other staff time to chime in if they want. If I'm going to implement Dan's plan, then I'll need to do it in two stages anyway, since it doesn't look like you can get a RAID system straight off the CD. The minus with doing the OpenBSD install on the IDE disk is that it doesn't give the three SCSI drives on the 29160 controller a good work-out, and that's questionable enough to be worth beating on. However, I can create some temporary partitions there, and start a program reading/writing stuff to them, just to give them a workout.
The last time I did an OpenBSD install into RAIDframe, I think I did
something like the following:
(1) Installed onto a single drive.
(2) Recompiled the kernel and tested it.
(3) Booted single-user.
(4) Dumped /usr, /usr/local, /var and all the
partitions I was RAID'ing to temp space
somewhere.
(5) Reclaimed the space of all the partitions
I wanted to RAID'ify into one big partition
using disklabel.
(6) Configured and started up RAID.
(7) Edited the RAID set disklabel and set up my
partitions.
(8) Rebuilt the new RAID set's parity (which went
surprisingly quickly).
(9) newfs'ed the new partitions and mounted them.
(10)restored the earlier dumps to the new, RAIDed
partitions.
This is slightly more complex, but I think you could do something similar
to get RAID working on the SCSI disks. Certainly, installing onto the
IDE disk gives you the manueverability to bootstrap the SCSI drives.
Of course, that doesn't resolve the issue of deciding on an optimal
configuration.
Some more suggestions: Use RAID level 5. You only have three disks;
if you had four, I'd suggest using 1+0, that's out. Anyway, RAID 5 will
give decent performance (particularly if coupled with soft updates on all
partitions), will protect against dropping a disk, and won't waste *too*
much disk space. With only three disks, you don't have much else in the
way of choices for RAID levels. Striping won't but you any reliability,
RAID 4 is just dumb, and you don't have enough disk for mirroring.
The last real question is how big to set the interleaves. I'd say 64KB,
and the reason for that is that 4.4BSD's FFS implementation defines a
weak concept of an ``extent''; basically, it'll try to read or write up
to 64KB in a single burst from/to the disk, if it can. A 64KB interleave
size matches up with that idea of an extent as used by the filesystem,
and should give pretty good performance.
Oh, PS- Didn't remmers donate an OpenBSD machine that could be used for software porting and things of that nature, leaving time to get the nextgrex configuration right?
Yup, he did. Well, ran into another snag in the OpenBSD install. OpenBSD can't find a network interface. During the boot up, when it is polling the PCI bus, it lists: Broadcom BCM5702X rev 0x02 at pci0 dev 8 function 0 not configured. That means it sees it, but doesn't have a driver for it. On th list of supported hardware it says: # Broadcom BCM570x (a.k.a. Tigon3) based PCI adapters (bge): (A) (B) (C) The (A) (B) (C) business means that the driver isn't on any of the install floppies. I think this means I need the CD to do the install. I can't very easily do an ftp install without a network driver.
Wait; if you burn a CD with the CD-ROM boot image on it, does that have the driver? You should be able to boot with it and perform an installation from there.
(PS- to Clarify. The CD boot image is different from the OpenBSD CD distribution, and can be downloaded from the OpenBSD web site. Given that there's a CD burner in Nextgrex, it shouldn't be hard to do. The URL for the CD-ROM image is: ftp://ftp.openbsd.org/pub/OpenBSD/3.3/cdrom33.f s
Right. Unfortunately, I blew away the Windows98 install Mark did on Next Grex, so I'd need to first install something on NextGrex that can fetch that file over the network and can burn a CD. I have a number of different old OS's on CDs that I could try, but none are painless. I don't have another computer with a CD burner. The real live OpenBSD 3.3 bsd was shipped from Alberta on Tuesday, apparantly. It should arrive in the next few days. There are also probably lots of people who could make me a CD with that file on it. Either of these two paths seem much easier than reinstalling Windows98 on NextGrex.
We don't know for sure if our CDs shipped Tuesday, only that our credit card was charged then. I sent mail to the shipping guy at openbsd.org to ask if they really did ship. But anyway, dang offered to make Jan a CD, and he'll bring it over tonight.
Re #280: My machine is still online and available to any staffer who wants access. It's currently running OpenBSD 3.2. If it's going to be used to test out software, I should upgrade it to 3.3. If I have time to do that in the next couple of days I will, but to be honest free time is in somewhat short supply this week. I'll see how it goes. Dang installed a CVS server on it, the idea being to use that to document our work. The CVS server hasn't been used yet, and nothing much else has been done with the machine yet either, so it might not be too unreasonable to use the OpenBSD CDs, when they arrive, to install 3.3 from scratch on my machine, then ask dang politely to re-install the CVS server...
Turns out Valerie can burn CD's. I should have known that. So I've got a working boot CD now. I tried logging into John's machine and failed. I should give him a call and see what I've got wrong.
Hmmm. Got it installed on the disk, but boot from the disk seems to be hanging when the kernel tries to initialize the audio drivers. I'll investigate more later tonight and report back.
I had a little trouble with the audio in Windows 98, actually. It mostly worked, but occasionally produced static when it should have been playing a sound. I figured it was because I needed a different version of the driver, or it needed to be reinstalled.
Why are we worrying about making the audio drivers on the next Grex machine work?
OK, some details. As the kernal starts up, it prints out lots of messages describing the various devices. When booting from the CD or floppy, it finds the audio device, but doesn't have a driver for it (of course, since this is a install disk and it doesn't need audio), so it says: "VIA VT8233 AC97 Audio" rev 0x50 at pci0 dev 17 function 5 not configured When we boot from the hard disk, it finds the device, and has a driver, but the driver seems to fail to initialize. It types the following, and then hangs forever with the cursor at the end of the line: auvia0 at pci0 dev 17 function 5 "VIA VT8233 AC97 Audio" rev 0x50_ It should go on to finish the line by typing something like auvia0 at pci0 dev 17 function 5 "VIA VT8233 AC97 Audio" rev 0x50: irq 9 We never get the ": irq 9" part. (It is IRQ 9, according to the bios). One fix would be to build a kernal without the audio driver. It's not like Grex needs audio. Any better ideas?
Eric slipped in. I don't care very much about making them work. Right now they are keeping us from booting, which I do care about. I'd slightly prefer to know what is causing it to fail. We are going to have to do an OpenBSD install on this machine every year or so. We need to figure out how to do it smoothly. It's worth a little effort to find the *best* way to deal with problems, not just some workaround kludge.
Hmm; can you disable the onboard audio in the BIOS? It sounds like it's hanging in the probe routine; perhaps it's having difficulties disambiguating the audio device from something else it might share an interupt line with? Maybe there's a bug allocating an IRQ? Weird.
Found this: http://www.netsys.com/openbsd-misc/2003/01/msg00734.html Appears to be someone having the same problem.
The discussion of this problem above didn't find any sensible solutions, so I'm willing to just disable the device. Looks like there are two ways to do this without recompiling the the kernal. http://www.openbsd.org/cgi-bin/man.cgi?query=boot_config&sektion=8&arch=i38 6 To make this work, you need to tell it to boot "/bsd -c" instead of /bsd. However, it doesn't prompt for a kernal to boot, and I don't know how to make it do so. http://www.openbsd.org/cgi-bin/man.cgi?query=config&sektion=8&arch=i386 To make this work, I need a reasonably running system. Booting off the CD and mounting the / partition under /mnt doesn't do it. The config program is not on the install CD and the copy on the hard disk wants ld.so which it can't seem to find while booted off the CD. I might be able to figure out how to make this work, if I was less sleepy. Either way, I just need to do "disable auvia" and that should kill the audio card. I'm going to bed.
Maybe you'll dream up another solution, like disabling the audio in the BIOS so OpenBSD will never see it in the first place...? (I suppose it could be a jumper on the motherboard if not a BIOS option.)
Yeah Leeron! There is a thing in the BIOS to disable the Audio Controller,
and with that turned off, we can boot.
I'd still like to know how to tell OpenBSD to boot off something else.
It seems to sometimes show a "boot>" prompt briefly, and if you type
something then it won't fill it in itself. However, when I started
booting off the CD, typed "boot wd0a:/bsd -c" at the boot> prompt,
it went ahead and booted off the CD anyway. Well, I'll get plenty more
chances to experiment with this.
In the true OpenBSD spirit, after all this work, it greets me by telling me
I'm an idiot:
Don't login as root, use su
Root's the only account on the system, and that's a comma splice, you idiots.
Sorry, I have personality conflicts with OpenBSD.
Most of the world has personality conflicts with OpenBSD. Hey, disabling the audio device in the BIOS; I said that in #293! Comma splices are bad, use semicolons.
So Jan, just so I can be sure I understand what's going on; a minimal OpenBSD installation is on the IDE drive, and it's seeing all the devices now?
Yup. Staff has been informed, John Remmers has successfully logged in. My next step is to write some little scripts to copy data around fiercely on the three SCSI disks, just to increase my confidence that the controller really is working right with multiple drives. Yeah, you did say that didn't you? I was way too sleepy last night.
Yep, I logged in and created myself a "remmers" account.
Thought I'd so a survey of suid/sgid programs, many of which might have to
be moved if we do an /suidbin directory. There's a number of them, but many
don't actually need to be SUID on Grex (the ones marked 'X' in the list
below should probably lose their suid bits or not be moved to suidbin).
SUID files:
X -r-sr-xr-x 1 root bin /sbin/ping
X -r-sr-xr-x 1 root bin /sbin/ping6
X -r-sr-x--- 1 root operator /sbin/shutdown
-r-sr-xr-x 3 root bin /usr/bin/chfn
-r-sr-xr-x 3 root bin /usr/bin/chpass
-r-sr-xr-x 3 root bin /usr/bin/chsh
X -r-sr-sr-x 1 root daemon /usr/bin/lpr
X -r-sr-sr-x 1 root daemon /usr/bin/lprm
-r-sr-xr-x 1 root bin /usr/bin/passwd
X -r-sr-xr-x 1 root bin /usr/bin/rsh
-r-sr-xr-x 1 root bin /usr/bin/su
-r-sr-xr-x 1 root bin /usr/bin/sudo
? -r-sr-xr-x 1 root auth /usr/libexec/auth/login_chpass
? -r-sr-xr-x 1 root auth /usr/libexec/auth/login_krb4
? -r-sr-xr-x 1 root auth /usr/libexec/auth/login_krb4-or-pwd
? -r-sr-xr-x 1 root auth /usr/libexec/auth/login_krb5
? -r-sr-xr-x 1 root auth /usr/libexec/auth/login_krb5-or-pwd
? -r-sr-xr-x 1 root auth /usr/libexec/auth/login_lchpass
? -r-sr-xr-x 1 root auth /usr/libexec/auth/login_passwd
-r-sr-xr-x 1 root bin /usr/libexec/lockspool
-r-sr-xr-x 1 root bin /usr/libexec/ssh-keysign
? -r-sr-sr-x 1 root authpf /usr/sbin/authpf
X -r-sr-xr-- 1 root network /usr/sbin/ppp
X -r-sr-xr-- 1 root network /usr/sbin/pppd
X -r-sr-xr-- 1 root network /usr/sbin/sliplogin
X -r-sr-xr-x 1 root bin /usr/sbin/timedc
X -r-sr-xr-x 1 root bin /usr/sbin/traceroute
X -r-sr-xr-x 1 root bin /usr/sbin/traceroute6
SGID files:
X -r-xr-sr-x 4 root crontab /usr/bin/at
X -r-xr-sr-x 4 root crontab /usr/bin/atq
X -r-xr-sr-x 4 root crontab /usr/bin/atrm
X -r-xr-sr-x 4 root crontab /usr/bin/batch
X -r-xr-sr-x 1 root crontab /usr/bin/crontab
? -r-xr-sr-x 1 root kmem /usr/bin/fstat
-r-xr-sr-x 1 root auth /usr/bin/lock
X -r-xr-sr-x 1 root daemon /usr/bin/lpq
? -r-xr-sr-x 1 root _lkm /usr/bin/modstat
? -r-xr-sr-x 1 root kmem /usr/bin/netstat
-r-xr-sr-x 1 root auth /usr/bin/skeyaudit
-r-xr-sr-x 1 root auth /usr/bin/skeyinfo
-r-xr-sr-x 1 root auth /usr/bin/skeyinit
-r-xr-sr-x 1 root _sshagnt /usr/bin/ssh-agent
-r-xr-sr-x 1 root kmem /usr/bin/systat
-r-xr-sr-x 1 root kmem /usr/bin/vmstat
-r-xr-sr-x 1 root tty /usr/bin/wall
-r-xr-sr-x 1 root tty /usr/bin/write
-r-xr-sr-x 1 root games /usr/games/atc
-r-xr-sr-x 1 root games /usr/games/battlestar
-r-xr-sr-x 1 root games /usr/games/canfield
-r-xr-sr-x 1 root games /usr/games/cfscores
-r-xr-sr-x 1 root games /usr/games/cribbage
-r-xr-sr-x 1 root games /usr/games/hack
-r-xr-sr-x 1 root games /usr/games/robots
-r-xr-sr-x 1 root games /usr/games/sail
-r-xr-sr-x 1 root games /usr/games/snake
-r-xr-sr-x 1 root games /usr/games/tetris
? -r-xr-sr-x 4 root _token /usr/libexec/auth/login_activ
? -r-xr-sr-x 4 root _token /usr/libexec/auth/login_crypto
? -r-xr-sr-x 1 root _radius /usr/libexec/auth/login_radius
? -r-xr-sr-x 1 root auth /usr/libexec/auth/login_skey
? -r-xr-sr-x 4 root _token /usr/libexec/auth/login_snk
? -r-xr-sr-x 4 root _token /usr/libexec/auth/login_token
-r-xr-sr-x 1 root smmsp /usr/libexec/sendmail/sendmail
X -r-xr-sr-x 1 root daemon /usr/sbin/lpc
X -r-xr-s--- 1 root daemon /usr/sbin/lpd
-r-xr-sr-x 1 root kmem /usr/sbin/pstat
You know, this is the next Grex hardware item. I should probably move future comments to a "software" item.
I've got six processes busily copying files around on the SCSI drives. No problems at all yet.
You're great, janc.
Regarding #304; Cool.
Regarding #303; My suggestion is to leave most of the `normal' binaries
that aren't on / (only ping, ping6, and shutdown are) alone and put
copies in /suid/{s,}bin, then put those directories in the user PATH
before the system default directories. As long as /usr and /usr/local
are mounted nosuid, it won't hurt anything if (mode & 06000) != 0 on
some files therein. It also makes it easier than trying to find what
was symlinked where when the system is upgraded.
Some suggestions: there's no reason to keep pstat, vmstat, systat,
modstat, netstat, etc executable by normal users. Definitely that's true
of fstat; that's a privacy violation waiting to happen. ssh-keysign
can be restricted to members. I don't see the need to strip the suid
bit off of shutdown, as that's only executable by root or users in
group operator, and there aren't likely to be many of the latter.
I would only put login_krb5 in /suid/sbin, as that's the only one
that's likely to be useful, if grex really moves to krb5. I would
further disable the skey stuff; I doubt anyone uses or would use it.
Certainly, authpf doesn't need to be executable by normal users, either.
Wall shouldn't be available to normal users, I don't think. If you go
with an alternate mailer like Postfix, then there's no reason to worry
about keeping sendmail sgid. Finally, the games are only sgid to write
high score files. Probably harmless, and I don't see why not to put
them in /suid/bin (or /suid/games, so they can be shut off easily if
there's a problem). Also, going with Kerberos means that stock `passwd'
can be disabled. /usr/bin/lock isn't likely to be useful if Kerberos
is used, either.
I wouldn't worry about ping and ping6 being suid since the only reason
they are is to send ICMP packets, and those would be blocked by the PF
rules preventing non-member users from sending random stuff out onto
the Internet. It's conceivable someone could find another hole in it,
but I think it's highly unlikely.
Generally Grex doesn't let users use 'ping' at all, from what I've been able to tell.
Yes; my point is that the PF filters will take care of that without modifying anything in the base-system filesystem.
Ah, I see. But if no one who doesn't have root privilages is allowed to use it anyway, why keep the setuid bit?
So that you don't have to remember to turn it off the next time you upgrade the system. :-) In general, it's more of a hassle to turn off the setuid bit if it doesn't do anything than to ignore it.
Dan, Re#298, 293, and others: Dreams can be like that. You never really know who said what, if it was real... but hey, it worked. Nonetheless, since you've provided so much other helpful information and I have not, I'm going to claim credit for this "fix". Afterall, you phrased it as a question. I said do it! (:
Another thing.... Grex might have turned off ping to avoid the problem of a malicious user using the `flood ping' -f option against another host. This mode sends packets to a remote host as fast as it can; effectively clogging the network link between the two. On grex's slow connection, this could clearly be a problem. However, OpenBSD's version of ping checks that the real user ID is 0 (ie, you're root) before allowing you to use the -f option for flood pinging. Given that any program that wants to create an ICMP socket must be running as root, and that the standard ping doesn't joe user flood ping anymore, perhaps it'd be acceptible to stop restricting access to ping. Still, someone might be able to DoS grex by sending a ping request to some big broadcast address, so maybe it's a good idea to keep restricting it.
Hrmph, Leeron! :-)
I have no intention of "remembering" to turn off suid bits. I'm for documenting it, in this case in the form of a script that does it. I'd turn off all the suid-root bits that don't need to be on (or leave them on a nosuid partition where the suid-bit doesn't matter). It's hard to imagine a security hole turning up in 'ping', but anything is possible. I'm much less inclined to be aggressive about the sgid scripts.
I'm not sure if you can use effectively use different RAID strategies on different partitions without having different disk sets for them, but I'm still thinking different RAID strategies make sense for different partitions. I think /usr is another example where data redundancy seems of less value. I you lose /usr, and you restore it from a month-old backup, you are probably fine. Just striping seems perfectly adequate for partitions like that. Where RAID 5 pays off mostly is in places like /var, /bbs, /home.
Regarding #314; Well, if you keep / as the only partition that honors the suid bit, then you only have to change permissions on two binaries: ping and ping6 (I still say ignore shutdown, since only users in group operator can run it, anyway). Regarding #315; The thing is that if you lose /usr, the system is unusable; similarly with /usr/local, /, etc. RAID isn't just about data security, it's also about availability.
Right, but that isn't Grex's highest priority. We aren't amazon.com that can't be off line for a few days without making headlines. Heck, we currently shut down for backups. If a disk melts down, taking a few days to come back up is no disaster, if we can do it without loss of data. If I'd have been at the last board meeting, I'd have argued against the third SCSI disk. Grex doesn't need that much disk in the near future. But, we've got the disk, so we might as well use it. I think the best use may be to do a RAID setup and win a bit better performance and a bit more data security. Right now I think the strongest argument against using RAID on Grex is the KISS argument. RAID certainly has benefits, but it adds complexity, and extra complexity is always a minus. Using RAID means one more potentially buggy piece of software in a critical function. It means one more complex subsystem staff members need to understand, administer, and reinstall on every upgrade. I think a sound argument could be made that the benefits aren't worth the complexity. Skip RAID. Divide the partitions among the disks and hope the loads balance out approximately. rsync critical partitions to the IDE disk frequently. Remember to do backups. We don't lose much by taking that easier path, and it is significantly simpler to install and administer. You could do ccd on some partitions, if you want the same performance benefits (slightly more even) at a lower complexity level.
All reports about problems with multiple drives on our SCSI controller seemed to be about really frequent ones. I've had all three drives busy reading and writing for all they are worth for a day now and have seen no problems at all. So I think we can probably consider that problem solved. I'll let them grind for a bit longer though.
Regarding #317; Well, to me, splitting up partitions is more complex. Maybe I'm smoking my hair, but it seems a lot simpler conceptually to think of a RAID as one giant partition that you can chunk up as you like, and the performance issues and load balancing are yours free. You get some modicum of resistance to failures as a side benefit. As for reliability.... RAIDframe has been in OpenBSD for several years now. It seems just as solid as FFS itself or even soft updates. Could it go wrong? Yeah, but there could also be bugs lurking in FFS. Complexity of configuration is pretty simple. One or two configuration files, and you're basically good to go. The only really annoying thing is that you can't directly boot from it. But, at the end of the day, I'm not on grex staff and aren't challenged with keeping it running. It seems simpler to me, and RAID-5 everywhere seems to fit grex like a glove (especially if it's planned on a few partitions already), but that's just me.
To some degree I'm arguing all sides of the question to make up for the lack of people arguing. But the large number of Grex staff who have no opinion on RAID is a bit worrisome. Administrative complexity hits at three points - first, right now - decide which RAID setup to use, and implementing it. Second, on each system upgrade, which we need to reinstall the kernel customizations and config files. Mostly we can document this and make a step-by-step procedure that most anyone can follow. Third, when a disk has a problem, or when we want to change the disk configuration. RAID can help with problems like this, but only if you know what you are doing. Doing the wrong thing can hose your data. In a volunteer run system, the level of knowledge that may be on hand on the day when a disk dies is unpredictable. It may make sense to keep things simple so lots of people feel like they can help. This argument applies equally to Kerberos. Both confer modest benefits that I'm not sure we need, at the cost of complexity that makes the size of the hump you have to get over to become an effective system administrator substantially larger. I fear they will reduce the size of the pool of potential system administrators. Or maybe they we make the system cooler and more interesting, thus attracting more potential system administrators. I think I may experiment with setting up RAID on Grex2003, just to get a better feeling for the complexity.
I'm in favor of RAID because I think it has the potential to *reduce* the amount of staff time needed for recovery if a disk fails. Assuming you don't have a multiple failure, recovery is reduced to a few steps, presumably simple ones, although I'm not familiar with RAIDframe specifically. Generally there's some way to tell the RAID subsystem you're going to offline a disk (this may have been done automatically if the disk failed), then you'd shut down the system, swap the disks, then boot and tell the RAID subsystem to rebuild the failed disk. During all of this except shutdown the RAID array is generally still usable, just in a "degraded" mode. (It will be slower.) There are some RAID systems where that isn't true, but I'd hope RAIDframe would have implemented online recovery. I think, if we have time, the concerns about complexity could be addressed by developing a step-by-step disk failure recovery procedure that any staff member could follow. It shouldn't really be any more complex than restoring from a backup, just different. If you're going to use RAID, I think it's best to look at the RAID array as one big disk, and not try to spread things out with different RAID strategies for different partitions. That seems unnecessarily complex to me.
Regarding #321; I concur. A barebones recovery plan should be developed in any event, regardless of whether RAID is used.
Thanks David. I definitely value input on this question. I built two new kernels. The first is simple GENERIC minus a mess of stuff we don't need - mostly device drivers for devices we haven't got. The second is the same, but turns RAID on (and does various stuff to make sure SCSI drives don't get renumbered when one fails). I also pushed the "maxusers" parameter from 32 to 64. Maxusers isn't really the maximum number of users. It's a voodoo number that is used to estimate sizes for all sorts of system parameters, which can be fine-tuned separately by editing lower level definitions. I saw various posts by people who had set it higher than 64 and got a warning message about that. One seemed to have some crashes after that and thought it might be related. However, one of these guys got no response that is in the archive, and the other was only told that he was an idiot. (These OpenBSD mailing list archives are such a valuable resource.) So for the moment I thought I'd set it to 64. It'll be easy enough to fine tune it later if we have problems with that setting. The OpenBSD FAQ discourages building new kernels without a danged good reason, threating lack of technical support for problems with non-generic kernels. However, since their technical support is laughable anyway and and Marcus is guaranteed to have changes to make to the kernel anyway, I decided we might as well get started, even if we don't end up using RAID. The stripped down GREX kernel is about half the size of the GENERIC kernel, which is a plus, if not a great big one: -rw-r--r-- 1 root wheel 4579691 May 21 07:01 /bsd.generic -rwxr-xr-x 1 root wheel 2719734 May 21 07:03 /bsd.new -rwxr-xr-x 1 root wheel 3133519 May 21 06:59 /bsd.raid It is currently running on the bsd.raid kernel, and that is the default. I haven't however, set up any RAID array yet. I've also now got a draft document on kernel building.
OK, I've created a RAID array on new Grex - just for experimental purposes at this point. First, I sliced up the three scsi disks into two partitions each, each disk identically: sd0a: 20479825 blocks = ~10 Gig sd0d: 15361127 blocks = ~7 Gig The sd0a, sd1a, and sd2a partitions are clustered into a RAID5 array, with just one partition, /dev/raid1a, on it (it can be sliced into smaller partitions). This is mounted as /raid. The sd0d, sd1d, and sd2d partitions are mounted as /sd0, /sd1 and /sd2 respectively. My idea was that if we want to do any benchmarks, this lets us access the same disks, with or without raid. All four partitions are rw-all so anyone with an account can create stuff there and look at the stats. df looks like this: Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/sd0d 7438613 1 7066682 0% /sd0 /dev/sd1d 7438613 1 7066682 0% /sd1 /dev/sd2d 7438613 1 7066682 0% /sd2 /dev/raid1a 19852909 1 18860263 0% /raid Note that the available space (18.8 Gigs) is about 61% of the disk we put into this (30 Gigs), most of the rest being used for parity, some of the rest being eaten by filesystem overhead of various sorts.
Yeah, from what I've seen a lot of OpenBSDers are a bit elitist and don't suffer newbies gladly. It's an unfortunate attitude.
Hmmm...I'm trying to run the bonnie benchmark (http://www.textuality.com/bonnie) on the raid disk, but I'm not sure it will work. Bonnie wants me to use a file size several times larger than main memory. Main memory is 1.5 Gig, so I told it to use 4 times that: 6144 Meg. But the first thing it said is: File './Bonnie.28521', size: -2147483648 Uh-oh. Someone may be using signed longs for the file size. If that's the case, then the biggest file size I can use is probably around 2048 Meg, which isn't several times the size of our memory. Well, I'll let it run and see what happens.
So, if we went with RAID, what would we do?
On the disks we'll have partitions
sd0a - pretty tiny. A place to store kernels. We'll boot from here.
sd1a - A copy of /dev/sd0a, so we can boot if sd0 dies
sd0b - swap partitions, one Gig each. You can put swap on raid, but
sd1b it doesn't appear to be a great idea. We'll trust OpenBSD to
sd2b balance swap load over the three spindels.
sd0d - the remainders of the disks, about 16 Gig each.
sd1d
sd2d
Now, sd0d, sd1d and sd2d will be clustered together into a RAID 5 array, called
raid0. To all intents and purposes, this appears as a single big disk. It
should come out at about 29 Gig, a more than adequate amount of space for all
of Grex's needs for a while. Raid0 gets partitioned into all the various
partitions we need, with root on raid0a, usr on raid0d and so on.
The 80 Gig IDE disk doesn't participate in this. We could put the boot
partition on this, but I'd want copies on two disks anyway, so we'll need at
least some non-raid partitions on the SCSI disks anyway, so let's leave
everything critical off the IDE.
Why not make sd2a a copy of sd0a as well? It wouldn't hurt anything, and might help, since each disk would be exactly like every other disk in terms of how the partitions are layed out. That makes partitioning easy; you can keep a copy of the disklabel for one of the disks around in a file somewhere, and just write it to a new disk with the disklabel command if necessary. Then, just plop the new disk in, tell RAIDframe to rebuild it, and let it go on its merry way.
Probably would. Actually, you don't even have to keep the layout in a file. You can just copy it from one disk to another: disklabel sd0 | disklabel -R sd1 /dev/fd/0 That's how I built the current setup.
Bonnie croaked while doing some seeks. Try it again with a smaller file to see if that works better.
I thought the IDE was mainly for a comprehensive backup of the boot partition plus storage for sources.
Yeah, you can do that, but if you also keep the disk label around in another file, you can label the disk on any machine with a SCSI controller. Does that matter? I don't know. It might be slightly more convenient. It's just a nit, though; it's trivial to get a copy of the disklabel once the machine's set up, and I doubt it would matter....
Oh yeah.... That's an idea. Put /usr/src and /usr/obj on the IDE drive, and then you don't have to do anything hacky with linking them to /var as in my latest proposal. /var can be decreased accordingly, and more space allocated to /u.
OK, with a file size of 2000M, I get results from Bonnie. The validity of
these results is, however, questionable, since a lot of the file may have been
in memory instead of on disk.
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
raid 5 2000 9520 6.8 7974 1.3 5706 2.0 50932 62.5 63815 13.0 147.9 1.6
scsi 2000 53754 43.4 54106 14.1 10090 2.6 60326 70.9 61067 11.5 201.2 0.8
We have two lines of results. The first was using the raid 5 array of three
SCSI disk. The second was on a single plain ordinary SCSI disk.
For each test we have the speed and the % of CPU used.
There are three output tests:
Per Char - file written sequentially with 2 billion calls to putc()
Block - file written with block writes
Rewrite - each block read, changed and rewritten
There are two input tests
Per Char - 2 billion calls to getc()
Block - block reads
And a seek test
Seeks - four child processes each execute 4000 seeks and reads. After
10% of these they change and rewrite the block.
So, on writing, RAID was 5 to 6 times slower. Notice that the supposedly
optimum block writes were actually slower than the character writes for the
RAID. The SCSI was twice as fast as RAID on the rewrite test.
On read the RAID array was still slower than the plain disks on the Per
Char reads, but a bit faster on the block reads. It was substantially slower
on the seeks.
Admitting that the benchmark is seriously questionable due to the small size
of the file relative to the large size of memory, this is not at all an
impressive result.
I reran the tests and got similar results.
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
raid 5 2000 9520 6.8 7974 1.3 5706 2.0 50932 62.5 63815 13.0 147.9 1.6
raid 5 2000 8745 6.4 7654 1.3 5717 2.2 51345 63.5 64022 14.6 150.0 1.1
scsi 2000 53754 43.4 54106 14.1 10090 2.6 60326 70.9 61067 11.5 201.2 0.8
scsi 2000 54058 43.4 54618 14.1 10129 2.8 60552 71.0 60865 11.1 203.4 0.9
I suppose the main advantage in performance is in balancing load among multiple
spindles, but this would really only be noticable if multiple processes were
reading/writing the disk at once. With a single process, we aren't going to
gain much. Only in the seek test are there multiple processes, and then only
four.
Are softupdates turned on on the raid filesystem?
No. They are not even enabled in the kernel. From what little I understand of it, it improves performance only with respect to metadata updates - updating inodes when files are created or destoryed. That wouldn't effect these benchmarks. I don't get a clear feeling that it is super stable yet either.
Every write and every read is also a metadata update (mtime and atime). Soft updates are definitely stable at this point; they're enabled by default in FreeBSD. OpenBSD tends to be somewhat more conservative, though. Gads; security be damned. Grex would've been better off with FreeBSD.
Well, I argued that. I have the impression softupdates are more mature in FreeBSD than OpenBSD. It's not really clear though.
For the heck of it, I ran eight copies of the Bonnie benchmark simultaneously
on the RAID 5 partition. Below, A through H were started simultaneously.
The last line is just one benchmark process running
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
A 2047 8087 5.9 5867 1.2 331 0.1 1647 2.0 1775 0.4 22.5 0.1
B 2047 1889 1.4 7770 1.4 545 0.2 890 0.9 1646 0.3 14.2 0.1
C 2047 1020 0.8 7038 1.2 417 0.1 1929 2.7 1578 0.3 19.9 0.2
D 2047 8647 6.3 7474 1.3 253 0.1 1905 2.4 4597 1.1 89.4 0.8
E 2047 3997 2.9 6946 1.2 215 0.1 23458 27.9 29250 6.4 155.5 1.4
F 2047 8314 6.2 7149 1.3 369 0.1 1333 1.6 1707 0.3 21.2 0.1
G 2047 8926 6.3 7899 1.4 512 0.2 865 1.1 1132 0.3 15.0 0.1
H 2047 4280 3.2 7861 1.3 458 0.1 954 1.2 1649 0.4 19.1 0.1
raid 5 2000 9520 6.8 7974 1.3 5706 2.0 50932 62.5 63815 13.0 147.9 1.6
They didn't stay well synchronized - you can tell that process E continued
running long after the others had finished (process scheduling doesn't seem
to be very fair). The write speeds didn't suffer too badly from the
competition, but the read times took a terrific beating - they are mostly
around 1/25 of the speed of one process. Note that there were probably some
write processes still running while the read processes were going.
Here's a more sensible test, a comparison against the SCSI and IDE drives,
in non-RAID configuration, with just one process running:
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
scsi 2000 53754 43.4 54106 14.1 10090 2.6 60326 70.9 61067 11.5 201.2 0.8
ide 2000 27188 21.7 27038 6.9 9634 2.6 24889 29.9 25640 5.2 99.0 0.8
Seems the SCSI is about twice as fast on most benchmarks, and about the same
on the Rewrite test.
RAID 5 is always going to be slower than a single disk, especially using software RAID. There's more processing overhead, and you're doing a third more reads/writes because of the parity. Still, I'm surprised to see it 5 times slower. That doesn't seem very acceptable at all.
RAID would be nice, and if we're making such a huge jump in processing power then I don't think the performance penalty (assuming it's only 2-1 or something less) isn't an issue.
I'm beginning to suspect that some of these some of these fast read times are
coming out of buffers. The drastic crash in read speed when I ran 8 bonnies
could because instead of trying to buffer one 2G file in 1.5G of memory, we
were trying to buffer a total of 16G of files in 1.5G of memory. Some of
these really fast speed (the ones around 50M/sec) are likely being done
largely out of cache. This makes the results pretty meaningless.
Anyway, I ran three simultaneous bonnies on a plain SCSI. I couldn't run
8 becase I didn't have 16 Gig partition.
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
A 2047 15240 16.1 20747 5.0 2018 0.6 3882 4.5 9506 1.7 164.7 0.6
B 2047 16768 13.7 20491 5.3 3016 0.9 4543 5.3 5598 1.1 31.6 0.2
C 2047 16812 13.6 17945 4.6 2977 0.9 4145 4.9 4513 0.8 46.5 0.2
scsi 2000 53754 43.4 54106 14.1 10090 2.6 60326 70.9 61067 11.5 201.2 0.8
scsi/3 2000 17918 43.4 18035 14.1 3363 2.6 20108 70.9 20355 11.5 67.1 0.8
The last line is just the one-process SCSI values divided by three. Notice
the write statistics for the three processes are all pretty close to one
third of the write statistics for a single process. The reads are way lower.
Is this an artififact of buffering? The seeks are a bit hard to tell, because
by that time the processes were pretty much out of synchronization.
The degradation in read performance is similar in magnitude to what we saw
on the raid (keeping in mind that we only have 3 processes instead of 8).
I think there must be a buffering thing going on here. The write statistics
are much better for the RAID - most of the 8 process wrote much faster than
1/8 of the single process.
Note that in both cases, the single processes read faster than they write,
while the multiple process write faster than they read. That's just weird.
Jan - can you fool the OS into thinking Grex has less memory than it really does? Or tell it not to cache disk reads?
Re #341: We are certainly taking a huge jump in processing power, but the
disk I/O performance improvement, while good, probably isn't as spectacular.
Disk speeds just haven't been growing as fast as processor speeds, and old
Grex's disks aren't nearly as old as it's processor. So the performance jump
in disk I/O from old Grex to new Grex might not be that huge. (Maybe I
should run some benchmarks on old Grex to compare with - will everyone please
log off?). I expect the new Grex will have memory to spare, cpu to spare,
disk space to spare, but maybe not disk bandwidth to spare (and certainly not
net bandwidth to spare).
I think the main benefits of RAID are:
- Availability. If a disk dies, the system can keep running. Performance
degrades, but it still works. If you have a hot spare disk, it can
be brought on line, replacing the dead disk, without interruption in
service.
I do not consider this very important to Grex. We can afford short
downtimes in the case of disaster.
- Data Protection. If a disk dies, the data on the drives is not lost.
This is important to Grex. However, it can be achieved other ways.
We could do daily rsync's from /var, /bbs, /home, and /etc to the IDE
drive (or even another machine). You might copy certain critical files
(/etc/passwd) more frequently. This has a performance penalty, of course.
In the case of a crash, your backup will not be fully up to date, so there
will be some data lose, but it should be tolerable. In the case of
accidental (or deliberate) deletion of data, this gives you a much better
safety net then RAID, so much so that we'll want to do at least some
of this even if we have RAID.
- Performance. RAID can balance the load over the drives nicely.
Yes, but so can ccd (pretty much equivalent to RAID 0).
So this doesn't really make a strong argument for RAID. However, there is
a bit of a flaw in the above break-down. These three aspects are not fully
separable. Suppose we merge our three SCSI drives into one big virtual ccd
drive and parition it up. Load balancing over the drives should be great.
Then one SCSI drive fails. You just lost a third of your data, scattered
randomly all over the system. Though you still have the other two thirds,
but doing anything with it is going to be a nightmare. Effectively a single
drive failure cooks all your data, instead of 1/3 of your data. I don't think
the performance improvement given by ccd or RAID 0 is worth the increased
risk of losing the whole system.
So I think the real alternative to RAID is what I originally proposed -
simple partitions, scattered across the drives in an ad hoc manner in hopes
of balancing the load across the spindles, with rsyncs to the IDE drive
for data protection.
I'm really starting to feel that might be the best choice. The advantages
of RAID for Grex are faint enough so that they don't quite overwhelm the
KISS factor in my estimation.
Re 343: probably - but I'm not sure how. I thought of just creating a RAMDISK and letting that eat up much of the memory (I could also run the benchmark on a ramdisk, which might be interesting), but it looks like you need to do a lot of kernal work to bring up a ramdisk, and I'm insufficiently motivated.
One can lower the amount of memory the kernel will use for caching by mucking with the kernel. It looks like, when caching is taken out of the picture, performance between RAID and the straight SCSI disks is more or less on par?
Hmmm...the faq (http://www.openbsd.org/faq/faq11.html) talks about the BUFCACHEPERCENT kernel value. It says the default is 5%. I haven't touched it, so if I'm reading this right, there should be 75M or less of disk cache. Hmmm...Linux uses all free memory as disk cache. A much nicer setup. Well, if that's the case then I'm not sure what make those benchmark numbers so goofy.
Re #344: I think that's starting to make sense, yes. Unless it turns out the performance hit you're seeing is an artifact of your testing method, we may be better off going with using the disks "straight". Getting only 20% of the potential performance of the disk subsystem in exchange for easier recovery on the rare occasions when we have disks fail doesn't seem like a good tradeoff. I'm still having trouble believing RAIDframe is *that* inefficient, though.
So am I; it seems unreasonably slow, and it looks vaguely like the numbers start to converge when you have many processes working at once, which is the normal mode of operation. I'd be interested in seeing what a test simulating a timesharing load would be like.
One simple way to "fool" the kernel into thinking that NextGrex has less memory is... to remove all but one memory module. Guaranteed to work. (: You might also want to test mirroring. Might be more efficent (less CPU utilization for striping and no extra parity data) while offering both availability and redundancy. The "cost" here is 50% drive overhead. The boot disk, with the system partitions (and /tmp or was that IDE?) could be one disk while the other pair could be mirrored.
I'm reluctant to take the machine appart for such purposes. Anyway, I'm a
software guy.
I certainly agree that we need better benchmarks, but I'm not sure how to
obtain them. Anyone with better ideas is welcome to suggest them. Those of
you with accounts on the system can probably run them yourselves, as the
relevant disk partitions are permitted 777. We really want to get some sense
of how RAID would effect a realistic multi-user load.
I tried running a benchmark with a really small file, one where you should be
getting lots of use from cache. Here's the 50 MB and 2000 MB results. Explain
this, if you will:
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
raid 5 50 24882 20.9 22641 3.5 3731 1.2 7555 9.6 64346 14.7 511.7 3.1
raid 5 2000 9520 6.8 7974 1.3 5706 2.0 50932 62.5 63815 13.0 147.9 1.6
The small run has much faster output, and significantly faster seek times.
The block read is about as fast as the large file (suggesting that it is
mostly reading from buffer). But what's going on with the per char read?
Note that the sequence of the tests is:
Per Char Output
Rewrite Output
Block Output
Per Char Input
Block Input
Seek
So it may be that the Per Char read was from disk, but left the entire file
in cache, so the block read was then very fast. But why wouldn't it already
be in cache after the block output? And why would the same speed be
achieved on the Block Read with the 2M file, which can't have all been
in cache.
I don't think I know enough about how buffering and disk I/O works in openBSD
to really interpret this stuff.
The information on the Bonnie web page (http://www.textuality.com/bonnie/) makes it sound like the tests are designed to correct for caching. There's some info there on how to interpret the results.
Maybe to help more people figure out what is being discussed here,
I should give a brief over view of RAID.
RAID stands for "Redundant Array of Inexpensive Disks" (the I-word
varies). Some wrote a paper once upon a time survey various options for
putting a lot of small disks together, and named the variations RAID 1,
RAID 2, RAID 3, RAID 4, and RAID 5. The RAID 0 name was coined later and
isn't really RAID. The interesting ones are RAID 0, RAID 1 and RAID 5.
I'll also discuss RAID 4 because understanding it makes RAID 5 easier
to understand.
Suppose you needed a 100 Gig disk, and all you had was ten 10 Gig disks.
Well, you could put them all together in a box, and write a little
controller that would write the first 10 Gig to disk one, the next 10
Gig to disk two and so on. To the computer, your box would look like
a single disk.
The performance of this disk array wouldn't be so hot though. Most
programs access file sequentially, so as the 100 Gig file was read,
we'd first have disk one very busy, while the other nine sit idle,then
disk two would be busy, and so forth. It'd be nice to balance the load
among the disks.
Which brings us to RAID 0 - also known as striping. We slice the disks
into 32K chunks. As you write a big file to the disk, the first 32K
goes to disk one, the second 32K to disk two, on through the tenth
32K chunk going to disk ten. That completes a stripe. The eleventh
32K chunk goes to disk one again. This balances the load over all ten
disks, so you get better performance. You can vary the chunk size for
different applications.
So RAID 0 gets you a large virtual disk and balances load over your
drives. It doesn't give you any increase in reliability. Quite the
contrary. If a drive dies, than instead of losing a 10Gig hunk of data,
you lose lots of 32K hunks of data scattered through all your data.
This is probably harder to restore.
Load balancing over multiple spindles would be nice for Grex, but not
vital. We don't have just a single process reading the disk sequentially.
Increasing the difficulty of reconstructing the file system after a
disk crash is too high a cost to pay for slightly better load balancing.
I think we can rule RAID 0 out as an option.
There is no Reduncancy in RAID 0 (so it should be called "AID 0").
Real RAID starts with RAID 1 - also called "mirroring". We are still
trying to make a virtual disk out of many real disks. This time we'll
group our ten 10Gig disks into five pairs, disk 1A, 1B, 2A, 2B, etc.
Whenever we write data to disk 1A, we also write a copy of the same data
to the corresponding location on disk 1B. The first obvious effect is
that our virtual disk only contains 50 Gig instead of 100 Gig. But now,
if disk 1B dies, we have an up-to-the-nano-second backup copy. We can
replace the disk 1B with a new disk, copy the contents of disk 1A onto
it, and be back up and running with no loss of data.
Ideally, in RAID 1, we'd do the writes to the two disks simultaneously,
so writing is no slower than reading. (In software implementations of
RAID 1, this may not entirely work.) On reads, we don't have to read
from both disk. We just select the one that is less busy at the moment
and read from that. So, we get decent performance and the capability
to survive a single drive failure, but at the cost of half our disk space.
I've heard of RAID 0+1, but not read much about it. I assume it's just
striping over the 5 pairs of mirrored disks in the example above.
RAID 4 is an attempt to get the same benefits as RAID 1, but with less
loss of disk space. This time we call 9 of our disks "data disks" and
the other one a "parity disk". Parity just means "even" or "odd". The
129th bit stored on the parity disk depends on the values of the 129th
bit stored on the other nine drives. If an odd number of those nine bits
are 1's, then a 1 is stored at that location on the parity disk. If an
even number of them ar 1's then a 0 is stored at that location on the parity
disk. In geek terms, the content of the parity disk is just a bit-wise
exclusive-OR of the contents of all the other drives.
Suppose a drive dies. If it was the parity drive, we can just recompute its
value from the other drives. But what if a data drive dies? Well, we have
all the other drives and the parity drives. So for each bit we have something
like:
data1 data2 data3 data4 data5 data6 data7 data8 data9 parity
1 0 1 X 0 1 0 0 1 1
The parity bit is 1, so we originally had an odd number of 1's on the
data disk. There are 4 ones on the surviving drives, so the bit on the
dead drive must have been 1. (In fact the dead drives contents are just
the bit-wise exclusive-OR of all the surviving data and parity drives, so
the reconstruction process for a dead data drive is identical to the
reconstruction process for a dead parity drive).
So, this is cool. We now have a virtual drive holding 90Gig of data, so
we've lost only 10% of our storage, and we can still reconstruct all the
data on any lost drive.
There are some additional performance costs though. The first problem is
the parity drive. Every time you write data to a drive, you have to update
the data on the parity drive. So though data writing is split over nine
drives, parity writing is all on one drive, so that drive is nine times as
busy as the other drives. It becomes a performance bottle neck.
The solution to this problem is RAID 5 - stripe the parity data over all the
drives. For example, the parity data for the first 32K of all the drives
would be on drive 1, the parity for the second 32K of all the drives would
be on drive 2, and so on. So there is no one parity drive and parity is
spread over all disks. (Note that disk reconstruction doesn't change -
you still just exclusive-OR all the other drives to reconstruct get the
lost drive.)
There is a second performance hit in RAID 4 and 5 though. Like RAID 1, every
write is to two drives - data to one drive and parity to another. However,
before we can write the parity, we have to compute the parity, and that means
we need to read the corresponding data from the other eight data drives. So
a simple write turns into 8 reads and 2 writes.
Also, in RAID 1, we were able to improve read performance by always reading
the data from the less busy drive of the two that had the data. In RAID 4
and 5, the data is only one one drive, so we can only read it from that drive.
However, we like to assume the striping in RAID 5 will balance the load among
the drives pretty well anyway.
There are lots of hardware RAID devices that optimize this kind of thing, but
we can't afford them. The option we are considering is software RAID, which
is implemented in the OpenBSD kernel by a program called RAIDframe. It's
pretty solid and rather nice. You can set up a RAID array, possibly with
spare drives. If a drive fails, and there is a spare on-line, it will
automatically bring the spare on line, reconstruct the lost data and proceed
without interuption of service. If there are no spares, it'll run with a
drive short (in RAID 5, any read from the dead drive is simulated by reading
from all the others and exclusive-Oring them). This is all terrific if you
need a server up 24x7, which Grex doesn't really.
Note that the redundancy in RAID gives you some protection against single
disk failures (it's assumed that you do something before the second disk
dies). It does not replace a backup. If you accidentally delete the wrong
file, or a vandal breaks in and alters all your files, the RAID will give
you nice redundant copies of the altered files, not the original ones.
So RAID is not a subtitute for backups. It's protection against hardware
failure and that's all.
RAID 0 can give you some performance enhancements by load balancing. The
other versions of RAID are all likely to be slower than a non-RAID setup,
especially if implemented in software. RAID 0 doesn't cost you any disk
space. The other versions are going to eat up some of your disk space.
In our case, since we have 3 drives, RAID 1 doesn't quite work and RAID 5
would eat up 1/3 of our disk space.
OK, that wasn't so brief. But writing it just make me more sure that RAID isn't right for Grex. The problem it is primarily designed to solve isn't an important issue for Grex. I may do some experimenting with rsync, and see if I can get a sense of how expensive it would be to regularly rsync to the IDE disk.
Where I work, we use rsync to keep a mirror of about 50 gigs worth of data. We're doing it across the Internet, via a T1, as well. It does cause a fair amount of disk thrashing on both ends when it figures out what files need to be transferred (very much like doing a 'find' across the filesystem) but overall it seems very efficient. It's worked well for us. My guess is the "expense" of doing an rsync to another local disk a couple times a day is going to be pretty low, especially since you're not transferring over a network and so won't need to involve ssh or compression.
I ran a bench mark last night; one of my own design. It's nothing really fancy or scientific; I wrote it a few years ago to try and get a feel for how various disk subsystems and filesystem times handled a load I thought was fairly typical of timesharing style machines. Basically, it just copies a bunch of 32KB files all over the place. Running on both the IDE and SCSI drives took about 4 seconds. Running on the RAID took around 80 seconds. Something is wrong here; there's no reason RAIDframe should be *20 times* slower than a `normal' filesystem, I just can't believe it's that bad. Perhaps I'm wrong about the stripe size; maybe 64 is just to small. Jan, could you up it to 256 and see if that helps any? I see at least one post from someone who says they used an interleave size of 168 and got decent performance, but 32 (and probably 64) was too small.
Right. I installed rsync from the ports tree (I like the ports tree). I then went to /sd0 (the test partition on the first scsi disk) and did time rsync -ax /usr . This should copy the whole /usr partition from the IDE disk to the SCSI disk (which is backwards from the direction we would be going) and give me some statistics. The /usr partition contains 664,632K of data. The result from 'time' was; 12.0u 24.5s 4:28.34 13.6% 0+0k 161046+660947io 36pf+0w So it took 4.5 minutes elapsed time, eating 13.6% of an otherwise idle CPU. I than reran it. In this case it should be checking the two copies against each other, and copying over only what changed (little or nothing). The time result was: 3.8u 3.5s 0:46.13 16.1% 0+0k 47000+1454io 1pf+0w This took 45 seconds. In real life we'd want the --delete option on the command, so files that don't exist on the source are removed from the copy, but I didn't do it in the test because I was paranoid about getting the arguments backward. Even so, we'd want our target partitions rather larger than the source partitions. Maybe just one big target partition instead of separate ones corresponding to the different source partitions, the whole thing readable only by root and possibly unmounted when it isn't being updated. Doing this a couple times a day seems a much lower impact way to data reduncancy than RAID. It'd be tempting to keep two copies of some partitions, and update them on alternate days. Dunno if that's necessary. This is not a substitute for real backups to tape, of course.
Dan slipped in. I'll try reconfiguring the RAID.
OK, I reconfigured it with a 256 K stripe size. The current config file is in /etc/raid1.conf. Running bonnie now.
How much would a RAID controller cost? I'm sure Jan is right that it'd cost too much, but if there are substantial benefits maybe the users would spring for some more money. I'm not sure the benefits would be all that substantial in any case. We're going to brand new spiffy hardware and I expect that will already mean a big improvement in reliability. Grex isn't unreliable even now. But it seems like it'd be easier to discuss it now than after the new machine is in place and in use.
I want to dispute the claim that since Grex doesn't have to be up all the time, the high availaibility provided by RAID isn't important. Grex doesn't pay anything for its staff time, but it is a scarce resource. The difference in staff time required to format a new disk and restore data to it, versus just putting in a new disk and letting it happen automatically, is huge. I too am curious about the costs of hardware RAID controllers. It's been years since I looked at such things, but given that they were widely available three or four years ago, I'm surprised to hear the price hasn't come down.
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
raid 64 2000 9520 6.8 7974 1.3 5706 2.0 50932 62.5 63815 13.0 147.9 1.6
raid 256 2000 9483 7.1 8768 1.9 5443 2.6 56017 67.9 70599 14.3 183.1 1.3
scsi 2000 53754 43.4 54106 14.1 10090 2.6 60326 70.9 61067 11.5 201.2 0.8
OK, the second line is RAID with stripe size of 256 kiB instead of 64 kiB.
Generally things are better, but not dramatically so. (Doing
'raidtcl -sv raid1' confirms that it did get reconfigured.)
Generally, if you do a large number of small reads and writes to small files,
then a large stripe size is better, and if you read a smaller number of larger
files, a smaller stripe size is better. Grex probably belongs on the larger
end of the spectrum.
Note that we have a hardware RAID controller on our motherboard, a "Promise" device whose model number I've forgotten. It works only with IDE drives and is not supported by OpenBSD (they don't seem to think they are going to support such things either). So, there is a wide range of hardware RAID controllers with different capabilities and prices. Recovering from a disk crash certainly costs less staff time with RAID. But how often does it happen? If you have a recent snapshot on another disk, recovering from a disk crash isn't all that hard even without RAID. Ammortize the time difference over the low frequency with which it happens, and I don't see much weight to that argument.
I did a quick search on RAID controllers, and saw prices in the mid- several hundreds ($300-700). I don't know anything about what value would be provided by the different types. I am not in position to analyze the number and effects of disk hardware failures, either. I'm only asking a question.
Also, OpenBSD's hardware support is pretty limited even compared to other open-source operating systems, so you can't buy just any RAID controller and expect it to work.
http://www.openbsd.com/i386.html#hardware includes a list of hardware RAID controllers supported by OpenBSD. Not thaat I think we should get one.
As jep said, you can get a decent RAID controller for about $400. OpenBSD drivers, though, are another matter. I think Grex needs to move forward. The 2nd guessing can continue for years, but the hardware is already in place (perhaps there should have been more discussion earlier). Keep in mind that what we're "bickering" over is what may (or may not) be a little bit better than the alternative. Having said that, what about my idea?! Have one boot disk with all the (rarely changing) system directories on it and then configure the other two "data" disks as RAID 1 (mirroring). It entails 50% disk "waste", but shouldn't have the performance hit while retaining availability and redundancy. After all, we live in compromising times.... (:
I didn't have the impression I was holding anything up, or that anyone else was, either, with the questions about RAID. Dan has been making what appear to be useful suggestions -- I can conclude that, if only that Jan has been accepting some of them. As for my part, I think it's clear enough to everyone here that I shouldn't have any input about RAID. I've never set up a RAID system. If there's a choice for a staffer between doing anything about the new system, and answering one of my questions or comments, by all means, work on the system. (As if I even have to say that.)
Back in janc's "Intro to RAID": RAID 5 turns a disk write into 2 reads & 2 writes. Better than what janc suggested that grex (with 3 disks, not 10) would face, but still not good when (i believe) grex is doing plenty of writes. (Is it?) Good hardware RAID (with dedicated hardware to do parity calculations, lots of private cache memory to reduce disk activity, etc.) could improve this. But disk space is cheap enough these days to make RAID 1 the way to go if one wants redundancy in a "lots of writes" situation. (At least for our size & budget.) RAID 1 is also considerably easier to do "acceptablely" in software, and great software RAID is obviously not a priority for OpenBSD. If we're eager to avoid downtime, a spare hard drive's great to have. When a dead drive has you down or limping, there's often a huge downtime difference between "have an identical, well-tested spare drive on hand" and "rush to research suitable replacement models, where they might be bought, costs, and lead times". *Especially* since different generations of SCSI hard drives sometimes fail to "play well together" in flakey, intermittent ways.
Hmm. It would appear that RAID5 performance is just unacceptably slow with RAIDframe in OpenBSD. Weird; I'd have thought it'd be better. Oh well, it's not the first time I've been wrong. If a hardware RAID controller is $400, one would have to weigh the cost of buying one of those versus bying another SCSI disk for $200 and using raid 0+1 (mirroring, and striping over the mirrors). That I am reasonably confident would be fast. Is it worth it for grex? That's another matter. I agree with scg that it is, but I'm not paying all the bills. I disagree with Leeron that doing mirroring by itself is the way to go; I think the price/performance ratio isn't worth it.
Sorry, jep, I didn't mean to imply that you (or others) were holding up anything. I certainly have no idea what the implementation time frame is. For all I know, Grex budgeted the next 3 months for such discusssion before finalizing NewGrex and putting it on-line. (: There's a lot of worthy discussion here and many good suggestions. But I do know how over-discussion can become negative on a BBS, and I don't want to see that happen here. Not to sound like the US Patent Officer of 125 years ago, I think all the constructive comments about RAID, with all its pluses and minuses have been made. It's time to make a decision.... These are the points I'd consider: (Note that whether RAID is useful for Grex almost becomes a moot point) 1. We have no RAID controller (and I'm not impressed by the list supported by OpenBSD) 2. The software RAID-5 performance rules that out. 3. Software RAID-1 remains a possibility. (At least Walter and I think so.)
Yeah, it occurred to me a little after I wrote my introduction to RAID that there were more efficient ways to maintain parity on writing - you can read the parity disk, and the value you are about to overwrite, and use those to compute the new check sum. So 2 reads and 2 writes suffices no matter how many disks you have in a RAID 5 array. So, Walter's correction is correct. I'm not at all unhappy with this discussion. I think we are still in a mode of usefully exploring options and collecting data. If I feel the discussion is stagnating, I'll bring it to completion, by declaring a solution by fiat if necessary, though I'd prefer to boil it down to a few options and get some concensus among staff. (If Marcus weren't out of town this month, I'd probably call a staff meeting. We'll need one after he's back in any case.) I'm interested in Leeron's RAID 1 suggestion. Two disks in RAID 1 and one disk plain wastes 1/3 of our space, just as RAID 5 would have. If RAID 1 performs substantially better than RAID 5, then this might be a viable option. The performance is going to have to be pretty good to convince me that this is better than the rsync option though. However, I plan to rearrage two disks into a RAID 1 array tonight, so we can benchmark that. The other project I'm pursuing is improving my understanding of Grex's disk usage patterns. If you're reading this in coop, you may want to check out Garage item 150 (I think) where I recently posted some statistics on old Grex's disk usage. Preliminary results seem to indicate that most of Grex's disk usage is on the /var drive (which does not include /var/spool/mail). Apparantly what Grex does most of is logging. More than half the disk activity is there, and it is almost all writes, not reads. I want to keep investigating this. I don't think we are in any special hurry to get the new Grex up, but I want to keep the process in motion, not letting it stagnate or stall. We are not stalled. Things are good.
OK, I've re-arranged the disks once again. Now /sd0 is a plain filesystem
comprising of of SCSI disk 0, and /raid is a RAID 1 array consisting of SCSI
drives 1 and 2.
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
scsi 2000 53754 43.4 54106 14.1 10090 2.6 60326 70.9 61067 11.5 201.2 0.8
raid 1 2000 16651 13.5 19368 3.4 10702 3.6 61614 73.5 68343 14.5 197.9 1.5
raid 5 2000 9520 6.8 7974 1.3 5706 2.0 50932 62.5 63815 13.0 147.9 1.6
This is definately performing much better than RAID 5, but the writes are
still rather on the slow side. (Though we do seem to be getting a slight
win on the READ side - looks like it is balancing reads across the two
disks well enough to get a moderate performance win over a single disk.)
Dan's multi-process benchmark might be worth trying.
Leeron's idea was to use RAID 1 for the more ephemeral partitions -
partitions where data changes rapidly, and restoring from a week-old
backup tape after a crash might be unsatistactory. RAID 1 would provide
a full backup of that data.
So, the RAID might have /bbs, root (mainly for /etc/passwd), /var
(current log files). The regular disk might have /usr, /usr/local, etc.
Dunno where users would go.
The problem is, that the partitions whose contents change a lot (and are
thus more interesting to keep a real-time mirror of) also tend to have
a lot of writes. So putting a partition like /var, which is almost
write-only, on RAID would be pretty unattractive from a performance
point of view. So there's a bit of a paradox here - RAID's advantage
over rsync is greatest when writes are frequent, but it's performance
suffers most under those circumstances.
The one partition where RAID 1 looks good to me right now is /bbs.
More reading than writing certainly happens there, but there is enough
writing so that keeping a mirror would be nice. I guess user partitions
would be a possibility for RAID too.
Hmm, I don't know. The more we look at the performance numbers, the less and less impressed I am, to the point of actually being really disappointed in RAIDframe. It almost doesn't seem worth it. Doing something like RAID 1+0 might be better, but would require another disk to really be useful. I'm not sure I think doing RAID on one partition alone is really worth it, the rationale being that if a disk dies, without using RAID everywhere, you have to do a lot of work to bring it back online. Doing the same work with one or two partitions more doesn't seem like that much of an added incremental cost. That doesn't solve the problem of lost data, though. One solution to that would be to leave a tape in the tape drive all the time, and do a nightly full backup of /bbs and the user partitions (just overwrite the tape). Every now and then, do a full backup of everything on a seperate tape and keep it for posterity.
FYI, I logged back into the nextgrex machine and re-ran my simple benchmark. The one that took 81 seconds on the RAID5 partition (using an interleave size of 64; I didn't get a chance to try it on the one with an interleave size of 256) took about 5 and a quarter seconds on average. Almost a 20 fold speed increase. Bonnie shows that performance on a mirror is about 1/3 that of a straight disk. With another disk, I'd champion using RAID 1+0, as I'm guessing that would be in the same general area performance wise as `normal' partitions, while still giving high availability. It'd cost another $200 to get another disk to do it, though.
Yeah, I think later today I'll make another pass at designing a RAID-less partition scheme. This has all been very educational, and RAID 1 is almost good enough to use, but I don't feel it is quite good enough.
I should note that I'm not pushing hard for using RAID. My impression has been that RAID is a good thing, all other things being equal, but I don't know enough about to make a good choice. What I would object to, and what it seemed to me was being advocated in some of the earlier arguments, is designing for low availability. There are all sorts of things it makes sense to design for in various situations, such as low cost, low maintenance, high performance, high availability, and so forth, and declaring one of those to be a high priority generally involves tradeoffs in other areas. If cost or performance are determined to be more important than high availability, I might agree and I certainly wouldn't argue. It's not doing RAID purely for the sake of not needing high availability that I was objecting to, and it doesn't sound to me like that's what's going on here anymore.
No, that certainly isn't my thinking here. RAID costs a lot of disk space, which we can probably afford. RAID, at least as implemented in software under OpenBSD, seems to have a pretty huge performance penalty. Much bigger than it theoretically should have. High availability *IS* the benefit of RAID. (RAID 0 and well-implemented RAID 1 might give you performance benefits, but in most version of RAID other overhead will eat any performance benefit). I like RAID, but my feeling is that high availability isn't important enough to Grex to justify its other costs. I may be wrong. It may be that this new computer is going to be so fast, that running Grex will hardly load it, and the performance cost of RAID wouldn't mean anything to it. If so, we should consider moving onto RAID in the future. I don't think making that change later will be hard. We need to rebuild the system every year and a half anyway, and changing the disks from flat disks to RAID does not have broad implications for the rest of the system configuration.
With all due respect, check out the speed of M-Net these days (arbornet.org). I'm not up-to-date on the hardware specs, but I'd assume it's running on a CPU that is 1/3rd to 1/4 the horsepower and slower drives.
Hmm, I use mnet...every couple of days or so. It's usually quite fast, but I don't believe they're using RAID. What's more, they're running FreeBSD, which has a different RAID implementation yet again. Leeron, what are you refering to that one should note in terms of mnet's performance?
I haven't been on M-Net for a while - but it also generally had fewer users than Grex. Generally I expect that the new Grex will be way too fast for the load the current user base will put on it. However, the user base may grow with better performance. Also we will be turning on quotas, which is going to put some drag on the disk performance - that's a lot more important to me than RAID. Also Grex occasionally gets hit by vandals - I just spent some time tracking down a mailbomber who was slowing the system down badly. How will the new Grex perform under those conditions? I don't know. I think we'll need to gain experience with the new computer before we can really decide this. I think we can reconfigure to use RAID later if we feel the need. I think we could do such a reconfiguration in a day, if needed.
Sounds good to me. Also, going to the next grex allows one to do some things that I think will be beneficial, such as turning off the queueing telnet daemon (the queue is almost always empty, anyway, except in like, 5% of all cases), using a new version of SSH, ditching sendmail in favor of something like postfix, etc.
This response has been erased.
Well, I didn't get much work on next Grex done today, but I built a respectable castle out of Lego, so the day isn't entirely a waste.
We finally received our OpenBSD CDs today - it took them 16 days to get here from Calgary. Stickers were included.
I have an inkling Marcus won't ditch sendmail as readily as you might wish, Dan. It seems to be one of his favourite hacking toys.
I don't know what his plans are, but I'd be surprised if he didn't seriously consider alternatives. I think the port to OpenBSD is going to be a bit of a "start over" for him even if he decides to stay with sendmail, because moving all his modifications into a current sendmail release is going to be nearly as much work as switching to a different program. I don't think he's really all that fond of sendmail.
He was talking about exim recently, but I still think postfix is a better choice. Grex isn't for people's personal hacking toys, anyway.
I use Exim, and it's certainly easier to configure than sendmail. It has a good, flexible filter language, too. It doesn't have the same privilage seperation features as Postfix, though -- it's still a monolithic binary.
Yeah, that's one of my problems with exim. I honestly believe that postfix is just as powerful, can be made to do everything that grex wants/needs, and is more secure. I also argue that it's better documented.
Major overload here. Hm. Regarding old hardware. Even when we switch over, we'll want to keep the old stuff intact for at least a bit in case of some sort of truely disastrous problem with the new hardware. Once we're comfortable, then we can decommission it. The disks, being slightly newer, may have some slight use for other small projects. Much of the data on them doesn't really matter, but for mail, spool, user files, swap, and /etc, we certainly want to scrub those before using them for other purposes or if we decide to sell or give away any of them (even to my basement collection). Scrubbing them *is* going to be an easier way to ensure reasonable security than destroying the disks. This is because sufficient physical destruction has its own issues. Disassembling things then bashing them with a hammer and using a bulk eraser may make data recovery more difficult, but it may still leave traces of data that could be recovered by the same sort of determined adversary that could recover data from a "single overwrite of all 0" drive. If that is the level of security you want, then physical destruction would require either a fairly good acid bath of all disk surfaces, or probably better yet incineration of the aluminum platters. No doubt we have pyromaniacs would would enjoy doing this, and there are certainly services that will do this (for a fee), but we'd probably be better off reserving these drives for future small projects (such as offloading mail processing, kerberos, etc.) or doing a multiple overwrite scrub procedure then selling them on ebay. Regarding kerberos. Cross and I clearly have a unreconciable difference of opinion here. I'm clearly not going to change his opinion, there seems little likelyhood he'll change mine, and I doubt most others share even my level of paranoia or care all that much. So I don't want to waste time arguing this. Cross (and any others who care) is welcome to change his password just before & after switching to kerberos, which should cover any personal concerns he may have. Root & other passwords will almost certainly change or be addressed by new mechanisms - this is almost inherent in any switchover in any case. Once we switch to k5, baring unexpected changes to the standard, changing one's password will likely result in a standards compliant k5 key at least potentially useful from other machines. Regarding mail. Hm, I think some of this scrolled off. Yes, we want to keep hierarchical mail boxes. The 4 possible mta's include exim, postfix, current sendmail, & legacy/hacked sendmail. Unfortunately, mail is an area where we have significant functional requirements, which means any stock solution out of the box will almost certainly prove unacceptable. One functional requirement is mailbox quotas, which at least has simple design parameters. Another functional requirement is anti-spam logic, which is both controversial and important. A final functional requirement, unfortunately, is that this all needs to come up in some finite amount of time. I intend to look at exim & postfix, with a view that one of these should a good enough base to support the functionality we want. As a fallback position, I am at least somewhat willing to consider installing the current legacy/hacked sendmail, with the understanding that it's both temporary and very very undesirable. I hope to spend time coding to avoid this possibility rather than composing lengthy responses defending whatever choices I make here. Hm. Surely I've said at least half of this somewhere already? Is this useful?
Yes, it's useful. Tell me, what do you think is so unique about grex's mail setup that a stock solution won't work? Surely postfix+procmail+spamassassin could handle the load grex would put on it, complete with hierarchial directories and mailbox quotas. Much larger sites use that combo and it works well. In fact, if you went with putting mail in $home/Mailbox, you'd get hierarchial mail directories for free, and eliminate a filesystem. Regarding Kerberos: I'd feel more comfortable with using your hashing algorithm if the guarantee was made that it would disappear from the system's Kerberos implementation no more than one year after its introduction, or some other suitable timeframe. There's no reason not to agree to that.
(re: anti-spam logic: I agree that a procmail/spamassassin combo would be a good move. nearly all of the spam that I receive at my Grex account is sent via open relays, which both SpamAssassin and SpamCop [via reporting] recognize and flag as such. whatever Grex is currently using, obviously, does not. given Grex's culture, I understand the reluctance to outright block mail from open relays, but I'd like to think that, with Next Grex, we should have sufficient processing capability to flag such mail.)
Mail has enough issues that perhaps it ought to be discussed in its own item. At some point, I will need to come up with a list of what grex mail currently has as "custom hacks"; that's not the same as a list of functional specs, but might beat idle speculation that just because there are "a lot" of solutions out there there's necessarily a set that matches our needs. I hope we will be able to take advantage of other people's work as much as possible. But I don't think there are any guarantees that we will necessarily find exactly what we want. Regarding procmail+spamassassin; this can't reject mail, which would be a significant step backwards spam-wise from what we can currently do. There are other issues regarding procmail+spamassassin (such as enforcing mailbox quotas, running perl on every piece of mail) that I don't find particularly attractive on a system-wide basis. I don't have a problem with this as a user option, but I'm much more concerned what to do for everybody else as a default. Regarding RBL - grex gets listed on them just often enough there's no way I can see us wanting to do this. RBL would be less unpalatable when used in conjunction with other stuff as just one more clue something "might" be spam. I hope that whatever we end up will have the flexibility to allow us such options, but not a sufficient or reasonable solution on its own. Regarding kerberos - there's no guarantee that *the* standard will necessarily do what grex needs, especially right off. Just for starters, des/des3 have inadequate etype info, preauth methods to reinforce weak passwords is lacking, aes is not yet fully standardized, and there is argument that the default aes string to key ought to be computationally intensive - fine for single/user workstations, not at all a good match for a popular timesharing system. I very much hope the standard evolves to a point where it fully meets the needs I think we have for it here on grex. For the short-term, our schedule means we probably shouldn't be, and when the standard converges to our needs is not something we can dictate. So, I don't think such a promise as you ask would be in grex's best interest. It may be worth keeping in mind that until we deploy useful kerberoized distributed applications from grex, the ability to kerberos authenticate to grex from elsewhere will be almost entirely only of academic interest. The real compatibility issue we have to sort out in the short term is not kerberos standards compliance, but how well does it fit into openbsd supplied interfaces and the grex environment? We aren't even close to worrying about distributed desktop applications, single sign-on, or making sure user passwords never leave the desktop.
That begs the question, why bother with Kerberos at all, then? I don't understand how other, much larger sites get away with stuff like using spamassassin on all incoming mail, and otherwise working on stock anti-spam solutions, but grex can't do it. Regarding timeframes; well, grex has already blown its one year timeframe. Given that, it seems most profitable to just use the BSD login API to deal with the custom hash algorithm and skip Kerberos for a later day.
Oh yes, regarding the Kerberos standard; I'm fairly sure the idea of burning a bunch of CPU time has been discredited and rejected. If not, it's a tunable parameter, anyway.
I think the problem he has with spamassassin is more the principle of it. We still have to accept the mail, we can't slam the door in the spammer's face. With the new system the extra processing penalty of having to accept the whole message before deeming it spam is probably not as important. I've seen some really poor spamassassin results on a few mailing lists I've been on, but part of that may have been poor configuration. On one list in particular spamassassin seemed to flag every message sent from Mozilla as spam, while missing a lot of *real* spam. I've kind of shied away from it for that reason.
I hate to break it to you guys, but 99% of the time, you're not slamming the door in the spammer's face as it is now.
(my vague recollection is that Kerberos will allow Grex to offload mail processing to another machine in the future.)
I don't think that'll buy much on the new machine, frankly. btw- I've started a new item, #156 in garage, for discussing mail issues.
(The problem we are trying to solve, if I recall correctly, is network bandwidth, NOT CPU/disk/etc. In that sense, the door is being closed fairly quickly: my understanding is that the mail is being rejected before the SMTP 'data' command.)
Postfix will do that. btw- has anyone ever measured how much bandwidth grex is really using?
Kerberos does 3 things for us: gives a more graceful way to deal with our existing password hash data, allows us to start developing distributed applications (such as file service, mail, conferencing, etc.), and eventually (if standards and availability permit) may allow us to offload client side stuff to user workstations. The first is immediately useful to us all on its own. The 2nd could be useful in the next 1-5 years. The 3rd is "pie in the sky" for now, but not impossible.
Did I misunderstand that? The modified password hash algorithm was justified because it would ease the transition to Kerberos. Now Kerberos is justified because it works better with the modified password hash algorithm? Seems like the case for kerberos has to rest on points two and three.
Again, I think it would be easier to use the BSD login API to deal with the custom hash algorithm now, and find some other way to transition to a standard Kerberos KDC. Particularly if you're not considering moving to distributed services for another 1 to 5 years.
This response has been erased.
The first is the reason to do kerberos now rather than wait another 1-5 years. The second is the reason we want to do kerberos. The third is the biggest but longest term and least likely win. kerberos 5 does do md5. But the md5 there has nothing to do with Unix md5 password hash. The unix md5 password hash was an early example of "computationally expensive" hash logic -- see above.
It's an extremely bad idea to take the hashed passwords from /etc/shadow and use them as keys in a Kerberos KDC. I already advocated a method for solving this problem cleanly and easily, but Marcus basically ignored it. See item 134 in the Garage conference.
(Spamassassin can be used to reject mail in the MTU, rather than via procamail. This is not as good as the spam rejection we do now, because the entire mail must be on Grex to run spamassassin against it. However, it will reject considerably more spam. It's a tradeoff.)
Yeah, I gotta say, the spam-rejection we're doing now is letting a whole lot of spam through.
Maybe it's best to delay this discussion until we know exactly what the mail filters do now....
I should note that one of the places I receive mail through runs SpamAssassin, and most of the spam I get scores considerably lower on the SpamAssassin scores than most of my legitimate mail. Of course, I never see the mail SpamAssassin does catch, and that's probably a considerable quantity of mail. Rejecting spam in the MTA is obnoxious. If you're receiving the spam directly from the sender, it probably makes a lot of sense. In general, though, most of the spam I receive is forwarded through other lists and aliases, and the return addresses are generally invalid, so bouncing spam just forwards it to the postmaster of whatever mail server was forwarding it, disguised as bounce messages of a sort the postmaster might actually need to look at. Spam filters should throw the spam away silently, with the caveat that you have to be really careful to err on the side of assuming mail is legitimate, since the sender of legitimate mail won't know the mail isn't getting through. In addition, if you're tagging spam in SpamAssassin and letting procmail do the discarding, you give individual users some control over how much is being deleted. My complaint about SpamAssassin letting through too much spam doesn't mean it's any worse than Grex's current filtering. I started automatically discarding all my Grex mail because more spam was coming through than I could manage.
A good amount of the spam I receive these days comes through grex. I don't bother forwarding it back to grex's uce alias since most of it is the standard type of junk one typically sees. That is, I doubt anyone is going to learn anything from it that isn't already common knowledge. I will note that spamassassin now includes a Bayesian filter that can be `trained' to recognize real spam. Running it and letting it do the tagging can't hurt much. There might be something to be said about generating bounces to postmasters who are running open mail relays, but I tend to doubt it.
It's extremely difficult to collect good statistics about the effectiveness of many anti-spam measures. Spammers tend not complain when they can't send spam, so we don't know how many gave up. Natural selection has clearly favored the evolution of spammers who get by grex's defenses. The problem with bayesian filters is it has to be trained on an individual basis to give the good results people quote. It's not enough to give it spam, it also have to be given ham, and each person's spam and ham are different enough that a filter trained to one person's ham isn't going to do so good at another's ham. A group of people with common interests may get acceptable results even so; and in a work environment defining "common" may even be possible. But I doubt there's enough commonality amongst all grex users to achieve results of any great value.
Case in point: I'd be perfectly happy with a spam filter that rejects all mail written in any languages other than English and German. If I trained a Bayesian filter, that's probably part of what it would do. However, there are lots of other users on Grex who for some strange reason like foriegn language main. A suitable Bayesian filter for them would be very different than one for me.
Re #412: If a postmaster's site is acting as a spam relay, they deserve to be annoyed. Re #414: Where I work we're currently using a Bayesian filter with a site-wide corpus, with pretty good results. But this is a small company, with about 30 employees that tend to make similar decisions about what is and isn't spam. (The majority of our spam is, for some reason, for porn sites and penis enlargment products, which no one admits to wanting in their work accounts.) We also don't bounce messages based on the filter, just tag them for later filtering with each user's mail client. I seriously doubt this approach would work for Grex, because the user base here is too diverse. To me, it's far, far more important that legitimate mail to my Grex account not be rejected than that spam be blocked. For that reason I'd oppose any heavy-handed spam filtering unless it was configurable on an individual user basis. I'd also oppose any spam filtering that silently dropped messages instead of bouncing them, because it's far better if a legitimate message bounces than if it just disappears. People are conditioned to assume that if an email doesn't bounce back, it arrived intact.
A global installation of spamasassin can be use each individual user's preferences. That is, it can be run by default and use a target users data; there doesn't have to be a system wide immutable setting. Also, spamassassin doesn't automatically delete spam, it just tags it and let's the user decide what to do with it. I dump mine into a special MH folder that I scan once or twice a day to pull out any false positives.
(when installing spamassassin on a system, it's a good idea to run it in a "learning" mode before using it to block mail. because it works using Bayesian filters, it needs time to A) learn what to block and B) prove that it's not dumping legitimate mail at an unacceptable rate.) (Marcus wrote somewhere [I don't remember where off-hand] recently indicating that, if Grex were to use an open relay blocking list, it would occasionally reject mail to itself because Grex sometimes appears on RBL lists. I'm not sure why he caem to that conclusion. while I've occasionally seen Grex on DNS blacklists, I've yet to see it on an open relay list. at any rate, it seems trivial [to me, anyway] that, if Grex were to use outside blacklists, it would be on its own local whitelist.) (Scott Vintinner wrote a document on how to set up an anti-spam gateway using a combination of OpenBSD, Postfix, Amavisd-new, SpamAssassin, Vipul's Razor, and DCC. it's at http://lawmonkey.org/anti-spam.html . he makes some choices in implementation that we likely would not. still, it provides a start, and there are some useful suggestions in the document that are valid in and of themselves. the one question I'd have about the gateway set-up is "how resource-intensive is it?", but I keep reminding myself that NextGrex is more powerful than the current model and will likely surprise us with what it can and can't handle.)
re 416:
There's a rather large difference between an open relay forwarding spam
in random directions, and a well secured mail server handling mailing lists
or .forward files.
As a case in point, I used to host the Grex staff mailing lists on my mail
server. That is, people would send mail to the staff@grex.org address or some
other less publicized aliases, and it would be forwarded to my mail server,
which would then forward it on to the individual staffers on mail servers all
over the place. Then a couple of staff members started using spam filtering
that rejected spam in MTA, thus sending the spam back to the postmaster of
the mail server that was trying to deliver it to them, in this case me. One
staffer had this imposed on him by his mail provider, and didn't have much
of a choice in it, while another staff member had configured it himself and
refused to fix it. The result was that running a mailing list that sent mail
to those people was more trouble than it was worth.
My provider dedided a while ago to start bouncing mail with SpamAssassin. For whatever reason, this was catching a lot of mail that I wanted, and people were complaining. I left them and went to a site that didn't block spam, as I run my own spam filters (a combination of spamassassin with custom rules and bogofilter). Just a datapoint.
There was a decent item on SlashDot yesterday about noting the tuple of sender, recipient, & IP address at the the start of a SMTP session and putting the e-mail off an hour (with "try again later" or some such) if the tuple isn't in a kept-on-the-side database of tuples seen more than one hour but less than 36 days ago. This does a passable job of blocking most try-once-quickly-and-move-on spam, but little try-again-per-the-RFC real e-mail (according to the author). The 1 hour and 36 days were adjustable parameters; there were other details & some real-use-experience statistics. This sounds like it has a number of features we're looking for in an anti-spam tool for Grex...any thoughts?
Whatever will help stem the flow...
It would tend to increase the network load. (Each valid message that wasn't high-traffic would require two connections and attempts intead of one.) Is the majority of spam really "try once and move on?" I imagine that's probably true of direct-to-MX spam, but probably not true of open relay spam.
I get the same spams numerous times.
Perhaps, but it's not clear to me that the load on the network link is excessive right now. It's thought to be, but no one's ever measured it. I'd think the latency in getting email would be more troublesome. Spam isn't a problem that's solved by adding arbitrary delays into the mix.
We're close to having this ready for production use at work. Very early results suggest that most spam doesn't come back to retry delivery later.
It might be easier to deal with spam by accepting the first N pieces of mail from a novel host, then delaying the rest; if the mail starts showing up in the UCE bin, further mail from that host is refused for an extended period (or perhaps permanently). This takes care of hijacked relays as well, while passing the occasional e-mail from an odd host without any delays at all. While spam may not be a big bandwidth load, it sure isn't going to stay small unless we act; it really behooves us to frustrate spammers if we can. Jan, is there any way to get Backtalk to salt its pages with fake e-mail addresses that would be picked up by spam harvesters? That would be one way to be certain that a host was sending spam to Grex.
All of these delaying tactics sound like they have the potential to cause problems for people who are on legitimate mailing lists.
I think it's probable that email addresses are being harvested from Grex, but I'm not sure of the extent of the problem. Grex backtalk has a robot.txt file that requests honest robots not to harvest it, which is why google searches don't find Grex items (somewhat mixed blessing). Obviously dishonest spammers are likely to disregard this, and I have seen indications of robots walking through Backtalk on Grex. In most Backtalk interfaces, clicking on the user name will give user bio page - but on Grex that is just the .plan file, and won't contain in clickable email addresses, which are probably the spammer's favorite thing to harvest (on other systems Backtalk does have clickable email links, an issue that I need to address). Some people will have their email addresses in their .plan files, but most probably don't or have email addresses for systems other than Grex, so spam to addressesharvested from there would go to non-Grex addresses and be hard to recognize. A spammer might be smart enough to go through the Backtalk pages, pick up login names and attach "@grex.org" to the end of each. But I'd think this would be uncommon. I'm not really inclined to think that seeding a lot of bad addresses is going to help enough to be worth the ugliness. However, you are certainly welcome to include <A HREF=mailto:uce@grex.org>send spam here</A> links in all your HTML postings on Grex.
I don't doubt that delaying mail acceptance for an hour would be effective against the current generation of spammers, but my general impression of techniques for blocking spam by insisting on standards compliance is that spammers are getting better and better at following standards. That strikes me as something which, if done commonly, would put a lot of extra load on legitimate mail servers, and would break the ability of e-mail to be used as a fairly instantanious back and forth communications tool.
More results from using this at work - over 75% of spammers do not come back within 24 hours. No evidence that any legit e-mail has been lost. Looking through the tuple database, substantial spam attacks are *really* obvious...suggesting automated means to keep 'em locked out after their tuples age out of probation...i think we're using 5 minutes for this now. Yes, like any anti-spam technology, this has downsides and costs for both the infrastructure & users. But *not* using any anti-spam technology also has downsides & costs - like overflowing "in" boxes of left-'cause- there-was-too-much-spam ex-users.
Walter, are you using some kind of pre-made package to do this or did you roll your own?
We rolled our own (adding a few lines of C to our mail server software with MySQL on the back end). Graylisting is hardly more complex than a bubblesort routine, and it took fewer lines of code than any bubblesort that i can recall. We started it in "observer, record, & report what you'd do" mode. We've added more simple features (whitelist, etc.) to it as they've occured to us. It looks like spammers are far more impatient on retrying than legit mail servers - we're hoping to add the rule "if graylisted e-mail comes back before the graylist time expires, then add a minute to the interval it came back within, compare that sum to the remaining graylist interval, and update the graylist interval to the greater of those two". Tweaking the numbers promises to let almost all legit e-mail avoid additional delays from this, hopefully spam with delay itself to death (or blacklist).
How's the NextGrex project coming along? I haven't seen any updates here since July. Is the computer still going to be new when this is completed?
It's stalled; too many staff have too many other things going on. I was
going to propose, and I suppose here is as good a place to do it, that
we move nextgrex from Jan's house to the pumpkin after the new version
of OpenBSD comes out. I think that'll make it easier to test and debug
subsystems, and ultimately easier to transition over from old grex to
new grex. Once we do that, we should set a schedule; a reasonable one,
that we can stick to, trying to anticipate people's time demands, for
switching over within, say, no more than three months.
Major subsystems I see needing configuration before we can switch are:
1) The BBS. Someone has to port PicoSpan (mdw?) or provide
an alternative (YAPP? Frontalk?)
2) Mail. We have to build out the mail system.
3) set up newuser
4) Configuration changes.
That's about it. Everything else, we can do once we've transitioned.
I'd say that's a fair summary. Only potential drawback that I can see to locating it in the pumpkin is if the hardware still needs a fair amount of hands-on attention, it'd be a nuisance for somebody to run over there all the time to tend to it. If the hardware configuration is pretty stable now, this isn't an issue.
This response has been erased.
Could we keep mail as is, on old Grex, for a while? Maybe get it up using YAPP and Frontalk only? I, like everyone else, would hat to see this project stagnate much longer. If that means coming up less complete, that would be better than not at all.
I agree. Re #437: My understanding is that Grex is running a legal copy of Picospan, though we do not have an official license. The validity of our copy comes from Marcus, so he needs to explain how that works.
Sorry for the sharp-sounding comment at the end of my question. It didn't come out the way I had imagined it would. No criticism was intended. If the hardware is moved to the Pumpkin, are there people who can do the things which need doing? Is the direction of the mail system determined closely enough that another staffer can do it, or is this a "Marcus-only" project? How about newuser? I assume Picospan is something only Marcus can do. I'd hope the staff is avoiding things wherever possible that can only be done by one specific person, whether it's Marcus or anyone else. Specifically... all of the world uses e-mail; Grex should be able to implement an e-mail system, too. It isn't even that important. If Grex didn't have e-mail at all, every Grexer would be able to get by anyway by using any of their dozens of other mail addresses. It would be silly to find out NextGrex was being held up only because a specific person wasn't available to set up a customized e-mail system. For other topics; is there agreement on what needs to be done for newuser? Is someone working on that? I'm not particularly parsimonious... but all of the hardware was purchased months ago, some of it at premium prices as I understand it to obtain stuff that's state of the art. Computer hardware doesn't have much of a shelf life for being state of the art. Grex got donations, and bought all of the stuff at a pretty quick pace... then nothing much seems to have happened, at least judging from the information I'm seeing coming out. It made me wonder about why things aren't happening any more. Is it a brief slowdown for specific reasons, and things will be picking up again soon? Thanks!
This response has been erased.
Re 440: John, some users have no other email address whatsoever. I happen to live with 3 of them. Other users have enough correspondents who email to their Grex addresses that notifying them all would be a big problem - and suddenly having mail start bouncing without much warning would be a much bigger problem. I'd say we could do without conferencing for a while about as well as doing without email.
Regarding #440; We're not going live without a working mail system. It's not rocket science to put one together, but it does take time, and that's what everyone is lacking right now. Regarding Remmers's comment about hardware configuration; As far as I can tell, with the exception of the ethernet interface (which seems flakey), it's pretty much set. Everything left is purely software configuration.
I'll accept the point about e-mail, but that's not the main point. resp:0 brings up the idea of buying the hardware right away... February 17, 2003. There was enthusiasm at that time, maybe even some urgency. The first component was purchased on April 8, ordering continued through April, assembly began on May 4, and by May 17, aruba said it was all assembled, tested and working. There was enthusiastic testing through May. The final physical component arrived on May 28 -- the OpenBSD CDs. And work stopped, at least as visible to interested outsiders. It has been 4 1/2 months since May 28; approximately twice as long as it took to acquire, assemble and test all of the hardware. I presume OpenBSD has been installed on the new hardware so that *something* can have been done since then. In 4 1/2 more months, it will have been a year since this item was entered. Where's the urgency now? Or at least enthusiasm? Shall I wait until February and ask again then? Or maybe hold my horses until April? Or will that have been too soon to expect results from the purchase of all new hardware that was fully assembled by the end of May? If we hadn't bought anything yet, how much further behind would we be, compared to where we are now? Put another way, how long do we wait before the hardware needs to be replaced because it's too old? I understand very well about being too busy... I also understand it's not always *everyone* who's too busy. That's not even very likely. There have been many group projects that never happen at all because everyone waits forever on one person. It would be a real shame if this project is one like that. The treasurer bemoans the lack of donations all the time. Donations are declining... but a lot of people rushed right in to send money when they were told it was needed for the new NextGrex computer. You folks on the staff gave every indication you were ready to set it up so we could start using it. We all understood it would take time... but how much time? It takes a *really* *long* time to finish a project if no one is working on it. If that's the case, as it appears to me it is, maybe it it could be time to look at alternatives?
Hey, I agree with you 100%. It's ridiculous that it's taken this long to get things rolling. I think we should be able to move over to next grex within three months, and I can think of no reason we shouldn't be able to: this has stalled long enough. There is some work happening, but you won't see it if you just read coop. Most of it happens in garage, and some (to a much lesser extent) in other places. In particular, despite juggling kids and other time commitments, Jan has made some stellar progress setting up facets of nextgrex. Regardless, we're stalled right now. Here's my take on some of what happened. There's been disagreement among staff about how _best_ to proceed in certain technical areas. But traditionally, certain staff members have maintained domains of responsibility comprising various subsystems on grex (like how Marcus maintains PicoSpan). Some of this is necessary (Marcus and PicoSpan is a good example; he's the only one with the source code), some is contrived. Those parts have been, if not off-limits to other staffers, then at least considered that individual staffer's responsibility and left up to them. So, despite some staffers having more time than others, certain areas of nextgrex remain untouched pending the staffers who have less time, but traditional responsibility of those areas, to become available. I don't think this is working out too well. I think maybe we should consider just saying, ``screw it; we need to get the new machine online. Let's figure out the quickest way to do that and go from there.'' *THAT* said, we also have to be careful. Grex, right now, is a real mess in my opinion. There are patches upon patches upon bandaids upon kludges upon hacks upon more patches stacked up so high, it's difficult to see over them all. I think it's scary for newer staff (well, for me anyway) to *do* anything because everything is so customized. A lot of the work Jan has been doing on nextgrex is meticulously documenting *everything* he's done so that rebuilding grex from scratch is going to be an easy process for a reasonably competent Unix system administrator. This is good, and necessary, and we really do need to do this with all the major components of the system so that, moving forward, we don't end up in the hole we're in right now (it really is a hole). I think we can get back on track if a couple of us agree to donate a few hours or so a day for the next couple of weeks, to get OpenBSD 3.4 to where we're at now with OpenBSD 3.3 (note: OpenBSD 3.4 comes out on the first of November). If we then put the machine in the pumpkin, we should be good to go with getting nextgrex online in under three months. hour or so a day for the next few weeks to g
This response has been erased.
I've proved the security of grex's password hash to be the same as that of sha1, which is provably mathematically superior to MD5. Also, isn't YAPP shareware? I thought you had to pay a significant amount of money for anything other than casual use?
This response has been erased.
Blowfish in OpenBSD is pretty slow. I've argued many times that building our own scrambling algorithm was a bad idea, and I certainly would have done it differently, but it happened before I came on board. Sorry, them's the breaks. Sometimes you just have to accept what you're given and work with it as best you can: we've got anywhere from 20,000 to 40,000 users whose passwords are hashed with it. But, we've also gotten it working with OpenBSD (I've got it working with login.conf on nextgrex). Jamie, I think you have a lot of good ideas on how to move grex forward, but don't blow it by being your polemical self. Three months *is* a reasonable amount of time for a complete build out of nextgrex. And I don't care how long it took Salcedo to get mnet up and running. This isn't mnet, and we want to have an amount of downtown that's a minimal amount longer than the time it takes to transfer the data from oldgrex to nextgrex (five weeks just to move things over---now that's outrageously long, in my opinion). I don't even live anywhere *near* Ann Arbor (otherwise, I'd volunteer to take in the hardware and crank away on it over the next two or three weekends), and other staffers don't have a lot of time, so it's unlikely we'll be able to half the amount of time I already projected. As for YAPP, I'm all for it. We need something up and running sooner than later. However, ``you-think-you-heard-someone-say-maybe'' isn't a license agreement, and if your primary objection to PicoSpan is a licensing issue, then trading that can of worms for a mess that is a verbal licensing agreement for YAPP doesn't seem much better. Now, if I had my druthers, I'd just as soon see FrontTalk built out and used as a replacement for both. We're sure of the license for that; unfortunately, that would require someone talking a lot of time to make happen, and, as we know, time is an issue. Look, you're preaching to the choir here. So cut the confrontational crap and let's figure out how to move forward.
This response has been erased.
I would like to see us stay with picospan. I left m-net when they switched to yapp. I tried it for a couple of weeks, hated it and never logged in again.
This response has been erased.
This response has been erased.
Four and a half months is long enough to wait for one or two formerly essential people to do what they have apparently promised. If there are holdups who aren't going to do what Grex needs, then they need to be replaced. Otherwise, the future of Grex is being held hostage, and those one or two people are being treated as more important than all of the rest of us. I can understand it if no one on the staff wants to state it that baldly. There seems to be little enough of teamwork for this project. So there it is, I'll say it. I have used YAPP for years, and Picospan for years, btw, and have not noticed any deficiencies in text YAPP. (WebYAPP is horrible, but Backtalk is available.) Let's put it this way, I'd rather have YAPP and NextGrex than Picospan and the rotting remains of obsolete Sun hardware.
Regarding #450; You're making all the same arguments I did. They haven't worked yet, but I'm willing to transition to another algorithm. It just takes convincing people that it's the right thing to do. However, the specific suggestion you make, to use the password expiration feature to force switching to another algorithm, would take a year (and is also something I suggested several years ago). Regarding #451; What specifically didn't you like about it? To me, they seem functionally equivalent, though I confess to not using any of the more exotic features of either. Regarding #454; Okay, that's fair. Btw- it's not just picospan; there are a number of things that are waiting the laying on of hands of one or two people. I confess I myself am guilty of stalling on at least one thing. The way I look at it today, it's silly to try and do anything before OpenBSD 3.4 comes out on the first of November. Assuming there's going to be a bit of rough edges around the release, let's say it's reasonable to assume we can do an FTP install on the 4th or so. From today, that gives us about three weeks before we can *really* make any software changes. I propose this: we start the clock for the transition to nextgrex today. We have the period between now and the 4th of November to argue about how to do things, and then once November hits, we have two months and one week to get the new system online, with all of the data transferred, and the Sun turned off. I feel pretty confident we could make that deadline, and if someone has a pet project they can't squeeze in before that, too bad. Comments?
Regarding #448; By the way, the details of the grex password hash, as well as the details about the configuration of many of the subsystems on grex, have been publically available for some time through the, ``grex staff notes'' on the web. Indeed, there's even a link to the code that implements it (using that was how I ported it to nextgrex). See http://www.cyberspace.org/staffnote/passwd.html, with the actual code at: http://www.cyberspace.org/staffnote/mkp2.c.
This response has been erased.
Responding to #457: (1) No. It has it's own recognizable token that doesn't match $[0-9]+$, but it's easy enough to switch off. /etc/master.passwd can be put together with awk or perl. Indeed, I already have a script to do it, and authentication uses the standard BSD login.conf framework. There are two possible places to go from here. (a) we move to Kerberos. This could be done by modifying the password changing program to automatically register principles using a standard string to key algorithm, or by using the modified KDC Marcus wants that uses his hash algorithm, where we'd use the existing contents of /etc/shadow as the key database. I think the latter is a really bad idea. (b) keep the same hash algorithm, using the login.conf framework to deal with it. Okay, there are really three: (c) switch over to one of the system standard algorithms after a suitable period of time, by modifying the password changing program to do it. Our customizations would disappear after a year or so. I favor this route. If we want to go Kerberos, it'd be best to go with a standard string to key algorithm, but this would be easy using methods I've already outlined elsewhere, using login.conf to make them transparant to the user. (2) That's the equivalent of a site-dependent salt. The idea is that if the same algorithm is used on more than one site, the hashes shouldn't come out to be the same. (3) That's not even the case with conventional cryptography (the algorithm doesn't get *easier* to crack, it just doesn't get appreciably harder). Regardless, this isn't conventional cryptography; it's a compression based hash algorithm. Marcus's password scrambling algorithm is essentially the HMAC construction applied to password hashing. The thing is, HMAC doesn't provide any additional security to password hashing over simple hashing, because it's designed to solve a different problem: authenticating messages over an insecure network, using a shared secret key. The thing is, with password hashing, either the key or the message is fixed, so you don't get any additional security. (4) That would require a change in the current configuration. But, there's no point. We have the situation with the password hashing algorithm well in hand on nextgrex. See (1). Don't bother arguing about the custom hash algorithm. It was a mistake (well, sort of. It was better than Unix crypt(3)); everyone knows that.
What reason is there to believe that anything will start on November 4, when it didn't start on May 28? Dan, are you speaking for the staff?
Regarding #459; No, I'm not speaking for all of staff. I am, however, speaking for myself as *part* of staff. There is of course a difference, though my experience tells me that timeline is reasonable if the rest of staff commits, say, 4 of 5 hours a week to make it happen. I'm hoping other staff members will chime in here saying either, ``no, that's completely unreasonable, and here's why...'' or, ``Yes, I think would could do that.'' My reason for stating that November 4 is a good starting date is that the next version of OpenBSD *will* be released on the 1st of November, and therefore it's reasonable to assume the 4th will be a date when most of the major problems of the release will be worked out, and it would be feasible to do an install.
There is a change in object file format with the next release, if I've properly understood the discussion. We (staff) have been discussing when to make the switch, with most wanting to wait for the release that includes/requires it. So November 1 sounds like a good starting point to me. I expect this to be a significant part of the agenda of Wednesday's staff meeting.
On expiring passwords: ssh does not work well with expired passwords. Blanket expiration of all passwords would cause a lot of trouble.
I'd like to respond to the earlier thread, of how we've gotten to the point that the move to New Grex has stalled. Because I'm going to name names and be pretty straightforward here, I'm going to be very clear this is my opinion and I could be dead wrong on a lot of it. One of the things that makes Grex so cool is that we are the sum of volunteer effort. It's also a problem as there really isn't anyone to call on the carpet when project goals slide. And not only are we dependent on volunteers but a very few pretty much call the shots for how anything goes. Partly this is out of respect for their opinions, partly it's because if the wrong decision is made they are the ones who will have to clean up the mess. But there is something else working there too, that I can't quite put into kind words, but it's working against the system as a whole right now. It can be seen in the difficulty our staff seems to have working as a team. We are all at fault for allowing one or two people to be so central to Grex's future. It's my understanding that Marcus is under a lot of stress at the moment, dealing with family issues. Been there, done that, but I was fortunate not to have a community of thousands breathing down my neck to simultaneously keep their project growing. If Marcus is holding things up right now it's not Marcus' fault. It's our fault for letting *any one person* be in the position of being that necessary. So, where I'm going, is toward a shift in philosophy. We need to not only move Grex to new hardware but to a way of working that fosters teamwork, uses software that is team customizable, and where any one person could walk away and the rest of staff could pick up and carry on. It's way time this should happen. I don't think Picospan is going to fit that goal. I'd like to take a serious look at software that would. It's scary, for sure. It probably would mean we'd not be too happy until we'd had a chance to mold it to our specific needs. But the point is we could mold it and the staff and users would get to decide how. And we'd be far far less dependent on one person. A win/win after the agony of working though the change. I'd also like to applaud Jan and other staff, who are working hard to document exactly what they are doing with New Grex, and looking to standardize as much of the hardware and software as possible. They are looking out for us.
Mary, in my opinion, that came off as both tactful and straight. I don't think I've ever managed to do both at the same time. Nice job.
I basically agree with Mary here, and like everyone I am frustrated that we haven't moved forward. I'd like to see some input from more staffers.
Well, there is another issue at play. Replacing Picospan is fine with me, but what are we going to replace it with? An old version of YAPP is available in source form, but even mnet doesn't run that, and there are no indications that it's completely free (as in not paying for it), or that the new version is available in source form, free, or that we have any chance of getting a license for it without paying many thousands of dollars. What alternatives do we have? Is frontalk there yet? That's the *only* reasonable alternative if getting YAPP doesn't pan out.
This response has been erased.
What's the email address? I'll send a message asking, but I'm not convinced it's the right direction.
This response has been erased.
Okay, I sent an email to that address. We'll see what comes back.
We sure will.
I did a lot of work on NextGrex for a while. After a while I lost momentum. Now I'm also short on time, having a lot of work to do. There are still things I ought to do, but I don't think anything is actually blocked on me. I don't think Picospan is a problem. Ask Marcus, and he'll deliver. An OpenBSD version already exists. Installing it would require very little of his time. He hasn't done it because he's thinking Grex isn't coming up right away anyway. If you want to move away from Picospan to something we can get a source license to, then that's a more significant project. I'm inclined to discourage Yapp. I've worked with it on Spring and M-Net and in both places it much flakier than Picospan. There seems to be something where it deletes hunks of the following response when a response is censored, and other flakiness. The Spring especially seems to be riddled with mangled item files that can't be correctly parsed any more. Fronttalk is buggier still, but the bugs are all in the user interface. Some of the search things don't work right, I think. It doesn't mung up the item files, because that part is done by Backtalk, which is pretty solid at that level. I do about 75% of my conferencing on Grex with the Fronttalk running on Grex. (Type "ft" at the shell prompt to run it - start up is slow but after that it's fine. Type "help differences" to see how it differs from Picospan. Mostly in good ways.) But I don't think it will be necessary to replace Picospan. Mail, not picospan, is the biggest single blocker. It would be good to have Marcus involved in that, at least in a spec-writing mode. Mostly there are a lot of smallish tasks that need doing. Some people need to put in some serious time. I don't have it right now. I'd be delighted if someone else did.
We should officially change the name to "NextGreX". It's a lot more symmetrical.
Several points. I'm starting to get really leery of major software components that only one or two staffers can fix, install, whatever. This is certainly the case with PicoSpan, and for legal reasons; it's not even a question of technical knowledge (which could presumably be transferred)! Surely that's bad. What's more, it's not at all clear to me that we have a legal license for it. But let me ask this: why is PicoSpan still `closed source'? I don't think there are too many people running it. Would it be possible to ask the holders of the intellectual property to release it under an open source style license? Or under a dual-license so that non-profits can use it for free? That would eliminate the problem once and for all. But even then, I'm not sure that's going in the right direction. Something like frontalk, which works across machines, is in my opinion where we should be directing our efforts. Mail isn't a big deal. Give me a day and I'll have it set up. But the time for elaborate spec writing and endless back and forth has passed: we've used up our time trying to create a system that satisfies everyone, and in the end, we've satisfied no one. Let's just start from a few basic guiding principles, a reasonable design, and go from there. If some aspect or other of the system isn't to someone's liking, too bad; at least we'll have a place to start addressing that person's concerns from.
I agree wholeheartedly with the last paragraph. I have never understood why Grex needs to have a mail system that's much more hacked than anyone else's. Why can't we use a (mostly) off-the-shelf solution for mail? (I'd be happy to be enlightened.)
I agree with Jan that Picospan probably isn't the holdup here, especially since an OpenBSD version already exists. As one can see by reading Dan's summary, quite a bit of work on NextGrex has already been done and some critical issues have been resolved; it's mostly been reported in the Garage conference, not Coop. I'm glad to see that Dan doesn't think mail will be a big issue, because that had been worrying me. Even if NextGrex comes up initally running Picospan for conferencing, which it probably will, I do believe that we should think in terms of moving away from it in the long run and towards something that is open source, non-proprietary, and is a bit more modern in its underlying architecture (that the user doesn't see). This rules out YAPP, of course, which on balance I'd have to say that I don't like as well as Picospan anyway (I've used both extensively). The most promising effort in this direction that I've seen is Jan's FrontTalk, which I hope he (or somebody) puts some more effort into getting ready for primetime. Fronttalk is essentially a Picospan-like front end to Backtalk, so users would continue to have a familiar interface, but the technical hassles involved in getting text-based conferencing to play well with web-based conferencing that one has with both Picospan and YAPP would basically disappear. Because of its client-server architecture, FrontTalk can also handle distributed conferencing, i.e. can access conferences on more than one machine. FrontTalk is slower than Picospan, but I think that once we're on the new machine, speed differences won't be particularly noticeable.
This response has been erased.
I'd like to think we aren't waiting for new software to be developed for NextGrex, because that would appear to me to be an indication of major further delays. Why were we in such a hurry to buy the hardware if we're so far away from using it? My expectation, when I saw we were buying new hardware, is that "we" (the staff) knew how it was going to be used, or thought they would know soon and had reason to believe they could begin using it. If we know we're using Picospan and that it's going to be ready in small amounts of time, that's terrific; it doesn't need to be worried about. Resp:474 implies the holdup is designing a mail system. Folks, this is not the right time to be designing a new mail system! The right time was before the hardware was bought. Maybe Next^2Grex can have the perfect mail system, after a few years of development. I'd like to respectfully request you quit screwing around with garbage like that, at least quit allowing it to hold up the new Grex machine, and install something available now.
Regarding #477; If you can get it done in the next three months, we'll consider it. We'll consider anything. However, the priority now should be doing whatever will take the least amount of time. Regarding #478; That's what I said. I plan on just building a mail system and moving on. If it's imperfect, we can fix it later. But right now, the goal *has* to be getting something reasonable up in short order. The only thing it would be silly not to do is wait for OpenBSD 3.4 to come out. That's less than two weeks ago, though, so won't be a huge deal. On that, we're hamstrung by a megalomaniac in Canada, though.
re resp:479: Dan, it's my impression you need buy-infrom the rest of the staff for the new mail system and that you don't have it. I think I was speaking to the staff as a whole rather than to you, indicating that, in my opinion, we have to go ahead. If my impressions are wrong, you have enough freedom to go ahead with implementing a standard mail system, have committed to doing so, and that that major roadblock (as described earlier in this item) is not going to be a hold-up... that's terrific. Congratulations to you and to all of us. But it still leaves open the question, which I raised on Friday... what's the next hold-up? What's going to be done about it? Finally, in the end... when do we, the non-staff users, get to be on the new hardware? However pointedly, brusquely (or other euphemisms for rudely) I've asked it, I think it's a valid question to be asking, and I don't feel like I have received an answer yet.
You're right, I do need buy-in from the rest of staff. But what I think we need buy-in for is to just *do* things without a huge amount of debate and public discussion. Something needs to be done, let's just do it instead of trying to appease everyone. That can come later. Like I said, I think three months is a reasonable timeline. But again, that's dependent on staff buying into it, and on us agreeing to just do the work.
For those who are worried the new hardware will be obsolete before we get onto it: I don't see this as a concern. We're not building a system to play the latest shoot 'em up game here. It doesn't matter if our hardware is obsolete as long as it's fast enough to do what we need it to. Grex has been running on obsolete hardware for as long as I've been using it. There are reasons to be unhappy about the delays, but fear of obsolescence is not one of them. It's important that we get this system up and running soon. But it's also vitally important that it be set up right, and documented thoroughly. Thorough documentation is our only way out of the "we need to wait for person X to fix it, because only they know how that works" syndrome that curses us right now. That's worth spending extra time on, and I'm willing to wait for it.
There's a difference between doing a thorough, but *speedy* job, and continuing to do what we have been, which is nothing, while we make noises about needing more time to do it right. Remember, it doesn't matter whether we do it right or wrong if we don't do it at all!
We bought brand new equipment with the idea of running Grex on hardware that *wasn't* several years old. That brand new equipment will have passed it's warranty period before NextGrex is anywhere close to being up and running.
The idea was to make a significant infrastructure improvement, and of the ways to do that, the best investment was in new hardware. The relative age of the Grex hardware versus the current standard has * never* been an issue of concern (except as it relates to the ability of Grex to provide even the basic services for which it exists), and very likely never will be.
Now, now, let's not mince our position. We *did* buy top of the line new hardware, with the intention of moving to the newer machine sooner than later. I will agree it wasn't our primary goal to have a super- computer to run grex on, but that doesn't excuse us being this slow. We really do have to do better. I'm going to suggest that, if staff agrees to move to nextgrex within three months, we end this thread of discussion. Pointing fingers and complaining about what we didn't do isn't going to help us achieve the goal of moving to the new machine sooner than later. Let's not lose sight of what's important here: getting nextgrex running. That said, I think we have a limited amount of time available until OpenBSD 3.4 comes out. Let's make profitable use of that time by trying to hash out some of the remaining technical issues before we have to start slinging away with configuring and documenting the new machine. Let's just get this thing done and move on to other issues.
John - ifit's any consolation, NextGrex has been up and running since not long after we got the hardware. So most waranteeable problems would have showed up by now.
Let me clarify my position a little bit. It's my intention to ask questions and to get the answers. I just wanted to find out why NextGrex isn't being used yet, and when it will be used. I still don't know either of those things. Dan Cross is certainly stepping forward and setting expectations, but I don't know if the staff as a whole is going to accept and meet those expectations. It's not my intention to bash anyone. I don't think the NextGrex implementation needs to be about personal feelings. If person A isn't going to get the job done -- for *whatever* reasons; I haven't asked for the reasons and no one has volunteered them -- then if there is a person B, that person should allowed to give it a shot. There are "person B" people available, by the way. Grex is lucky that it doesn't have to be dependent on one individual for it's survival and future direction. re resp:485: I apologize for any of my comments which may seem to stretch a point or two. Grex could have bought a 3.0 GHz machine in February, instead of a 2.2 GHz machine, so in that sense it's new hardware is not "state of the art". And I don't care if it's state of the art in that sense; that wasn't my point. It *was* new then, and is not new now. If we had waited a year to buy, we'd be better off. That's what probably should and would have happened, had we known there was no commitment to put Grex on the new hardware last winter and spring. Having the NextGrex hardware sitting unused is hurting Grex. *That* was my point. re resp:487: Yeah, the warranty remark was a stretch. A hardware warranty isn't of that much relevance.
I don't think the current state is "hurting" grex, but I do agree it's not really helping. And waiting a year to buy the hardware wouldn't have helped, either. We are setting up a checklist of things to be done. The next step is to install OpenBSD 3.4, which janc agreed to do.
Re #488: > It *was* new then, and is not new now. If we had waited a year to > buy, we'd be better off. That's what probably should and would have > happened, had we known there was no commitment to put Grex on the new > hardware last winter and spring. Honestly, I think this is a bit of a catch-22. You're not going to get people to commit time and effort until they know there's a will to spend money on the hardware.
I disagree that we should have waited to buy. As I understand the process, you have to make sure the software runs correctly on a specific hardware configuration, not a hypothetical one. While we might have moved faster if we weren't a volunteer organization, or if more people had more time, I just don't see that we should have waited on the software.
Well we tried waiting, for over a year, with the idea that we would install the OS on two different types of hardware and then pick one. Nothing happened. Last winter the consensus was that we should commit to Intel hardware and buy what we need, because then staff would be more likely to do something. So we bought it. We made some progress at first, but now we're stalled. I'm glad cross is trying to jumpstart the process.
Oh boy...time drags on... re 489 There are dozens of reasons that show the current state is hurting GreX.
Errr, last line 491 software=hardware.
I agree with moving ahead if we were ready. I'm piqued that we spent the money then stopped. I'll raise another point, too. For years, there have been two people who have been most adamantly against switching from Sun to Intel hardware; Marcus and STeve. These are the two people who were most needed to do anything new with Grex. I am on the outside; I'm not involved with the process, but... these are the two people who have stalled the move. Right?
They have not stalled it on purpose. Both are in the midst of other time crunches at work. Between volunteer work and salaried work, salaried work wins out. I, personally, know that if STeve spends much more time away from home and family, there will be no home or family. I have seen very little of him since about 2 weeks before the big power outage because he is spending all of his awake time fighting various incarnations of Microsoft yuckiness. That, coupled with my insane class/work schedule means we get no 'us' time as it is. He is not dragging his feet on the NextGrex. There is only so much time available and right now it is onverfull. Believe me, he feels badly that he doesn't have more time. But, I have put my foot down. I don't want him to have another stroke because he is pushing the envelope too far. As much as I love Grex, it just isn't worth it.
Neither STeve nor Marcus, in my opinion, have been the roadblock; we just haven't made as much progress as we would like. There are things we could have used Marcus for, but we've been making some progress despite his not being available.
Glenda, I don't want to slight either STeve or Marcus. I definitely don't wish either of them bad health. They have done a lot of things for Grex that I appreciate. I understand about having a life, as well. However, if there's something that needs doing, that's waiting on you, that you can't do, there is nothing wrong with saying, "Someone else do this. I can't." It occurred to me that that could possibly resemble the situation here.
There's no one person who has stalled things. Marcus had some things he wanted to do that we've been waiting on that I don't think are realistic at this point: moving to Kerberos being the primary one. I'm interested in what happened at the staff meeting. Obviously, I can't attend them since I'm so far away, but I'm hoping some concensus was reached on how to proceed. Joe mentioned in party the other night that Jan is going to install OpenBSD 3.4 on nextgrex; hopefully we can proceed from there.
Other structural issues such as those raised by Mary (in #463) and others need to be discussed but that really ought to be separated from discussion on the tactics of getting nextgrex into production as soon as possible. Seems to me you have X amount of work waiting to be done, Y number of staff trusted and capable of doing this kind of work and only a subset available to do it. We want to promote nextgrex into production in as little time T as possible. How much time that is can (and has) be(en) argued but it remains that to reduce the time you can: 1) Reduce the amount of work required for go live Possible. Depends on what is critical-path. Throwing in non-critical-path distractions like replacing Picospan as was suggested by someone earlier does little to reduce T. Shortcuts may be an option in some areas, but must be weighed against security/integrity risks and a call made whether the deferred cost in staff time (remember short cuts need to be cleaned up) is worth it. Personally, I'm willing to give up some time to insure our new system is well documented, stable and secure. 2) Increase the number of people working on the upgrade This is tougher. You can't just expand the pool of trusted technical staff overnight. Trust is something earned over time and a track record of delivering. Basically to reduce T here we either discover underutilized people resources within the existing group of trusted or semitrusted staff, or possibly implement some system of work delegation non-staff volunteers that applies limited staff time to reviewing work done by them. However, there may practical limitations on the value of mixing in more people or farming out the work. Moving forward, we should take a hard look how to redistribute technical responsibilities to reduce the pressure on individual staff and avoid single points of failure but our options for the immediate term are limited by the above. From what I gather, the main tasks remaining are: * Recompile Picospan for OpenBSD * Setup newuser * Configuration tuning, whatever smallish tasks janc refers to in #472. * Mail * Transfer the data Mail seems to be the only significant one. Perhaps we can sort out an agreement from Jan and Marcus that Dan move ahead on what he would like to do with the understanding he document as he builds it out and they make some time to review. For the record, if there are tasks that can be delegated to non-staff, please add my name to the roster of volunteers. My apologies for letting this get so long. Price of having lurked for so long, I guess :-)
(I thought I had heard that Marcus has already compiled Picospan for OpenBSD. He'll probably have to do it again, before the migration, but it doesn't seem to be a show-stopper.)
My understanding is that OpenBSD is totally changing their binary format and breaking backwards compatibility, so it will have to be recompiled. However, if he's already compiled it once, there's probably nothing time-consuming to do.
Regarding #500; Well, I think there are some things we can seriously cut down on to save time. For instance, we don't need to build out a complex authentication system using Kerberos right now. The reason for trying to use something other than PicoSpan is to cut down on time (Marcus isn't around a lot these days), and partly because using software we don't have the source for is going to cause problems as time goes on. For instance, it'd be swell if someone could just compile picospan on nextgrex and have done with it. Unfortunately, it's just not that easy. As far as mail goes, I'd love to put mail on another machine, and just use lmtp to deliver locally. Unfortunately, I don't know that's going to be practical.
What's the big deal with Picospan? From what I've heard, Marcus has control of the source. He can put it where he wants. Switching now doesn't gain us anything.
Maybe my understanding is flawed. It's that we don't really have a legal license for it, Marcus sort of has a copy of the source as an artifact, and the legalities of it all are really rather fuzzy, and he doesn't have much control over the source at all. If the source could be opened up, all of my objections would evaoprate, but it really seems rather more complex than that.
Re #495 - Intel vs. Sun. Marcus and STeve are in favor of moving from Sun to Intel hardware. In fact, at the meeting where we decided on what type of hardware to purchase, STeve made that recommendation, speaking for himself and Marcus. Subsequent to that, STeve was actively involved in the hardware acquisition process. If I had to speculate on the reasons for slow progress - the fact is that several key staff members had unexpected situations come up in their lives that limited the time they had available to work on NextGrex. I think everyone on staff is committed to getting the work done and is making an honest effort to do so within the time constraints they have to deal with.
People who have been with Grex a long time may also be unconsciously comparing "how long it took to do X" when staff was much younger, had fewer family members living with them, and had jobs that were less consuming.
Regarding #506; I think the thing that's upsetting some folks, and perhaps rightfully so, is this idea that there are some staff members who are `key' at all. In reality, that's the way it's always going to be, but that doesn't mean the rest of us should be paralized by it.
Right. But I don't think that's what's been holding things up.
I concur. I just wanted to attempt to clarify.
A lot of things happen behind the scenes, where most of us don't know anything about them. When it looks like nothing is happening, that can be frustrating. We all had a lot of expectations, I think, and everyone cheerfully donated all the money that was asked for... then nothing visible happened for months. That's fine, but I don't feel guilty for asking some questions now. So, there was a staff meeting on Saturday... was there any buy-in to Dan's idea of moving ahead?
Keep an eye on your mailbox, folks. My OpenBSD 3.4 CD (and t-shirt) just arrived 30 minutes ago. Grex's ought to be arriving any day now.
John, the staff meeting was on Wednesday, October 22. The staff report in the minutes of tonight's BoD meeting pretty much sums up the discussion. The Next Step is installing OpenBSD 3.4.
The question was asked in general, has OpenBSD 3.4 been installed?
Not last I checked.
I understand from the earlier discussion that Jan and the other staff configuring grex have been documenting the configuration in detail. For folks like myself curious about the technical nitty-gritty, is any of that documention publicly available yet?
I haven't read all of what's above. I should have a lighter work-load over the holidays, but the kids won't be in school as much, so I might not have all that much time to work on Grex either. Still, I expect to be able to do some work. Last night I started work on upgrading to OpenBSD 3.4. It's up and working, and I am about half way through the business of following the instructions to redo the installs and configuration changes that had already been documented. As I've been working, I've also been updating and clarifying the install documents. Mostly the install documents have worked fine. It's not just documentation. A lot of it is custom scripts. So setting up the /suidbin partition, moving appropriate suid files to it and replacing the old copies with symbolic links took about 7 minutes. Full install and setup of party took four commands and four minutes (most of the time to ftp the source over). Configuring Apache and the external authenticator took about 4 minutes too. There are still some glitches - my scripts to install Orville-Write seem to have failed. However, the goal is to be able to build a new Grex in fairly short order, and we've made good progress toward that. I don't have a good way to make these documents public right now. It's nothing amazingly interesting. One bit of good news - I've done lots of reboots as I installed stuff, rebuilt kernels, and such. So far the ethernet interface has initialized correctly every time. I don't know if the ethernet driver got fixed in the 3.4 release, or if my new router just plays better with OpenBSD, but it looks like this issue is solved. Right now I'm just playing catch-up to get the system back to where it was before we upgraded to 3.4. I hope to get a substantial amount of forward progress done over the holidays. I hope other staff members will too.
Great! Okay, how about relocating the machine to the pumpkin?
For the next few weeks, I'll likely have some time to work on the thing. I don't know what advantage moving it to the pumpkin would have, at least during that time period. However, if there is any strength of opinion favoring that, I'd actually love to have it off my desk. It's fans are loud and it takes up scarce desk space.
If it's coming up on the network reliably now, the advantage is that (a) we an test out network services other than those that you poke holes in your firewall for, and (b) it's closer to oldgrex, and (c) it's already in place for when grex shifts to it.
All good reasons, but I'd like to see it a bit closer to being ready for use before moving it. I'd like to see it move early in January, earlier if possible.
Thanks, Jan.
Actually being able to make it accessible via http and smtp and things like that may be useful for testing. Well, for other people. I can access those services just fine :). I'll move it as soon as any staff member says they'd find it easier to do their work if it was moved, or at the end of the first week of January, at which point I'm booting it out my house no matter what state it is in.
I think it'd be a lot easier to set up a decent mail configuration if it were moved earlier.
OK.
However, before we move it out from behind my firewall, we need to check that this isn't going to be a security problem. Are there any services we need to turn off?
This response has been erased.
I would find http useful.
As a general principal, I agree with #527.
Taking a quick look at what's currently running on nextgrex, I would turn off
tcp and udp ports:
daytime (13)
time (37)
auth (113)
I don't see any particular need for any of these to be running.
Leaving ssh, www, 8080, https, smtp open should be fine with the caviat that
we may want to populate /var/www/htdocs with something closer to the real
grex html files before opening it generally.
I would turn off "submission" (587) in the sendmail cf files beneath /etc/mail
as we don't currently offer that on old grex.
finger (79) is currently off but presumably you will want to turn that on later
at somepoint since we do offer that on old grex.
auth/ident should be left open, I think. It's one we've traditionally left open.
http and https should be OK to leave open. I've already configured those (https with a self issued certificate). /var/www/htdocs is no longer ht document root. The document root is /usr/local/www as on the traditional Grex, and it currently contains only a place-holder index.html and some backtalk images. I should probably delete /var/www/htdocs, or symlink it to /usr/local/www. I'm not exactly sure how to schedule the move. I'd pretty much have to do it at night. Wouldn't hurt to have someone else around to help. Anyone know what IP addresses are free in the pumpkin? I suppose it would be save to use the old grease IP address.
(Jan, if you just need physical help in moving, I can be available.)
Don't think I really need physical help. It's not a heavy computer. I don't suppose I really need help at all. Figuring out how to get it onto the network, getting it configured, moving junk around to make space for it, someone to hold door while someone else carries it...it'd be pleasanter with two people, but it'll work with one, and the difficulty of scheduling time in advance means one is probably the best choice. I guess I'll tentatively aim at moving it this evening, sometime after the kids are in bed.
The only security problem with ident that I'm currently aware of is it can be used to determine what username servers are running under. It's probably worth running it on Grex because it lets other sites inform us of which of our many users is causing them trouble, in the event of abuse.
I'm aborting the plan to move Next Grex to the pumpkin tonight. I just released that I haven't got a monitor for it. Right now it's on the secondary inputs of my dual input monitor. The only monitor we have free in the pumpkin not a VGA monitor. (Monochrome CGA, I think.) To set it up in the pumpkin I'd need to borrow the monitor and keyboard from gryps. I could do that, but it's not a very satisfactory solution. I think we should let the move wait till we have a monitor and keyboard. The reasons to do it now are not all that strong and a monitor should not be all that hard to find. People all over town are paying money to get rid of them. I think I have a spare keyboard someplace. I'd have to dig around a bit. I think Dan Gryniewicz (dang, if you must have a last name like that, I wish you'd get a unique first name so I wouldn't have to type the last one all the time) had offered the donation of a monitor. I don't know if he's even in town right now.
Steve Weiss says he has a monitor. Maybe I'll grab that and make the move tommorrow night.
We bought all the hardware & didn't get monitor & keyboard?
Right. See reponses 43 and 48 of this item:
As for random hardware peices, I'm willing to use things like
donated monitors and keyboards, because if they fail the system
won't be affected (STeve Andre, #43).
and
You'll notice that I omitted the keyboard and monitor from the
list of things to buy. This is because of either fail, they
won't affect the running of Grex immediately. We can boot the
system up without a monitor (indeed, nost of my OpenBSD machine
have only had a monitor on them during the initial install and
upgrades), and we have spare keyboards in the pumpkin already.
So those are of a nature where a failure means we have to drag
something over to Grex and use that instead (STeve Andre, #48).
Jan, I can lend a 15" monitor to the cause. It's easy for me to swing by your house on my way to or from work (before 9:00 or after 6:00) tomorrow. Call me to arrange a time (unless you want to hold out for a bigger screen).
It's exciting to see that things are starting to move again. Thanks Jan and the rest of the staff!
A small screen is fine. I'll call Jeff.
I've got some monitors here at work that the company is going to have to pay to get rid of anyway. Grex would be welcome to have one. They're 14". The tubes are worn enough that the display contrast is a bit low for Windows, but they'd work fine for text consoles. Just thought I'd offer one in case you don't come up with something better.
We have 12 or 13" mono VGA monitors unless Jim threw them out. Do you need color? These are easier to move around.
Jeff dropped off his 15" today. I'll use that. Thanks to all other offers. I can temporarily loan a keybaord. My plan is to move it tonight.
OK, it's in the pumpkin. We have a UPS issue though. The UPS in the pumpkin is at capacity, so I couldn't plug it into that (didn't try, actually). So it's on wall power, which is reputed to be none to clean. I don't really know what to do about this, so I'll just hope that someone else solves the problem.
I would recommend Grex buy an inexpensive UPS if you're going to be running the system for any length of time. It would cost under $200 and could be used as a spare later if we ever have to take our main UPS down for servicing again. APC makes okay small UPS's. Stay away from Cyberpower.
TROGG IS DAVID BLAINE
You have several choices: