|
Grex > Coop > #297: Grex: It's time to switch operating systems. | |
|
| Author |
Message |
cross
|
|
Grex: It's time to switch operating systems.
|
Nov 29 20:48 UTC 2010 |
So after fighting the OpenBSD package system for the last week or so,
I'm about to give up. The stuff just doesn't work. OpenBSD sucks.
Background: When Grex was founded, and for the first few years of its
existence, it ran on Sun hardware using Sun's operating system,
SunOS. At the time, SunOS was a derivitive of the Berkeley
distribution of Unix (Sun later switched to AT&T's System V Release 4,
which is more commonly known as "Solaris" or "Solaris 2", but formally
known as SunOS version 5 [SunOS 1 through SunOS 4.x were the BSD
derived versions; actually, the earliest might have been 7th Edition
derived, but that's neither here nor there and really getting into the
weeds on something that's only tagentially related]. Confusingly, as
a marketing tactic, they later retroactively named the older, BSD-
derived SunOS Solaris 1). The hardware originally used Motorola's
68000-derived family of processors, but eventually switched to the
RISC SPARC processor.
Around 2002 or 2003, Grex's hardware was starting to break down, the
BSD-derived SunOS had been EOL'd by Sun, and the price/performance
point for Sun hardware was really being eclipsed by x86 hardware.
Further, there were several free, very stable and mature operating
systems available for x86 hardware. It became clear that it was time
to at least upgrade; after a lively round of debate, it was decided to
use an x86-compatible AMD processor, new SCSI disks, etc. Then the
debate around operating systems started in earnest, with the field
eventually narrowed to two reasonable candidates: OpenBSD and
FreeBSD. Of these, FreeBSD has a much larger userbase, much better
third party software support, and is generally more advanced. OpenBSD
focuses more on software security and correctness, with many
developers actively auditing the source base looking for (and often
finding and fixing) security problems before they become real
problems. Of note, and relevant to this discussion, is that FreeBSD
does the same thing; also, something that's come up recently but that
was not, and is not now a consideration, is that M-Net runs FreeBSD.
Marcus and Steve both argued passionately for OpenBSD on security
grounds, I argued for FreeBSD on technical grounds, everyone else was
ambivalent. Ultimately, the decision was made to go with OpenBSD.
Hardware was purchased, and then ... nothing happened for a year. The
hardware sat in somebody's house with no one really looking at it.
Finally, Jan Wolter took the initiative to get Grex up and running on
the new system; he developed a less than favorable impression of
OpenBSD along the way (see some of the old archives in the garage
conference). Joe Gelinas and I did some additional work to get things
going, someone (probably Jan) moved the data over from the Sun and put
the new machine into production in December of 2004; it's been running
ever since, modulo some new hardware to replace failed components
(particularly hard discs) and upgrades of the operating system.
Since sometime in 2007, I've been what you might call the primary
staff guy, in the sense that I've done most of the upgrades and a lot
of work on the software, the web site, etc. In that time, I've formed
the opinion that OpenBSD was a mistake, and that we really, truly
would be better off with FreeBSD. I want to bring this back up for
discussion now.
Some basic technical points:
a) Grex's current hardware is getting long in the tooth and needs to
be replaced at some point. Modern computers are based around
multicore processors; OpenBSD doesn't support these particularly well;
certainly not as well as FreeBSD.
b) I do not believe that OpenBSD is any more secure than FreeBSD. In
fact, we've seen instances of OpenBSD security holes *on Grex* that
didn't appear in FreeBSD. Many of the flaws that Chad and Mickeyd
used to crash Grex continuously don't appear in FreeBSD: despite
trying to do the same things on M-Net, that system remained stable.
c) Many virtual hosting providers support FreeBSD, but relatively few
support OpenBSD. Grex has probably got one more round on its own
hardware before going virtual and living in the cloud. It would
behoove us to transition to a software stack that's going to make that
transition as painless as possible.
d) FreeBSD has a LOT more third party software support. Things that
could be a big draw for potential users exist in the FreeBSD ports
tree, but not OpenBSD; sure, those things *may* eventually get ported
to OpenBSD, but why wait?
I can go into more depth, but I think it's time to get serious about
switching.
|
| 29 responses total. |
kentn
|
|
response 1 of 29:
|
Nov 29 22:22 UTC 2010 |
Thanks for the overview, Dan. It seems like a good idea to me to use
whatever OS that staff feel is best, that supports the applications we
currently run, applications we want to run, and an OS staff would like
to continue working on. We definitely need something that will make
life easier for staff as well as something that makes it easier for Grex
to provide the services our users want. And being compatible with
the offerings from hosting providers sounds like a plus. I'm curious
what other people think.
|
jep
|
|
response 2 of 29:
|
Nov 29 23:08 UTC 2010 |
I think most of us are going to make, or go along with, any decisions on
the basis of non-technical information. I don't know enough about the
differences between OpenBSD and FreeBSD to make any difference. Neither
do most of the users, or most of the Board.
I would be willing to bet there are two or three people involved in
running Grex who should decide such things, and the rest of us will go
along. I'm interested in details like whether the slime who hack Grex
and bring it down can be stopped, not how they can be stopped. If there
are cool capabilities to be gained, I would be interested in what those
are. Can Grex be opened up again so people can log in and use it right
away, the way I did when it first opened to the public? I want it open
again, and I want it as soon as possible.
How long is it going to take to transition to FreeBSD, once the decision
has been made to make that move? I mean in weeks on my calendar, not
hours of staff time. If you can do it in a week, great. Start this
Sunday and we'll see you in a week. If not... then how long? Are you
going to do it yourself, Dan? If not, do you have the help you need?
How long until Backtalk will be available? It's what I use here, so
it's what I care about.
Also, is there any cost? How much? Are there users who will pay for
it? As I said elsewhere, I can chip in some.
Tell me things like that.
|
nharmon
|
|
response 3 of 29:
|
Nov 30 02:23 UTC 2010 |
Users probably would not see much a difference between FreeBSD and
OpenBSD, so I think whatever is easier for staff is the way we should go.
I might also mention that Dan really does seem to be our main staffer
right now, if not the only active staffer. I think it would be a good
idea for the board to give him some authority to take charge of
technical operations and make decisions like this.
|
cross
|
|
response 4 of 29:
|
Nov 30 15:08 UTC 2010 |
resp:2 Backtalk, etc, would be available immediately (that is, FreeBSD
wouldn't be put into production until the software had been ported and
tested). I don't know how long it would take, but total down time
would be minimal (like, a day or two max). That is, most of the work
would proceed in parallel with keeping the current Grex running.
There would be no cost.
|
vsrinivas
|
|
response 5 of 29:
|
Nov 30 15:14 UTC 2010 |
Using FreeBSD would allow Grex to switch to the ZFS filesystem; I think
that alone would be an excellent reason in its favour.
FreeBSD release branches are supported for considerably longer than
OpenBSD releases. While that isn't a major deal for Grex (it is keeping
up-to-date fairly well), it would keep upgrade pains confined to longer
intervals.
-- vs
|
cross
|
|
response 6 of 29:
|
Nov 30 15:23 UTC 2010 |
ZFS is another great point.
|
tsty
|
|
response 7 of 29:
|
Dec 1 19:45 UTC 2010 |
did a little digging .. expeciallya bdcuase of hte starbved-ram cross ran
into.
these may be worth the click & read:
http://www.undeadly.org/cgi?action=article&sid=20100618041150
http://www.osnews.com/comments/23978
http://mongers.org/openbsd/interview-espie-ports
http://onlamp.com/pub/a/bsd/2004/03/18/marc_espie.html
|
tsty
|
|
response 8 of 29:
|
Dec 1 19:59 UTC 2010 |
also, from sys.cf ther is this:
Item 100: Linus Torvalds on OpenBSD
Entered by John H. Remmers (remmers) on Thu, Jul 17, 2008 (10:18):
Ran across this today at
http://article.gmane.org/gmane.linux.kernel/706950
|
cross
|
|
response 9 of 29:
|
Dec 1 20:30 UTC 2010 |
Pretty much all of that, except for the OpenBSD 4.8 announcement
link, is a couple of years old. Grex is running OpenBSD 4.8.
I found the problem installing packages; an old dependency that had
been removed in the GNOME libraries (needed through a complicated
set of dependencies by the RT package) had been installed as a port,
but then that port had been removed. When pkg_add (the tool Marc
Espie wrote to add or update packages under OpenBSD) went to upgrade
that package, it got itself into an infinite loop trying to navigate
what it thought was an cyclic dependency chain. I guess every time
through that loop it set a variable, or added something onto an
array (these tools are written in Perl) or something similar until,
eventually, the thing just ran out of memory. I tracked it down
by manually following the dependency chain until I found the cycle,
and force-removing the no-longer-existing package (and everything
that depended on it).
Now, this to me sort of exemplifies what I dislike about OpenBSD.
Espie's package tools are often held up as examples of what they
do *right*, but actually, they've got some pretty serious bugs in
them. I mean, really; the tool didn't bother to keep some sort of
"visited node" list when it traversed the package dependency graph?
Detecting a cycle in a directed graph isn't that hard. Similarly,
having some new packages and some old packages on the system, without
doing any sort of maintenance of the original dependency information,
just invites troubles. Databases get around this by having a notion
of an atomic transaction: either everything succeeds, or it all
fails and is "rolled back" in a way that is transparent to the
consumer of the data...none of this half-updated business.
The FreeBSD people seem to do a lot better with portsnap and
portupgrade.
|
tsty
|
|
response 10 of 29:
|
Dec 2 05:46 UTC 2010 |
re 9 ... greate about findeirng the prob & fix!! now, it shold never appear
again ???? pkg_add was too new comparred to taht old depenendency for it to
have been considered ?? just asking.
|
cross
|
|
response 11 of 29:
|
Dec 2 08:50 UTC 2010 |
No, it'll probably happen again (in fact, it did later, with another package).
The point about pkg_add is that it has bugs in it, and those bugs appear, at
first inspection, to be pretty deep into its architecture.
|
cross
|
|
response 12 of 29:
|
Dec 2 09:34 UTC 2010 |
Here's another interesting point about OpenBSD: they will categorize
security problems that affect their system, but weren't discovered
by them, as "reliability" problems. For instance, the recent OpenSSL
vulnerability (for which the FreeBSD project released a security
advisory) was listed as a reliability problem by OpenBSD.
Well, I guess their project can continue to claim that they've had
some artifically low number of security holes in "a heck of a long
time" if they just don't call security holes security holes.
At the last staff meeting back in March, Steve made mention of a
bug in FreeBSD's ftpd, citing that as a reason we should stick with
OpenBSD. Unfortunately, that *exact same bug* affected OpenBSD and
existed on Grex. Again, the OpenBSD project marked this as a
"reliability fix." Despite their supposedly superior auditing,
they didn't catch either of these problems.
So this is another way of saying that I don't buy OpenBSD's security
claims. I'm not saying they don't do a good job, but so does everyone
else. In a lot of ways, it appears they do a better job, but that's
partly self-selection on their part: when your definition of what a
"security hole" is is so tightly focused, it's not that hard to make
it look like you're light years ahead of everybody else, but are you
really? I say no. And in that case, the main rationale for why we've
stayed on OpenBSD for so long is, I think, demonstrably false.
|
remmers
|
|
response 13 of 29:
|
Dec 3 13:53 UTC 2010 |
I've done some setting up of FreeBSD systems, and as a result - although
I'm not nearly as experienced in the nuts-and-bolts of system
administration as Dan is - I've found FreeBSD straightforward to manage
and tend to agree that FreeBSD would be a better choice for a system
like Grex than OpenBSD is. The fact that FreeBSD supports modern
processor architectures better than OpenBSD is another point in its favor.
How stable is ZFS on FreeBSD nowadays? The last time I looked (which
was a while ago) the implementation was somewhat experimental.
FreeBSD is well-supported in the cloud, OpenBSD not so much I think.
Given the likelihood of Grex's moving to the cloud eventually, that's
another reason for abondoning OpenBSD.
Should cloud support factor into decisions we make now, even if we're
not going to move to the cloud quite yet? I'm thinking of Amazon's EC2
service, which is widely used by some big players (e.g. by Netflix) and
offers Linux and Solaris virtual machines but not FreeBSD at this point.
At the risk of being accused of heresy ;-), should we be considering
going with Linux?
|
veek
|
|
response 14 of 29:
|
Dec 3 14:19 UTC 2010 |
Solaris has stable zfs (from what i've seen at wrk but my exposure is
minimal) Why don't we create a test partition and see..
I think we are looking at it the wrong way (Linux/ZFS/etc).. We have a
bunch of volunteers right? A, B, C etc.. If a A wants to try something
let him try it so long as he doesn't create unwanted work for B! OR in
other words B pre-approves the task.. Eg: If cross wants ZFS he should
get it - If Tsty approves off it before hand (because TS/or-someone-
else will have to go reset the box/re-install the b0x if ZFS b0rks)..
so get it pre-approved by WHOEVER has to clean up the mess..
This won't create problems and heated debates about which is the
"better" solution. The actual "better solution" is ultimately people
using the box.. if you got 50ppl in a bullock-cart and 1 in a car.. the
cart is better simply because it serves more ppl..
|
cross
|
|
response 15 of 29:
|
Dec 3 15:44 UTC 2010 |
ZFS on FreeBSD is quite stable now days; certainly, production ready.
I think that it's always good to think ahead: cloud support (as you
put it) should definitely be a consideration. It may be a while
before we move there, but the whole world is heading in that direction
and I think it would be silly if Grex tried to resist that tide. I
believe one of the reasons we're in the malaise we are in now is that
we spent too long trying to hold back other tides with teaspoons.
|
remmers
|
|
response 16 of 29:
|
Dec 4 20:48 UTC 2010 |
Interesting - you were the resister and quite opposed to a move to the
cloud when I suggested it a couple of years ago. What's changed since then?
And how might a move to the cloud in the future affect our choice of OS
today?
|
cross
|
|
response 17 of 29:
|
Dec 5 01:04 UTC 2010 |
resp:16 I wasn't opposed in the long term; I was opposed in the
near term, and still am (in the near term). I don't think the
present offerings are mature enough, or offer a compelling enough
price point over our own hardware. I also think there are a host
of legal issues to be thought through, and I think most of the
technical benefits of virtualization can be realized by a combination
of a remote console capability at the hardware level, and maybe
virtualizing our own hardware (e.g., run Grex under Xen or VMware
or something, but on a computer we own and control).
That said, I think in the long term, jumping into the cloud is
inevitable. One cannot fight the march of time. Grex has tried,
and I think a lot of the current predicament is a result. I also
think that, in about five years, precedence will have been set for
the legal ramifications of running a service like Grex on a virtualized
hosting provider, and the price point will continue to get better
for the sorts of capabilities we'd like.
In other words, I don't think it's the best avenue of approach now,
but I think we would do well to prepare for it in the future.
|