In the staff conference I've been attempting to make the case
that our upgrade of OpenBSD should not be to 4.1, but rather to
what is currently called "4.2-beta". Since we haven't done very
well at timely upgrades in the past (and I'm definitely a big
part of that problem), I want to see Grex running the latest
version, so at least we aren't behind the day we finish the
upgrade. We've been lucky in not having many problems running
outdated versions.
This is going to be a long item. Sorry. I don't think I
can simplify it much. My contention is that 4.2-beta is close
enough to the final 4.2 that we can upgrade to this, and then
do a src upgrade and recompile to the final version. I've been
tracking 4.1-current since it started, and use -current machines
at work. I feel comfortable with it for my job.
Wednesday I sent email to two long time OpenBSD people, Nick
Holland and Henning Brauer, asking them what they thought of my
idea of using 4.2-beta. Nick responded; the emails are at the
end of this item.
We've been talking about this in staff. Here are some of the
responses I put there. Comments about the responses are in brackets.
[ comment after Dan expressed reservations in using -beta]
#11 of 18: by STeve Andre' (steve) on Thu, Aug 2, 2007 (21:50):
Dan, I'm using -current right now *in my job*. If I screw up, I'm in
trouble. I play with stuff alot before jumping to it. Partly its because
I've been living on -current for five years now on my laptop to test new
things, and partly to be able to jump to newer releases when something
comes out that I want to use.
The ports tree just underwent a soft-lock today: no new things are
going to be added unless there is a really good reason for it, and
folks are testing stuff now. Since I built my own packages and have
been for a few years now, I have a pretty good sense of where things
are, and they're pretty good right about now. KDE 3.5.7 is the weakest
thing right now and we're not going to use that. If we decide to
wait for the official release of 4.2 we're waiting 'till October
sometime, assuming we preorder the CD very early and get it earlier
than the posted date. Do we want to do that?
[ my response to Dan's comments. I think this is standalone]
#13 of 18: by STeve Andre' (steve) on Tue, Aug 7, 2007 (21:10):
I think I have perspective here, Dan. Mis-matching a kernel with userland
is *bad*. If an api to a kernel call changed between 3.8 and 3.9, we
could have been in real trouble. It does seem that we escaped this particular
bullet, and I'm very glad of that, so lets move on to the more important
thing, which is what version to upgrade to.
You are very correct in that we haven't had a good record in upgrading
the system overall. This is why I really want to see 4.2-beta in place.
First though, for benefit of those reading who aren't so familiar with
the way OpenBSD versions come out we should talk about that. When 4.1
came out in May of this year, the developers were working on what is
called "4.1-current". This is the 4.1 code base, with new things. In
six months from the introduction of 4.1, 4.2 makes the light of day--
this is what 4.1-current becomes. By the time 4.2 is released, development
of 4.2-current will have started and the process goes on.
4.2 will be officially available about November 1st. Before that can
happen 4.2 has to be frozen, since it takes time to make up the releases
and create the master disks, and then send them off to the factory to
make the 4.2 disk sets that will go on sale. At the end of each six
month development cycle, things start to slow down; things are not added
except in cases of fixes to things. This is where OpenBSD is, right now
on August 7th. OpenBSD can be thought of as two big parts: the operating
system itself, and the "packages", the some 4,496 or so programs that
have been ported to OpenBSD. The package tree is in a "soft-lock" state
right now: no new packages can be added, nor are little upgrades to things.
The focus has shifted to working on the testing of things, to ensure stuff
works right.
The operating system proper isn't quite to the soft-lock stage yet, but it
will be before long. Right now huge things aren't going into the source code
tree (the "src tree") but more little things. When the src tree lock occurs
folks will be testing stuff, and only real problems will be worked on; the
idea is to make the release as good as can be.
The packages are possibly more work to test than the operating system is,
hence the lock on the packages tree right now, but both will be in the
final testing stage soon.
Amid the start and end of the development cycle for each OpenBSD version
there are people like me who live on OpenBSD-current. This means that on
my laptop I have the src tree, and every few days I recompile my world
so as to stay near the top of whats current in -current. There are a few
times when this is not advisible, such as the "Hackathons" when lots of
the developers get together and work on stuff which can make -current a
very unstable place, or when some huge new thing gets inserted into the
src tree for everyone to test. I've been running -current as my main
version of OpenBSD for five years now. Twice I've screwed myself, once
upgrading during a hackathon such that I had a system which had problems
such that I had to manually drop back to the state of -current just
before the Hackathon, and once during a Flag Day (where something changes
in an incompatible way, such that you generally need to reinstall) where
I didn't follow the published instructions, and had to extract myself from
the mess I made.
I don't know of any other op system today where I can live on the
development version, and be OK. The OpenBSD developers are very very
good about testing things which could impact the system, before they
go in. The src tree almost always compiles correctly--if someone puts
something in that doesn't compile (thus breaking the tree) they fix it
quickly, fearing the wrath of Theo otherwise. That doesn't happen
very often.
For the last 2.5 years or so, I've been compiling the entire package
tree myself as well, such that I have complete control over the systems
I use for myself and at work. Compiling the packages means having a
system that runs -current, along with all the code for the packages,
and following the ports tree (ports are the instructions for making
a package, a package is a compiled port, ready to use). I now have
a Core 2 Duo machine for package building, so I can build everything
in just under 2.5 days (about 58 hours). So once or twice a week I
bring the package machine up to -current, compile it, and then start
the package build. As I am writing this I'm getting ready for vista
to start up again; about 20 changes have been made to various packages
fixing small things.
Now, back to Grex. Grex has not had a good record of upgrades. This
has bothered me, and the last time I tried to upgrade the system what
I did was over-written by someone else, because what I did bothered
other staff people. I certainly didn't explain things well enough
back then, so I hope that this is an explaination of what I'd like
to see happen, and why.
We're at 4.2-beta right now, which means a mature -current which will
become 4.2 before long. There are changes to things going on daily,
but not the brand-new-virtual-memory-upgrade type of thing. I've
been working on this and feel comfortable enough with -current that
both my web server and fileserver will be running the latest -current
later this week. This is my job we're talking about: if I mess up
polisci.msu.edu thats bad. There is always stress when upgrading
systems, but I don't have qualms about -current.
OK then, once we run 4.2-beta, we're stuck with it, right? No, not
really. When the 4.2 src tree comes out, we can upgrade our tree
to that, and recompile. The kernen gets built first, then "userland",
ie the rest of the system. Then for completeness the installed
packages can be removed and replaced with the 4.2 set. That doesn't
have to be done if things like libraries aren't changed, but lets
assume that the packages all need to be replaced. We need to keep
track of what was installed and what config files there are, then
do a pkg_delete on them and a pkg_add for the new set.
This is why I'm not worried about running 4.2-beta right now. If
you look at http://openbsd.org/plus.html you will see a list of
most of the changes between 4.1 and 4.2-beta. This isn't a 100%
complete list, but its pretty good. There are some 1100 changes;
I'll bet that at least 900 of them apply to our platform (remember
that OpenBSD runs on 17 different hardware platforms). Another
advantage of using 4.2-beta is that all the patches which we'd
have to apply are there for 4.1 and 4.2. Obviously more will
come about in time, but at least all of them to date are already
applied. Given Grex's speed at doing things, this is a good
thing.
Next, a few specific points to Dan's comments about using -current
on machines I maintain:
+ do I let the world get on my machines? No -- but Grex does, and
having the very latest version of OpenBSD on Grex *is a very good thing*.
Read the changes list; all those little (and not so little) things
are in 4.2.
+ whats the use on my machines, general or specific? Pretty much
specific stuff, to be sure. But the reasoning for the previous part
holds here, too: Grex *does* get pounded by vandals in just about
every conceivable way, so given that OpenBSD gets better with time,
which is more vandal proof? I say the later version. And yes, this
touches on our shameful record of not keeping up to date on stuff.
Thats bad -- this at least gets us about 90% of the way to the next
version, which I think is better for us all.
+ who fixes things if they break? We do. Or we do it in whats
been our slow way of doing things. I'll not deny that we've been
pretty bad about that, but again, going back to the first point,
I think we're better off with newer stuff. Better off includes
having newer software. The developers have found that keeping
people off -current is needed for sanity's sake: someone who doesn't
know anything about OpenBSD shouldn't be there, because if something
horrid happens you have to deal with it. Running a -current just
before it becomes the next -stable is a different thing.
[my reasoning as to why Grex is different]
Yes Jan, Grex is different. Different in that we have all sorts of
uses going on at once, meaning that the latest version *really* is the
best idea. I'm really sad that the rest of Grex's staff doesn't see
this. That we've been so bad about not upgrading *is* bad, and I'm
definitely a part of that. This is why upgrading to something of less
than the latest version is so bad. We *NEED* the latest code. And
we're willfully not doing this.
------------------------------------------------------
Here is the email I sent to Nick and Henning
>Hello Nick and Henning,
>
>I have a favor to ask the two of you, which I hope you won't mind too
>much. I'd like your thoughts on an upgrade for a system I help maintain.
>
>Grex (grex.org) is a computer conferencing system thats been around
>since 1991. Since 2003 or so we've been running OpenBSD-i386. Grex
>offers a number of services, and basically looks like a text-based
>fossil these days. The staff of Grex (I'm one of them) hasn't been very
>good about upgrading to new versions of OpenBSD lately. Sadly, we're
>currently running 3.8 and about to do an upgrade to 4.1.
>
>I'm pushing to install 4.2-beta on it instead. So far my reasons
>have fallen on deaf ears. I'd really appreciate hearing from you folks
>about this. As I see it, 4.1-current is very close to 4.2, and the
>package tree being under its soft-lock is about as far along. Given
>that major changes aren't going into the system right now, and that
>I have been tracking -current on my own laptop, I feel perfectly safe
>in doing this. I've been building packages for a couple of years now
>and have a set for -current of a few days ago.
>
>I'm not sure I'd advocate the running of a non-stable version of
>anything else other than OpenBSD, but my problems with -current over
>the last several years have always been my fault, mainly remembering
>not to try things out during the Hackathon.
>
>Grex is a fairly dynamic place, where we offer shell access among
>other things. To my amazement, there are still people (and schools!)
>who use Grex for C/C++ programming, and learning about unix. Thats
>one of the reasons we've been in existence, teaching. The downside
>of course are the vandals who try various things from Grex, or to
>attack the systems itself. We've had to ramp down initial access to
>things because of this. Sadly, the net isn't what it used to be. I
>mention this because we get hit on all the time, by people trying to
>break various things. Thankfully 99% of the vandals are pretty inept
>but we do get intelligent nasty people from time to time. This is
>why I really want to see the latest version installed.
>
>Given that the staff of Grex is fairly bad at doing things like
>upgrades, I think that jumping to 4.2-beta is a reasonable thing to do,
>and then update the src tree later and come up to 4.2-stable. Given
>that the packages I've built have been pretty good and that I don't
>think we're using anything that doesn't build, I don't see the package
>issue as a problem.
>
>So then, am I out in left field? I am the only one who advocates
>this. Our upgrade is scheduled for this weekend, but I am hoping that
>comments from you might persuade folks to think about this, or your
>comments will shut me up. I'd like permission to put your comments
>(either way) in one of our conferences so that people can read them.
>
>Thanks to both of you for the neat things you've done for OpenBSD over
>the years...
>
>--STeve Andre
Here is the response from Nick:
>My thought is, 4.2-no-longer-beta-but-not-quite-release.
>
>If this is a NEW install, ABSOLUTELY, grab the most recent snapshot.
>On -release day or whenever afterwards, install the 4.2 release by
>just copying over the kernels and untaring the .tgz files (with the
>'-p' option!!), reboot and you are running 4.2 with little additional
>effort. 4.2 is 95+% done. It is unlikely anything revolutionary is
>going to happen between now and tagging day. If you asked me if you
>should install -current or -release in June, I'd probably have
>leaned towards -release. Today, no, -current is the answer. It's
>too close to 4.2.
>
>If you are UPGRADING the existing system, it's a bit more difficult
>a question. If you follow upgrade39.html, upgrade40.html, and
>upgrade41.html, you will be in fine shape on OpenBSD 4.1. The
>problem is, there is no upgrade42.html yet, and I'm not likely to
>finish it by this weekend. Or even start it. :)
>
>Now, if you wish to "roll your own" upgrade process, you can probably
>do so, just look at all the files that changed in etc41.tgz (release)
>to etc42.tgz (snapshot), and figure out which you wish to copy over
>verbatim and which you wish to patch.
>
>OR... just upgrade /etc to 4.1 instructions, and do the 4.2 changes
>later. All my machines running -current (which is ALL my personal
>systems) are running a 4.1-release /etc directory. USUALLY, you only
>run into problems doing this when you are trying to run new features.
>After I publish the upgrade42.html process, you can then do the /etc
>upgrades, reboot, and you should be up and running. Granted, I try
>to avoid making /etc changes so I have a lot more machines to test
>my upgrade42.html process on, but it works. Usually. :)
>
>
>Another option would be, since you are a number of upgrades behind,
>unpack etc42.tgz directory along with the rest of the *.tgz files,
>and simply re-configure the machine from scratch. Icky, but
>sometimes, it is easier. Make sure you have a hard console on
>the machine, I've done this by remote, but it's scary and I did lose
>a machine this way once (forgot to set a password on the root account,
>and no other accounts existed, so I couldn't log in by remote to
>finish the job!). However, I don't see a lot of benefit in doing this
>for your task.
>
>Really, however, this "4.1 vs. 4.2-pre-release" is kinda moot. You
>really should have updates and upgrades part of routine maintenance.
>Maybe just plan first (something) in December and June as your "upgrade
>day". Using 4.2-prerelease allows you to skip one of these upgrade
>days (or at least, makes it trivial), and that is not a good thing if
>it allows you to fall back to bad habits. :)
>
>So, my best reason I can think of for going with 4.1 over 4.2-PR is
>completely non-technical: it might help encourage a healthy upgrade
>habit.
>
>The other reason I should point out for avoiding 4.2 at this time is
>that in the unlikely event there is a serious security issue discovered
>between this weekend and Nov. 1, errata will not be published for it
>until Nov. 1, so you will be on your own to "fix" the problem. In your
>case, that's really not a huge issue, as you have been on your own in
>that way for over a year now.
>
>On an upgrade of this "size", I'd recommend practicing at least once
>before hand on an off-line system. I'm sure you know that, I just
>feel the need to always point that out. (upgradeXX.html is my most
>terrifying commit of the cycle -- I realize if I screw up, LOTS of
>systems will break.)
>
>"-stable" is an unfortunate name for our patch branch. It does imply
>(incorrectly) that everything else is "unstable". I just made a change
>to the FAQ (faq5.html#Flavors) that makes it clear(er?), if you are
>looking for the "Best" version of OpenBSD, it's -current, not -stable.
>Really, in my mind, the ONLY reason to install -stable over -current
>is packages: If you install a -current today and a few packages now,
>then wish to install some more in a month, you will probably have to
>upgrade to the new -current at that time to run the new -current
>packages. -stable makes it much easier...you have six months where
>you can install software without worrying about the OS underneath.
>
>Nick.
81 responses total.
[This is really more appropriate to the garage conference, not coop.] It sounds like he suggested just upgrading to 4.1 via the procedure we'd already planned on and then moving to 4.2 when it becomes available. He gives a number of reasons for avoiding 4.2 right now, including ``encouraging a healthy upgrade habit.'' 95% is not 100%. Steve, I've asked you a number of times, and this time, I'd like a substantial technical answer. Why, *technically* should we move to 4.2 *right now* instead of 4.1? What does it give us that 4.1-stable does not? Again, I'd like a technical answer that's relevant *right now*, not just, ``well, *all* of the changes!'' I'd like a *technical* argument, not a philosophical one.
The two are intertwined, Dan. Some people see that and others don't. As I have said before, read the change log. Only that isn't good enough for you, as you want 'hard' answers. They're there, but intermixed with the reasons (philosophy) why I state this. And now, Nick Holland.
Btw: here are some of my comments from the discussion in the staff conference,
to fill in the gaps around what Steve posted.
[The first paragraph of #8 is in response to Steve noting that we were running
the 3.9 kernel on top of the 3.8 userland as an experiment to fix our TCP
connectivity problems.]
#8 of 19: by Dan Cross (cross) on Thu, Aug 2, 2007 (14:53):
Regarding #7; Just as a test. Running the next-release kernel with the old
userland is, believe it or not, a supported thing to do: it's the standard
method for doing an in-place upgrade, and the newer kernels tend to be
backwards compatible with older binaries (generally, everything except a few
low-levelish utilities work). I was not at all surprised to see that it
booted.
I'm strongly opposed to upgrading to a beta release. If we're that close to
the 4.2 release, then we should upgrade to that. But we really need to do
*something* to fix the various TCP/IP issues we're experiencing. Jan was
going to do an upgrade to 4.1 about a month ago, but it never happened. :-(
#12 of 19: by Dan Cross (cross) on Fri, Aug 3, 2007 (01:20):
Steve, let's get some perspective here.
Given our track record, we do an operating system upgrade on grex about once
every 1.5 *years*. That is, about once every *3* OpenBSD releases. Are you
seriously suggesting that we run Beta software for the next 1.5 years? That
software is in Beta for a reason; it is not production ready. It will be
out of Beta and production ready at some point in the future, one hopes, but
it isn't there yet. But it is in Beta because people know there are still
bugs and untested code in it. Is that really something we want to lock
ourselves into for a year and more?
Compare this to booting grex on an OpenBSD 3.9 kernel for a couple of days
or so as an experiment. Note that /bsd is the OpenBSD 3.8 kernel; the next
time Grex reboots, we'll be back to exactly where we were this morning.
This is not something we've locked ourselves into for a year and a half; it
wasn't even an upgrade, or anything of that nature. In fact, I'm really
rather surprised this has been as much attention paid to it as their has
been, particularly from a group that once replaced the fork system call on
another version of Unix! The OpenBSD people even support it; heck, note
this document: http://www.openbsd.org/faq/upgrade39.html expressly has as
a step, ``Reboot on the new kernel ... the new kernel will run old userland
apps....''
Note also that, from what I understand, the OpenBSD people will flame you if
you post to one of their mailing lists and aren't running a GENERIC kernel.
In fact, it seems like they'll flame you for a lot of things, so the quality
of information one gets from knowing that the OpenBSD people will flame you
over this or that isn't very high. (That's a joke...but you guys are
usually a really tough crowd.)
Now, I understand that you run OpenBSD-current on your work and personal
machines, and that's all well and good, but please consider the following:
1) Do you give access to those machines to anyone in the world?
2) Do you use them for general time sharing, or for servers running
a rather limited set of software? E.g., is this a web server, in
which case, the thing most often used is, say, Apache and maybe some
disk package?
3) Are you not paid to fix those machines if they break? Who's paid to
do that for grex? We've already had lots of downtime from running
supposedly stable software that no one was available to support. Now
we're thinking about running softwre that the developers have expressly
told the world is still buggy?
I can assure you that where I work, we run some pretty exotic kernels.
And if they break, it's not just that I or any of my co-workers get in
trouble, but rather that it shows up on the front page of the New York
Times. That said, we run those systems in a tightly controlled environment
running, when you get down to it, a rather limited set of software that we
understand very, very well (because we wrote it). I would certainly never
champion running a general purpose timesharing system on our platform.
So to recap, let's not compare apples and oranges. Booting a kernel for
a couple of days is supported and not locking us into anything for over
a year; claiming otherwise is FUD. Anecdotal evidence is not sufficient
to prove that it's going to work in our fundamentally different environment.
Given our track record with bugs in OpenBSD, I'm wary of running something
that we *know* to be even more buggy. Even if we accept the position
that the software is fine and the problems are our misconfigurations,
I'm not comfortable throwing software bugs into that mix.
Personally, I'd be really rather happier to run OpenBSD 4.1 for a while
than run 4.2-Beta. We're going to be out of date shortly anyway; why not
be out of date with a stable release as opposed to a slightly newer beta?
[Jan pointed out that we've had this discussion before, and always decided
to run the current stable software, as opposed to betas, pre-releases, or
-current, etc.]
#15 of 19: by Dan Cross (cross) on Wed, Aug 8, 2007 (12:58):
I concur with Jan.
Beta exists because software is not yet release quality. ``Works for me''
is not a reasonable criteria for installation here, even if it's working for
someone in a production environment; grex is too different.
As for booting a 3.9 kernel on top of a 3.8 userland. If it's so dangerous,
then why do the OpenBSD people tell you to do it in their upgrade procedures?
Regarding #2; Nope. You're the one pushing to do the odd thing. The burden of convincing us is really on you. If you want to upgrade to OpenBSD 4.2-BETA, then please post a compelling technical reason. ``It's better!'' doesn't convince me.
Another question: Steve, you're the only one who wants to do this. I'll be honest with you; I don't want to support failures in non-production software, and neither, I don't think, does anyone else on staff. Granted, my ability to fix those sorts of failures is limited to begin with, given that I'm not local. But anyway, if OpenBSD 4.2-BETA breaks, what sorts of guarantees are you willing to give the Grex community about your availability to fix it? If there's a kernel crash, will you be available to debug and fix it?
Another thing: We have serious issues with TCP incompatability between Grex and machines running certain versions of Linux 2.6 (including the GMail servers and UMich's email servers) as well as Windows Vista. We have strong reason to believe that upgrading Grex to OpenBSD 4.1 will fix those issues; in your email to board just now, you say you would like to see the upgrade to 4.1 delayed. Won't this prolong the amount of time we experience those connectivity problems? Ie, won't it extend the amount of time that large chunks of the Internet's population cannot connect to us?
My question is: Does upgrading to 4.1 now preclude us from upgrading to 4.2 -beta *at any time we choose*? From what I can see, upgrading to 4.2 -beta is the next step after we get to 4.1. Since we're doing a step-wise upgrade, can't we just stop at 4.1 and then move to 4.2 -beta if there is a staff consensus to do so?
No, that makes too much sense.
Regarding #7; I don't see why not.
I think realisticly we need to choose one or the other; the full release of 4.2 comes out in November, and the odds are slim that anyone will want to do another upgrade between this one and that time. STeve, I'm not convinced. Maybe the beta version is as stable as oyu say it is, but even so, it makes me nervous just because of the name. So without a strong advantage, I don't see what it buys us. It's possible that the next upgrade won't be for another year and a half. Do we want to be running on a beta version for that long? I'd rather not.
I for one consider it a *bad* idea to run anything pre-release, and have chewed out an assistant for doing so after giving the reasons why we don't. Either wait for the first of November (well, I'd wait until the end of November, in case any early errata comes out) or upgrade to 4,1 and do the upgrade later. I do think we need to get into the habit of discussing each new release and consciously deciding whether or not there is a case to upgrading, rather than letting upgrade cycles slip by negligence. I think rushing to have a flag day every 6 months is ludicrous, but so is failing to at least checking out what is being fixed.
Why do you like to chew people out so much, maus? You're not particularly clever yourself, you know.
Bad attitude. I'm just now realizing how abrasive I've been and this week have started working on making amends to certain people. If you want more information, ask me privately; I don't need to make a sob-story in front of world+dog.
This response has been erased.
For the most part, I enjoy and encourage maus' comments, and am not offended by them.
Nor am I. I find Maus's comments insightful and not at all rude.
I learn a great deal from maus, every time I read something he's posted. Abrasive, well, yeah, from time to time.
I received a response from Henning Brauer (the second OpenBSD developer I sent my query to. His response: "nick has said everything ;-)"
Coleen, if Grex staff had a better record of doing timely upgrades, then I wouldn't be making a fuss. But the sad record of our upgrades is that we won't be doing an upgrade soon. If we move to 4.1 today, we'll be at the current version for about 9 weeks--folks who order CDs early usually get them early, so we could have an official CD in 9 weeks or so. If we kept copies of the source tree before 4.2 became 4.2-current, we'd have 4.2 around Oct 1st. Let me reiterate that two long time people working on OpenBSD agree with my assessment of where 4.2-beta is, and would use it themselves. Each day that we argue about it brings us a day closer to the final code changes that will make 4.2. Something I probably haven't made clear enough is that we aren't comitted to 4.2-beta forever. When 4.2 comes out, regardless of how we get the code, we can recompile the kernel, reboot, recompile the world, and change to the official set of packages *IF* any libraries have changed since now. Grex would be down for perhaps 3 hours during the time to compile the kernel and user-land. At that point we'd be at 4.2-stable, meaning the stock off-the-CD version.
OK, but *why* is it better?
The time between OpenBSD releases is 6 months. We have nearly three to go
until 4.2 comes out. Steve, so far, your argument for *why* we should do
this just isn't compelling, and if you read the emails that you yourself
posted, they don't really back you up. In particular, Nick had the
following to say [slightly edited for formating]:
"If you are UPGRADING the existing system, it's a bit more difficult
a question. If you follow upgrade39.html, upgrade40.html, and
upgrade41.html, you will be in fine shape on OpenBSD 4.1."
This indicates to me that he thinks it would be just fine to run under
OpenBSD 4.1. Am I missing something here?
He also says this [again, slightly edited for formatting]:
"The other reason I should point out for avoiding 4.2 at this time is
that in the unlikely event there is a serious security issue discovered
between this weekend and Nov. 1, errata will not be published for it
until Nov. 1, so you will be on your own to "fix" the problem. In your
case, that's really not a huge issue, as you have been on your own in
that way for over a year now."
"between this weekend and Nov. 1" is a long time, and certainly, the
probability of finding some sort of show-stopping bug between now and then
is greater than after the software is released; that's why it is called beta
software, and not a release.
In any event, Nick says we'll be "in fine shape on OpenBSD 4.1" and his
other comment doesn't exactly give me a warm, fuzzy feeling. And you've
posted no real technical rationale as to why we should upgrade to 4.2.
"It's better" just doesn't cut it with me, and whether we're out of date
*now* or in twelve weeks is immaterial to me, since we're going to be out of
date at some point anyway. As it stands, we're not *that* close to the
OpenBSD 4.2 release to justify going with pre-release code. It's not like
this stuff is coming out next week and thus we could make a solid case for
delaying.
And you've failed to address the issue of delaying the upgrade continuing to
impact actual, real users. Honestly now; these aren't idle suspicions. We
know that users cannot connect to Grex *right now* because we're out of
date. What about them? We can't get email from GMail or UMich; what about
that? I'd really, really like to see you address that question. I don't
see a compelling reason to
And you've failed to address the issue of support.
As it stands now, barring any major objections, I plan on doing the upgrade
to OpenBSD 4.1-stable as already planned.
Continued improvements in the code. It's that simple, and that complex. One of the ideas in OpenBSD is fixing little things, just because they aren't good style, or whatever. I think there is a Japanese term for this, called something like "Kizen", continued improvement. So random little things are in there, and larger issues that while not a security problem are annoying. One example is a pointer arithmetic problem in make, which has been around for 10 years. When crashed into make just sat there, eating up some amount of cpu and never doing anything. Not a security problem but annoying. An example of a reliability problem, a problem that also existed in Free- and OpenBSD for years is this: CVSROOT: /cvs Module name: src Changes by: dim@cvs.openbsd.org 2007/07/05 03:04:04 Modified files: sys/sys : socketvar.h Log message: From FreeBSD: Fix a bug in sblock() that has existed since revision 1.1 from BSD: correctly return an error if M_NOWAIT is passed to sblock() and the operation might block. This remarkably subtle macro bug appears to be responsible for quite a few undiagnosed socket buffer corruption and mbuf-related kernel panics. "diff is correct" todd@, "should go in asap" markus@ This is a "CVS entry", an example of what a submission to the src tree looks like in the Concurrent Versions System, the program that keeps track of source code, and revisions for each file, etc. I have seen a couple of panics on Grex which puzzled me. I will not say that this is the cause and fix for it, but it wouldn't surprise me at all. Things get better with time. The three main (well, maybe four now, with Dragonfly) BSDs get and share things all the time. This would certainly be something useful to have. Look at http://openbsd.org/plus.html for a list of most of the changes since 4.1 came out (I say most, because sometimes no one updates that in a timely manner). Someone on staff said that it didn't matter to them that 1100 some things have been made. It does to me, for two reasons. The first is that the annoying little things are nice to see go away. Thats nice, but not serious. The more important reason is to have problems fixed. A number of times I have seen vandals doing really bizarre things on Grex. One that comes to mind was a small program that opened up a number of files, and did multiple (like 1000's) of opens and closes on them, over and over. This didn't do anything here, and I classified that person as another vandal wanna be, which we get by the boxcar load. But perhaps a couple of months after this I was taking at Penguicon with friends and heard of a problem with some less popular unix type system (aix? hp-ux?) that had some sort of problem with repeated file closes. I believe that person I saw was attempting an attack. It is ABSOLUTELY the case that we get attacks here all the time. in my root directory there are hundreds of vandal accounts where I have copies of their code. Most are the now familiar things like Windows RPC attackes, perl scripts to attack web sites, etc. A few are OpenBSD specific, like the attack for the SSH bug in OpenBSD-3.1, etc. Most to the vast majority aren't even specifically aimed at us. But there are people who comb through the OpenBSD CVS logs to see whats been changed, and then twiddle with that item to see if they can cause a problem. Every bug fix in OpenBSD is potentially something that some vandal will try to see if they can exploit. 99.9%+ of the time they haven't a clue, but the intent is there. This brings up the idea that being on any open source system means that the vandals have access to the same stuf that we do, which means that there is a potential problem there. We've been *AWFUL* at keeping Grex up-to-date. I'm part of that problem. Because of this I want to see Grex running the latest code. And, we can upgrade to 4.2 in a matter of months.
Dan's comments came in as I was doing other things here at work. I have to work on other stuff at the moment. I'll point out right now that with 4.2-beta, all the errata is in the src tree right now. There is no errata yet--if somethiing has been fixed, its there right now. When it comes time to upgrade to 4.2 all the other changes will be in there as well, at which point we're at 4.2-stable and ready to add patches as need be.
Regarding #22; A 10 year old bug in make that no one's run into before, and a problem with a sockets related macro? Those (a) sound not so serious (what panics are you referring to that might have been caused by the networking change? I've never seen a panic with a stack backtrace going into the sockets code on grex). (b) They both also sound like things that *might* be in 4.1-stable, and (c) they both sound trivial enough that we could cherry pick them into the stable source tree if necessary. Neither compells me to want to install a beta release of the operating system, especially since it's not clear that either affects us at all. Of course, we can upgrade to 4.2 in a matter of months, but why not upgrade to 4.1 now, and then roll to 4.2 when the time comes? Regarding #23; That's what I'm talking about. No one has the sort of time available to support a pre-release version of the operating system. Note also: the most serious bug fix here is from FreeBSD, not organic to OpenBSD. I really, really wish we were running FreeBSD instead.
Dan that is completely ridiculous. You should know that the BSD projects get stuff from one another all the time. Come on! For any one bug, there has to be an origin, right? All three (four?) projects get stuff from elsewhere. To cast aspertions as OpenBSD because of one example is not reasonable.
My point is that OpenBSD is not the end-all-be-all of operating systems, Steve. We originally went with it because of its supposedly bullet-proof reputation, but it's proven, again and again, to be less than deserved. Frankly, I think FreeBSD would be a better platform. But, that's my opinion, and not really relevant to the discussion at hand, which I'd really like to concentrate on.
I have never said it was the end-all. Hardware problems do not count here, and we've definitely had some. So yes, let us concentrate on the discussion at hand, and deal with that. Dan: why are you so against upgrading to the pre-4.2 right now, and then upgrading later? You know that we can grab a copy of the src tree right when 4.2 becomes 4.2-current, and that will be at least one month before the earliest CDs are shipped, making it about mid to late September that we could get it.
Steve, the burden is on you to convince us that we *should* upgrade to 4.2-BETA, not for me or anyone else to convince you why we *should not*. You are the one in departure from established practices, and to whom practically everyone with an opinion has said, ``that is not a good idea.'' Even the OpenBSD people who's opinions you posted expressed some reservations and said that 4.1-stable would work fine for us. As for why I am against this, I've posted that reasoning repeatedly. But to recap: a) The software is in beta for a reason: it is not production ready; it will not be officially released for nearly three months. I do not think it is acceptable to run pre-release software on Grex. b) Delaying now will continue to negatively impact our existing users. c) There is not a sufficiently compelling reason to go to the beta release. A one-line code change, that we can cherry pick ourselves, doesn't cut it. d) There is a support issue that has remained unaddressed. e) Given our track record, I think it's unlikely that someone will do an upgrade to 4.2-stable once it's finally released. Thus, we are likely to be running on a Beta for a long time. I do not think that is acceptable. And the other reasons I've posted throughout this thread.
If our remote upgrade strategy proves successful, and we upgrade to 4.1 now, and Dan documents the process thoroughly in Grexdoc as requested, wouldn't it be a pretty straightforward process to upgrade to 4.2 in a few weeks when it becomes available and the upgrade process is still fresh in people's minds? If so, I don't see why it would be a major problem to wait. I'm having trouble understanding why this issue has generated such heated and lengthy argument.
(While I was composing #29, more argumentation slipped in via responses #27 and #28. Still mystified here...)
If we were good about upgrades, I'd definitely agree John. But we haven't been, which is why I've been advocating using 4.2. Remember, we should be upgrading every six months.
Given that we are bad about doing upgrades, why then do you think we'll upgrade to 4.2 once it comes out?
Regarding #30; I wouldn't consider it argumentation, but rather discussion. Hopefully, Steve feels the same way. Steve? Surely you don't think we're arguing?
Not only that, but if we are bad about doing upgrades, why then should we delay the upgrade to 4.1 when we have someone ready, willing and able to do it now?
Re #31: I guess my point is that if upgrading to 4.1 is successful *AND* Dan thoroughly documents the procedure, we're in a position to be better about upgrades next time. Re #33: Discussion and argumentation aren't mutually exclusive. But if staff members are going to split hairs over words like *that*, I'd have to question whether they're making the best use of their time.
You know, I dislike any discussion that starts with someone pulling someone out of their hat and saying "here this guy is a big expert and here is what he says". I have been, at various times of my life, the indisputable world's number one expert on a variety of narrow topics, and I can say from personal experience that experts are, as a rule, full of shit, and their opinions are just personal opinions like everybody else's. Now, I don't generally believe in trumpeting my own expertize, because, as previously mentioned, I don't believe in experts so the whole thing is fundamentally hypocritical, but for those who love experts, I'd like to say that on the subject of upgrading Grex, "Hello, I'm the expert." Though lots of people helped, I was the one who took the lead on upgrading Grex first to the Sun 4/670, then to the first OpenBSD system, and then all the upgrades to OpenBSD since then. The documentation that exists for Grex's upgrade procedures was written 95% by me. The last time that someone other than me upgraded Grex's OS was probably when we moved from the Sun 3 to the Sun 4/260. I think that was mostly Greg Cronau. I think STeve's experts are thinking in terms of systems where nearly all the software is OpenBSD software in a stock configuration. You upgrade them by applying the OpenBSD upgrades, and you are done. Any program you have on the system are probably Perl scripts or python programs, and they'll work just fine. Grex, alas, has a substantial amount of our own custom C code that all needs to be rebuilt after each upgrade. We have various strange local configurations like the suidbin area, that need to be regenerated. Upgrading Grex is substantially more complex that just installing a new version of OpenBSD. We don't do it often, because I hate doing it, and until Dan showed up, we had no other volunteers. Maybe he can evolve a faster and more efficient upgrade procedure and we'll be able to do it more often. That'd be great. STeve's expert speaks of upgrading now the 4.2-beta, then when 4.2 is released, we just apply the diff's and, voila, we have 4.2-stable. But, oops, upgrading Grex isn't that easy. How much of our code will we have to rebuild after that? We'll the diffs apply correctly to the programs we moved to /suidbin instead of leaving them in their normal places? I don't know. Neither does STeve's expert who's never heard of suidbin. If we are on a -stable release, and a major bug appears, then a patch get's released. We give the patch a quite look over to make sure it isn't trying to patch something we moved to a different place, and we apply the patch, maybe after some simple mods. If we are on a beta release, and a major bug appears, then there is no patch. I think we need to make our own patch or do a full upgrade to a newer version of the OS. Yes, it's possible. No, it's not terribly easy, and it's much harder because we run a modified version of OpenBSD. I can't get hot and bothered about the fact that we will be missing out on 3 months of improvements to OpenBSD if you go with 4.1-stable. So what? If any of these was a really major security fix, there'd have been a patch to 4.1 released. By our standards, only 3 months obsolete counts as brand spanking new. In any case, an effort is now underway to buy a new computer. I expect that by the time that is ready to go, 4.2-stable will be out. Or if things proceed at the usual Grexian pace, 6.7-stable. That'll be a great opportunity to upgrade again. Almost makes me want to not bother upgrading the current Grex. As far as I know, there is only one thing besides general ancientness that's bothering us about the current 3.8 software, that the TCP bug that keeps me from being able to connect to Grex from my computer. Rumor has it that the 4.1-stable will fix that. If you told me we needed 4.2 to fix that, then I might be temtped, because there is something that we actually need enough to do something that is going to be as wacky and likely to cause trouble down the road as upgrading to a Beta release.
Well, we certainly have a disagreement. You and I see OpenBSD very differently. I know that we both want what we think is good for Grex; its just that the paths to get there are different. As for the upgrade proceedure, I think its less a matter of the how-to, than finding the time to do it. Time is the single most rare commodity these days. But certainly documentation is never a bad thing.
Many people slipped in. But we've had this same discussion before every upgrade, and I'm tired of it. I'd much rather people worked on getting a backup made so the upgrade could proceed. I, alas, have more than used up the little time I have available today.
Jan I have to go but the packages will all be avilable so I think thats covered. The next things to think of are our custom items. What would force a compile would be a major version number bump in a library. That could happen, especially if some bug is found. To get back up in a disgusting manner would be to make a sym link between libsnarf.so.2 and .3 such that programs linked to .2 would still work, as we compiled things. This is one of the things to consider when using -current, but we're close enough to the release date that I don't think that will happen, which is why I think we ought to use it.
Steve, we're just not that close to the release data. That's still several months away.
Regarding #37; Well, I have time this weekend, and am going to do an upgrade to 4.1-stable. If people want to do an upgrade to 4.2-beta later, and it's decided that that's the thing to do, then we do the upgrade from 4.1-stable. Really, I don't see that as being a problem at all. Barring any strong objections, that's what I'm going to do. Regarding #38; I have a plan for the backups. We've got space on the /grex partition to back up the SCSI drives. Then we just leave that drive quiescent during the actual backup.
The official release date is November 1st. People get early CD's about mid to late October. It takes about 5 weeks to get the release masters ready and get CDs back from the factory. Assuming October 25th as the date for early releases, we're looking at September 17th for the cut off date for 4.2-stable. Today is August 10th. Thats one month and one week left till we could get a copy of the src for 4.2. So it isn't several months. It's five weeks.
I don't see why we can't move ahead as planned this weekend (as long as the hardware cooperates). I have seen no compelling technical reason to move to the -beta. I many not understand everything correctly, but it seems to me that moving to beta leaves us MORE vulnerable, simply because staff availability is our choke point.
The *official release* isn't slated to happen until November 1. What about an 11th hour change that goes into the `release' as errata? But that is an academic distinction. You still haven't clearly articulated why we absolutely *need* to upgrade to 4.2-BETA. I'm a software engineer, I guarantee that I will understand any technical argument you can throw at me: convince me. Also, you have not articulated why we shouldn't do an upgrade to 4.1-stable now, while we have resources to do so, regardless of whether we upgrade to 4.2-BETA at some later point? You yourself said time was our biggest missing resource. I have time now; why should I *not* do an upgrade to 4.1-stable tomorrow? Nor have you addressed any of the other concerns that I raised. If the totality of your argument is, `it's better!' then I will remain unconvinced.
Actually, I would like to see the upgrade done this weekend. If something goes poorly, Dan will have time to fix it. If we delay, the upgrade will probably get delayed until sometime after November because of known staff time constraints. Why should we not try to fix the connection/email problem right now?
re #29: John wrote: > I'm having trouble understanding why this issue has > generated such heated and lengthy argument. I'm not. In my opinion it's not really about the difference between 4.1 and 4.2-BETA, it's a proxy battle about control over the system between two participants determined to demonstrate that they're the alpha geek. They may not even realize that's what they're doing, but I'm convinced that's the best explanation to fit the facts. Being somewhat inclined in that direction, I used to love a good technical argument myself. Then I grew up and realized that unless there's a critical flaw involved, fewer than 1 in 1000 people truly give a rat's ass what revision of software their computer is running.
Wow. I think I'm going to cry now. :-)
Admitting you have a problem is the first step towards finding a solution. :-\ If it's any consolation, I'm leaning your way in the actual dispute. It's just that I think the whole thing is kind of pointless, except insofar as it REALLY needs to be established whether STeve retains the authority to stonewall any plans he doesn't approve of.
I'm adding you and remmers to the list of people who've made comments I agree with in this item. What I find odd about STeve's "argument" is that part of it is based on the lack of staff time. The reason I find this odd is because it presumes the status quo of relatively inactive staff will continue. From a historical standpoint, he's probably on pretty solid ground. OTOH, I am not alone in calling for "new blood." Perhaps it's naive of me to expect grex will soon have that new blood, and an upgrade to 4.2 would then not be such a big deal. Anyway, I just think STeve's reference to time reveals an "old" mindset compared to what others are working towards now.
If we don't have staff time to maintain what we've got, we definitely don't have the staff time to babysit and constantly rev beta code.
Sorry Mike, you got that dead wrong. I'm thinking of whats best for Grex. We're doing an upgrade, which is good. But since we're bad at timely upgrades, we'd be in the best position to get to the most recent version. Two OpenBSD developers agree that this is the way to do it. Sadly, Grex staff doesn't "get" OpenBSD. The board doesn't understand, and so can't raise an objection. It's not about "alpha geek". I don't give a shit about that. I have no idea how you'd ever measure it, any way. Cyklone, if I'm of the old mind-set, so be it. But show me where one can get six-packs of time, will ya? If we had the staff of a few years ago where we had 11 roots, we could do this upgrade, and then do another later. That would be great. Dan, you do an extremely good job of finding reasons not to do the upgrade. You have a future I think, in the legal profession should you ever tire of computers. That isn't meant as a jab. To respond to your *offical* release statement: at some point in September (or later) the code is going to change from 4.2 to 4.2-current. At that point we know what 4.2-stable is. Now, the "errata" that you speak of on November 1st is all available in the CVS logs. You know that as well as I do. If you subscribe to the CVS change logs you'll see things. Bad stuff is identifiable. You have to know something about code, but its all there. It would certainly be possible to add those if we deemed them important. But I'm not going to push any more. Folks don't see what I'm saying and I can't do any more than I have, so I consider that I've "lost" this, and we'll upgrade, which is good, but be in the position of being behind in about 2.5 months. Oh well. Let us talk about two other points. Point one is that Provide has cut back hours that staff are there, and might not be there at all on Sunday. Did anyone call John A to see if he'd be around such that perhaps he could reboot the box for us if things get into a weird state? Point two is the question of wether telnetd will compile and run right on 4.1. It should, I think, but I'm not sure. If there are any parts of it that know about ipx packets that will have to get changed as 4.2 no longer supports them. Point three (OK, I can't count) is that I don't have Picospan ready yet. I am quite busy at work trying to get systems ready for the coming academic year, and I just found that my sole 4.1 box is dead. So I'm trying to get more hardware together to make such a box, but I have limited time tonight so I'm not sure I can get 4.1 up. Picospan proper should be a breeze to compile, taking about 1 minute to make.
The physical part of the box for compiling Picospan will be set up before I go to bed. I would then help STeve do the OS install and compile except that he won't be home until after midnight and tomorrow I start my annual 50+ hr week at work (I just love working 10-20 hrs/wk all year except this one fun week from hell!) and have to be getting to bed shortly.
Thanks for that Glenda. Assuming the little Dell you stole from the cats works, we should have picospan for 4.1 tomorrow some time.
I'll take care of telnetd; it shouldn't care about IPX at all. No, we didn't call provide, but then, we anticipated a lack of onesite coverage, which is why we carefully worded the notices about the upgrades to say, ``the weekend or longer.'' I think Picospan is the biggest issue. The general concensus at the board meeting was that we should upgrade regardless of whether Picospan had been compiled, falling back to fronttalk if it really wasn't ready. Steve: in the worst case, could you move the source to grex and compile it there, then remove the source and leave the binary? Sort of like the anti-"make clean"?
#53 slipped in. Big thanks, Steve!
STeve, those "six packs of time" will come about when (a) more people volunteer, and (b) they receive the proper training and support from existing staff (and perhaps former staff). Rather than just ignoring that point, what are you willing to do to make it a reality?
Re #54: Fortunately, Picospan is not as mission-critical as it once was, since we have Fronttalk as a fallback. It's not in a completely polished state, but I've had enough experience with it to know that it's pretty stable and is 90+% indistinguishable from Picospan. So I agree with folks who say that we shouldn't let whether Picospan is ready or not affect the timing of the upgrade. As to Grex's other custom software - I was pretty heavily involved with the upgrade to 3.8 and found Jan's Grexdoc CVS system to be invaluable. I recommend sticking with it.
Yes, as we've already discussed, I will be updating and enhancing grexdoc (though it might be renamed to grexconfig, since there's already another grexdoc that contains board meeting minutes and the like. Naturally, that's a purely cosmetic change).
(It's now moot, since the upgrade is in progress, but I'm going to add my comments anyway. (We have the staff do an upgrade this weekend. We may not have the staff to do an upgrade in November. So how can ANYONE, _in good conscience_, argue for a solution that DEPENDS upon having staff to do an upgrade in November? The procedure I've seen propounded is to upgrade to -beta now and to -stable when it is released. *IF* we have the staff available in November, they can upgrade from -stable to -stable just as easily as from -beta to -stable. If we don't have the staff for an upgrade, we stay on -stable INSTEAD OF ON -BETA. (And yes, I'm shouting, 'cus some don't seem to be hearing.)
Looks like we're up and running within the announced time period. Kudos!
Yeah, I think we beat the time line by about 20 minutes. :-)
Dan, you did it, and I didn't see any indications of a problem. You go!
Hurrah! we are up and running. neat.
Well done, Dan. :)
thanks much, Dan!
Wonderful. Thanks, Dan (and anyone else who helped with this one).
Nice work!
Thanks, folks! There's still quite a bit to be done, but I'm glad that (so far) there haven't been any major problems.
I just read this for the first time. I don't see why it was the best thing to start the discussion here on Friday, for an upgrade that was to begin Saturday. My other comment is that I've known Nick Holland for over 20 years, since our college days at Michigan Tech. He introduced me to M-Net, which directly led to me getting a job in Ann Arbor. Not that that could possibly be valuable or interesting to anyone here...
Are you sure you wouldn't like a ballcap that says, "I upgraded Grex to OpenBSD 4.1 and all I got was this stinkin hat" Of course, that may be too many words to fit on a ballcap.
I'd prefer a hat that says, ``when you can't get up off the floor.'' :-)
It does seem as if the upgrade went well. Nothing has crashed that I can see, and we're still up. ;-) Joe: what I argued for meant a simple upgrade when 4.2-stable came out. There is a rather large difference between what Dan did to get us to 4.1 as opposed to installing the 4.2-beta and then getting the updated code and recompiling. Think of it as applying patches and compiling the kernel and user-land.
Are we going to have to do the same thing Dan did this past weekend to go to 4.2 in November?
I'd recommend that we wait a couple of weeks to see the postings on the web and in mailing lists and see if there is anything in 4.2 that needs work before installing. While OBSD is far better than others, all new software should be looked at with a wary eye its first week. Let the bleeding-edge early adopters like STeve be the beta testers, and we can move to it once we see that there are no critical flaws.
Not a bad idea, but it doesn't answer my question. ;)
Regarding #73; Short answer, yes, but it's unlikely to take as long. Longer answer: I'm hoping to have a lot of the procedure more firmly nailed down and (most importantly) automated via scripts and the like, so it should be much more straight forward. My big emphasis for this upgrade is, ``change the base operating system as little as possible.'' Hence, most changes have been pushed to places that are *expected* to change. The rationale here, of course, is that if we change the base operating system very little, then that's less we have to change to do an upgrade, and less opportunity to make mistakes.
Is there a precompiled lynx 2.8.6 for OBSD?
Probably not; is there a compelling reason to get Lynx 2.8.6? Some critical feature or bug fix?
2.8.5rel1 dates from Feb. 2004 (3.5 years ago). There are many pages of bug fixes and improvements listed (18 pages at 1024 resolution) at http://lynx.isc.org/release/lynx-2-8-6/CHANGES. 2.8.6rel1 is Oct 2006 and not much has changed since then. They are working on 2.8.6dev5 now. YOU also need to update cacert.pem (cert.pem) every 6 months or so in order not to get messages about certificates. I have the address for it. Some of the fixes have to do with memory leaks. OpenBSD 4.1 still has lynx 2.8.5rel4.
Yes, but that's the version that comes with the operating system, which means it's the one supported by the OpenBSD folks: the newer version might not be supported.
I probably missed something... WHY is it so important to run -current, ie. now 4.2pre instead of 4.1 ??? Version numbering in OpenBSD isn't features related, it's just that De Raadt decided to put a release each 6-months. Moreover Grex is a console system (no need for getting latest port upgrades of GUI stuff) running" old" hardware (no need for drivers for latest wip, sata, new chipsets, ...).
You have several choices: