Grex Helpers Conference

Item 140: Grex System Problems - Spring 2005

Entered by i on Tue Mar 22 12:11:57 2005:

277 new of 457 responses total.


#181 of 457 by tod on Mon May 2 16:00:33 2005:

re #180
Every password cracker in existence relies on assumptions like that.


#182 of 457 by gull on Mon May 2 16:18:37 2005:

Re resp:180: Why not just write a procmail filter to drop 100K+
messages, if that's what you want?


I hope staff gets the kinks worked out of Grex's new system, soon.  So
far, it looks like the old Sun was more reliable.


#183 of 457 by naftee on Mon May 2 17:35:05 2005:

i hope nobody hacked my account :(


#184 of 457 by albaugh on Mon May 2 18:59:03 2005:

> Alternative: stamp your little feet and declare  yourself a "paying member"
> of grex, which, I understand, accepts contributions.

News flash:  That won't help the situation ONE BIT.  Despite whatever revenues
grex may take in - which can pay for hardware and utilities - its labor and
brain power comes only from human volunteers.  If the system is not reliable -
and lately it has not been - then grex users are at the mercy of whatever time
volunteer staff may or may not have available to give to grex, and access to
the new facility.

I have seen the light:  grex has been so reliable for so long that I have come
to rely on that.  And that was unwise, and unfair.  If I'm smart I will start
to wean myself off grex...


#185 of 457 by nharmon on Mon May 2 19:14:03 2005:

Or you can volunteer your time to help Grex get back up to being super
reliable.


#186 of 457 by tod on Mon May 2 19:21:05 2005:

And see if the STeve of Oz lets you.


#187 of 457 by albaugh on Mon May 2 19:22:09 2005:

Nope, I don't have the expertise necessary, nor the proximity to the location,
nor the inclination.  If I want a reliable system, I should not be relying
on grex - that is the conclusion that all should recognize.


#188 of 457 by tod on Mon May 2 19:23:15 2005:

MOVE ON -Mary Remmers


#189 of 457 by gull on Mon May 2 20:23:00 2005:

And I'm sure many of us will remember being told to "move on," when it
comes time to renew our memberships...


#190 of 457 by naftee on Mon May 2 21:20:49 2005:

GreX should adopt a no-ID policy wrt memberships


#191 of 457 by steve on Mon May 2 21:31:43 2005:

   I have no evidence that Grex was vandalized.  Believe me, I looked
and I've not seen anything that indicates that.  Based on previous
examples of vandalism on Grex and M-net and other such systems, a
rm -rf / is one of the more common things done to systems.

   The Sun-4/670 was incredibly reliable.  It's tough to match that,
but I think we can.  I realize that the current system has had problems,
which combined with other stuff has made for long periods of downtime.

   STeve of Oz, eh?  Ok.


#192 of 457 by steve on Mon May 2 21:38:52 2005:

   Changing one's password is never a bad idea.  How often do people actually
change their passwords, I wonder.

   Another reason I don't think Grex was vandalized was the fact that the
nulogfile that newuser writes data into was completely weird for many
accounts.  Someone breaking in wouldn't do that.


#193 of 457 by tod on Mon May 2 21:40:13 2005:

re #191
I wasn't referring to "vandalism".  Did you miss the bit about "cracking"?
A corrupt passwd file should prompt staff to request all users to change their
passwords ASAP, don't you think?


#194 of 457 by steve on Mon May 2 21:43:03 2005:

   Cracking is vandalism.


#195 of 457 by mary on Mon May 2 22:03:00 2005:

Re: 188 & 189  You guys are such drama queens.  

Grex is unreliable at the moment and probably will be for some time 
to come.  If you have important files here, make sure they are 
backed up.  If you need mail, I mean NEED mail, Grex should not be 
your primary mail drop.  Realize that, find reliable email 
elsewhere, and stop acting like clients not getting their expected 
service.  We aren't into service.  We're into community.  Don't 
donate based on services you expect to receive.  That's expecting 
more than we're prepared to give.  We are run by volunteers who are 
pretty thinly spread at the moment.  And we can't beat them into 
giving more of their time.  So it's up to the users to back off and 
be as supportive and patient as they can.  Not into that?  Then you 
really should move on.  Grex isn't going to work for you.


#196 of 457 by jor on Mon May 2 22:19:13 2005:

        But what if the "paying members"
        stamp their little feet?

        Just teasin' ya.

        tod has motivated me to change passwords.

        man I wish I could lern to be one o'them thar Drama Queens



#197 of 457 by tod on Mon May 2 22:23:28 2005:

re #195
 Re: 188 & 189  You guys are such drama queens.
I prefer "prima attore" if you don't mind!  8P


#198 of 457 by naftee on Mon May 2 22:43:24 2005:

i changed my passwd.  thanks, tod !


#199 of 457 by janc on Mon May 2 23:00:45 2005:

I'm not that sure that we need a bigger mail partition.  The one time I
saw it fill up, it was due to a single user's mail file that grew way,
way big.  When I deleted that file, mail was only like 46% full.  I
probably should have studied the file before deleting it, because
theoretically no mail file should ever be able to grow that big.  The
mailbox quota software I wrote should prevent it.  But evidentally there
is a weakness in that somewhere.

If we got a bigger mail partition, but didn't fix this bug, then I
presume it would still fill up.  If we do fix the bug, maybe we don't
need a bigger mail partition.


#200 of 457 by steve on Tue May 3 00:36:53 2005:

   Thats a good point, Jan.  I saw the account thats been hogging so
much of /var/mail, so I did a talk at him.  Turns out he's been
harassed by someone where he is, who subscribed him to a bunch of
things.  He said he had in the process of getting off them, and I
think that perhaps he did it, since I haven't seen much activity
on that account lately.


#201 of 457 by cross on Tue May 3 02:47:06 2005:

This response has been erased.



#202 of 457 by steve on Tue May 3 04:33:02 2005:

   Anything that is misconfigured is going to be a problem, Dan.
You know that.  I don't understand your negative attitude towards
OpenBSD.  I think you know that the panics we've seen have been
related to the networking card.  OpenBSD 3.5 had a few problems
with it, and we've seen them.

   But I'll tell you what we haven't seen: random breakins with
filesystems being destroyed, mail from root to staff ala "we 0wn u",
mail sent from others accounts by someone with root, or any of
the other marvelous little things that vandals do.  And believe
me, people are trying, *all the time*.  The number of exploits
I've seen brought over here, and the number of strange little 
one screen programs making odd kernel calls is as high as I
have ever seen, only they're more tailored for OpenBSD.

   Given the history of security on SunOS, we're a trophy.  I've
been told this at least three times.  There are people who really
dislike us just because we had the sense to lock down the system
back when NFS explots and remote mountings were common enough
that Grex not doing those things was rare.

   I wish I'd had more time in the last several months to collect
some of the crap I've seen people try and run.  But that we're
still here, haven't lost the system says *a lot*.

   Have we had problems?  Yes.  We started out on the wrong
version of OpenBSD, by not using the latest.  I do not fault
Jan for this; after all, he just about single handedly got the
system up.  I was certainly useless during most of that time.
So we've been running a version behind from the start.  Our
upgrade is going to change a lot of things, and I'm going to
be very surprised if we don't see an elimination of the nic
problem.  The quota problem we saw I think is fixed but I 
haven't looked at the cvs logs closely.

   Nothing is perfect.  As Marcus and I have said in the past
we've embarked upon an interesting journey here, in going to
a modern operating system.  SunOS was stable, but also had
the advantage of being obscure enough so as to not to attract
a lot of vandal attention.  Now that we're out of that world
we live in an enviroment where people do try active exploits,
and have access to the source.  Ultimately the open source
model makes for much better security, but if we do find that
an exploit is out there we'll have to act quickly.  I'll point
out that FreeBSD is in exactly the same position here.  If
something comes out, swift reaction is needed.

   Had we a pile of money I'd go for the raid design.  When
we were first talking of building a new Grex it hadn't occured
to me that we wouldn't have constant access to the hardware.
I guess I fumbled on that one--it should have been talked
about more.  Given Grex's budgetary restraints at the moment
I don't think that this is the best thing to do at the 
moment.  We need to get Grex 1) reliable, 2) deal with the
amazing Spam problem.  Once we have a system that resembled
the stability of the Sun-4/670 we can start thinking of 
other improvements; raid would certainly be a nice thing
to have.

   Lastly, Grex does beat on its disks Dan.  You've never
been in the Pumpkin and listened to the disks.  I remember
a time some years ago that Marcus and I sat next to Grex
just listening to the noises the disks were making.  The
activity lights were furiously blinking away and we could
hear the HP disk and at least one other as people were doing
things.  Long time Grex folks will remember the various
disks we beat to death over the years, going all the way
back to the little 40M (not gig--meg) SMD disk that Marc
Unangst and I battled.  What may be different now is our
current system; given the extra ram we have we might be
significantly be reducing disk movement.  I don't know.


#203 of 457 by mcnally on Tue May 3 06:12:56 2005:

I agree with Dan:  Grex is not (or, not quite the same thing, should not
be) an exceptionally disk-intensive system for a multi-user server.
If disk is really thrashing that much we ought to (a) examine the partitioning
scheme, (b) look at other tuneable disk parameters, (c) maximize caching.
Even with 30-40 users logged on most of the time it doesn't make sense to
me that Grex's disks are getting pounded so hard, especially not the user
partitions.

I also agree with him that the explanation for the recent crash was quite
surprising to me.  Did I gather correctly that an ordinary unprivileged
user can take Grex down with a fork bomb?  Haven't we set per-user file,
memory, and process limits to reasonable values?  I think I read recently
that OpenBSD doesn't set those in the default install, but are they set now?


#204 of 457 by gull on Tue May 3 13:31:59 2005:

Re network card-induced panics: I recently had that same problem with
FreeBSD.  Near as I can tell, the RealTek driver is buggy.  OpenBSD
shares most of the same driver code, so if you have any RealTek cards in
use, you may want to change them.  This may be a bug that only rears its
head on Alpha platforms, though.


#205 of 457 by janc on Tue May 3 14:04:04 2005:

I've seen no clear evidence on what cause the last crash.  We do have a core
dump and kernel image saved from that panic.  It looked like something went
very bad while attempting to close a file descriptor.  That could be a network
issue, or it could be just about anything else.

We've twice now had the password file corrupted.  STeve attributed the first
one to a disk error.  I assume he had evidence for that, but the few times
I tried to access that disk, I never got a single error message off it.  Now
we have a second crash where the passwd database got corrupted and I can't
help wondering if something else might be going on.  I wish somebody had time
to properly analysize these things.


#206 of 457 by tod on Tue May 3 15:14:19 2005:

Yea, Dan. Its the DISKS, Dan.  It gives worms to ex-girlfriends, Dan.
<pats self on back like bird flapping wings>

What's wrong with the RAID suggestion?  It makes sense.  If RAID won't fix
the problem then the OS needs to be replaced.


#207 of 457 by cross on Tue May 3 15:39:46 2005:

This response has been erased.



#208 of 457 by cross on Tue May 3 15:41:11 2005:

This response has been erased.



#209 of 457 by tod on Tue May 3 15:42:49 2005:

re #207
 Okay, I'll let the cat out of the bag: Staff had a report of a
 security problem where a random, unauthorized users could run *cat*
 on a tty device and see users connecting and typing their passords.
This was on Grex? Were the users notified that they should change their
passwords?  


#210 of 457 by gull on Tue May 3 15:46:00 2005:

FWIW, I think the security arguments for OpenBSD over FreeBSD are
overstated.  FreeBSD gets the benefits of OpenBSD's code audits, because
a lot of code is shared.  I also suspect FreeBSD has a larger installed
base, which tends to flush out driver problems sooner.  I never ran into
them on my x86 machines.  I've run into a few on my AlphaPC, but Alpha
is a minority platform that doesn't receive as much testing.

I'm not trying to weigh in on one side or the other here.  I'm just
saying that the two operating systems are, from my perspective,
extremely similar, so I think in many ways it's an arbitrary decision. 
Migrating from OpenBSD to FreeBSD, if you choose to do so, would
probably be fairly painless; much of the configuration is kept in the
same places.  It still may not be easier than fixing what you've got,
though.

(Incidentally, keep in mind that OpenBSD's much-touted "only one hole in
the last 8 years" security claim applies only to *remote* exploits. 
That suggests to me that security in a situation where you have local
shell users may not be their first priority.)


#211 of 457 by cross on Tue May 3 16:32:46 2005:

This response has been erased.



#212 of 457 by tod on Tue May 3 16:36:18 2005:

re #203
 Did I gather correctly that an ordinary unprivileged
 user can take Grex down with a fork bomb?  Haven't we set per-user file,
 memory, and process limits to reasonable values? 
That's what I read from the explanation.

re #211
 What's more, most of the security auditing that happens is poorly done
 by amateurs.  I wouldn't rely on it to run a bank.
I would hope you would want better security for the financial or healthcare
sector than for Grex, too. ;)  At least with Grex, I'd hope we could find a
way to keep the system from crashing for several days at a time.


#213 of 457 by mcnally on Tue May 3 16:53:42 2005:

  Since I don't want to see this dissolve into a BSD-vs-BSD flame-war
  death match, I'm going to try to subvert the conversation by proposing
  some concrete suggestions that don't require immediate consensus on
  the OS issue and don't require planning an OS upgrade or replacement
  any time soon.

  How about if we begin by:

  Immediate, Critical:
  1)  Fixing the TTY security problem if it isn't already solved.
  2)  Make sure that sensible run-time limits are enforced and that
      no ordinary user can cripple or crash the system with a fork
      bomb.

  Very Important:
  3)  Ensure that Exim version is sufficient to use recent SpamAssassin
      integration features and begin testing system load under more
      aggressive spam-filtering program.
  4)  Verify whether network driver support for RealTek NICs really is
      affected by known bugs and add new ethernet card based on better-
      supported chipset to system if so.
  5)  Consider setting up CVS or other versioning system to checkpoint
      multiple backup copies of critical system files like /etc/passwd.
  6)  Research OpenBSD disk problems to see whether others are experiencing
      similar crashes, in which case we should reconsider OpenBSD, or,
      if not, we should consider the possibility of hardware problems as
      a root cause.


#214 of 457 by nharmon on Tue May 3 17:42:34 2005:

>  I wouldn't rely on it to run a bank.

With most financial institutions, security is concentrated on the perimeter,
usually because the mainframe systems that run banking software use insecure
operating systems (Windows 2000 Datacenter comes to mind).


#215 of 457 by tod on Tue May 3 17:50:21 2005:

re #214
 With most financial institutions, security is concentrated on the perimeter
Actually, security is concentrated "in depth" as in at multiple layers like
a fortress with a moat, gate, guard tower, huge wall, etc
A firewall simply doesn't cut it anymore when you have GLBA worries, IT
productivity problems, password headaches, etc..
The least you should have are 2 firewalls with different flavors at the
perimeter of a financial institution but this is not a DMZ or IPS discussion.
The fact is, Grex had a security flaw and it wasn't reported to the users.

I'm disheartened at how this and the subpoena discussions have been buried
from the public discussion.


#216 of 457 by marcvh on Tue May 3 17:58:24 2005:

Sure, and also financial security is based on the concept of transactions
and auditability.  Grex doesn't have such beasts.


#217 of 457 by cross on Tue May 3 18:30:04 2005:

This response has been erased.



#218 of 457 by nharmon on Tue May 3 18:32:14 2005:

Re #215 - In Grex's defense, perhaps the Coop conference is a more appropriate
place to discuss Grex policies regarding notifying users. I've posted an item
that hopefully attracts some comments on the pros/cons.

Re #216 - It used to be that Banks didn't have to care very much about their
customer's names and addresses, etc...because this data was regularly bought
and sold to other companies. But the GLBA now requires us to safeguard this
information with the utmost diligence...to the extent that some banks will
fire employees for not locking their PCs and leaving them with customer
information still on the screen.


#219 of 457 by steve on Tue May 3 18:39:04 2005:

   We use a Broadcom 5702x nic.

   Grex isn't a transaction system.  I will agree that such a thing presents
more of a load than Grex does, but it also has hardware better suited to that
task.  We've *listened* to the disks Dan.  Honestly.  There were times on the
Sun-4/670 that you could just sit there and hear them madly running around.
Perhaps I didn't say it well enough but OpenBSD may be significantly different
from SunOS in this regard; maybe it will be kinder on the disks due to caching
issues.  I guess we'll see.


#220 of 457 by tod on Tue May 3 18:49:58 2005:

Maybe IDE would be kinder than SCSI?


#221 of 457 by naftee on Tue May 3 18:51:04 2005:

i use FreeBSD and Realtek and am pleased by the performance of both.


#222 of 457 by steve on Tue May 3 18:56:50 2005:

   I don't think the disk interface matters much.  However, it has occured
to me in the last few minutes that we're swimming in disk compared to what
we had under SunOS: 256M there, and 1.5G here.  That will eliminate swapping
and use about 75M ram for file caching which will also help.

   I just changed the default limits in /etc/login.conf for maxproc to 32.
Maxproc-max was at 128.


#223 of 457 by steve on Tue May 3 19:06:20 2005:

   Now, as for sd0 having a problem, I just mounted it and tried copying
spwd.db to /dev/null.  It failed.  The message in /var/log/messages is

May  3 15:00:59 grex /bsd: sd0(ahc1:0:0): Check Condition on opcode 0x28
May  3 15:00:59 grex /bsd:     SENSE KEY: Media Error
May  3 15:00:59 grex /bsd:    INFO FIELD: 116647
May  3 15:00:59 grex /bsd:      ASC/ASCQ: Unrecovered Read Error
May  3 15:00:59 grex /bsd:      FRU CODE: 0xe4
May  3 15:00:59 grex /bsd:          SKSV: Actual Retry Count: 134
May  3 15:01:00 grex /bsd: sd0(ahc1:0:0): Check Condition on opcode 0x28
May  3 15:01:00 grex /bsd:     SENSE KEY: Media Error
May  3 15:01:00 grex /bsd:    INFO FIELD: 116647
May  3 15:01:00 grex /bsd:      ASC/ASCQ: Unrecovered Read Error
May  3 15:01:00 grex /bsd:      FRU CODE: 0xe4
May  3 15:01:00 grex /bsd:          SKSV: Actual Retry Count: 134

There are other errors on the disk as well.  When I tried to dd the
entire disk I brought the system down, the day Joe said that newuser
was failing.

Have to go back and do work work now...


#224 of 457 by naftee on Tue May 3 19:12:44 2005:

work work work


#225 of 457 by steve on Tue May 3 19:15:31 2005:

work plod work


#226 of 457 by nharmon on Tue May 3 19:29:27 2005:

plod no work,... abort, retry, or ignore?


#227 of 457 by cross on Tue May 3 20:02:14 2005:

This response has been erased.



#228 of 457 by steve on Tue May 3 20:11:23 2005:

   Dan you are sliding off into fantasy land here.  THE DISK HAS PROBLEMS.
It is as simple as that.  If no one else saw the errors it was because no
one looked at /var/log/messages.  I will point out that you could have
rummaged around there yourself to find errors.   Sigh, I don't know why
I'm bothering to respond to some of your comments, but I will say that I
think Marcus and I know the difference between the sound of bearings
and the noise a disk makes when the heads are constantly moving.


#229 of 457 by tod on Tue May 3 20:17:57 2005:

We're not worthy.


#230 of 457 by gull on Tue May 3 20:28:22 2005:

Re resp:211: In my (admittedly limited) experience, banks run on Windows
and proprietary mainframes.  The bank I worked for had *no* Internet
connections at all, though.  All branch-to-branch connections were on
leased lines.


Re resp:213: #3 is a minor issue.  AFAIK there are no major bugfixes in
recent versions of Exim.  While versions earlier than 4.50 do not come
with Exiscan out of the box, it's easy to patch in, and the OpenBSD port
probably already includes a flag you can toggle to include it. 
FreeBSD's does.


Re resp:219: Are we swapping?  Maybe we need more RAM.  Even if we're
not swapping, more RAM means more disk cache.  RAM is cheap.


Re resp:227: Those messages pretty clearly indicate a hardware problem,
and if they were the result of an incorrect request on the part of the
driver we'd be seeing them on the other disks, too.  I think you're
really reaching to blame OpenBSD here, which is unfortunate, because it
makes this look like a matter of religion on your part instead of a
technical argument.

If you're really convinced that OpenBSD is somehow causing the illusion
of a hardware failure on this disk, I suggest connecting it to another
system running a different OS and trying to access it.  That should
settle the issue.


#231 of 457 by steve on Tue May 3 20:32:53 2005:

   No, we're doing fine for ram, now.  The 256M swapping situation was
on the Sun-4/670.  Unless spam processing eats up Grex's hardware I 
think the 1.5G we have will last for some time.  Sorry I wasn't clear
on that.


#232 of 457 by gull on Tue May 3 20:40:49 2005:

I just took a look, and we've currently got 1.2 gigabytes free, so I
think you're right.  I don't know how to find out how much OpenBSD is
using as disk cache.  In FreeBSD it's reported by "top", but that
doesn't seem to be the case here.


#233 of 457 by gull on Tue May 3 20:43:39 2005:

Also, someone on staff should please read my last set of comments in the
Exim item in the Garage conf.  I clear up a couple of apparent
misunderstandings in Grex's current exim.conf file, and point out some
stuff that was copied verbatim from my example and shouldn't have been.
 It looks like those things still haven't been fixed.


#234 of 457 by steve on Tue May 3 20:52:48 2005:

   Want to make up a diff?


#235 of 457 by cross on Tue May 3 22:00:05 2005:

This response has been erased.



#236 of 457 by steve on Tue May 3 22:39:47 2005:

   1) The problem with permissions on the tty was our fault, I
believe.  Do you think that this was a part of the release, and
that no one ever found it?  We messed up, not the OS itself.  I
ask you to prove otherwise.  If its a real bug in the distribution
others would have seen it.

   2/3) At least some of the crashes have been due to our nic. That
code was worked on post 3.5 release.  I won't say that we've not
crashed for other reasons but have we properly analyzed it?  The
quota code could well have some problems.  I'll bet we're pushing
it.  You are right that in our current configuration we are more
disk intensive than if softupdates were on.  I'm pretty sure that
the softupdate code was changed post 3.5 and in visiting the
changelog between 3.5 and 3.6, we find

   "Big FFS softdep merge with FreeBSD, fixing a number of bugs."

In the changelog between 3.6 and 3.7 we find

   "Fix a soft dependencies problem that caused processes to get stuck."

This was a part of some stuff from FreeBSD which wasn't complete
apparently, and was then fixed.

   You know as well as I do that saying something "will be fixed
in ..." is a dangerous thing to say.  I'm not going to let your
bias against OpenBSD make me say things that shouldn't be
promised.  But yes, I *do* think that 3.7 is going to be a good
move for Grex, as will 3.8 and so on.


#237 of 457 by mcnally on Tue May 3 22:57:45 2005:

 >  I really want to know why people take one hypothesis I propose
 >  (which I clearly stated was a hypothesis) and fixating on that,

My own theory is that it's due to the somewhat confrontational tone
of your messages.  Even though I agree with most of your conclusions
I'll admit I'm somewhat put off by the way you've worded your responses.
Presumably it's because you feel strongly about the issue, which I
applaud. 

But if Grex wants to know why volunteer staff resources are drying up,
we need look no further than the way this and other "discussions" about
the system develop.  Very few people will volunteer to join a flame war
already in progress.  Of course it takes more than one party to have a
good fight but whoever's responsible for the tone maybe we can all just
back off a little bit and instead of concentrating on what possibly
*should* have been done, figure out what to do now.

 >  Okay, so here are my major points:
 >
 >  (1) We've had one *major* security hole in OpenBSD.  

        Agreed, and frankly it's a baffling one for a supposedly 
        secure OS.  Granted the security promise offered by OpenBSD
        partisans is usually "Only <n> root exploits since <time t>"
        but world-readable ttys is bizarre.

 >      I don't think it was a configuration issue.       

        I'm not so sure about this -- it seems incredible to assume
        that if this were really the system default it wouldn't be
        very widely known, and OpenBSD rightly slammed for it.

        I think when I get home tonight I'm going to have to install
        OpenBSD on a spare computer and test to see whether this is,
        in fact, the way things work out of the box on OpenBSD.

 >  (2) OpenBSD crashes quite a bit more than I or anyone else am
 >      comfortable with.  It doesn't appear to be because of the network
 >      driver.  It often crashes when filesystem errors, or, apparantly,
 >      because the proc table gets full.

        Because of the time investment required to change OSes again 
        and the fact that we don't know for sure that FreeBSD will be
        better, I'm inclined to give OpenBSD more time to prove itself 
        providing:

         (a) we can guarantee a fix to the fork-bomb vulnerability, and
         (b) we replace put another disk in place of the one that's erroring.

        Of course I'm not sure I even get a vote, but that would be my
        recommendation if my advice were solicited.

 >  (3) Our *application* is not disk intensive, but because things
 >      like soft metadata updates aren't reliable on OpenBSD, we're
 >      *making it* disk intensive.  If that's chewing up drives, then
 >      fine, but it's not grex has such a high volume of *usage* that
 >      it *has* to be that way.  *REAL* high volume usage
 >
 >  Steve, you were one of the people who were adament about OpenBSD.
 >  Please respond to these problems.  Should I believe that they'll
 >  be solved in the newest version of OpenBSD?  Or did we make a mistake
 >  going with OpenBSD?

        Aren't there disk-usage monitoring tools we can use to get some
        sense of what's going on with our disks?  And would it help   
        relieve thrashing if we made /tmp a 512MB RAM disk or picked   
        something like that? 
                           
        Something's already seriously wrong if / (containing /etc) is
        being written to so often that disk corruption is regularly
        bringing the system down.  How many things need to write to /
        anyway?  Newuser?  What else?  Virtually everything else on the
        system seems like it should write to /var, /tmp, or a bbs or
        homedir partition.


#238 of 457 by steve on Tue May 3 23:16:18 2005:

   sd0a went because of a general problem with it.  I seriously doubt
that the runs of newuser caused this.  Though /a was on sd0 and that
did have lots of i/o.


#239 of 457 by gull on Tue May 3 23:54:28 2005:

Re resp:234: Okay, I'll make one up and email it to you. 
 
 
Re resp:237 (1): I took a quick look at an OpenBSD 3.6 system at work.  
It's used strictly as a firewall, no local logins except root for 
maintenance, but that's not really relevant. 
 
What I found is that pseudo-ttys appear to be world-readable until 
they're used.  For example, with root logged in on ttyp0: 
crw--w----  1 root  tty       5,   0 May  3 19:44 /dev/ttyp0 
crw-rw-rw-  1 root  wheel     5,   1 Dec 18 13:53 /dev/ttyp1 
crw-rw-rw-  1 root  wheel     5,   2 Dec 18 13:53 /dev/ttyp2 
(etc.) 
 
Now, if I open another ssh connection, again as root: 
crw--w----  1 root  tty      5,   0 May  3 19:46 /dev/ttyp0 
crw--w----  1 root  tty      5,   1 May  3 19:46 /dev/ttyp1 
crw-rw-rw-  1 root  wheel    5,   2 Dec 18 13:53 /dev/ttyp2 
(etc.) 
 


#240 of 457 by cross on Wed May 4 02:02:13 2005:

This response has been erased.



#241 of 457 by steve on Wed May 4 02:27:11 2005:

   Let's reverse the tty problem for a minute--if it wasn't us, then
it was in the release of OpenBSD 3.5.  I've been looking for comments
about that and haven't seen any so far.  It could be the case that we
didn't do anything, but I tend to think that the collective set of
people who worked on the machine could have done something.  I agree
with you that we should dig into things.

   We crashed at least twice with a trace leading back to a bge symbol.
I think thats fairly good evidence that it was in the nic.

   I'm obviously pro OpenBSD.  I came to be that way after staring at
several Linux flavors, then Net- and FreeBSD, then OpenBSD.  Since Oct
1999 I've been using it exclusively and have found it rock stable
except for when hardware problems have messed things up.  I know of
no other system that puts security and takes the pro-active stance of
fixing things and developing enhancements like the write xor execute
system.  Grex needs these things.  We get hit on by enough people
that we need all the help we can get.

   It occurs to me that we ought to order a 3.7 CD set.


#242 of 457 by nharmon on Wed May 4 02:38:25 2005:

 > Why don't we take some of the money we have in the bank and buy a
 > SCSI hardware RAID controller, and do disks properly, with 0+1
 > striping of mirrors, so that in the event one disk dies, we don't
 > end up in these situations?

I agree that a RAID set up would provide system continuity until a staff
member can replace a drive. Personally, I prefere a RAID 5 set up with a hot
spare (or RAID 5EE if we don't have a spare drive to spare...but I don't know
if this is an IBM-only thing or what, so it might not be possible). Further,
RAID 5 wouldn't batter the drives as much as RAID 0+1...

BTW, I might be mistaken, but isn't RAID 1+0 more reliable just by the fact
that a multiple-disk failure resulting in catastrophic data loss is
statistically more likely with 0+1?


#243 of 457 by gull on Wed May 4 04:14:46 2005:

Dan, to be honest, I was with you until you started insisting that 
OpenBSD had caused a good disk to generate read errors.  To me, that 
made it seem like you were really reaching for more reasons to dislike 
OpenBSD, and I'm having a tough time believing you're really taking an 
objective position, now. 
 


#244 of 457 by cross on Wed May 4 11:28:50 2005:

This response has been erased.



#245 of 457 by nharmon on Wed May 4 12:15:31 2005:

Is the disk bad? Have we plugged it into another computer and verified it has
problems?


#246 of 457 by aruba on Wed May 4 14:31:31 2005:

No, the disk is still attached to Grex.


#247 of 457 by steve on Wed May 4 15:03:42 2005:

   I should point out that in puting the sd0 disk in some other machine,
it might work.  It might appear OK for an hour, or a week.  Moving a
damaged disk jossles things.  I had a small ide disk at work which did
exaqctly this.  It was flaky in the machine it was running on, but
ran OK for some while on a test machine I had.  Finally, after several
days of pounding on it, the exact same error cropped up.  This is rare,
but if the problem involves something in the head or arm mechanics, 
anything can happen.  I do not believe that will happen in this case
but moving a suspect disk around can lead to unexpected results.


#248 of 457 by cross on Wed May 4 15:32:15 2005:

This response has been erased.



#249 of 457 by tod on Wed May 4 16:02:47 2005:

Let us know how the 3.7 disc works out.


#250 of 457 by twenex on Wed May 4 16:59:23 2005:

If Plan9 has "dd", why not "fsck"? After all, "dd" isn't even (originally)
native to Unix.


#251 of 457 by cross on Wed May 4 17:18:33 2005:

This response has been erased.



#252 of 457 by twenex on Wed May 4 17:23:31 2005:

Yeeees, but you could still call it "fsck"....


#253 of 457 by mcnally on Wed May 4 17:52:34 2005:

 They could also call it "scandisk".  After all, lots more people are used
 to scandisk than fsck, right?

 What does it matter to you what they called it?


#254 of 457 by twenex on Wed May 4 18:01:51 2005:

Just seems arbitrary to name Plan9 "dd" after Unix "dd" but not do the same
with fdisk, that's all.


#255 of 457 by gull on Wed May 4 18:06:56 2005:

A lot of such decisions are arbitrary.  Heck, on Linux, 'fsck' is really
just a front end that calls any of a number of more specific
filesystem-checking tools, depending on the type of filesystem in question.


#256 of 457 by drew on Wed May 4 21:01:51 2005:

FWIW, I've had a disk *image file* (created with 'dd if=/dev/hdc of=filename')
produce read errors when used in the virtual machine it was attached to.


#257 of 457 by keesan on Wed May 4 21:14:02 2005:

Three times now, with two different modems, we have dialed into grex and got
garbage.  The second dial logged us in.  Another grexer reports that the modem
on 484-0513 works but the first one does not, from his location.  Is there
any other reliable modem that could be switched with the 0512?


#258 of 457 by steve on Wed May 4 23:03:49 2005:

   I think first we need to verify that the line and connection is OK,
physically.   Sindi, do you know when these problems started?  That
would be good to know.


#259 of 457 by cross on Thu May 5 00:53:50 2005:

This response has been erased.



#260 of 457 by keesan on Thu May 5 01:02:05 2005:

The garbage on dialin happened this week, probably in the last three days.
Jim mentioned it to me yesterday but I had already noticed. It might just have
started yesterday. It occurred again this afternoon.
Jim tried switching from 38 to 19K which did not help.


#261 of 457 by steve on Thu May 5 01:30:31 2005:

   Is it always the same modem that messes up?


#262 of 457 by albaugh on Thu May 5 15:22:54 2005:

Drift:  Does anyone else think that the fsck program name was partially chosen
because it looks like a get-past-the-censors-disguise for the f-word?  ;-)


#263 of 457 by keesan on Thu May 5 16:11:06 2005:

We always dial 0512 but I don't know which modem we actually reach.  SOmeone
said the 0512 modem does not work for him but 0513 does, something about
distance from the phone company.


#264 of 457 by twenex on Thu May 5 16:42:51 2005:

Re: #259 - ah, I see.

Re: #262 - Heh. I bet that is exactly the reason! :-)


#265 of 457 by tsty on Thu May 5 16:52:38 2005:

hullo disk problems!           there *IS* the best disk repair/recover
software for everyone - and now spinrite 6  will also do xnix drives
and mac drives. 
  
the procedure for xnix formatted drives is a tad more detailed but
if you put *any* worth on your drives you simply must (sorry if that's
preachy) run spinrite on them about every 6 months. 
  
i feel as if i *should* be preachign to the choir when i state adn
restate the obvious, but the choir is still out of tune, it seems.
  
spinrite 6     grc.com    ...............................  please!
  


#266 of 457 by tsty on Thu May 5 16:53:08 2005:

oh, i also wnat to thankx STeve & company for all the extra efforts
on behalf of grex. thank you very mulch.


#267 of 457 by jor on Thu May 5 17:01:29 2005:

        Nonsense. None of that thank you stuff.
        All we do is criticise, stamp our little feet,
        and declare oursleves to be "paying members".

        tsty get with the program dude.


#268 of 457 by steve on Thu May 5 18:12:49 2005:

   Heh...

   The problem with disk "fixing" software is that is nearly all cases its
a giant kludge.  With bit densities being hundreds of millions+ per square
inch, the most minute impurities left inside the disk case can cause disasters,
and the tolerances for everything mechanical has shrunk to amazing porportions.
This means that when something in a disk goes wrong its far harder to fix.
When bad disks come in the disk oem's look at the control electronics and
the disk case (mechanical) for problems.  Depending on which they find bad
they throw that away and put another "known good" component in, test it and
then have a refurb disk for replacements.  I'm not really happy with that
but thats the way things are.

   Trying to alter a disks surface by rewriting something just isn't a good
idea now.  Back in the era of 300M disks it worked to some extent.

   Lastly, when you think of the sheer amount of data that you can put on
a 100G+ disk, you have a huge investment in that data, be it personal or
professional.  It just doesn't make sense to trust kludges.  Disks are too
cheap not to replace; data is too expensive to replace.


#269 of 457 by keesan on Thu May 5 18:57:42 2005:

Regarding modems, I got garbage again dialing 0512 and before the garbage
there was briefly something about tty00.  A second dialin immediately after
connected me properly - does this imply that I got the second modem this time
because the first was still tied up?  If so can someone replace the first
modem, or at least confirm the problem?  We got it at two locations.


#270 of 457 by cross on Thu May 5 19:31:29 2005:

This response has been erased.



#271 of 457 by naftee on Thu May 5 19:41:47 2005:

thanks for the mulch, tsty !


#272 of 457 by twenex on Thu May 5 19:44:51 2005:

Snicker.


#273 of 457 by drew on Thu May 5 20:54:54 2005:

Re #268:
    You have a point about attempting to 'fix' a bad disk. However, a program
that gets the disk electronics to cough up the truth about how much of the
disk is *really* damaged, and how many reserve sectors are left, might be
useful for monitoring purposes.

Re #270:
    Humor or not, it may not be far from the truth. cf. HTML content, flash
animations, spam, etc.


#274 of 457 by steve on Fri May 6 05:41:30 2005:

   Drew, the problem with that is when things are damaged, how can you trust
the electronics?  As an example, IBM has something called smart for their
disks.  It's a system where you can run a drive fitness test on a disk to
get a sense of its health.  I've found it to be useful in telling me whats
wrong with a dead disk, usually.  But it has failed me several times when
testing a disk that the user said had acted weirdly.  To be fair, it did
catch a disk that was on the verge of going bad, but I still think the
technology is ripe for improvement.


#275 of 457 by gull on Fri May 6 15:00:29 2005:

Re resp:270: It's definitely true where I work.


I'm not sure how valuable Spinrite-style products really are these days.
 You can't directly address sectors on a disk anymore for testing -- the
drive electronics hide all those details and remap bad sectors from a
pool of spares.  By the time there are actually visible bad sectors, the
disk has been going south for a long time.


#276 of 457 by tsty on Fri May 6 16:18:43 2005:

re #267 ... oh, right, i forgot.    stomp sTomP SToMp st0MP, ds al coda
  
ummmmmmmmm, about spinrite. every 'objection/dismissal' above demonstrates
that not one of you has read up on *what* it does nor *how*! dammit!!
  
there is *NO* comparable program in the universe. it FSCKING works!
and all the hidden shit is bypassed, obviated, shunted, circumvented,
counteracted and evaded <insert further descriptions here>.
  
steve gibson is long overdue for a macarthur grant, imnsho.
  
and one of it's best built-ins is that it catches stuff and fixes it
BEFORE shit hits the fan.  /sheeeeeeeeeeeeeeeeeeeeeeeeeeeSH!
  


#277 of 457 by russ on Sun May 8 00:13:29 2005:

Configuration of the new disk isn't quite done; /var/log/wrttmp isn't there.


#278 of 457 by richard on Sun May 8 01:33:09 2005:

I have noticed that now when you !finger anybody to see when they logged in
last, it says "never logged in"  For any user that I've tried.  I guess all
recent login information has been lost?


#279 of 457 by keesan on Sun May 8 01:57:30 2005:

I got garbage again the first time I dialed in but different looking garbage,
and the same wrttmp message.


#280 of 457 by drew on Sun May 8 03:30:28 2005:

Re #274 and 275:
    Out of curiosity I checked out a copy of Spinrite. It seems to do the
usual sector checking and attempted 'fixing', with five different levels of
intervention. But in addition, I've found a screen called "SMART settings"
that went something like this:

   attribute   event cnt   margin
---------------------------------
 ecc corrected: 0             149
rd chan margin: not reported
relocated sect: 0              60
realloc events: 0
   seek errors: 0              49
 spin-up retry: 0              49
 recal retries: 0              49
cabling errors: 0
 uncorrectable: 0
  write errors: 0             149
   temperature: 40'c /104'f
 power-on time: 5,271

Not sure it means anything; a 'margin' of 149 sectors out of 63*255*thousands
seems rather tiny. But I think this is supposed to be the 'bypassing the drive
electronics' part of the program.

    Spinrite also works on VMWare virtual machines. However, the SMART screen
does not appear; instead there's a message saying that SMART data is not being
reported.


#281 of 457 by steve on Sun May 8 05:22:26 2005:

   /var/log/wrttmp seems OK to me.  Is anyone else seeing problems
with anything?  The modems are a seperate issue, I think.


#282 of 457 by drew on Sun May 8 05:28:07 2005:

/var/log/wrttmp seems to be restricted-read. is it supposed to be?


#283 of 457 by russ on Sun May 8 12:41:36 2005:

The wrttmp problem is breaking "amin".


#284 of 457 by aruba on Sun May 8 13:27:45 2005:

I get this when I log on:

mesg: Unable to open /var/log/wrttmp to read/write

I guess the mesg program needs access to that directory.


#285 of 457 by eprom on Sun May 8 15:17:42 2005:

looks like /log wasn't permed correctly

last: /var/log/wtmp: Permission denied

mesg: Unable to open /var/log/wrttmp to read/write


#286 of 457 by steve on Sun May 8 21:41:14 2005:

   The /log problem is fixed.


#287 of 457 by mary on Sun May 8 23:25:23 2005:

Thanks, STeve, for spending a gorgeous Saturday, working on Grex.  And 
another thanks to Mark for picking up the new disk and seeing it got to 
STeve and that STeve got to Provide.  

Is newuser still off?


#288 of 457 by jor on Mon May 9 00:52:17 2005:



#289 of 457 by steve on Mon May 9 01:35:45 2005:

   Yes, its still down.  I need to finish ressuresting the accounts that
got munged.  This next week looks to be better crazy wise, so I should
get it done in the next day or so.


#290 of 457 by keesan on Mon May 9 02:20:57 2005:

I have changed my script to dial the second modem (0513) and no longer get
garbage.  Could someone replace the first modem, assuming we have another
working one?  Steve et al, thanks.


#291 of 457 by steve on Mon May 9 02:50:36 2005:

   I'm sure we have spares.  I forgot to reset the modei yesterday.  That
might be a good thing to do.
   So you've narrowed it down to the first modem.  Good and thanks.


#292 of 457 by twenex on Mon May 9 15:45:38 2005:

Thanks, Steve and team.


#293 of 457 by keesan on Tue May 10 01:12:50 2005:

Someone else told me the first modem was bad.  At least switch the two.


#294 of 457 by steve on Tue May 10 01:36:29 2005:

   Next time I'm there I will.  We need to have both of them
working.


#295 of 457 by albaugh on Tue May 10 22:41:55 2005:

It looks like grex went of the 'net for about half an hour just after 6pm.


#296 of 457 by tsty on Wed May 11 17:47:17 2005:

re #280 ... that disk was in pretty darn good shape based on those data.
  
i have seen errors found/fixed inthe    200,000 + range (ForReal!)(tm).
they were primarily ecc errors but other junk showed up in the 100,000+
range on other 'stuff'.
  


#297 of 457 by albaugh on Wed May 11 19:16:33 2005:

Does the bounce below from excite.com imply that one or more grex twits are
responsible for getting grex blacklisted re: mail delivery to excite.com?
(identifying information suppressed)


From MAILER-DAEMON Fri May 06 12:54:00 2005
Envelope-to: a_user@cyberspace.org
Delivery-date: Fri, 06 May 2005 12:54:00 -0400
X-Failed-Recipients: a_user@excite.com
Auto-Submitted: auto-generated
From: Mail Delivery System <Mailer-Daemon@cyberspace.org>
To: a_user@cyberspace.org
Subject: Mail delivery failed: returning message to sender
Date: Fri, 06 May 2005 12:54:00 -0400

This message was created automatically by mail delivery software.

A message that you sent could not be delivered to one or more of its
recipients. This is a permanent error. The following address(es) failed:

  a_user@excite.com
    SMTP error from remote mailer after RCPT TO:<a_user@excite.com>:
    host xmxpita.excite.com [208.45.133.107]: 554 Service unavailable; Client
    ho
st [216.86.77.194] blocked using dynablock.excite.com; Your message could not
be
 delivered due to complaints we received regarding the IP address you're using
 o
r your ISP. See http://blackholes.excite.com/ Error:
    WS-02

------ This is a copy of the message, including all the headers. ------

Return-path: <a_user@cyberspace.org>
Received: from a_user by grex.cyberspace.org with local (Exim 4.42)
        id 1DU65L-0006qP-U4
        for a_user@excite.com; Fri, 06 May 2005 12:53:55 -0400
To: a_user@excite.com
Subject: a_subject
Message-Id: <E1DU65L-0006qP-U4@grex.cyberspace.org>
From: a_user's_name <a_user@cyberspace.org>
Date: Fri, 06 May 2005 12:53:55 -0400

a_subject



#298 of 457 by scholar on Wed May 11 22:03:44 2005:

better get steve to write a few  e-mails to abuse@gmail.com


#299 of 457 by naftee on Wed May 11 22:33:44 2005:

a_steve


#300 of 457 by keesan on Fri May 13 14:01:47 2005:

Are we supposed to have a ping command?  It tells me 'not found'.  I was
wondering why sdf.lonestar.org suddenly froze up and cannot be accessed now.


#301 of 457 by rksjr on Fri May 13 17:32:24 2005:

Re. #300: I tried the "finger" command circa 1:05 p.m. and got the 
following results:

> finger @sdf.lonestar.org
[sdf.lonestar.org/192.94.73.1]
must provide username

> finger staff@sdf.lonestar.org
[sdf.lonestar.org/192.94.73.1]
finger: staff: no such user

Given that the attempt to "finger" didn't time-out, I assume that they 
are at least partially "up".

If lonestar has an uptime report page similar to that of Grex, you might 
find more information there.

          http://www.cyberspace.org/cgi-bin/uptime


#302 of 457 by keesan on Fri May 13 20:13:00 2005:

They are online again now, but I was simply wondering why we have no ping.


#303 of 457 by tsty on Sat May 14 07:10:20 2005:

abuse ... but you'll hvae to read it from someonw else, i guess.
,.


#304 of 457 by naftee on Sun May 15 02:30:21 2005:

.,


#305 of 457 by gelinas on Sun May 15 03:43:52 2005:

Ping was disabled/removed for all but staff because it can be (and has been)
used for denial-of-service attacks.


#306 of 457 by naftee on Sun May 15 11:29:51 2005:

Just like /etc/passwd files, right, gelinas ?


#307 of 457 by keesan on Fri May 20 04:42:10 2005:

I dialed 4840513 and got a busy number but 4840512 was free.  Isn't it
supposed to hunt through the 'queue' of both numbers?


#308 of 457 by i on Fri May 20 23:14:22 2005:

Isn't 484-0512 the 1st number and -0513 the 2nd number in the queue?
My experience is that phone number queues don't "wrap around"...this
works fine as long as people don't dial numbers further up the queue
...but features like "call back" break this assumption.


#309 of 457 by keesan on Sat May 21 02:40:34 2005:

I have been dialing the second number because the first modem seemed to be
bad.  Has someone replaced it?


#310 of 457 by gull on Sat May 21 21:05:14 2005:

Re resp:308: That's the way it works on our phone system at work.  A
number "lower" in the hunt group will work its way up to "higher"
numbers, but if it reaches the top it doesn't try from the bottom again;
you just get a busy signal.


#311 of 457 by naftee on Sat May 28 13:10:05 2005:

I get a permission error when trying to access http://grex.org/cgi-bin/wnu


#312 of 457 by jep on Tue May 31 16:19:35 2005:

The login screen has said for weeks that newuser is temporarily 
disabled, and that the staff is still working on restoring some 
accounts.  How many accounts are still being restored?  How long until 
newuser will be turned back on?

Thanks!


#313 of 457 by naftee on Tue May 31 16:27:52 2005:

Thanks, jep !


#314 of 457 by mary on Tue May 31 16:29:57 2005:

That we know of there is only one staff member left on Grex who has the 
skills to work through this issue.  That's STeve.  STeve knows about the 
problem.  I have to assume his life is very busy or he'd be working on it.  
There is no estimate on when this will be fixed.  


#315 of 457 by albaugh on Tue May 31 21:20:29 2005:

The following is not to put pressure on STeve.  But despite my own level of
"eh" re: newuser being shut off, it's not very tenable that grex, with its
stated mission, a) should be running for an extended period of time with
newuser down, and b) should have only *ONE* staffer able to address the
situation.


#316 of 457 by tod on Tue May 31 21:28:19 2005:

Perhaps Cyberspace should dissolve itself.


#317 of 457 by nharmon on Tue May 31 21:31:55 2005:

Perhaps we need more volunteer sysadmins?


#318 of 457 by albaugh on Tue May 31 21:41:22 2005:

If it dissolved itself, would that really be a "solution"?  ;-)


#319 of 457 by mary on Tue May 31 22:25:27 2005:

It's a lot easier to be provocative than to find reasonable solutions.


#320 of 457 by naftee on Wed Jun 1 00:18:31 2005:

i volunteer.


#321 of 457 by slynne on Wed Jun 1 00:24:23 2005:

resp:315 as it happens, I agree with you. I cant think of any good
solutions though. I dont personally have the time to learn how to fix
grex's issue myself. I dont know anyone else who might have the skills
and who would be willing to help out. I wish I did. 


#322 of 457 by jep on Wed Jun 1 13:11:14 2005:

Yeah, identifying the problem in this case is easier than fixing it.  
Grex's staff has been very exclusive and non-trusting of other users.  
It's worked pretty well in the past, but now that they're losing 
interest and dropping off in activity level, there are no replacements 
and no method for bringing in any new staff members.  It might be a 
good idea for the Board to put some serious thought into the matter and 
formulate a plan of some kind.  I know the idea has been mentioned in 
coop before, but nothing came of it.

Mdw and valerie are gone, janc and steve are about half here (with 
amazing bursts of energy at times), i just keeps cranking along 
invisibly but isn't participating in conferences much any more.  With 
all due respect for all of these people, they don't seem like active 
users in Grex's current environment.  It's harder to imagine any of 
them returning to active participation than just going away.  When they 
do, or at least if they do, who runs Grex?


#323 of 457 by mary on Wed Jun 1 15:49:30 2005:

I think there are some important questions that deserves to be mentioned, 
like, why should a qualified person want to volunteer their time for Grex?  
Is the sense of community persuasive enought to be worth the effort?  Is 
the time donated appreciated? Do they like the people that will benefit 
from their generosity?  Is Grex a fun place worthy of support?

It may not be the fault of the staff and potential volunteer that we're in 
this pickle.


#324 of 457 by other on Wed Jun 1 16:04:05 2005:

Mary has a good point.  Anyone doing a simple cost/benefit analysis of
being root staff of grex would be have to be either crazy to do it, or
they'd very likely have some kind of agenda.  In either case, they're
not very attractive candidates for the job.  That leaves only those who
are really interested in the job, or at least in the skills that having
the job will help develop (like a healthy insensitivity to harsh and
worthless criticism).


#325 of 457 by nharmon on Wed Jun 1 16:20:37 2005:

From looking at http://www.grex.org/staffnote/, one would get the
impression that Grex is overflowing with staff members. And this might
discourage people from volunteering. Perhaps we can trim this list?

Re: 323 and 324, We all understand Grex Board's reluctance to just hand over
the keys to someone they don't know. This is why it is necessary to constantly
bring in new volunteers, people who gradually are trusted with more of the
system. But that doesn't seem to be happening. How many people on the list
of Grex staff started working as staff in the past 2 years?


#326 of 457 by mary on Wed Jun 1 16:35:45 2005:

That list is out of date.  Current staff, as I know it:

steve (STeve Andre)
kip (Kip deGraff)
gelinas (Joe Gelinas)
i (Walter Cramer)
cross (Dan Cross)
spooked (Michelangelo Giansiracusa)
arthurp (Charles Mitchell)
remmers (John Remmers)
mdw (Marcus Watts)
srw (Steve Weiss)
janc (Jan Wolter)

Of those eleven, three have are new to staff within the past two years.
I hope I haven't missed anyone. 


#327 of 457 by rcurl on Wed Jun 1 17:11:37 2005:

How many of these can fix newuser? 


#328 of 457 by albaugh on Wed Jun 1 17:47:51 2005:

If grex cannot fulfill its mission - which necessarily means having a working
newuser - then it has no business asking for donations to support that
mission.  The board must realize this as being part of its responsibility to
address.  If everyone is weary of grex for one reason or another, then fine,
turn off its nonexempt / charitible status, and turn it into a private club,
one which may or may not work at any particular time, depending on whether
or not anyone feels like addressing any problems which may crop up.

Call a spade a spade!

(this really belongs in coop, I know)


#329 of 457 by gull on Wed Jun 1 17:57:04 2005:

Re resp:317: My understanding is you don't volunteer, you get invited.


Re resp:322: Given the beatings staff takes in the conferences, if I
were a staff member I'd probably avoid them, too.


#330 of 457 by eprom on Wed Jun 1 18:11:30 2005:

I'm getting backtalk errors when I try to erase or hide a response


#331 of 457 by cross on Wed Jun 1 18:38:56 2005:

This response has been erased.



#332 of 457 by jep on Wed Jun 1 18:41:34 2005:

Mary, how many of your list are actively working on anything for Grex?  
Loginids kip and mdw haven't logged in since February so I assume they 
are "staff emeritus".

I am not familiar with the activities of the staff in general.  I am 
only aware of actions by i, steve and janc in the last 3 months.  I 
thought cross had resigned from staff, but if he's active, that's great.

I don't want to blame anyone for anything.  The staff has done a 
terrific job for years, and so has the Board.  I do want to suggest 
that, if the staff members we have can't keep Grex running, then 
something needs to be done.  Maybe Grex can ask for volunteers from the 
users.  The newuser program is critical to Grex.  I suggest it's 
important enough to the survival of Grex that it's worth changing 
Grex's time honored practices in order to get it working.


#333 of 457 by mary on Wed Jun 1 21:48:31 2005:

That I know of there are only a couple of people on that list that know 
the software well enough to fix newuser.  And one of them is pretty much 
not around.

I'm wondering, again, if it isn't time to drop PicoSpan and our newuser
program and go to Backtalk and Frontalk entirely.  Now, I'm talking out of
ignorance here that Backtalk even has some type of an account setup
procedure that would allow us to dump newuser.  But if so, it may make
sense.  I know Jan has Backtalk working elsewhere on the web, what do
those systems use for account maintenance?

I'm very anxious to get newuser running.  Even without some of the 
damaged accounts restored.  At least those affected would be able to 
setup new accounts and again participate.  We really, really must 
make it a priority to not be using software or hardware that is 
owned or configured in such a way that only one or two people can
dig us out if there are problems.


#334 of 457 by naftee on Wed Jun 1 22:28:04 2005:

re 322 You're making yourself invisible by having your responses deleted. 
oops


#335 of 457 by cross on Thu Jun 2 01:15:38 2005:

This response has been erased.



#336 of 457 by keesan on Thu Jun 2 01:48:44 2005:

I think grex is nearly perfect the way it is, apart from the broken newuser
and an occasional crash.


#337 of 457 by mary on Thu Jun 2 02:15:18 2005:

After speaking with John this evening, about the newuser program, I 
realize I really don't know enough about it to suggest changes.  Moving to 
Backtalk we'd still need newuser.  It's not there just for Picospan.  I'm 
learning.  


#338 of 457 by mary on Thu Jun 2 02:28:22 2005:

As to Dan's comments - I'm sorry you're feeling so burned on Grex.  I'm 
worried and looking to improve things, which I guess means I'm not 
burned.  I've been here from day one and we're not big on fast changes, 
that's for sure, but we are big on being a user run system, where things 
happen by consensus, if at all.  And we're an open system where anyone 
who finds us gets to act out in ways they couldn't in grade school.  And 
we don't have much money.  And we depend on voluteers to keep the wheels 
spinning.  I consider it sheer magic we've lasted this long.  Wow.  
Think about it.

I'm also pretty darn good at hanging in there with problems rather than 
throwing up my hands and walking away.  Someday, Grex will end.  Yep.  
It will.  But I'm hoping it's not for a while yet.  

So, my next project: Lighting a Fire Under STeve.  I plan to call him 
and explain how Grex is at his mercy.  I'll offer to take him to Zing's 
for dinner.  Shameless manipulation.  Wish me luck.


#339 of 457 by cross on Thu Jun 2 03:01:27 2005:

This response has been erased.



#340 of 457 by naftee on Thu Jun 2 03:06:51 2005:

I enjoy reading cross' GreXsoft item in the garage conference.  Everyone
should check it out.


#341 of 457 by jep on Thu Jun 2 03:38:10 2005:

Good luck, Mary.

Let me give a little of my perspective on what Dan was talking about, 
and how it conflicts with what Mary was talking about in resp:338.

Mary said Grex is run by the users.  It is, to a point, but the users 
are given the level of consent allowed to them by a very small and 
select group.  Mary is "in".  Jan is "in", and Marcus, STeve, Valerie, 
John Remmers, and as many as several others, but certainly no more 
than several.  That group makes all of the decisions about the staff, 
and all of the real decisions about how Grex is going to be run.  It 
always dominates the Board and always has, and it always will.  It's 
the group recognized by the peripheral users like myself as the group 
which "gets it" about how Grex is supposed to run.  If any of the 
peripheral users didn't agree with that group, they would go away (and 
this has happened), because it's just not worth it to push anything 
against that group.

When the in group stuck with STeve about SunOs for an extra decade, 
then it was dang well written in stone that Grex was going to be on 
SunOs.  When that group finally conceded it was plain unreasonable to 
keep running on SunOs, it took 3 more years to get onto PC hardware.

I've been rooting for cross to gain influence and bring some new ideas 
into the staff for years... something like 5 years I think.  He has 
fought and clawed and scratched and beaten people up to edge his way 
into having some say for all of that time (from my peripheral 
perspective) and now he says he's worn out.  It is *hard* to break 
into that circle, if it's possible at all, and it's not worth it for 
anyone but a fanatic.  But the "in" group is anti-fanatic, too.  Aruba 
entered it, and maybe srw could have.  So I guess there's a way but it 
seems pretty dang rare to me and I'm not sure it's possible any more.

It's been fine, the "in" group has done a fine job of keeping things 
usable and comfortable for everyone else.  It's not a tyrannical group 
at all, as long as you don't oppose any of it's core principles such 
as the technical infallability of mdw and steve.

But now the "in" group is shrinking and moving on to other things.  I 
already referred to mdw and valerie abandoning Grex, and others being 
less interested.  I don't blame them; everyone changes over time, but 
there's no one around and acceptable to take their place.

Given an unexpandable core but one which can diminish, and it's 
inevitable that an organization is going to fade in time.  M-Net's 
entire core vanished, then the peripheral group collapsed and formed a 
new core, then that vanished, too.  The result can be seen.  (M-Net 
has 6 eligible voters, and didn't manage to have an election in April 
as required by the by-laws.)  There but for the grace of <insert deity 
here> goes Grex.

Grex needs to bulk up it's core, but the "in" group here has as one of 
it's very highest principles than no one else "gets it".  The silent 
majority of us peripheral users support their belief, too.


#342 of 457 by naftee on Thu Jun 2 04:33:52 2005:

jep, you are one messed-up guy.  M-net has several very fine and talented
staff members who actually enjoy keeping the system running.


#343 of 457 by eprom on Thu Jun 2 04:47:12 2005:

it'd be nice if we could post edit our posts....I made a boo-boo :O


#344 of 457 by glenda on Thu Jun 2 07:10:47 2005:

STeve is working on it as he has time.  He is having a real time crunch at
work with many machines there acting up, budget time for purchase of new
machines with their installation problems (and what looks like a bad batch
of Dells).  When choosing between the paying job and volunteer Grex to spend
limited time and energy on, I'm sorry, but the paying job has to win. 
Especially this summer when my job gets dropped from 20 hrs/wk during the
semester to just 4 wks of work from May through September.


#345 of 457 by cross on Thu Jun 2 13:06:42 2005:

This response has been erased.



#346 of 457 by twenex on Thu Jun 2 13:34:31 2005:

Anyone doing a simple cost/benefit analysis of
 being root staff of grex would be have to be either crazy to do it, or
 they'd very likely have some kind of agenda. 

Anyone who doesn't have an agenda is probably not the kind of person you want
to have around when the going gets tough. That happens a LOT in system
administration - on any platform you care to name.


#347 of 457 by gull on Thu Jun 2 13:53:41 2005:

Re resp:345: Well, information hiding is one way an "in group" stays
exclusive.  Information is power in any organization.


#348 of 457 by naftee on Thu Jun 2 14:58:58 2005:

re 344 That means you can start cooking at home ! :-0


#349 of 457 by nharmon on Thu Jun 2 15:06:33 2005:

The old argument of "anyone who wants to be {root,ircop,sysop,admin} just
wants it for the power, or to further their agenda, otherwise, they wouldn't
want to be {root,ircop,sysop,admin}" has been used for decades to discourage
people from volunteering their time to improve a system. 

I mean, am I the only one who was somewhat offended that when people started
suggesting that we need more volunteers that the response was that the user's
were to blame because of how staff is treated in BBS.

Which, by the way, I don't buy for a second. After system crashes, there is
usually an outpouring of gratitude and support from the users. Save for a few
trolls here and there, people generally support the staff.


#350 of 457 by tod on Thu Jun 2 15:26:35 2005:

re #344
Thanks for the status update.  I can certainly sympathize.  Best of luck to
STeve.


#351 of 457 by rcurl on Thu Jun 2 16:22:59 2005:

Would STeve be willing to have some staff apprentices work with him? 


#352 of 457 by nharmon on Thu Jun 2 17:14:27 2005:

How about Grex call out for people who are willing to volunteer to help out
with the system. These volunteers can list their abilities, things they are
good at. Then you take that, create a skills matrix, and assign projects;
problems; etc. to teams of people with the skills applicable to that project;
problem; etc.

Then all Steve would really need to do is make sure things get down, and are
routed to the appropriate teams.


#353 of 457 by tod on Thu Jun 2 17:30:55 2005:

re #352
But that sounds like WORK!  :(


#354 of 457 by keesan on Thu Jun 2 17:59:05 2005:

Glenda, could STeve show you how to help with this during the period when you
are not working?  


#355 of 457 by twenex on Thu Jun 2 18:48:11 2005:

Re: #349. Nate, you're right.

To be fair, not all the sysadmins on Grex blame the users. For various
reasons, some of them unrelated to their capacities as sysadmin, I've
suspected for quite a while that some of those who do have a generalized
attitude problem.


#356 of 457 by jep on Thu Jun 2 19:07:56 2005:

re resp:351: In fairness to the staff members such as STeve, training 
someone else to do a job is hard.  If he doesn't have time to fix 
newuser, he probably doesn't have time to teach someone else how to fix 
it.


#357 of 457 by twenex on Thu Jun 2 19:16:14 2005:

 Grex needs to bulk up it's core, but the "in" group here has as one of
 it's very highest principles than no one else "gets it".  The silent
 majority of us peripheral users support their belief, too.

 Grex needs to bulk up it's core, but the "in" group here has as one of
 it's very highest principles than no one else "gets it".  The silent
 majority of us peripheral users support their belief, too.

If that's true, which it very well may be, then it's the "in group" that need
to "get it". Many of our most regular users (myself included) have some level
of UN*X expertise. Many more of them, perhaps excluding only the trolls,
subscribe to some version of our "philosophy". Indeed, iirc, complaints about
the (perceived or real) abuse of the system or its principles have, when
specific, usually come from the users and been directed at staff, not the other
way around; other staff have also either kept silent on the issue or defended
the target(s) of the complaints. 

It's easier to inculcate technical expertise than philosophy. Perhaps none of 
those outside the "in group" have the expertise to hack on a binary-only copy 
of newuser, but given that those who do won't last forever 
(for whatever reason), who cares? It may be time to start replacing our 
proprietary sw with open-source versions, or at least with versions which have
source code open, but only to staff. The more sysadmins we have, the more
time they will collectively be able to spend on projects like this. If staff
want more colleagues, and they don't accept that people who might have the
ability to join them might not know Grex inside out but can be shown the ropes,
and that their expertise can grow over time (they can learn by doing), they are
going to *have* to accept it. If not, I can only hope, that those users who
care revolt, and set up their own system, a la Grex, just like what happened in
the early nineties to M-Net. I don't know if there's enough momentum for that
to  happen, though.


#358 of 457 by tod on Thu Jun 2 19:25:39 2005:

re #357
 revolt, and set up their own system, a la Grex, just like what happened in
the
 early nineties to M-Net. 
I think you got it backwards.  There is too much apathy and ego invested here
for skilled volunteers to "step on toes" I mean..."get trained" by existing
staff.


#359 of 457 by rcurl on Thu Jun 2 19:29:13 2005:

Re #356: that's a recipe for nothing ever getting fixed when current staff
fades away. But I don't believe it. I suspect there are quite a few
members that need only know what the fault is, and could look at the code
and fix it. OK - ask them to do a "beta" fix, and STeve can check it out
and test it, before installation.  But things would move forward - and
some new people would learn how Grex software works.


#360 of 457 by tod on Thu Jun 2 19:46:40 2005:

I bet Mike McNally could fix newuser without too much tutorial on where it
is and operates but you don't see anyone welcoming him with open arms.
It's a shame more folks aren't invited to volunteer because the qualifications
aren't THAT specialized.


#361 of 457 by happyboy on Thu Jun 2 19:48:28 2005:

mike doesnt have the proper level of
aspberger's syndrome to fit into that
role.


#362 of 457 by tod on Thu Jun 2 19:59:18 2005:

If you don't show up for the Grex Trundle and Trough-off then you're not
trustworthy.


#363 of 457 by mary on Thu Jun 2 20:04:13 2005:

Is this the part where we're being nice to staff?


#364 of 457 by jep on Thu Jun 2 20:07:59 2005:

re resp:363: I see offers of help, and tod and happyboy being 
irrelevantly trivial which is normal for them.  What do you think would 
help?


#365 of 457 by happyboy on Thu Jun 2 20:14:14 2005:

you just responded to yourself, nerse ratchet


#366 of 457 by naftee on Thu Jun 2 20:29:21 2005:

would you trust a gay man with your children ?

would you trust TWENEX with your COMPUTER ?!


#367 of 457 by tod on Thu Jun 2 20:36:08 2005:

re #363
WHAT staff?  Why is it staff that has the authority to decide who can
volunteer and who can't?  Why doesn't the BoD step up to the plate and make
some management decisions instead of letting things drag on over and over?


#368 of 457 by mary on Thu Jun 2 21:01:54 2005:

The board has always been of the opinion that we're not going to micro-
manage staff.  That has worked pretty well in the past.  We've encouraged, 
asked what we could do to help, and facilitated as we could.  Staff, on 
the other hand, has, for the most part, never trucked off on their own 
without consulting with the board and the membership on important issues.  
It's been teamwork.

Every once of people skills I own is telling me now is not the time to 
change that policy.  You may disagree.  You may want to run for the board 
next time around, stating that's how you'd do business, and see how it 
goes.  


#369 of 457 by mary on Thu Jun 2 21:02:42 2005:

s/ounce/once


#370 of 457 by tod on Thu Jun 2 21:18:48 2005:

I understand not wanting a barn full of admins running amok and fscking up
the system but what if the opposite is happening and the barn is empty?
Right now, newuser is defunct.  There is "one" staff person that everyone is
aware of that can fix it.  That "one" staff person is tight on time and has
had health concerns.  Is that how you want to do business with Grex?


#371 of 457 by nharmon on Thu Jun 2 21:22:08 2005:

> You may want to run for the board next time around, stating that's how
> you'd do business, and see how it goes.

Ancient chinese proverb speaks of being carefull of what one wishes for.


#372 of 457 by glenda on Thu Jun 2 22:31:05 2005:

STeve is working on the problem, he just called to ask me if I needed the car
tomorrow so that he can work on it tonight for as long as it takes to get it
done.  Even if it means he misses his ride to get enough sleep to be useful
at work and has to drive himself in tomorrow.  He is sorry for not getting
it done sooner.  He will also be showing me how to do such things so that I
can help more in the future.  (Mary missed me on the staff list.)


#373 of 457 by rcurl on Thu Jun 2 22:58:10 2005:

This is a good beginning for the immediate problem, but it seems there is
still need for a longer range solution, which is the development of additional
volunteer staff members.


#374 of 457 by steve on Thu Jun 2 23:48:13 2005:

  Indeed,  more are needed.

  I am getting the data right now to finish the fixing of master.passwd.


#375 of 457 by drew on Fri Jun 3 00:30:13 2005:

When exactly did newuser go fubar? Was it coincident with going to the new
system? Was it coincident with moving into the co-lo?


#376 of 457 by nharmon on Fri Jun 3 00:50:26 2005:

I believe it had something to do with a drive going bad, and the password file
being messed up.


#377 of 457 by cross on Fri Jun 3 03:40:06 2005:

This response has been erased.



#378 of 457 by slynne on Fri Jun 3 03:50:15 2005:

I have a feeling that both glenda and steve would agree that it isnt
fair to dump everything in their lap. I am sure they will correct me if
I am wrong.

I agree that the lack of staff is a board issue. I am not sure exactly
what the solution is here. I want to make people feel welcomed enough to
feel that they can volunteer to be on staff without being in some sort
of in-crowd. I want the staff to let those people who volunteer do things. 

Part of the problem, as I see it, is security. The staff tend to allow
people they know onto staff because those are the people they know they
can trust. It is true that the best way to become staff on grex is to be
invited. It is an "in crowd" on staff. On the board too I suppose
although it is probably easier to get on the board than it is to get
onto staff. 

I really dont know what the best solution is though. We could always
double the staff's pay ;)


#379 of 457 by aruba on Fri Jun 3 03:52:01 2005:

Re #375: Drew - the problems with newuser are more recent than either the
move or the change to NewGrex.

jep's assessment of the staff situation in #342 (I think) was pretty
accurate a couple of years ago.  But we definitely passed the point where
the board felt it was a good idea to depend on STeve and Marcus to fix
things.  A few years ago we acquired 3 or 4 new staff members, and we really
hoped that would solve the problem.  Unfortunately, for various reasons,
we're largely back where we were.  When this latest crisis came up, for
instance, no one but STeve stepped up to work on it, and then he got it half
fixed and moved on to other crises, leaving Grex in limbo.

We certainly need more staff, and it's certainly true that there have been
many barriers to getting onto the staff.  We need a procedure for giving
potential staff members some responsibility to see how they do, then
"promoting" them if they do well.  But the board/staff also needs the
discretion to ignore applications from people who are clearly just trying to
cause trouble.

There's a needle to be thread there.


#380 of 457 by aruba on Fri Jun 3 03:53:19 2005:

(Lynne slipped in.)


#381 of 457 by glenda on Fri Jun 3 07:36:01 2005:

Re #354:  Because Glenda is burned out from almost 6 years of intensive
computer classes and needs a break, that is one of the reasons she is not
working regular classes, just the 4 one week long special sessions this
summer.  She is going to spend most of her time working on the organizing the
house, get her spinning wheels and looms up and working, stitching, beading,
and other general crafting, and reading fiction and crafting books.  The most
technical reading will be SciFi (my favorite genre).  The only real computer
work I am planning on doing is building the computer that I have had
components for since just before the emergency surgery in December 2004.  It
is a 64 bit processor machine with a minimum of 250G hard drive (maybe more),
a gig of memory and will run OpenBsd as main OS, Net and Free BSDs and SuSe
to play with, haven't decided whether it will have a small windows area or
not.  Some of my needlework, weaving, and beading software only runs on
windows since most of the people using it aren't really computer literate and
only use windows.


#382 of 457 by scholar on Fri Jun 3 08:33:12 2005:

Right, because computer literacy is defined by Unix.


#383 of 457 by jadecat on Fri Jun 3 12:44:47 2005:

In regards Cross's comment in #377- it does seem that more information 
needs to be shared between current staff members.

It's hard for other staffers to help when the only one that knows isn't 
sharing info. I've seen Cross ask several times, in this item, for more 
information on what's wrong so he (or someone else can help) but 
haven't seen ANY indication that STeve (or glenda) is sharing any of 
that. Maybe this is happening in staff e-mail, but I would think that 
if it was Cross wouldn't be here repeatedly asking for more info.

Right now I'm one of those people that can telnet in to my account but 
can't access the mooncat account via Backtalk (which makes me glad I 
created this account on a whim a few months ago).


#384 of 457 by jep on Fri Jun 3 13:42:19 2005:

It was a lot easier for me to write my perspective about how Grex is 
limited by lack of trust than it is to overcome that limitation.

I think resp:379 is part of the right approach for bringing in 
additional volunteers to the staff.  There's also a need for some sort 
of training procedure to bring new staffers up to speed with the 
philosophy and practices of the staff.  That will require time and 
effort from someone on the staff.

It's also going to require patience and flexibility from the staff and 
all of Grex, because new people coming in are going to have their own 
ways of doing things.  It's not fair to expect them to suppress their 
personalities and the techniques they have used in the past in other 
contexts to volunteer for the staff of Grex.  They do have to fit into 
the team which exists, but the team has to adjust, too.

In the short term, it's easier for the existing staff to just do things 
themselves to get by, but right now it seems apparent the short term 
doesn't last forever.


#385 of 457 by twenex on Fri Jun 3 14:55:37 2005:

Hear, hear.


#386 of 457 by naftee on Fri Jun 3 15:36:03 2005:

has steVE still not told cross what is wrong ?! i believe cross is available
to fix.


#387 of 457 by tod on Fri Jun 3 16:02:55 2005:

Cross is not in the Burns Family "circle of trust"
I've got my EYES on you, Focker.  ;)


#388 of 457 by naftee on Fri Jun 3 16:59:04 2005:

that movie was OK, but got kind of tiresome, i think


#389 of 457 by mary on Fri Jun 3 19:15:56 2005:

Sorry I missed you on staff, Glenda, and thanks for the correction.  Did I 
miss anyone else?

And a huge thanks to STeve for looking at the newuser problem last night.  
Is there an update?  Is newuser back on?


#390 of 457 by naftee on Fri Jun 3 19:49:17 2005:

thanks, huge mary !


#391 of 457 by tod on Fri Jun 3 19:49:48 2005:

Don't call her huge, looni Jim


#392 of 457 by naftee on Fri Jun 3 19:52:05 2005:

whoops,  big slip :(   

sorry, mary !


#393 of 457 by glenda on Fri Jun 3 19:57:54 2005:

Re 389:  Not yet, still working on it.  

Putting an item in Agora asking STeve for information doesn't work.  He hasn't
read Agora in years unless I tell him of an item he should look at.  I forget
more items in Agora than I read, so I don't always see them either.  I also
heavily filter what/who I read just so that I can keep up.  About the only
CF that STeve reads is staff, and I am not sure how often he gets there.


#394 of 457 by tod on Fri Jun 3 20:52:54 2005:

That's good to hear! Right, Mary?


#395 of 457 by naftee on Sat Jun 4 01:23:48 2005:

I sure hope steVE reads the system problems' item.

oh wait; that's THIS item.


#396 of 457 by tsty on Sat Jun 4 02:21:25 2005:

... ???? mdw is gone?   huh?!


#397 of 457 by jor on Sat Jun 4 16:24:37 2005:

        May I have permission to just *look*
        at the source code for newuser?
        I can take "no" for an answer.

        Where is it?




#398 of 457 by naftee on Sat Jun 4 17:14:31 2005:

/grex/grexdoc/newuser


#399 of 457 by drew on Sat Jun 4 20:02:35 2005:

Newuser was working just before the disk went bad, was it not? My guess is
that it isn't the code itself that's broken, but one or more data files that
newuser depends on.

What exactly happens if you put newuser online and have someone run it?


#400 of 457 by gelinas on Sun Jun 5 04:39:05 2005:

I've updated the staff list at http://www.grex.org/staffnote/


#401 of 457 by scholar on Sun Jun 5 20:33:27 2005:

thanks, joe!


#402 of 457 by naftee on Sun Jun 5 21:19:49 2005:

thanks, scholar !


#403 of 457 by tsty on Mon Jun 6 16:26:50 2005:

  
  
grex% ls -las /grex/grexdoc/newuser
total 36
2 drwxrwxr-x   7 root  staff  1024 Jan 18  2004 .
2 drwxr-xr-x  31 root  staff  1024 Jan  3 13:13 ..
4 -rw-r--r--   1 root  staff  1396 Jan 18  2004 00-account
2 -rw-rw-r--   1 root  staff   164 Dec 27  2003 01-newuser
2 -rw-rw-r--   1 root  staff   160 Dec 30  2003 02-wnu
2 drwxrwxr-x   2 root  staff   512 Oct 14  2004 CVS
2 -rwxr-xr-x   1 root  staff    45 Dec 28  2003 build_newuser
2 -rwxr-xr-x   1 root  staff    65 Dec 29  2003 build_wnu
2 drwxr-xr-x   3 root  staff   512 Jan  3 17:29 datafiles
2 -rwxr-xr-x   1 root  staff   684 Oct 14  2004 install_newuser
2 -rwxr-xr-x   1 root  staff   367 Dec 30  2003 install_wnu
4 drwxr-xr-x   3 root  staff  1536 Oct 16  2004 nu
4 drwxr-xr-x   3 root  staff  1536 Dec 28  2003 src
4 drwxr-xr-x   3 root  staff  1536 Oct 17  2004 wnu
  


#404 of 457 by albaugh on Mon Jun 6 22:00:05 2005:

BTW, if newuser doesn't run properly due to munged data, then it is deficient.
(yes, yes, there are limits on what one can expect any program to do when it
is asked to work with bad data)


#405 of 457 by jor on Tue Jun 7 17:29:48 2005:


/a: write failed, file system is full
Got error 28 (No space left on device) in buffer write




#406 of 457 by naftee on Wed Jun 8 15:28:25 2005:

por jor


#407 of 457 by keesan on Thu Jun 9 02:05:26 2005:

Is it time to remove the 'grex was down from April 26 to April 29' message
or is this meant to show how long we have been up since then?


#408 of 457 by russ on Thu Jun 9 03:00:21 2005:

I took it upon myself to do so.


#409 of 457 by naftee on Thu Jun 9 04:28:50 2005:

thanks, russ !


#410 of 457 by nharmon on Thu Jun 9 16:00:21 2005:

Looks like /a is full.


#411 of 457 by russ on Fri Jun 10 05:08:27 2005:

/a is full again.


#412 of 457 by eprom on Fri Jun 10 18:53:30 2005:

Jesus H christ, can someone make more room on /a, what 
ever happened to the 3 month reaper?


#413 of 457 by tod on Fri Jun 10 19:19:16 2005:

/a whatchit


#414 of 457 by naftee on Fri Jun 10 20:17:18 2005:

get a home directory on /c, dipshits


#415 of 457 by nharmon on Fri Jun 10 20:44:47 2005:

Shaddup shuttin' up, or I'll give ya a fat lip!


#416 of 457 by eprom on Fri Jun 10 21:26:25 2005:

re #414

hey assclown, I would if it weren't for the fact that newuser is also in a
state of disrepair.


#417 of 457 by jor on Fri Jun 10 21:44:23 2005:

        "is this the humor item?"


#418 of 457 by mcnally on Fri Jun 10 21:45:12 2005:

Simple suggestion for clearing some space on /a:

  find /a -mtime +3 -name .pine_debug\? -exec rm \{\} \;

(for the non-Unixy people, that command will find all of
the files named .pine-debug? (where ? is a wildcard that
matches a single character) that have been more than three
days since they were last modified, and remove them.)

My search to find out how much space they're taking
shows they're still taking rather a lot of space.

  grex% find . -name .pine-debug\* -ls | ~/pdb_total.pl
   7521 files
  87279280 bytes

That's 87,000,000 bytes of space that's being wasted by
stuff that nobody will ever look at or ever miss..  In terms
of modern disks that's not huge but hey, 87MB here, 100MB
there, pretty soon it adds up..


#419 of 457 by jor on Fri Jun 10 21:54:14 2005:

        cat | butter | toast > feet


#420 of 457 by naftee on Fri Jun 10 21:58:58 2005:

re 416
i forgot about newuser being broken.  oops !

get the GreX stafferz to give you a home directory on /c


#421 of 457 by jor on Fri Jun 10 23:47:46 2005:

        login banner mentions item 28


#422 of 457 by nharmon on Sun Jun 12 20:31:44 2005:

load averages:  2.40,  2.06,  1.89                                    
16:29:02
94 processes:  2 running, 88 idle, 4 stopped
CPU states: 71.5% user,  0.0% nice, 28.5% system,  0.0% interrupt,  0.0% idle
Memory: Real: 198M/340M act/tot  Free: 1170M  Swap: 0K/3072M used/tot

  PID USERNAME PRI NICE  SIZE   RES STATE WAIT     TIME    CPU COMMAND
25988 leemak    64    0  280K  912K run   -       21:08 92.97% warcaby
 5250 _mysql     2    0   34M   16M sleep poll    16:06  0.00% mysqld




#423 of 457 by gelinas on Mon Jun 13 01:47:04 2005:

The reaper was to be replaced by an automated process.  I should check to see
if the one we were using got moved over.


#424 of 457 by mcnally on Mon Jun 13 07:48:42 2005:

/a out of disk space again.  I removed some files to clear enough disk space
to enter this response but it'll be out again in no time flat.


#425 of 457 by naftee on Mon Jun 13 16:35:46 2005:

/c :)


#426 of 457 by naftee on Thu Jun 16 02:41:41 2005:

!last -2 keesan
keesan    tty01                             Wed Jun 15 22:24   still 
logged in
keesan    ttyp9    pm918-21.dialip.mich.net Wed Jun 15 19:18 - 19:27  
(00:09)


!!!! 



#427 of 457 by jor on Thu Jun 16 09:09:52 2005:

        ????


#428 of 457 by albaugh on Thu Jun 16 18:13:27 2005:

I REALLY HATE IT when my agora conf. participation file gets stomped when the
idle zapper gets me when I'm sitting in bbs at agora!!!!!!!!!!!!!


#429 of 457 by tod on Thu Jun 16 18:15:42 2005:

Maybe you shouldn't idle?


#430 of 457 by albaugh on Thu Jun 16 18:24:34 2005:

Han Solo:  "Even I get boarded sometimes."


#431 of 457 by mcnally on Thu Jun 16 18:31:07 2005:

 re #428:  I lost a near-finished nethack game to the idle zapper earlier.
 D'oh!


#432 of 457 by naftee on Thu Jun 16 18:52:20 2005:

the GreX idle zapper should take jor's model and knock you off after an hour,
not twenty minutes


#433 of 457 by jor on Thu Jun 16 21:02:24 2005:

        And hey. Mine is configurable during run time,
        you don't even have to stop it and start it
        to alter the settings.

        Plus it's one tenth the souce code of idled.c,
        ready for efficient customization.
 
        www.arbornet.org/~jor/id2002.htm




#434 of 457 by cross on Fri Jun 17 00:49:55 2005:

This response has been erased.



#435 of 457 by naftee on Fri Jun 17 02:34:54 2005:

jor is the ID man


#436 of 457 by nharmon on Fri Jun 17 02:46:25 2005:

paedophile


#437 of 457 by naftee on Fri Jun 17 02:56:38 2005:

Telegram from nharmon on ttyp2 at 22:56 EDT ...
What if I was to kick the ever loving shit out of you?
EOF (nharmon)

"if I were to"


#438 of 457 by jor on Fri Jun 17 09:43:23 2005:

        "is this the humor item?"


#439 of 457 by naftee on Fri Jun 17 13:39:09 2005:

i'm not klg :(


#440 of 457 by aruba on Fri Jun 17 17:29:49 2005:

I get an "Abort Trap" message from Picospan when I try to read agora item
174.


#441 of 457 by aruba on Fri Jun 17 17:42:09 2005:

Looks like response 4 of that item has a long line, and either less or my
twitfilter is having a hard time dealing with it.  Does anyone else have a
problem seeing response 4 of item 174?


#442 of 457 by mcnally on Fri Jun 17 17:48:57 2005:

 I have no trouble seeing it using Picospan with "more -d" as my pager.


#443 of 457 by aruba on Fri Jun 17 17:50:32 2005:

Looks like it's my twit filter.  The response has a line of length 316
characters.  I bet OldGrex enforced a 256 character limit on line length,
and that's why I've never seen the error before.


#444 of 457 by albaugh on Fri Jun 17 19:39:02 2005:

Drift:  How did you determine that line length?


#445 of 457 by jor on Fri Jun 17 22:01:43 2005:

        He counted using fubgrs *and* toes.

        Or by using a text editor, writing to
        a file, and ls -l.

        High tech.



#446 of 457 by naftee on Fri Jun 17 22:10:48 2005:

stuff you'd expect from a treasure.r.


#447 of 457 by jor on Sat Jun 18 00:18:49 2005:

        fingers




#448 of 457 by aruba on Sun Jun 19 04:07:13 2005:

Re #444: I downloaded the file /bbs/agora53/_174 and looked at it in a text
editor.


#449 of 457 by naftee on Sun Jun 19 14:55:52 2005:

fubgrs, ur smart, aruba.,


#450 of 457 by jor on Tue Jun 21 13:30:48 2005:

        It's been summer for 8 hours and 41 minutes


#451 of 457 by jor on Tue Jun 21 15:39:43 2005:

        Correction. It has now been summer for 8 hours and 49 minutes.

        http://www.archaeoastronomy.com/2005.shtml




#452 of 457 by twenex on Tue Jun 21 15:41:44 2005:

Wikipedia says the summer solstice marks MIDsummer.


#453 of 457 by rcurl on Tue Jun 21 16:06:32 2005:

Isn't the Wikipedia that online font of misinformation?


#454 of 457 by naftee on Tue Jun 21 16:39:08 2005:

what ! what's wrong with wikipedia ?


#455 of 457 by jor on Tue Jun 21 17:11:38 2005:

        midsummer is six weeks away.


#456 of 457 by aruba on Tue Jun 21 21:44:34 2005:

Right - for some reason, the first day of summer is sometimes called
"midsummer's day".


#457 of 457 by keesan on Wed Jun 22 02:23:40 2005:

Summer used to be defined differently, as the 1/4 of the year surrounding the
solstice, when the days were longest.
Similarly for winter.


There are no more items selected.

You have several choices: