|
Grex > Coop7 > #33: Setting time limits for the new modei on Grex | |
|
| Author |
Message |
| 25 new of 57 responses total. |
popcorn
|
|
response 25 of 57:
|
Apr 21 12:46 UTC 1995 |
Hm. Then I guess I misunderstood the last paragraph of response #19.
|
tsty
|
|
response 26 of 57:
|
Apr 22 13:54 UTC 1995 |
correct; you did. That I happen to have been involved in one event,
I'd wager that there has been more than one "nohup zap" in the
history of Grex. My singularity could not possibly be the inclusive
universe of zapped nohups.
|
srw
|
|
response 27 of 57:
|
Apr 22 18:09 UTC 1995 |
You said "it depends on the root who is investigating at the time".
It does not. Nohups are zapped automatically and without prejudice.
Neither you nor anyone else is singled out for better or worse.
|
lilmo
|
|
response 28 of 57:
|
Apr 23 04:22 UTC 1995 |
What are nohups, and why are they zapped, and why must that be done manually?
|
mdw
|
|
response 29 of 57:
|
Apr 23 05:58 UTC 1995 |
They're jobs run in the background, they're zapped if the person has
logged off because they can take up lots of CPU to the detriment of
interactive users, and it was felt this was not a reasonable use of the
system, and, no, it's done automatically not manually.
|
popcorn
|
|
response 30 of 57:
|
Apr 23 12:55 UTC 1995 |
The idle-process-zapper hits some things automatically.
Other things are hit manually by staffers.
|
popcorn
|
|
response 31 of 57:
|
Apr 23 12:56 UTC 1995 |
To re-state what Marcus said: "nohup" is the name of a command you
can use to say "run this other command, and leave it going even after
I log off". I think "nohup" is short for "no hang-up (the phone)".
|
lilmo
|
|
response 32 of 57:
|
Apr 23 19:36 UTC 1995 |
So if they are not allowed, why are they allowed? uh... I mean, if one
MAY not use it, why CAN one use it?
|
srw
|
|
response 33 of 57:
|
Apr 24 04:20 UTC 1995 |
nohup is a program that starts your program as a subprocess.
It handles the C signal HUP (the hang-up signal) but does nothing with it.
If it were not handled, it would kill your program.
nohup is very handy on most unix systems, but has been deemed an inappropriate
use on Grex. The main function of the idle zapper is to catch buggy programs
that don't terminate conveniently sometimes. It zaps nohuppers as a byproduct.
|
tsty
|
|
response 34 of 57:
|
Apr 24 07:42 UTC 1995 |
I was not aware, until #27, that "nohup zapping" had become automated.
|
steve
|
|
response 35 of 57:
|
Apr 24 12:37 UTC 1995 |
Greg wrote the script more than a year ago, and we talked about
it in either an item in coop or agora.
The reason for it was simple: we were running out of pty's, rather
often. If a person on a pty left something in the backgound, then
that pty was withheld from the pool of available pty's, until that
process finished. Over time, we saw "holes" in the ttyuse log, where
all the lower numbered pty's (ttyp0, p1, ...) had less use than the
q or r series.
So because of that, and the fact that people were intentionally
leaving things to run in the background, we decided to disallow them.
|
popcorn
|
|
response 36 of 57:
|
Apr 24 14:03 UTC 1995 |
The other reason for Greg's idle-zapper was because programs like lynx
and trn were getting "stuck" in such a way that they ate up large amounts
of cpu time, even though the user who had been using the program was
logged off. By zapping these "stuck" processes, it lowers the cpu load
for everyone else.
|
davel
|
|
response 37 of 57:
|
Apr 25 02:03 UTC 1995 |
(Don't forget tin.)
These *very* often got left by a user who waited (say) half an hour or more
for trn to come up, & then gave up and disconnected. News was *slow*,
& slows everything else down under those circumstances.
|
popcorn
|
|
response 38 of 57:
|
Apr 25 02:13 UTC 1995 |
Actually, though, something was wrong with lynx and trn, such that when
a user logged off and left these running, they would get into an infinite
loop of trying to get user input, failing (because the user had disconnected),
and trying again immediately to get user input. This loop sucked up a lot
more cpu time than leaving the programs running normally would.
|
srw
|
|
response 39 of 57:
|
Apr 25 05:51 UTC 1995 |
(don't forget nethack)
Yes, these programs have bugs that cause this bad behavior, but there were
quite a number of such buggy programs. If implemented right, a program
should not suck cpu when it gets disconnected, no matter how rudely.
|
davel
|
|
response 40 of 57:
|
Apr 25 11:17 UTC 1995 |
Agreed. It's just that some of those buggy programs were such
*****incredible***** resource hogs in the first place - & at least some
were so slow to come up (partly in consequence) that people assumed they'd
died & disconnected. Bleah.
This drift really isn't drift. It would be really nice if we could come
up with a way to allow people to leave stuff running, possibly with some
restrictions. There are also compelling reasons for the decision that was
made to zap things left running. Is there room to accommodate both?
|
tsty
|
|
response 41 of 57:
|
Apr 25 12:09 UTC 1995 |
hmmm, could nohup be a membership perk? I fully understand/understood
the reasons for zapping nohups, I was only commmenting that the
automated process was new to me. And, this discussion reinforces
some of the decisions made in the past, and bring +everyone+ up
to date. Thankxx steve.
|
steve
|
|
response 42 of 57:
|
Apr 25 15:07 UTC 1995 |
We could I suppose distinguish between members and non-members when
getting ready to kill bacjground processes, but is that a good policy
decision?
Our real problem is that we don't have enough CPU. Once we are
using the 4/200 we bought, we'll be on the road to gettting news up.
When thats running, the various newsreaders will eat the extra CPU,
and we'll be on the hunt for the next processor upgrade. I think we're
going to be deficient in CPU power for some time now; as we expand,
we'll just find more things to fill up the CPU. Because of this, I
don't think we should allow any more background processes than really
needed (which is alrady too much!).
|
ajax
|
|
response 43 of 57:
|
Apr 25 16:20 UTC 1995 |
What types of things do people do that they run in the background
for more than a half hour under a normal load? Seems like it must
involve a *lot* of disk access or number crunching.
|
remmers
|
|
response 44 of 57:
|
Apr 25 16:27 UTC 1995 |
Compiling large software packages is one type of thing. This is
something that staff people do occasionally.
|
gregc
|
|
response 45 of 57:
|
Apr 26 09:08 UTC 1995 |
I've been away for awhile. I'll respond to a number of things:
1.) We do not have an "idle zapper" program on grex. The current run state
of a process is not examined. What we have is an "orphan process zapper".
Every 15 minutes, this program wakes up and looks for any running,
non-root, process whose owner is no longer logged in on the tty/pty
that the process was started on.
2.) This program was written not just to kill programs that were intentionally
left running, but to kill a number of programs(lynx, nethack, trn, tin)
that had bugs in their character input routines. This bugged caused
these programs to consume *enormous* amounts of CPU under the right
conditions. Unfortuneatly, it would be impossible to allow certain
users to leave nohup programs running, because there would be no good way
to distinguish between a good background program and one that had gone
flaky. This just isn't an option.
3.) *Alot* of modern modems, in fact almost all of them, have inactivity
timers. Most have had them for *years*.
4.) The modem is the best place to do this. Certain programs can decieve
UNIX into believing that a tty is idle when it isn't, but the modem
will *always* know. Unfortunately, there is no way to conditionalize
this. The modem knows nothing about UNIX, and users, and processes. It
just sees streams of characters. It could be hooked to a Mac for all
it knows.
5.) Just to be clear: We are only talking about hanging up the 9600 baud
lines, not the 2400 baud lines, and we're only talking about hanging
up lines that have been *IDLE* for 20 minutes, not setting a hard time
limit. Actually, I though I was being overly *generous* when I set the
modems for 20 minutes. There are only 4 high speed dialins. They are
a limited resource. There is no good reason for somebody to sit idle
on one for 10 minutes, let alone 20 minutes. If you want to run something
that takes longer that 20 minutes to run and doesn't produce *any*
output, what the hell are you doing on a high speed line??? Please, log
off and reconnect to one of the 2400 baud lines for this kind of work,
don't tie up a high speed line for 20 minutes for a job that *DOESN'T
PRODUCE ANY OUTPUT*!!!!
|
remmers
|
|
response 46 of 57:
|
Apr 26 12:23 UTC 1995 |
Well, being the good net-citizen that I am, I would like to run the job
nohup and get off the line altogether. Unfortunately, then I get bit
by a certain orphan process zapper. (The phrase "between a rock and
a hard place" comes to mind...)
|
remmers
|
|
response 47 of 57:
|
Apr 26 12:27 UTC 1995 |
(Yes, I know you were referring to the high-speed lines only, but I
couldn't resist...)
|
steve
|
|
response 48 of 57:
|
Apr 26 15:33 UTC 1995 |
John, are there non-staffing things you'd like to leave in the
background? I thought about this a while ago, and concluded that all
the things I've put into the background have been things related to
the running of the system. So, if its a staffish thing, I'd suggest
becoming root and doing it that way.
|
gregc
|
|
response 49 of 57:
|
Apr 27 10:58 UTC 1995 |
I could also modify the orphan killer to bypass certain user ID's. This
would allow staff to leave things running. Unfortuneately, it would also
prevent the program from killing an errant lynx,tin,trn,nethack,etc that
went into brain-damaged-mode after a staffer logged off.
There may be other solutions, let me think on this one.
|