|
Grex > Coop8 > #83: "All ports are busy" vs. A countdown-- which is better? | |
|
| Author |
Message |
| 25 new of 196 responses total. |
ajax
|
|
response 50 of 196:
|
Jul 15 18:55 UTC 1996 |
- A minute is a good time-out period. Rane's suggestion of 10 seconds
is too quick, requiring that someone sit at the computer, probably
not even multitasking much unless the machine can always rapidly
switch tasks (many OS's have modal dialog boxes and non-preemptive
multitasking that can make rapid task switching difficult). With
waits of a half hour or an hour, people need to do other things
(answer a phone, pee, whatever). If it turns out a lot of people
are missing the 1 minute mark, thus wasting a lot of minutes for
people waiting in line, maybe it should be reduced a bit, but not
to 10 seconds.
- I like the suggestion of a beep when you get to the front after
waiting in a queue. I'd think most people would want a beep; if
so, it should beep by default, with a command added to disable it.
- I agree with Rane's observation that the change seems to
significantly increase the average waiting period for those who
ultimately log in. This is because people who would ordinarily try
a couple times and give up, will now stick around and wait in line
(to reiterate Rane's explanation). I still prefer the queue so
far, but I do think this is a valid observation. (You can shoot
holes in this by playing with the definition of "average," but I
don't want to take even more space to use bullet-proof wording :-).
- I have some suggestions for purely cosmetic changes:
When people first connect, a "Welcome to Grex" message
might be nice. Possibly followed by a sentence explaining
"Grex is filled up right now; your place in line is..."
In the help screen, there's an "L" command listed, which
produces output something like "65 in line, 23 users,
34283 head." I don't even have a guess as to what the
last two numbers mean. Perhaps it could be explained with
a sentence? (Or if it's debug info, eventually remove it?)
Add about a command to explain basically what's happening,
something like "Grex can only support 64 Internet users
on-line at one time. Right now, more than that number want
to connect, so the most recent people are put in a line to
log in. When a user logs off, the person at the front of
the line can log in, and everyone's position moves up by
one. The number displayed after the '...' is your current
place in line."
Maybe even add some intro info on Grex that people can
read while they wait. A lot of users will never have used
Grex before, and it would be a small drain on resources.
|
ajax
|
|
response 51 of 196:
|
Jul 15 19:09 UTC 1996 |
Oops, the last part drifted into some not "purely" cosmetic changes. :)
Also: thankyou Marcus, for working on this.
|
janc
|
|
response 52 of 196:
|
Jul 15 20:21 UTC 1996 |
Hmmm...so since we made it easier to get on, more people keep trying to get
on instead of abandoning the effort, so it gets harder to get on.
|
brighn
|
|
response 53 of 196:
|
Jul 15 21:25 UTC 1996 |
Valerie Mates added a line to the MOTD telling people to type
"exit" at the login: prompt if they want to end the connection.
Would it be possible to add a line to the login lines (just before
grex login: or after someone presses enter without a handle at the login:)?
It seems like the information would be more useful (and noticable) there --
especially if there's someone who finds Grex by accident and in fact
doesn't even want to log on once.
The queue clogged up again, but this afternoon it seems to be all right.
RobH cleaned it out (thanks) when it was clogged... is the clogging
problem being fixed, or does it have to be manually cleaned?
|
rcurl
|
|
response 54 of 196:
|
Jul 15 21:58 UTC 1996 |
I don't give a damn about who Marcus thinks is better than whom among
those calling him arrogant, but I'm glad to hear that there is apparently
a consensus in that regard.
If you had spent a little time, maybe even less than the ca. 86 lines of
after-the-fact justification in #44, we could have discussed it rationally
and known about the advantages and disadvantages. I have always supported
*experimentation* with new systems, and I am sure I would in this case
too. Having it "sprung" upon everyone, however, sure does not conform to
the much heralded Grex method of "consensus".
But, it doesn't work, and still wastes time. I tried to telnet in twice
this afternoon. The first time I came in as #58 (with 25 *users*), and it
took 12 minutes to count down, at which point I was instantaneously
disconnected. I tried again, came in as #7 (59 users), and 5 minutes later
I connected and was again instantaneously disconnected.
I would like to know what fraction of those finally reaching the login
actually log in. That 1 minute timeout could be a large part of the delay
if any significant fraction of users don't log in. This is a statistic
that should be maintained during the whole experimental phase.
This system only becomes fair if everyone getting on the queue intends to
and does use the system. As I have said before, there is absolutely no
impetus to quit the queue, rather than just forget about it. So, I'm
speculating that a lot of the wait is for timeouts - but data would answer
this.
The timeout could be shortened to 10 seconds if the beeping is at one
second intervals once connection is made. I see no reason for waiting a
full minute for users that aren't interested enough to keep attending to
their place in the queue. If you wander away from a queue in real life,
you lose your place (if you don't have a stand in, which users could have
for this system too).
Let's start involving the members (and other interested users) in the
development of Grex - again.
|
brighn
|
|
response 55 of 196:
|
Jul 15 22:14 UTC 1996 |
i would suggest a compromise of 30 seconds
i agree that a full minute is too long, but 10 seconds is too short
i have several application which frequently freeze my ssytem for 10
to 15 seconds (loading, printing, and so on)... further, not all of
us get the beeps. i don't... at least, i don't from other boards, and i doubt
this board would be any different
Once the stats have been gathered, it might be possible to approximate the
wait based on the number of users in the queue ... nothing complicated, just,
say, if it turns out that a tty frees up every 45 seconds, then give a timeof
45 seconds X place in queue.
Skip the long message someone suggested... if you don't know what a queue is
already, then the explanation isn't likely to help.
Instead, when someone is placed in the queue, the message *I'd* like to see:
Welcome to Grex!
Grex is currently full.
You are in queue position #58
Your wait will be approximately 45 minutes.
Press q at any time to leave the queue.
Press ? for help.
{wait transpires}
{finally:}
Welcome to Grex!
yada yada yada
Type exit to leave Grex, newuser to create an account,
or your handle.
Login:
Or something along those lines.
But that's just *my* vision.
|
scg
|
|
response 56 of 196:
|
Jul 15 23:11 UTC 1996 |
I'm actually concerned that the timeout is too short, if anything. I find
that if I mistype my password once or twice (due mostly to having enough
different passwords on enough different systems that I have trouble keeping
track of which is which), it times out and I get kicked off. Standard
timeouts, from what I've seen, generally tend to be a lot longer than that.
Dropping it down to ten seconds, as some have suggested, wouldn't be enough
time for a fast typist who got their password right on the first try to get
in, assuming the system was somewhat slow.
I'm a little concerned that Rane seems to feel so strongly that we don't have
a concensus, and shouldn't have done this. So far, from what I've seen, this
got batted around for months with nobody objecting to it, and now that it's
been tried there are only two people strongly objecting to it, Rane and
kerouac. Given that kerouac objects to absolutely everything, seemingly only
for the sake of objecting to things, that leaves Rane as the only person with
serious objections. Rane's concerns are certainly valid, and certainly need
to be looked at, but I'm afraid I really don't see this huge lack of consensus
that he is complaining about. With so many people saying they support it,
I think we have about as close to a consensus on this as we ever do on
anything.
|
brighn
|
|
response 57 of 196:
|
Jul 16 02:03 UTC 1996 |
When I suggestred 30 seconds, Steve, I wasn't aware the timeout included
the time it takes to type your login and PW... I assumed the timeout ended
on the first keystroke. If it's really the case taht you only have x amount
of time to enter your handle and password *and get it right*, then yeah, 10
seconds is obscene and 30 seconds is difficult.
|
rcurl
|
|
response 58 of 196:
|
Jul 16 07:39 UTC 1996 |
Re #56: no item had been entered in coop on the subject of a queue, and
how it would work, so that there could be discussion of the pros and
cons. I looked at the citations for the "got batted around for months",
and it seems it was mentioned a couple of times as something Marcus
was working on (and which Greg didn't think he'd ever do...). I think it
is fair to say that members and users were never directly approached with
the proposal to institute a queue, with a request for our comments and
suggestions.
I still can't telnet it. Every try counts down - flashes "login" for
a fraction of a second, and then dumps the connection. This has happened
every time I have tried today.
Even without that problem, I think I will abandon telnet connections,
unless for something requiring IP/TCP, because of the 10X delay that
has been introduced into logging in. Today I did come prepared with
other things to do, so I combined the long wait on the queue with other
useful activities. Though, as I said, I was disconnected in the end anyway.
I had an awful thought. Malicious persons can write scripts (I presume)
that can attack telnet and spawn unlimited connections, expanding the
queues enormously - and of course abandoning all of them so they each
take a minute (or whatever you choose) to time out. I even did the
experiment, and merrily joined the queue again and again...well, five
times (but killed all but one before any got to the head). What protection
is there against this?
|
tsty
|
|
response 59 of 196:
|
Jul 16 08:35 UTC 1996 |
<<finally, some little change that generates as much finger-flailing
as the "sky is falling" co-op login screen." <g> >>
all seriousness aside, grexers are participating in the .03beta
version of a reasonable balance between muscle and brains.
maybe mdw can get some $$ for this code, he's certainly worth it.
.
|
tsty
|
|
response 60 of 196:
|
Jul 16 08:35 UTC 1996 |
... .when the 1.0 version is implemented .....
|
mdw
|
|
response 61 of 196:
|
Jul 16 09:54 UTC 1996 |
Rane is right, I didn't make much of an effort to seek out input on the
"queuing" process. In fact, nearly everyone has an opinion on how it
"should" be done - trying to please them all is guaranteed to be more
work than the programming itself. Furthermore, the programming itself
promised to be quite enough of a challenge.
Naturally, I had some pretty definite ideas of my own. Frankly, I don't
really like queues at all, any more than the next guy. One of the main
reasons I did this, was so that I could at least pick my own brand of
poison. In fact, I had 4 main criteria in my design. I wanted
something that would be reliable, fair, efficient, & scale well.
Obviously, the initial reliability sucked. Hopefully it will improve, a
lot. The current system is dead fair; everyone, members, staff, users,
people from india, goes through the same queue. (Of course, that
doesn't mean it's convenient; but at least everyone, staff, members,
users, people from india, and Rane, go through the same logic.) One of
the particular forms of unfairness I wanted to avoid at all costs, was
giving members preferential access (a la M-net.) To me, that generates
a much more selfish sort of membership. I very much want members, and
indeed everyone on the system, to be thinking in terms of "the users",
not "the members". So far at least, it doesn't seem too piggish. The
current system can also clearly accomodate hundreds of people in the
wait queue without killing the system.
This is not to say other ideas or queueing systems are without merit.
Several other staff people talked seriously of building "queue" logic
into login. Since this design would still be limited by "pty's" - there
would be a definite limit on the # of people who could be waiting. So
very probably, even if we do in the end go with a login queueing system,
it would still be useful to have the telnetd non-pty queueing logic
available for when the pty's run out. Since a login queuing system
could apply *after* authenticating the person, it could easily implement
various forms of unfairness. There were serveral other staff members
talking seriously about doing such a thing; if people at large think
that's a good idea, I'm sure those staff members would be willing to
sink more effort into it.
I'm sure there are plenty of other ways queuing could work; this
conference makes a wondeful place to discuss any serious proposal as to
how they should work, & certainly everyone on staff would be quite
willing to discuss ways in which any particular proposal could be
implemented. I'm also certain staff would be willing to implement any
particular proposal if we reach a consensus on it. The current queuing
logic is not meant to be the end all final solution to queuing; it is
merely a first step in an incremental process to improve life on grex.
It may not even be the right solution; perhaps after experimenting with
it, it will seem to most people not to have been worth it.
Other specific things people mentioned:
I made absolutely no changes to login. The delays and such are the same
as always. The delay is 60 seconds. In general, network lag could
account for as much as 10-20 seconds against the 60 seconds, and the
user has to get their password & login in before the time is up; so
users could easily be facing a 40 second window. I don't think it would
make much sense to reduce the delay much. It might make more sense to
make the delay "per-prompt" rather than "both prompts". A user who was
facing truely horrorific lag could be faced with less time. For
instance, a bad high speed dial-up connection can easily generate what
*looks* like network lag, although it really isn't.
The initial messsage can't currently be customized. I would argue
against any attempt to make it at all long; because this is a message
*everyone* sees *every time*. Making long messages is usually
counter-productive. People are much less likely to read long things.
Note the # of members who keep up in co-op & read long responses such as
this one. Anyone reading this sentence can certainly be considered "one
of the elite" on grex.
Janc's observation, that "We made it easier for people to get on, so
more people will try getting on, so it will get harder to get on," bears
repeating. Unless we somehow suffer a *very* impressive windfall, we
will always be faced with a tradeoff in terms of resource constraints.
We *can*, however, offer slightly better service to a much larger pool
of users, by opening up more pty's, which means as necessary, buying
more system, more network bandwidth, & making more efficient use of what
we have. It's up to the members to provide the resources to buy more
system & network bandwidth. It's up to us staff to make it so, and to
make the best use of what we have.
|
remmers
|
|
response 62 of 196:
|
Jul 16 11:03 UTC 1996 |
One problem I had yesterday, twice: When I got to the front of
the queue and received the login prompt, I was disconnected within
two seconds, before I had a chance to type my login id. Should have
made a note of the exact times this happened but didn't -- believe
it was early to mid afternoon.
|
brighn
|
|
response 63 of 196:
|
Jul 16 15:40 UTC 1996 |
I'm confused by Rane's complaint that people could clog
the queue with deliberate ghosts combined with his own problem
that his login: prompt is only up for a brief flash.
He seems to be admitting that the delay is a minute, but alleging
that he himself isn't getting that minute.
Very confusing.
All I can say, Rane, is that I know for a fact that I mentioned
this queue program in at least two items in Coop. One mention
turned into a flamewar about the telnetd 2-minute timeout problem,
in which I was told that I wasn't a Good Little Grexer with the
Right Community Spirit.
If nobody who cared about the topic stopped the convo at the
time and said "Queue program? What queue program? Why, my
goodness, we must have a discussion of this right now!", well,
seems to me the barn doors are bein closed after them cows have
gone AWOL.
(*brighn ponders finding said entries* *brighn is aware of
how incredibly difficult it is to start an item on a conference,
the advance technical and programming skill and the special permissions
such a venture takes, which explains why so many users seem
incapable of starting items, and complain so frequently about
items not having been started*)
|
janc
|
|
response 64 of 196:
|
Jul 16 16:32 UTC 1996 |
Couple things:
- If we had discussed this in advance, all the staff would have agreed
that "the average waits will be no longer than they are now". We would
not have anticipated the problem with netlag and/or modemlag eating up
the full minute allowed to login. The problems people are complaining
about weren't known. You guys with 20/20 hindsight may find it easy
to say everything would have been better if we had told you about it in
advance, but we didn't have anything to tell you then except our
incorrect theories on how much this would help. We didn't make a big
deal about it, because we didn't know it had a dark side. I don't
believe the rest of you did either, because I still haven't heard a
coherent theory about why the queues get so big.
- My best guess as to why the queues get so big is that we were thinking
wrong about the way people telnet to Grex. We were kind of assuming that
many users attack-telnet in a fairly consistant way. If that were true,
I think queuing would have sped up connection times. But apparantly
what happens is that at peak times people try telnetting a few times,
and then try again much later (likely at a non-peak time). The whole
process has to be about shifting users from peak times to non-peak times.
So now, instead of trying to telnet in a couple times, and giving it up
in frustration, the say "wow, I'm number 783 on the queue!" and wait...
and wait...and wait...and slowly grow furious at the size of the wait.
This isn't stupidity on the user's side either. Since leaving and
coming back later puts you back on the end of the list, there is a
disincentive to do so. With the queuing system, the fastest way to
get on is to wait. The difference is that you get a lot of people
saying "I had to wait an hour to get on Grex this afternoon" instead of
"I couldn't get on Grex this afternoon".
I don't think that is the whole story though. I think the queues build
up so large that they stretch out peak times, so that the system is
inaccessible more often. I really don't feel like I understand the
process that leads to the behavior we see.
Do we have data from which you could build a chart of queue length as
a function of time? That might be helpful.
I wonder if setting a maximum queue size of say 10 users, and giving
anyone beyond that point a "No connections available" message would
work better. You'd only be putting people on the queue if there is
a rather short wait anticipated. Otherwise you'd be giving them the
bounce and, in effect, telling the to "come back much later". If my
incomplete theory of the queuing process is correct, this might work
better. It automates the natural human "try a few times and give up"
strategy.
- One of my problems with the current queuing implementation is that
it doesn't address what I thought was the major reason for getting
into this at all. We would like to stop using the number of pty
devices as the limit on the number of telnet connections. Pty devices
are useful for all sorts of other things, most importantly connections
from a terminal server, but also screen sessions. So we'd like to
still limit the number of telnet sessions, but have lots of extra
pty devices available for people coming in from terminal servers or
people who want to run "screen".
When I wrote one of the early proposals for a queuing system, my main
idea was that we would modify things so it would limit the number of
incoming telnets to a number *lower* than the actual number of pty
devices (note: this would not be a decrease in the number of telnet
connections, it would be an increase in the number of pty devices).
That would disconnect the telnet limit from the pty limit and leave
us free to do various nice things. As long as we were doing that,
I thought putting in a queuing system would be a cool idea, but I
thought that was a secondary goal, not a primary goal. Marcus's
queuing system still limits the number of connections to the number
of pty devices. I have no idea how hard that would be to change, but
it's not obvious that this telnetd gets us any closer to being able
to use the terminal servers for dialin.
|
pfv
|
|
response 65 of 196:
|
Jul 16 16:33 UTC 1996 |
Hmmmm... here's a neat little "poison apple": why not search the queued
list as you try to add a user and if the new site is already in the
queue, THEN you KILL the OLD entry? This would certainly ruin the day of
any 'attack telnet' users ;-)
|
ajax
|
|
response 66 of 196:
|
Jul 16 17:23 UTC 1996 |
Re Marcus:
> The initial messsage can't currently be customized. I would argue
> against any attempt to make it at all long; because this is a message....
> *everyone* sees *every time*.
I agree...when I suggested a "Welcome to Grex" message, I meant that
literally, adding those 15 characters. I've forgotten what it says now,
but I remember feeling it was *too* austere. The other suggestion I
made for the intial message was also under one line. The longer messages
I suggested were as a response to a command requesting the information,
not as something displayed to everyone every time.
Re Jan:
> because I still haven't heard a coherent theory about why the queues
> get so big.
Other theories were incoherent? :)
> I wonder if setting a maximum queue size of say 10 users, and giving
> anyone beyond that point a "No connections available" message would
> work better.
My first reaction was that that's utterly absurd and pointless. But
on thinking about it, I think its results would be very unpredictable;
maybe good, maybe bad, depending on people's reaction to it.
Certainly it wouldn't have the "fairness" of an unlimited queue, and
we'd still have attack-telnetting most of the time, but possibly less
attack-telnetting because 10 potential attack-telnetters are already
waiting in line, or possibly more attack-telnetting because people
perceive their chances of getting in the queue as better than they
were of getting a login prompt without any queues. And that perception
may be right or wrong, depending on other people's perceptions, which
is why I think it's unpredictable!
Re Pete's idea of nuking queue spots coming from the same location,
different people can legitimately be telnetting from the same machine.
For example, people using shell accounts from the same ISP often have
the same address (unless the ISP has multiple machines).
|
albaugh
|
|
response 67 of 196:
|
Jul 16 17:25 UTC 1996 |
Hey, I like #65! :-) Actually, though, why anyone would request another
slot in the queue is beyond me, since it doesn't help anything - unless the
person didn't know he was already in the queue, or if he were malicious.
But to answer the question about malicious queue-clogging, I assume that
the queue entries are tied to IP addresses, so it would be easy to detect
multiple, duplicate entries from the same person.
|
rcurl
|
|
response 68 of 196:
|
Jul 16 17:31 UTC 1996 |
It could be different user from the same site.
Re #62: what John describes - being dumped quickkly after getting the
login prompt - is what I am experiencing on all attempts to telnet in.
Are there any data I should record to report along with the fact that
this has happened?
Re #63: you should not be confused, brighn, by the existence of a "bug"
in software. The 1-minute timeout is intentional. The disconnection that
both I and John have described is a problem.
Re #64: the reason for discussing the proposal and how it would work, etc,
in advance, is not for the benefit of staff, who have already discussed
it in board and staff meetings, but for the benefit of members and users
so they will know what to expect in general, and what to remark upon
that deviates from those expectations. It is also to include members and
users in the process of development of new ways of operation of Grex, to
answer their questions, and obtain their comments and suggestions.
Well, I would have *predicted* that the queues would have gotten very
big, if I had been given a chance to consider the matter by being told
exactly what was being proposed. Jan is figuring out some of the
possible reasons in his #64, but with more heads considering the matter,
I am sure that at least some of the problems would have been thought of
in advance. Maybe even the solutions!
The queueing system does, of course, very much favor the user with the
time and inclination to wait around. Its use will, in time, change th
composition of the user body of Grex - a demographic change induced by
a technical modification. It should be considered whether that demographic
change - toward the heavier computer users - is desirable.
|
rcurl
|
|
response 69 of 196:
|
Jul 16 17:34 UTC 1996 |
#s 66 and 67 slipped in.
|
mdw
|
|
response 70 of 196:
|
Jul 16 18:49 UTC 1996 |
Limiting telnetd to a subset of the pty pool is in the "not yet done"
part of the to do list. I had already made quite enough changes to get
this far, I wanted to debug them thoroughly before going on. The subset
logic should be quite trivial, and will even have the added benefit of
being considerably nicer on the system (trying all 64 ptys's until one
can be opened is not actually very nice at all). I did want to be very
sure that the code path to efficiently deal with "no more pty's" was
very well debugged before putting in more logic that was likely to
ensure the "no more pty's" was only rarely if ever used.
Actually, there's another mode this version can run in: a sort of
modified "lottery" system. If there's no waitlist file, this version
won't just say "no ports", but it will then go into a modified form of
the "waiting for a port" state. It won't automatically try for a port,
but if you say 'x', it will try. Presumably, the user who says 'xxx'
the most wins. This is obviously more efficient than attack telnet's,
but less efficient than a queue. I suppose we could make the 'xxx' more
complicated. Perhaps making the user answer "trivial pursuits"
questions correctly in order to try for a free port?
|
rcurl
|
|
response 71 of 196:
|
Jul 16 19:15 UTC 1996 |
How about people on the queue bidding against one another for the next
available port?
|
robh
|
|
response 72 of 196:
|
Jul 16 19:46 UTC 1996 |
"You are currently #213 in line for a telnet port. To advance
in the queue, give us your credit card number now and authorize
us to take $1 for every slot you want to advance. E.g. to advance
to slot #1, give us $212."
A few weeks of that, and we could afford ISDN. >8)
|
jenna
|
|
response 73 of 196:
|
Jul 16 21:15 UTC 1996 |
Well I got sick of reading the 50 responses of
weirdness and redundancy. It seems to be working juysst fine now...
and i'm glad of it! much better than tyhe telnetd thing!
|
brighn
|
|
response 74 of 196:
|
Jul 16 21:23 UTC 1996 |
Jan's suggestion of a maximal queue size seems reasonable,
though with 62 or so slots on Grex, a max of 10 seems small.
I'd suggest 25.
That's approximately a ten minute wait.
|