No Next Item No Next Conference Can't Favor Can't Forget Item List Conference Home Entrance    Help
View Responses


Grex Info Item 272: Newuser duplicate checking shouldn't take so long. What's it doing?
Entered by mikep on Thu Jan 11 00:15:24 UTC 1996:

Why does newuser go through this whole long drawn-out process of
counting down when it's ostensibly checking for duplicate login-ids?
It took 60 seconds when I went through newuser today, and I did a

        time grep stealth /etc/passwd

shortly thereafter, and it came back with 0.5 seconds real time.
I'm sure a getpwent() would be even quicker than that.
What the heck is newuser doing when it claims that it's checking
for duplicates?

3 responses total.



#1 of 3 by davel on Thu Jan 11 04:40:30 1996:

<dave experiences nostalgia, remembering the days when 10 or 20 minutes was
common>


#2 of 3 by orinoco on Mon Jan 15 14:26:01 1996:

I often have wondered what the delay is for, but 60 seconds is faster than
it took me for any of my multitudinous pseudos many moons ago.  


#3 of 3 by mdw on Sun Jan 28 04:48:41 1996:

Actually, getpwent is quite slow, because it has to parse each line.
Newuser doesn't actually use getpwent, for portability reasons.  The
routine it uses does the same work however; which involves calling fgets
to get a line, & a custom routine to copy characters up to the next
colon or newline.  In addition, newuser marks a bitmap with UID's in use
(so that it can allocate the next free UID), does a string compare to
see if the loginid matches, & does a "ftell" to decide when to make
noise.  In most implementations of stdio, ftell will do a "lseek".  What
that all means is, each character of the password file is copied 3 times
as it's processed - once from disk to the buffer pool, once from the
buffer pool into the the process stdio buffer and once from the stdio
buffer into a line buffer.  There's also a 4th per-character pass "in
place" that replaces the colons & newlines with nul's, as the line is
parsed.

Even though this loop takes time in newuser, it's still somewhat speedy.
There's another program, "passwd", where things get even worse.  In the
standard vendor version of "passwd", the program makes *multiple* passes
through the password file, as it verifies the person's old password, and
then updates the file - which it does by doing "getpwent" on each line,
changing the line it wants, writing it all out into a temporary file,
then copying the file back (the last, thankfully, is at least done with
read/write & not stdio.) Even so, this is an expensive process & it
loses, badly, on a system the size & speed of grex.  In fact, it's so
excessively lousy that we no longer use it - the passwd program that's
on grex is a custom version that uses a very different algorithm - it
uses "mmap" to map the password file in, and does much of its logic "in
place".  In fact, most of the time, it can write the password right into
the same place in the file, & avoid the need to use a temporary file.

Using mmap is a clever strategy, but it does have its price.  mmap is a
recent addition to Unix, & it's still not real standardized.  Both AIX &
SunOS, for instance, have their special differences, that the code needs
to understand.  mmap is an even more recent addition in netbsd/freebsd;
it didn't exist in the older releases, & even in current versions, there
are consistency problems if you try to write to a file that is mmap'd
in.

Another strategy is to avoid the use of flat files entirely.  Indeed, on
grex, there are dbm files that contain the information in the password
file, and a few high demand programs, such as mail, & ls, have been
modified to use the dbm file instead.  This does speed things up, but at
a price - the library to do this is somewhat crazed, & the result
required hundreds of lines of changes to sendmail to make it all work.

Unfortunately, while newuser has to update the dbm files, it can't take
advantage of them for duplicates - because it still has to scan through
all the lines to find the next available UID to allocate.  Even so,
obviously, newuser *could* be improved - however, it's not the main CPU
consumer on the system.  The program that would have the largest
optimization pay-back is actually pine; if we could speed that up, then
that would definitely have a substantial effect on system performance.
Even so, our version of newuser is not that bad off.  I understand the
version of newuser that's on m-net has to rebuild the entire dbm file
from scratch everytime it's run.  Our version is smart enough to just
update two records in the dbm file, a much cheaper operation.

Response not possible - You must register and login before posting.

No Next Item No Next Conference Can't Favor Can't Forget Item List Conference Home Entrance    Help

- Backtalk version 1.3.30 - Copyright 1996-2006, Jan Wolter and Steve Weiss