|
Grex > Coop12 > #183: Grex's UPS was the reason for the downtime on 4/15/2003 | |
|
| Author |
Message |
steve
|
|
Grex's UPS was the reason for the downtime on 4/15/2003
|
Apr 16 06:29 UTC 2003 |
Grex's downtime yesterday was caused by our UPS, it seems.
When I got to the Pumpkin I could hear a squealing noise,
coming from the UPS. It's "fault" light was on and everything
was dead. Resetting the UPS gave power to everything, but
about 1 second into the boot process the noise came back on,
the UPS clamped down and that was that.
The solution to getting Grex back up was to take Grex itself
(the Sun-4/670) off the UPS, taking a lot of the power burden
off the Leibert. That made it much happier, and after about 20
minutes Grex was back on the air, undergoing a mailstorm as I
left it.
We have a problem however, which I still need to work on. We
have two possibilities here. First, the batteries could be bad
enough that it won't operate under heavy load with them, or we
could have an internal fault with the unit itself. There are
several LEDs on the front panel which tell the state of things
when problems arrise. I now know what they are, so can tell
whats going on when I'm next at the Pumpkin, which will be
very soon. I don't like the idea of Grex being off the UPS.
Fortunately the disks are still being protected, but I'd like
to see everything back on it.
This is a commercial UPS we have, made by Leibert, Inc. We got
it at the 1999 Dayton Hamvention, sold by Leibert employees who
were getting rid of stuff they couldn't sell as "new", but which
was still new as it hadn't been used. It's been in service for
just under 4 years now. The manual says that the batteries should
be replaced every five years, and considering that the batteries
were not new (weak, I think they were called), they could well be
the source of the problem. I have email into them about this, so
with luck we'll know tomorrow some time, whats going on. At the
worst case we'll have to send part of it back to them for evaluation
and (hopefully) repair. This has been a wonderful unit--well built,
reliable and one of those things that can be largely forgotten
about. Nothing is perfect, but our Leibert sure has come close.
I'm of course hoping that the unit isn't damaged, as that was
about $1800 new (list price); we paid $150 or $175 for it.
I'll post updates as I know about them.
|
| 34 responses total. |
tod
|
|
response 1 of 34:
|
Apr 16 16:53 UTC 2003 |
Thanks for the update STeve. Please let us know if there are any expected
costs.
|
steve
|
|
response 2 of 34:
|
Apr 16 18:16 UTC 2003 |
Well, we do need to get batteries. We talked about that at a
board meeting and authorized some money for that. Just how much
I still don't know. The first set of batteries I found were
correct in terms of capacity but wouldn't fit in the case. I
now that the referance manual and can get the right part #.
I think that the problem is more likely the betteries as
opposed to a more serious fault. I say this after talking
with Liebert people--its unlikely though possible that the
unit would clamp down just by being heavily used but work at
40% load. Anyway, I need to go back and play with the unit
again to see what the diagnostic LEDs say.
|
steve
|
|
response 3 of 34:
|
Apr 16 22:59 UTC 2003 |
I really like Leibert. It turns out I got the wrong PDF manual
for our UPS, so they sent me the right now. I still don't know if
the problem is batteries or the unit, but now I have the exact doc
and will be looking for the error codes next time I'm at the Pumpkin.
|
dpc
|
|
response 4 of 34:
|
Apr 18 13:52 UTC 2003 |
That's cute - our "uninterruptible" power system interrupted us. 8-)
|
steve
|
|
response 5 of 34:
|
Apr 18 20:07 UTC 2003 |
Yes, after some 30,000 hours of continuous use. Things do eventually
break. And really, it still works--its just wounded. ;-)
|
scg
|
|
response 6 of 34:
|
Apr 30 06:35 UTC 2003 |
The five year rating for battery life sounds wrong to me, based on experience.
I'd expect it to be about half that, at best, under heavy load.
I seem to recall that this UPS, even when new, gave Grex only a few minutes
of run time. That presumably means we've been loading it far beyond what it
was designed for for years.
I'd be very nervous about leaving anything else on it that matters, if it's
been shutting off. I'd suggest taking everything off it until it gets fixed.
|
steve
|
|
response 7 of 34:
|
Apr 30 18:44 UTC 2003 |
The way Leibert works the batteries themselves aren't under heavy load
till they actually are used. Thats one of the things I like about Leibert.
I have seen several installations of them where they really did last five
years. Granted it was a little used UPS, but thats the best test condition.
I have mail out about pricing for replacements. They should be $29 each
for less, for four of them.
|
steve
|
|
response 8 of 34:
|
May 6 17:22 UTC 2003 |
It's pretty much assured that the cause of the glitch in the UPS
is due to the batteries. The UPS is happy now, running at full load--
I could not make it fail. In talking with the folks at Leibert just
now, the problem is much more the batteries than the UPS itself, which
is nice. Right now it doesn't like surging beyond 100%. I have the
batteries specs and think that they're available here. As soon as I'm
at work I will be making some calls. Sounds like they are about the
same locally or by mail order, since the shipping on them isn't
nothing. I expect they will be $29 each; we need four of them. If
I remember right, the board authorized $150 for them? In any event
I will get pricing today.
|
aruba
|
|
response 9 of 34:
|
May 6 17:51 UTC 2003 |
Looks like we allocated $200 for UPS batteries on 12/5/2001, but they were
never bought; then we allocated $120 for the same thing on 12/2/2002.
|
davel
|
|
response 10 of 34:
|
May 7 12:56 UTC 2003 |
So we wait another year before we actually buy them?
8-{)]
|
aruba
|
|
response 11 of 34:
|
May 7 14:30 UTC 2003 |
Heh. No, it seems our need has become urgent.
|
steve
|
|
response 12 of 34:
|
May 8 23:48 UTC 2003 |
I have them now. They look quite reasonable and shouldn't
be hard to install. Thanks for Mark for contacting the place
in Lansing, so krj and I could easily get them.
|
scg
|
|
response 13 of 34:
|
Jun 7 07:02 UTC 2003 |
There's been a lot of talk on the NANOG list recently about Leibert UPSs
catching fire.
|
aruba
|
|
response 14 of 34:
|
Jun 7 18:32 UTC 2003 |
Have you heard any model numbers, Steve?
|
lk
|
|
response 15 of 34:
|
Jun 8 16:54 UTC 2003 |
That's impressive. Catching fires? Smokey would be proud. (:
|
polytarp
|
|
response 16 of 34:
|
Jun 9 05:12 UTC 2003 |
You're a Zionist, Learon Kopelman.
|
lk
|
|
response 17 of 34:
|
Jun 9 23:49 UTC 2003 |
Smokey the bear is a Zionist, too.
|
polytarp
|
|
response 18 of 34:
|
Jun 10 00:37 UTC 2003 |
AH HA!! YOU ADMITE YOU"RE A ZIONICST!
|
davel
|
|
response 19 of 34:
|
Jun 10 01:30 UTC 2003 |
"admite"?
|
mdw
|
|
response 20 of 34:
|
Jun 10 06:22 UTC 2003 |
It's probably some sort of biblical term, related to "smite".
|
polytarp
|
|
response 21 of 34:
|
Jun 10 11:22 UTC 2003 |
Learon is the word's biggest Zionist.
|
tod
|
|
response 22 of 34:
|
Jun 10 15:51 UTC 2003 |
Polytarp is jealous that he has to sit in his own "outskirt development"
|
polytarp
|
|
response 23 of 34:
|
Jun 10 19:06 UTC 2003 |
Nah, I just don't like Learon.
|
lk
|
|
response 24 of 34:
|
Jun 11 00:57 UTC 2003 |
Just because I refused your advances?
You'd better start getting used to rejection.
|