|
Grex > Coop13 > #397: Grex needs to buy a new disk, pronto | |
|
| Author |
Message |
steve
|
|
Grex needs to buy a new disk, pronto
|
Feb 19 04:28 UTC 2007 |
Grex has a failing disk at the moment. I'm pleased to say that
it didn't just crap out as we've had in the past, but it's definitely
sick and we're living on borrowed time.
This last Saturday I spent time on Grex first making backups of
nearly al the system, and then replaced the failing disk with the
replacement disk we got from Seagate when we had our last disk
disaster. This replacement was a "certified repaired" disk we got,
which of course was certified bad--in the process of restoring
data to our new disk all sorts of random errors cropped up, and
playing with it more only revealed more weirdness. At this point
it was getting close to 7pm, so I put the original dying disk
back in service.
So once again we've managed to skirt around a disk disaster,
at least today. We need to get a new disk, and soon.
We can't get an 18G disk, they aren't made any more. We can
still however get a 36G disk, just like last time, for about $250.
This isn't a bad thing, as the three partitions on the disk are
/tmp (4g), /c (5g) and /var/mail (8g). Having a 36G disk there
would mean we could have a 24G /var/mail partition, so we could
hold more spam. ;-)
We've been talking about getting a raid system, so in a way
this is spending money only to change things later, but I think
we don't have much of a choice here. We need a replacement now,
and given the problems with the lack of /var/mail space, getting
a 36G disk makes a lot of sense. Add the fact that "certified
repaired" disks are all too often not, getting a new disk is the
most reasonable thing.
I sent mail to Leeron Kopelman to see if his place still
sells disks, since we've gotten things from him in the past.
If he doesn't Newegg.com has them for $250.
We need to act on this in the next day or two. We're
being given extra time here. When we get a replacement I'll
take time off from work to install it if I have to.
|
| 26 responses total. |
cross
|
|
response 1 of 26:
|
Feb 19 04:58 UTC 2007 |
It occurs to me that the disk that is dying is sd2. However, grex has
plenty of reserved space on sd1 to take over the duties of sd2 until we
could implement a more robust disk storage subsystem.
In particular, currently, /b is empty. We could dump the contents of /c
into /b (for a neglibible overall reduction in space) and remount /dev/sd0k
(which currently holds /b) on /c. Similarly, we could dump the contents of
/tmp into sd1e, which is currently being mounted as /alt/usr (and which
isn't likely to change that much over the next few weeks) and remount that
as /tmp; as it is, /tmp is ridiculously oversized and while the partition
we'd copy it onto is only one quarter the size of what we have now, we'd
still be close to 0% utilization on it. Finally, we could dump /var/mail to
sd1f, which is presently mounted as /alt/usr/local (again, not likely to
change drastically over the next couple of weeks), and remount that onto
/var/mail; that partition and the current partition are close in size.
To summarize:
CURRENT GETS REMAPPED TO
/tmp (/dev/sd2a) /alt/usr (/dev/sd1e)
/c (/dev/sd2d) /b (/dev/sd0k)
/var/mail (/dev/sd2e) /alt/usr/local (/dev/sd1f)
This increases the load on sd0 and sd1, but only for a short time until we
can get and configure a RAID system and it saves us $250. Plus, this is
something we can do *now*, instead of waiting for a new disk to be
delivered, someone to install it into grex, partition it, newfs it, etc.
|
cmcgee
|
|
response 2 of 26:
|
Feb 19 13:33 UTC 2007 |
Thanks Steve for spending so much time, and for being willing to take time
off to make sure we stay running!
|
janc
|
|
response 3 of 26:
|
Feb 19 15:49 UTC 2007 |
My vote as board member and staff is to purchase immediately:
(1) new disk drive as STeve recommends.
(2) a DVD-W drive for Grex.
The DVD-W drive is to make backups easier, and ordering it at the same time
is so that we can install both drives at the same time.
|
slynne
|
|
response 4 of 26:
|
Feb 19 16:12 UTC 2007 |
That sounds like a good idea to me. I am fully in support of that
|
cross
|
|
response 5 of 26:
|
Feb 19 16:20 UTC 2007 |
I wonder why we want to buy a new disk when we can use the disk we already
have and start moving towards a RAID solution.
|
cross
|
|
response 6 of 26:
|
Feb 19 16:26 UTC 2007 |
Regarding #3, #4; Is there a reason why either of you disagree with #1?
|
nharmon
|
|
response 7 of 26:
|
Feb 19 16:47 UTC 2007 |
The refurbed drive isn't warranteed?
|
steve
|
|
response 8 of 26:
|
Feb 19 17:35 UTC 2007 |
Dan has a most excellent idea. I am abashed to say that I had
forgotten all about the /b partition. I'm used to thinking of /b
as the picospan code, rather than a partition for users.
With that, I think Dan is right and we have the space to make
the alt partitions usable for other things. The /tmp space would
be 1/4 the size, but I think we can live with that for the time
being. /var/mail would be within a few percent of its original
size, and moving /c to /b is about the same thing.
Thanks Dan -- I think we can do this. Let me do work work
for a bit as I ponder this; if it didn't work out we can always
get a new disk.
|
cross
|
|
response 9 of 26:
|
Feb 19 17:40 UTC 2007 |
No problem, Steve! My pleasure to help out! If you need any backup, and
there's anything I can do, please let me know. I'm home sick and crawling
the walls with boredom. :-)
|
drew
|
|
response 10 of 26:
|
Feb 19 21:11 UTC 2007 |
$250 for 18G sounds excessive to me, even for Scuzzy.
Best Buy has hard drives on sale this week:
160GB Westerd Digital EIDE or SATA, $59.99
320GB WD (probably EIDE) for $109.99
250GB Seagate for $99.99
Instant savings, no rebates involved.
I've been happy personally with Western Digital.
Don't modern motherboards have built-in EIDE controllers? Get a
160GB drive from Best Buy, put it in, and move the whole system
to /dev/hda[1-n].
|
steve
|
|
response 11 of 26:
|
Feb 19 21:23 UTC 2007 |
We've been using scsi disks because of their speed; the ones
we have are 15K rpm. When I was testing stuff, I was getting
about 70M/sec transfer rates. To contrast that with my laptop
(udma mode 5), I can get about 42M/sec via dd. These are also
U320 disks; we have a U160 controller currently, but if we
decided to stay with scsi we could get a u320 disk controller
and have better disk i/o.
|
ric
|
|
response 12 of 26:
|
Feb 21 14:39 UTC 2007 |
If y'all don't mind me asking... $250 *IS* outrageously expensive.. why don't
you hit ebay for a replacement drive? Thre are *MANY* listings for 18 gig,
15k RPM U160 drives on ebay.
|
other
|
|
response 13 of 26:
|
Feb 22 03:52 UTC 2007 |
A study has just come out based on real-world usage of a vast array of
disks, and one of the conclusions was that failure rates between
commercial and consumer grade drives did not substantially differ.
Does this mean there are more inexpensive disks we should consider
purchasing?
|
cross
|
|
response 14 of 26:
|
Feb 22 04:03 UTC 2007 |
Interesting, but believable. Eric, do you have a cite?
|
mcnally
|
|
response 15 of 26:
|
Feb 22 05:41 UTC 2007 |
There're two papers he could be talking about. Both made Slashdot headlines
in the past couple of days; one was from Google and the other was from some
large research consortium if I remember correctly.
|
other
|
|
response 16 of 26:
|
Feb 22 16:09 UTC 2007 |
Them's the ones. I only saw the lead in my RSS reader.
|
ric
|
|
response 17 of 26:
|
Feb 22 19:39 UTC 2007 |
I wanna get me a couple of those 'perpendicular' hard drives like the
barracude 7200 10...
|
steve
|
|
response 18 of 26:
|
Feb 24 23:59 UTC 2007 |
The study Eric is talking about compares the normal disks to
"extended duty" disks. IBM's travelstar disks in laptops were
like that. The bottom line is that all of them are getting
better, such that the differences between those two flavors
didn't amount to much.
However, the type of disk does matter. You can see it today
in the length of the warranty offered. IDE disks are typically
1 year warranty. Their rock-bottom price coupled with ever
increasing performance meant something had to give, and that
was, sadly, quality. SCSI disks are a lot more, offer much
better transfer rates (well, they used to) and had better
warranties. The newest kind of disk, SATA are interesting:
they are cheap, have some pretty decent transfer rates, and
have a failure rate in the field of about 0.5%. I'm trying
to get that study so I can post it. SCSI disks of the kind
we have are still the fastest disks, in that they rotate at
15K rpm and are ultra-320 speed, for 300Mb/sec rates. But
they cost a *lot* more, and I'm not sure that Grex's next
generation of disks needs to be SCSI any more. Time
marches on.
We're now off of the defective sd2 disk, using other
partitions that wern't used. Thanks to Dan for that
idea, as now we have a little breathing room without
spending money on another scsi disk.
|
cross
|
|
response 19 of 26:
|
Feb 25 04:10 UTC 2007 |
Here's a pointer to Google's study. Most of the disks in use in google data
centers are serial and parallel ATA.
http://labs.google.com/papers/disk_failures.pdf
Here's a speed comparison chart Seagate has put together; a 7200RPM SATA
Barracuda is somewhere between a SCSI 10K RPM Cheetah and a SCSI 15K Cheetah,
except that the Cheetah's are between two and three times faster in terms of
access time. I think a RAID controller with a lot of cache memory would
amortize most of that difference.
|
cross
|
|
response 20 of 26:
|
Feb 25 04:10 UTC 2007 |
Whoops, here's the pointer to the Seagate page:
http://www.seagate.com/www/en-us/support/before_you_buy/speed_consideration
s/
|
jared
|
|
response 21 of 26:
|
Mar 7 19:05 UTC 2007 |
re#11
Laptop drives are some of the worst to test against because they typically
run at lower rpms (even as low as 4200rpm) to keep noise and heat at a much
lower level.
I've been getting cheap disks for my hosts from 3btech.net (located in
Indiana and free ground shipping) for several years now without any
failures. I typically buy the OEM and/or Refurb disks and use them
for my backup solutions for cheap storage.
http://3btech.net/ideover160.html
We could also use an ATA or SATA hardware raid controller (I have a SATA one
i could donate) to do raid 0+1 across 2 or 4 disks. Even if they're slower
discs, you will see better than the 70MB/s out of the SCSI disc if you have
multiple spindles and do round-robin reads.
I've also stopped partitioning my systems quite as much as grex
currently is partitioned. While I agree on a public host you need to
divide things up some, because we're not talking about 20MB disks
these days, going with something like a set of 250G "white label"
(refurb/OEM Western Digital) drives for $56 each would give another ~500g
of space for around $250 (buy 4, plus an ide raid controller, cables, etc..)
of mirrored space.
|
arthurp
|
|
response 22 of 26:
|
Mar 27 13:46 UTC 2007 |
Yep. Hardware mirroring with hot spares, and good OS support would be
the way to go. Speed increase on reads. Auto reliability. Cheap.
I like 3btech as well.
|
maus
|
|
response 23 of 26:
|
Mar 30 02:24 UTC 2007 |
Is 3btech a vendor that you order through? If I have been looking at
references to the right group (http://3btech.net), they appear to be a
vendor with a very strong reputation. A strong reputation nearly always
beats a bargain, IMHO. Can you get an estimated quote on the stack of
Serial ATA drives and the RAID board and the cage and cables that we
were looking at?
|
krokus
|
|
response 24 of 26:
|
Apr 2 18:10 UTC 2007 |
I guess that depends on if it's a good reputation, and how severe
the bargain is.
|