Oxidisation of Seagate & WDC PCBs

In comp.sys.ibm.pc.hardware.storage Mike Tomlinson <mike@none.invalid> wrote:
In article <82cni4F42iU1@mid.individual.net>, Arno <me@privacy.net
writes

That sounds like BS to me. A soft pencil eraser cannot remove silver
sulfide, it is quite resilient.

It's a technique that has been used on edge connectors for many years.
It works with a harder eraser and it works for tin contacts with
a soft one. But it does not work for silver contacts, you need
to have at least some sand in th eraser for that.

Arno
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
 
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
In sci.electronics.repair Franc Zabkar <fzabkar@iinternode.on.net> wrote:
On Thu, 8 Apr 2010 14:03:39 -0700 (PDT), whit3rd <whit3rd@gmail.com
put finger to keyboard and composed:

On Apr 8, 12:11?am, Franc Zabkar <fzab...@iinternode.on.net> wrote:

Is this the fallout from RoHS?

Maybe not. There are other known culprits, like the drywall (gypsum
board,
sheetrock... whatever it's called in your region) that outgasses
hydrogen
sulphide. Some US construction of a few years ago is so bad with
this
toxic and corrosive gas emission that demolition of nearly-new
construction
is called for.

Corrosion of nearby copper is one of the symptoms of the nasty
product.

It's not just Russia that has this problem. The same issue comes up
frequently at the HDD Guru forums.

I'm right here in the US and I had 3 of 3 WD 1TB drives failed at the same
time in RAID1 thus making the entire array dead. It is not that you can
simply buff that dark stuff off and you're good to go. Drive itself tries to
recover from failures by rewriting service info (remapping etc.) but
connection is unreliable and it trashes the entire disk beyound repair. Then
you have that infamous "click of death"... BTW, it is not just WD; others
are also that bad.

It is extremly unlikely for a slow chemical process to achive this
level of syncronicity. About as unlikely that it would be fair to call
it impossible

Your array died from a different cause that would affect all drives
simultaneously, such as a power spike.

Yes, they did not die from contacts oxidation at that very same moment. I
can not even tell they all died the same month--that array might've been
running in degraded mode with one drive dead, then after some time second
drive died but it was still running on one remaining drive. And only when
the last one crossed the Styx the entire array went dead.

Ah, I see. I did misunderstand that. May still be something
else but the contacts are a possible explanation with that.
I don't think it is something else but everything is possible...

I don't use Windows so my machines are never turned off unless there
is a real need for this. And they are rarely updated once they are
up and running so there is no reboots. Typical uptime is more than a
year.

So your disks worked and then refused to restart? Or you are running
a RAID1 without monitoring?
They failed during weekly full backup. One of the files read failed and they
entered that infinite loop of restarting themself and retrying. Root
filesystem was also on that RAID1 array so there was no other choice than
to reboot. And on that reboot all 3 drives failed to start with the same
"click of death" syndrome.

I don't know though how I could miss a degradation alert if there was any.

Well, if it is Linux with mdadm, it only sends one email per
degradation event in the default settings.
Yep, I probably missed it when shoveling through mountains of spam.

All 3 drives in the array simply failed to start after reboot. There were
some media errors reported before reboot but all drives somehow worked. Then
the system got rebooted and all 3 drives failed with the same "click of
death."

The mechanism here is not that oxidation itself killed the drives. It never
happens that way. It was a main cause of a failure, but drives actually
performed suicide like body immune system kills that body when overreacting
to some kind of hemorrargic fever or so.

The probable sequence is something like this:

- Drives run for a long time with majority of the files never
accessed so it doesn't matter if that part of the disk where they
are stored is bad or not

I run long smart selftest on all my drives (RAID or no) every
14 days to prevent that. Works well.

- When the system is rebooted RAID array assembly is performed

- While this assembly is being performed a number of sectors on a
drive found to be defective and drive tries to remap them

- Such action involves rewriting service information

- Read/write operations are unreliable because of failing head
contacts so the service areas become filled with garbage

- Once the vital service information is damaged the drive is
essentially dead because its controller can not read vital data to
even start the disk

- The only hope for the controller to recover is to repeat the read
in hope that it might somehow get read. This is that infamous
"click of death" sound when drive tries to read the info again and
again. There is no way it can recover because that data are
trashed.

- Drives do NOT fail while they run, the failure happens on the next
reboot. The damage that would kill the drives on that reboot
happened way before that reboot though.

That suicide also can happen when some old file that was not accessed for
ages is read. That attempt triggers the suicide chain.

Yes, that makes sense. However you should do surface scans on
RAIDed disks regularly, e.g. by long SMART selftests. This will
catch weak sectors early and other degradation as well.
I know but I simply didn't think all 3 drives can fail... I thought I have
enough redundancy because I put not 2 but 3 drives in that RAID1... And I
did have something like a test with regular weekly full backup that reads
all the files (not the entire disk media but at least all the files on it)
and that was that backup that triggered disk suicide.

Anyway lesson learned and I'm taking additional measures now. It was not a
very good experience loosing some of my work...

BTW, I took a look at brand new WDC WD5000YS-01MPB1 drives, right out of the
sealed bags with silica gel and all 4 of those had their contacts already
oxidized with a lot of black stuff. That makes me very suspicious that
conspiracy theory might be not all that crazy--that oxidation seems to be
pre-applied by the manufacturer.

---
******************************************************************
* KSI@home KOI8 Net < > The impossible we do immediately. *
* Las Vegas NV, USA < > Miracles require 24-hour notice. *
******************************************************************
 
Sergey Kubushyn wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net
wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net
wrote:
In sci.electronics.repair Franc Zabkar
fzabkar@iinternode.on.net> wrote:
On Thu, 8 Apr 2010 14:03:39 -0700 (PDT), whit3rd
whit3rd@gmail.com> put finger to keyboard and composed:

On Apr 8, 12:11?am, Franc Zabkar <fzab...@iinternode.on.net
wrote:

Is this the fallout from RoHS?

Maybe not. There are other known culprits, like the drywall
(gypsum board,
sheetrock... whatever it's called in your region) that outgasses
hydrogen
sulphide. Some US construction of a few years ago is so bad
with this
toxic and corrosive gas emission that demolition of nearly-new
construction
is called for.

Corrosion of nearby copper is one of the symptoms of the nasty
product.

It's not just Russia that has this problem. The same issue comes
up frequently at the HDD Guru forums.

I'm right here in the US and I had 3 of 3 WD 1TB drives failed at
the same time in RAID1 thus making the entire array dead. It is
not that you can simply buff that dark stuff off and you're good
to go. Drive itself tries to recover from failures by rewriting
service info (remapping etc.) but connection is unreliable and it
trashes the entire disk beyound repair. Then you have that
infamous "click of death"... BTW, it is not just WD; others are
also that bad.

It is extremly unlikely for a slow chemical process to achive this
level of syncronicity. About as unlikely that it would be fair to
call it impossible

Your array died from a different cause that would affect all drives
simultaneously, such as a power spike.

Yes, they did not die from contacts oxidation at that very same
moment. I can not even tell they all died the same month--that
array might've been running in degraded mode with one drive dead,
then after some time second drive died but it was still running on
one remaining drive. And only when the last one crossed the Styx
the entire array went dead.

Ah, I see. I did misunderstand that. May still be something
else but the contacts are a possible explanation with that.

I don't think it is something else but everything is possible...

I don't use Windows so my machines are never turned off unless there
is a real need for this. And they are rarely updated once they are
up and running so there is no reboots. Typical uptime is more than a
year.

So your disks worked and then refused to restart? Or you are running
a RAID1 without monitoring?

They failed during weekly full backup. One of the files read failed
and they entered that infinite loop of restarting themself and
retrying. Root filesystem was also on that RAID1 array so there was
no other choice than to reboot. And on that reboot all 3 drives
failed to start with the same "click of death" syndrome.

I don't know though how I could miss a degradation alert if there
was any.

Well, if it is Linux with mdadm, it only sends one email per
degradation event in the default settings.

Yep, I probably missed it when shoveling through mountains of spam.

All 3 drives in the array simply failed to start after reboot.
There were some media errors reported before reboot but all drives
somehow worked. Then the system got rebooted and all 3 drives
failed with the same "click of death."

The mechanism here is not that oxidation itself killed the drives.
It never happens that way. It was a main cause of a failure, but
drives actually performed suicide like body immune system kills
that body when overreacting to some kind of hemorrargic fever or so.

The probable sequence is something like this:

- Drives run for a long time with majority of the files never
accessed so it doesn't matter if that part of the disk
where they are stored is bad or not

I run long smart selftest on all my drives (RAID or no) every
14 days to prevent that. Works well.

- When the system is rebooted RAID array assembly is
performed

- While this assembly is being performed a number of sectors
on a drive found to be defective and drive tries to remap
them

- Such action involves rewriting service information

- Read/write operations are unreliable because of failing
head contacts so the service areas become filled with
garbage

- Once the vital service information is damaged the drive is
essentially dead because its controller can not read vital
data to even start the disk

- The only hope for the controller to recover is to repeat
the read in hope that it might somehow get read. This is
that infamous "click of death" sound when drive tries to
read the info again and again. There is no way it can
recover because that data are trashed.

- Drives do NOT fail while they run, the failure happens on
the next reboot. The damage that would kill the drives on
that reboot happened way before that reboot though.

That suicide also can happen when some old file that was not
accessed for ages is read. That attempt triggers the suicide chain.

Yes, that makes sense. However you should do surface scans on
RAIDed disks regularly, e.g. by long SMART selftests. This will
catch weak sectors early and other degradation as well.

I know but I simply didn't think all 3 drives can fail... I thought I
have enough redundancy because I put not 2 but 3 drives in that
RAID1... And I did have something like a test with regular weekly
full backup that reads all the files (not the entire disk media but
at least all the files on it) and that was that backup that triggered
disk suicide.

Anyway lesson learned and I'm taking additional measures now. It was
not a very good experience loosing some of my work...

BTW, I took a look at brand new WDC WD5000YS-01MPB1 drives, right out
of the sealed bags with silica gel and all 4 of those had their
contacts already oxidized with a lot of black stuff. That makes me
very suspicious that conspiracy theory might be not all that
crazy--that oxidation seems to be pre-applied by the manufacturer.
MUCH more likely that someone fucked up in the factory.
 
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
[...]
That suicide also can happen when some old file that was not accessed for
ages is read. That attempt triggers the suicide chain.

Yes, that makes sense. However you should do surface scans on
RAIDed disks regularly, e.g. by long SMART selftests. This will
catch weak sectors early and other degradation as well.

I know but I simply didn't think all 3 drives can fail... I thought I have
enough redundancy because I put not 2 but 3 drives in that RAID1... And I
did have something like a test with regular weekly full backup that reads
all the files (not the entire disk media but at least all the files on it)
and that was that backup that triggered disk suicide.

Anyway lesson learned and I'm taking additional measures now. It was not a
very good experience loosing some of my work...
Yes, I can imagine. I have my critical stuff also on a 3 way RAID1,
but with long SMART selftests every 2 weeks and 3 different drives,
two from WD and one from Samsung. One additional advantage of the
long SMART selftest is that with smartd you will get a warning
email on every failing test, i.e. one every two weeks. For additional
warning you can also run a daily short test, e.g..

BTW, I took a look at brand new WDC WD5000YS-01MPB1 drives, right out of the
sealed bags with silica gel and all 4 of those had their contacts already
oxidized with a lot of black stuff. That makes me very suspicious that
conspiracy theory might be not all that crazy--that oxidation seems to be
pre-applied by the manufacturer.
Urgh. These bags are airtight. No way the problem happened on your
side then. My two weeks old WD5000AADS-00S9B0 looks fine on the top
of the PCB. I think I will have a look underneath later.

Arno
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
 
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
[...]
That suicide also can happen when some old file that was not accessed for
ages is read. That attempt triggers the suicide chain.

Yes, that makes sense. However you should do surface scans on
RAIDed disks regularly, e.g. by long SMART selftests. This will
catch weak sectors early and other degradation as well.

I know but I simply didn't think all 3 drives can fail... I thought I have
enough redundancy because I put not 2 but 3 drives in that RAID1... And I
did have something like a test with regular weekly full backup that reads
all the files (not the entire disk media but at least all the files on it)
and that was that backup that triggered disk suicide.

Anyway lesson learned and I'm taking additional measures now. It was not a
very good experience loosing some of my work...

Yes, I can imagine. I have my critical stuff also on a 3 way RAID1,
but with long SMART selftests every 2 weeks and 3 different drives,
two from WD and one from Samsung. One additional advantage of the
long SMART selftest is that with smartd you will get a warning
email on every failing test, i.e. one every two weeks. For additional
warning you can also run a daily short test, e.g..
No matter what you do you can not prevent an occasional disaster :( One
MUST remember that "backup" in not a noun but a verb in imperative.

BTW, I took a look at brand new WDC WD5000YS-01MPB1 drives, right out of the
sealed bags with silica gel and all 4 of those had their contacts already
oxidized with a lot of black stuff. That makes me very suspicious that
conspiracy theory might be not all that crazy--that oxidation seems to be
pre-applied by the manufacturer.

Urgh. These bags are airtight. No way the problem happened on your
side then. My two weeks old WD5000AADS-00S9B0 looks fine on the top
of the PCB. I think I will have a look underneath later.
Those 4 were fine on the top of PCB. Black stuff was underneath, on those
pads contacting with springy heads pins.

---
******************************************************************
* KSI@home KOI8 Net < > The impossible we do immediately. *
* Las Vegas NV, USA < > Miracles require 24-hour notice. *
******************************************************************
 
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
[...]
That suicide also can happen when some old file that was not accessed for
ages is read. That attempt triggers the suicide chain.

Yes, that makes sense. However you should do surface scans on
RAIDed disks regularly, e.g. by long SMART selftests. This will
catch weak sectors early and other degradation as well.

I know but I simply didn't think all 3 drives can fail... I thought I have
enough redundancy because I put not 2 but 3 drives in that RAID1... And I
did have something like a test with regular weekly full backup that reads
all the files (not the entire disk media but at least all the files on it)
and that was that backup that triggered disk suicide.

Anyway lesson learned and I'm taking additional measures now. It was not a
very good experience loosing some of my work...

Yes, I can imagine. I have my critical stuff also on a 3 way RAID1,
but with long SMART selftests every 2 weeks and 3 different drives,
two from WD and one from Samsung. One additional advantage of the
long SMART selftest is that with smartd you will get a warning
email on every failing test, i.e. one every two weeks. For additional
warning you can also run a daily short test, e.g..

No matter what you do you can not prevent an occasional disaster :( One
MUST remember that "backup" in not a noun but a verb in imperative.
Indeed.

BTW, I took a look at brand new WDC WD5000YS-01MPB1 drives, right out of the
sealed bags with silica gel and all 4 of those had their contacts already
oxidized with a lot of black stuff. That makes me very suspicious that
conspiracy theory might be not all that crazy--that oxidation seems to be
pre-applied by the manufacturer.

Urgh. These bags are airtight. No way the problem happened on your
side then. My two weeks old WD5000AADS-00S9B0 looks fine on the top
of the PCB. I think I will have a look underneath later.

Those 4 were fine on the top of PCB. Black stuff was underneath, on those
pads contacting with springy heads pins.
Mine is fine on both sides. However there is a quite a bit of contact
area that looks and feels silver-plated to me, most notably areound
the screws and on the bottom the contacts to the head assembly.

Arno
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
 
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
[...]
That suicide also can happen when some old file that was not accessed for
ages is read. That attempt triggers the suicide chain.

Yes, that makes sense. However you should do surface scans on
RAIDed disks regularly, e.g. by long SMART selftests. This will
catch weak sectors early and other degradation as well.

I know but I simply didn't think all 3 drives can fail... I thought I have
enough redundancy because I put not 2 but 3 drives in that RAID1... And I
did have something like a test with regular weekly full backup that reads
all the files (not the entire disk media but at least all the files on it)
and that was that backup that triggered disk suicide.

Anyway lesson learned and I'm taking additional measures now. It was not a
very good experience loosing some of my work...

Yes, I can imagine. I have my critical stuff also on a 3 way RAID1,
but with long SMART selftests every 2 weeks and 3 different drives,
two from WD and one from Samsung. One additional advantage of the
long SMART selftest is that with smartd you will get a warning
email on every failing test, i.e. one every two weeks. For additional
warning you can also run a daily short test, e.g..

No matter what you do you can not prevent an occasional disaster :( One
MUST remember that "backup" in not a noun but a verb in imperative.

Indeed.

BTW, I took a look at brand new WDC WD5000YS-01MPB1 drives, right out of the
sealed bags with silica gel and all 4 of those had their contacts already
oxidized with a lot of black stuff. That makes me very suspicious that
conspiracy theory might be not all that crazy--that oxidation seems to be
pre-applied by the manufacturer.

Urgh. These bags are airtight. No way the problem happened on your
side then. My two weeks old WD5000AADS-00S9B0 looks fine on the top
of the PCB. I think I will have a look underneath later.

Those 4 were fine on the top of PCB. Black stuff was underneath, on those
pads contacting with springy heads pins.

Mine is fine on both sides. However there is a quite a bit of contact
area that looks and feels silver-plated to me, most notably areound
the screws and on the bottom the contacts to the head assembly.
That makes me wonder why are they silver-plated. It is definitely not the
best material longevitywise, especially for such low-level signals. It makes
me even more suspicious and adds to the conspiracy theory.

---
******************************************************************
* KSI@home KOI8 Net < > The impossible we do immediately. *
* Las Vegas NV, USA < > Miracles require 24-hour notice. *
******************************************************************
 
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
[...]
Those 4 were fine on the top of PCB. Black stuff was underneath, on those
pads contacting with springy heads pins.

Mine is fine on both sides. However there is a quite a bit of contact
area that looks and feels silver-plated to me, most notably areound
the screws and on the bottom the contacts to the head assembly.

That makes me wonder why are they silver-plated. It is definitely
not the best material longevitywise, especially for such low-level
signals. It makes me even more suspicious and adds to the conspiracy
theory.
Well, maybe. However I tend to think that "never attribute to
malice what can be adequately explained by stupidity" may apply.

These contacts should be gold plated with high quality gold. It is
also possible that the HDD vibration (always present with a running
HDD) and thermal variation allows the process to creep between the
contacts and kill them. Maybe a young, inexperienced engineer was
hired to replace an older, experienced (but more expensive one)
and that person made a pretty bad judgement call due to
inexperience, wanting to save a few cents on the design.

I have to say that the last time I saw silver plating as contact
protection was in vaccuum tube equipment. Modern electronics
typically uses Gold, or Tin for low insertion cycle contacts.

I also found a statement on Wikipaedia that silver plated
copper, once the copper is exposed in a place, will rapidly
corrode all over because of some electro-chemical process.
No idea whether this is true or not.

Arno
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
 
Sergey Kubushyn wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net
wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net
wrote: [...]
That suicide also can happen when some old file that was not
accessed for ages is read. That attempt triggers the suicide
chain.

Yes, that makes sense. However you should do surface scans on
RAIDed disks regularly, e.g. by long SMART selftests. This will
catch weak sectors early and other degradation as well.

I know but I simply didn't think all 3 drives can fail... I
thought I have enough redundancy because I put not 2 but 3 drives
in that RAID1... And I did have something like a test with
regular weekly full backup that reads all the files (not the
entire disk media but at least all the files on it) and that was
that backup that triggered disk suicide.

Anyway lesson learned and I'm taking additional measures now. It
was not a very good experience loosing some of my work...

Yes, I can imagine. I have my critical stuff also on a 3 way RAID1,
but with long SMART selftests every 2 weeks and 3 different drives,
two from WD and one from Samsung. One additional advantage of the
long SMART selftest is that with smartd you will get a warning
email on every failing test, i.e. one every two weeks. For
additional warning you can also run a daily short test, e.g..

No matter what you do you can not prevent an occasional disaster :(
One MUST remember that "backup" in not a noun but a verb in
imperative.

Indeed.

BTW, I took a look at brand new WDC WD5000YS-01MPB1 drives, right
out of the sealed bags with silica gel and all 4 of those had
their contacts already oxidized with a lot of black stuff. That
makes me very suspicious that conspiracy theory might be not all
that crazy--that oxidation seems to be pre-applied by the
manufacturer.

Urgh. These bags are airtight. No way the problem happened on your
side then. My two weeks old WD5000AADS-00S9B0 looks fine on the top
of the PCB. I think I will have a look underneath later.

Those 4 were fine on the top of PCB. Black stuff was underneath, on
those pads contacting with springy heads pins.

Mine is fine on both sides. However there is a quite a bit of contact
area that looks and feels silver-plated to me, most notably areound
the screws and on the bottom the contacts to the head assembly.

That makes me wonder why are they silver-plated. It is definitely not
the best material longevitywise, especially for such low-level signals.
Likely just some fool's reaction to the price of gold.

It makes me even more suspicious and adds to the conspiracy theory.
Nope.
 
Sergey Kubushyn wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
[...]
That suicide also can happen when some old file that was not accessed for
ages is read. That attempt triggers the suicide chain.
Yes, that makes sense. However you should do surface scans on
RAIDed disks regularly, e.g. by long SMART selftests. This will
catch weak sectors early and other degradation as well.
I know but I simply didn't think all 3 drives can fail... I thought I have
enough redundancy because I put not 2 but 3 drives in that RAID1... And I
did have something like a test with regular weekly full backup that reads
all the files (not the entire disk media but at least all the files on it)
and that was that backup that triggered disk suicide.
Anyway lesson learned and I'm taking additional measures now. It was not a
very good experience loosing some of my work...
Yes, I can imagine. I have my critical stuff also on a 3 way RAID1,
but with long SMART selftests every 2 weeks and 3 different drives,
two from WD and one from Samsung. One additional advantage of the
long SMART selftest is that with smartd you will get a warning
email on every failing test, i.e. one every two weeks. For additional
warning you can also run a daily short test, e.g..
No matter what you do you can not prevent an occasional disaster :( One
MUST remember that "backup" in not a noun but a verb in imperative.
Indeed.

BTW, I took a look at brand new WDC WD5000YS-01MPB1 drives, right out of the
sealed bags with silica gel and all 4 of those had their contacts already
oxidized with a lot of black stuff. That makes me very suspicious that
conspiracy theory might be not all that crazy--that oxidation seems to be
pre-applied by the manufacturer.
Urgh. These bags are airtight. No way the problem happened on your
side then. My two weeks old WD5000AADS-00S9B0 looks fine on the top
of the PCB. I think I will have a look underneath later.
Those 4 were fine on the top of PCB. Black stuff was underneath, on those
pads contacting with springy heads pins.
Mine is fine on both sides. However there is a quite a bit of contact
area that looks and feels silver-plated to me, most notably areound
the screws and on the bottom the contacts to the head assembly.

That makes me wonder why are they silver-plated. It is definitely not the
best material longevitywise, especially for such low-level signals. It makes
me even more suspicious and adds to the conspiracy theory.

---
******************************************************************
* KSI@home KOI8 Net < > The impossible we do immediately. *
* Las Vegas NV, USA < > Miracles require 24-hour notice. *
******************************************************************
You know of course that the black silver layer is still conductive
for low level signals??
 
In comp.sys.ibm.pc.hardware.storage Sjouke Burry <burrynulnulfour@ppllaanneett.nnll> wrote:
Sergey Kubushyn wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
[...]
Mine is fine on both sides. However there is a quite a bit of contact
area that looks and feels silver-plated to me, most notably areound
the screws and on the bottom the contacts to the head assembly.

That makes me wonder why are they silver-plated. It is definitely not the
best material longevitywise, especially for such low-level signals. It makes
me even more suspicious and adds to the conspiracy theory.

You know of course that the black silver layer is still conductive
for low level signals??
Silver Silfide is a (bad) conductor? That will help for the
R/W signal. However the lines for the moving coil go through
the same connector and they need a low resistance path.

Arno

--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
 
Sergey Kubushyn wrote:
In sci.electronics.repair Sjouke Burry <burrynulnulfour@ppllaanneett.nnll> wrote:
Sergey Kubushyn wrote:
cut
area that looks and feels silver-plated to me, most notably areound
the screws and on the bottom the contacts to the head assembly.
That makes me wonder why are they silver-plated. It is definitely not the
best material longevitywise, especially for such low-level signals. It makes
me even more suspicious and adds to the conspiracy theory.

---
******************************************************************
* KSI@home KOI8 Net < > The impossible we do immediately. *
* Las Vegas NV, USA < > Miracles require 24-hour notice. *
******************************************************************
You know of course that the black silver layer is still conductive
for low level signals??

It is not. Look at low level signal relays with stated _MINIMAL_ current
capacity and think why none of them has silver contacts.
I have done brain wave registration, eye movement detection and
skin resistance measurement with silver-chloride electrodes,
and they conducted nicely.
 
In sci.electronics.repair Sjouke Burry <burrynulnulfour@ppllaanneett.nnll> wrote:
Sergey Kubushyn wrote:
In sci.electronics.repair Sjouke Burry <burrynulnulfour@ppllaanneett.nnll> wrote:
Sergey Kubushyn wrote:
cut
area that looks and feels silver-plated to me, most notably areound
the screws and on the bottom the contacts to the head assembly.
That makes me wonder why are they silver-plated. It is definitely not the
best material longevitywise, especially for such low-level signals. It makes
me even more suspicious and adds to the conspiracy theory.

---
******************************************************************
* KSI@home KOI8 Net < > The impossible we do immediately. *
* Las Vegas NV, USA < > Miracles require 24-hour notice. *
******************************************************************
You know of course that the black silver layer is still conductive
for low level signals??

It is not. Look at low level signal relays with stated _MINIMAL_ current
capacity and think why none of them has silver contacts.

I have done brain wave registration, eye movement detection and
skin resistance measurement with silver-chloride electrodes,
and they conducted nicely.
That is totally different application. Yes, silver sulfide is not a perfect
dielectric but it is not a good conductor either. And modern HDD heads are
magnetoRESISTIVE.

---
******************************************************************
* KSI@home KOI8 Net < > The impossible we do immediately. *
* Las Vegas NV, USA < > Miracles require 24-hour notice. *
******************************************************************
 
In sci.electronics.repair Sjouke Burry <burrynulnulfour@ppllaanneett.nnll> wrote:
Sergey Kubushyn wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
[...]
That suicide also can happen when some old file that was not accessed for
ages is read. That attempt triggers the suicide chain.
Yes, that makes sense. However you should do surface scans on
RAIDed disks regularly, e.g. by long SMART selftests. This will
catch weak sectors early and other degradation as well.
I know but I simply didn't think all 3 drives can fail... I thought I have
enough redundancy because I put not 2 but 3 drives in that RAID1... And I
did have something like a test with regular weekly full backup that reads
all the files (not the entire disk media but at least all the files on it)
and that was that backup that triggered disk suicide.
Anyway lesson learned and I'm taking additional measures now. It was not a
very good experience loosing some of my work...
Yes, I can imagine. I have my critical stuff also on a 3 way RAID1,
but with long SMART selftests every 2 weeks and 3 different drives,
two from WD and one from Samsung. One additional advantage of the
long SMART selftest is that with smartd you will get a warning
email on every failing test, i.e. one every two weeks. For additional
warning you can also run a daily short test, e.g..
No matter what you do you can not prevent an occasional disaster :( One
MUST remember that "backup" in not a noun but a verb in imperative.
Indeed.

BTW, I took a look at brand new WDC WD5000YS-01MPB1 drives, right out of the
sealed bags with silica gel and all 4 of those had their contacts already
oxidized with a lot of black stuff. That makes me very suspicious that
conspiracy theory might be not all that crazy--that oxidation seems to be
pre-applied by the manufacturer.
Urgh. These bags are airtight. No way the problem happened on your
side then. My two weeks old WD5000AADS-00S9B0 looks fine on the top
of the PCB. I think I will have a look underneath later.
Those 4 were fine on the top of PCB. Black stuff was underneath, on those
pads contacting with springy heads pins.
Mine is fine on both sides. However there is a quite a bit of contact
area that looks and feels silver-plated to me, most notably areound
the screws and on the bottom the contacts to the head assembly.

That makes me wonder why are they silver-plated. It is definitely not the
best material longevitywise, especially for such low-level signals. It makes
me even more suspicious and adds to the conspiracy theory.

---
******************************************************************
* KSI@home KOI8 Net < > The impossible we do immediately. *
* Las Vegas NV, USA < > Miracles require 24-hour notice. *
******************************************************************

You know of course that the black silver layer is still conductive
for low level signals??
It is not. Look at low level signal relays with stated _MINIMAL_ current
capacity and think why none of them has silver contacts.

---
******************************************************************
* KSI@home KOI8 Net < > The impossible we do immediately. *
* Las Vegas NV, USA < > Miracles require 24-hour notice. *
******************************************************************
 
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
[...]
Those 4 were fine on the top of PCB. Black stuff was underneath, on those
pads contacting with springy heads pins.

Mine is fine on both sides. However there is a quite a bit of contact
area that looks and feels silver-plated to me, most notably areound
the screws and on the bottom the contacts to the head assembly.

That makes me wonder why are they silver-plated. It is definitely
not the best material longevitywise, especially for such low-level
signals. It makes me even more suspicious and adds to the conspiracy
theory.

Well, maybe. However I tend to think that "never attribute to
malice what can be adequately explained by stupidity" may apply.
I agree but it looks like there is a pattern here...

These contacts should be gold plated with high quality gold. It is
also possible that the HDD vibration (always present with a running
HDD) and thermal variation allows the process to creep between the
contacts and kill them. Maybe a young, inexperienced engineer was
hired to replace an older, experienced (but more expensive one)
and that person made a pretty bad judgement call due to
inexperience, wanting to save a few cents on the design.
They did not save anything on that design. Gold plating is a common
procedure, it is everywhere, most of card-edge connectors (e.g. PCI) are
gold and they even called "gold fingers" by chinese PCB manufacturers.

Silver, on the other hand, is almost unheard of and I'm pretty sure PCB
makers would charge extra for this if they agree to do it at all. And it is
NOT that the entire board is silver-plated; there are gold-plated parts on
that same board that makes it have at least 2 different platings so it will
be more expensive than simple gold all over.

I have to say that the last time I saw silver plating as contact
protection was in vaccuum tube equipment. Modern electronics
typically uses Gold, or Tin for low insertion cycle contacts.
Yep. Silver plating was usually used in microwave equipment, HF coils etc.
where skin effect was so profound that current only ran through that silver
(that was quite thick, btw.) Silver is also used for HIGH CURRENT relay
contacts where the corrosion is removed by mechanical action of closing
contacts and burned through with high current.

If you look at low current signal relays with stated minimal current
capacity _NONE_ of them have silver contacts. It is usually gold, platinum,
rhodium, or a mix thereof.

I am all pro Occam's Razor but all this looks like deliberate effort to make
it fail after some time. It is NOT easier or cheaper to put silver there
because it is an _ADDITIONAL_ step and not so common one.

---
******************************************************************
* KSI@home KOI8 Net < > The impossible we do immediately. *
* Las Vegas NV, USA < > Miracles require 24-hour notice. *
******************************************************************
 
Arno wrote:
In comp.sys.ibm.pc.hardware.storage Sjouke Burry
burrynulnulfour@ppllaanneett.nnll> wrote:
Sergey Kubushyn wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
[...]
Mine is fine on both sides. However there is a quite a bit of
contact area that looks and feels silver-plated to me, most
notably areound the screws and on the bottom the contacts to the
head assembly.

That makes me wonder why are they silver-plated. It is definitely
not the best material longevitywise, especially for such low-level
signals. It makes me even more suspicious and adds to the
conspiracy theory.

You know of course that the black silver layer is still conductive
for low level signals??

Silver Silfide is a (bad) conductor?
Nope.

That will help for the R/W signal. However the lines for the moving coil
go through the same connector and they need a low resistance path.
The black silver layer conducts that fine.
 
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
[...]
Those 4 were fine on the top of PCB. Black stuff was underneath, on those
pads contacting with springy heads pins.

Mine is fine on both sides. However there is a quite a bit of contact
area that looks and feels silver-plated to me, most notably areound
the screws and on the bottom the contacts to the head assembly.

That makes me wonder why are they silver-plated. It is definitely
not the best material longevitywise, especially for such low-level
signals. It makes me even more suspicious and adds to the conspiracy
theory.

Well, maybe. However I tend to think that "never attribute to
malice what can be adequately explained by stupidity" may apply.

I agree but it looks like there is a pattern here...

These contacts should be gold plated with high quality gold. It is
also possible that the HDD vibration (always present with a running
HDD) and thermal variation allows the process to creep between the
contacts and kill them. Maybe a young, inexperienced engineer was
hired to replace an older, experienced (but more expensive one)
and that person made a pretty bad judgement call due to
inexperience, wanting to save a few cents on the design.

They did not save anything on that design. Gold plating is a common
procedure, it is everywhere, most of card-edge connectors (e.g. PCI) are
gold and they even called "gold fingers" by chinese PCB manufacturers.

Silver, on the other hand, is almost unheard of and I'm pretty sure PCB
makers would charge extra for this if they agree to do it at all. And it is
NOT that the entire board is silver-plated; there are gold-plated parts on
that same board that makes it have at least 2 different platings so it will
be more expensive than simple gold all over.
Good points. An exotic process would be more expensive than a
common one and two processes instead of one as well. I also happen
to know that putting gold directly on silcer is problematic, but
putting it directly on copper is fine. At least that is for galvanics
on jewelery and if I remember this correctly.

I have to say that the last time I saw silver plating as contact
protection was in vaccuum tube equipment. Modern electronics
typically uses Gold, or Tin for low insertion cycle contacts.

Yep. Silver plating was usually used in microwave equipment, HF coils etc.
where skin effect was so profound that current only ran through that silver
(that was quite thick, btw.) Silver is also used for HIGH CURRENT relay
contacts where the corrosion is removed by mechanical action of closing
contacts and burned through with high current.
That explains it. I have indeed seen it in power relais as well.

If you look at low current signal relays with stated minimal current
capacity _NONE_ of them have silver contacts. It is usually gold,
platinum, rhodium, or a mix thereof.

I am all pro Occam's Razor but all this looks like deliberate effort
to make it fail after some time. It is NOT easier or cheaper to
put silver there because it is an _ADDITIONAL_ step and not so
common one.
Well, it only makes the required level of stupidity larger,
because (if we have this right) they also need to mess up the
economic angle. If we assume they are competent, then indeed this
looks very much like a deliberate and rather bad design error.

Arno

--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
 
In comp.sys.ibm.pc.hardware.storage Sjouke Burry <burrynulnulfour@ppllaanneett.nnll> wrote:
Sergey Kubushyn wrote:
In sci.electronics.repair Sjouke Burry <burrynulnulfour@ppllaanneett.nnll> wrote:
Sergey Kubushyn wrote:
cut
area that looks and feels silver-plated to me, most notably areound
the screws and on the bottom the contacts to the head assembly.
That makes me wonder why are they silver-plated. It is definitely not the
best material longevitywise, especially for such low-level signals. It makes
me even more suspicious and adds to the conspiracy theory.

---
******************************************************************
* KSI@home KOI8 Net < > The impossible we do immediately. *
* Las Vegas NV, USA < > Miracles require 24-hour notice. *
******************************************************************
You know of course that the black silver layer is still conductive
for low level signals??

It is not. Look at low level signal relays with stated _MINIMAL_ current
capacity and think why none of them has silver contacts.

I have done brain wave registration, eye movement detection and
skin resistance measurement with silver-chloride electrodes,
and they conducted nicely.
Silver cloride and silver sulfide are two different things.

Ok, finally looked it up:

Silver Sulfide (Ag2S) is black and forms when silver is
exposed to the atmosphere by a reaction with hydrogen sulfide.
As to conducticity, it seems this really messes up contact
characteristics, including formiong diode-like effects and the
like. I found an abstract of a IEEE article from 1970 online:
"Electrical Characteristics of Contacts Contaminated
with Silver Sulfide Film"
So it seems it does concuct, but not well, uniformly or even
in an ohmic fashion. Very bad. THis would explain the HDD
failures, I think. If such a noise source is found in the
signal path from/to the heads and the moving coil, I think
this can cause all sorts of problems.

Silver Chloride (AgCl)is a white crystal used as referecne
electrode, becasue it has very stable characteristics, giving
you 230mV +/-10mV against a standard hydrogen electrode. I
conclude from this that Silver Cloride conducts reasonably
well and mostly in an ohmic fashion.


Sorry, but your observation does not aplly to the discussion
at hand.

Arno
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
 
Arno wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net
wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net
wrote:
In sci.electronics.repair Arno <me@privacy.net> wrote:
[...]
Those 4 were fine on the top of PCB. Black stuff was underneath,
on those pads contacting with springy heads pins.

Mine is fine on both sides. However there is a quite a bit of
contact area that looks and feels silver-plated to me, most
notably areound the screws and on the bottom the contacts to the
head assembly.

That makes me wonder why are they silver-plated. It is definitely
not the best material longevitywise, especially for such low-level
signals. It makes me even more suspicious and adds to the
conspiracy theory.

Well, maybe. However I tend to think that "never attribute to
malice what can be adequately explained by stupidity" may apply.

I agree but it looks like there is a pattern here...

These contacts should be gold plated with high quality gold. It is
also possible that the HDD vibration (always present with a running
HDD) and thermal variation allows the process to creep between the
contacts and kill them. Maybe a young, inexperienced engineer was
hired to replace an older, experienced (but more expensive one)
and that person made a pretty bad judgement call due to
inexperience, wanting to save a few cents on the design.

They did not save anything on that design. Gold plating is a common
procedure, it is everywhere, most of card-edge connectors (e.g. PCI)
are gold and they even called "gold fingers" by chinese PCB
manufacturers.

Silver, on the other hand, is almost unheard of and I'm pretty sure
PCB makers would charge extra for this if they agree to do it at
all. And it is NOT that the entire board is silver-plated; there are
gold-plated parts on that same board that makes it have at least 2
different platings so it will be more expensive than simple gold all
over.

Good points. An exotic process would be more expensive than a
common one and two processes instead of one as well. I also happen
to know that putting gold directly on silcer is problematic, but
putting it directly on copper is fine. At least that is for galvanics
on jewelery and if I remember this correctly.

I have to say that the last time I saw silver plating as contact
protection was in vaccuum tube equipment. Modern electronics
typically uses Gold, or Tin for low insertion cycle contacts.

Yep. Silver plating was usually used in microwave equipment, HF
coils etc. where skin effect was so profound that current only ran
through that silver (that was quite thick, btw.) Silver is also used
for HIGH CURRENT relay contacts where the corrosion is removed by
mechanical action of closing contacts and burned through with high
current.

That explains it. I have indeed seen it in power relais as well.

If you look at low current signal relays with stated minimal current
capacity _NONE_ of them have silver contacts. It is usually gold,
platinum, rhodium, or a mix thereof.

I am all pro Occam's Razor but all this looks like deliberate effort
to make it fail after some time. It is NOT easier or cheaper to
put silver there because it is an _ADDITIONAL_ step and not so
common one.

Well, it only makes the required level of stupidity larger,
because (if we have this right) they also need to mess up the
economic angle. If we assume they are competent, then indeed this
looks very much like a deliberate and rather bad design error.
Or some fool has focussed on the price of gold metal and has
lost sight of the fact that more complex pcb manufacturing
process negates any advantage by using the cheaper metal.

MUCH more likely than any conspiracy to shaft the user.
 
In sci.electronics.repair Sergey Kubushyn <ksi@koi8.net> wrote:

Just took a brand spanking new WD5000AAKS drive out of sealed bag with
silica gel and all that stuff. The PCB is all _SILVER_ plated, no gold. And
that silver is almost totally black right out of the bag.

In sci.electronics.repair Arno <me@privacy.net> wrote:
In comp.sys.ibm.pc.hardware.storage Sergey Kubushyn <ksi@koi8.net> wrote:
[...]
That suicide also can happen when some old file that was not accessed for
ages is read. That attempt triggers the suicide chain.

Yes, that makes sense. However you should do surface scans on
RAIDed disks regularly, e.g. by long SMART selftests. This will
catch weak sectors early and other degradation as well.

I know but I simply didn't think all 3 drives can fail... I thought I have
enough redundancy because I put not 2 but 3 drives in that RAID1... And I
did have something like a test with regular weekly full backup that reads
all the files (not the entire disk media but at least all the files on it)
and that was that backup that triggered disk suicide.

Anyway lesson learned and I'm taking additional measures now. It was not a
very good experience loosing some of my work...

Yes, I can imagine. I have my critical stuff also on a 3 way RAID1,
but with long SMART selftests every 2 weeks and 3 different drives,
two from WD and one from Samsung. One additional advantage of the
long SMART selftest is that with smartd you will get a warning
email on every failing test, i.e. one every two weeks. For additional
warning you can also run a daily short test, e.g..

No matter what you do you can not prevent an occasional disaster :( One
MUST remember that "backup" in not a noun but a verb in imperative.

BTW, I took a look at brand new WDC WD5000YS-01MPB1 drives, right out of the
sealed bags with silica gel and all 4 of those had their contacts already
oxidized with a lot of black stuff. That makes me very suspicious that
conspiracy theory might be not all that crazy--that oxidation seems to be
pre-applied by the manufacturer.

Urgh. These bags are airtight. No way the problem happened on your
side then. My two weeks old WD5000AADS-00S9B0 looks fine on the top
of the PCB. I think I will have a look underneath later.

Those 4 were fine on the top of PCB. Black stuff was underneath, on those
pads contacting with springy heads pins.
---
******************************************************************
* KSI@home KOI8 Net < > The impossible we do immediately. *
* Las Vegas NV, USA < > Miracles require 24-hour notice. *
******************************************************************
 

Welcome to EDABoard.com

Sponsor

Back
Top