Hello,
I am running an ESXi 5.0 host with an adaptec 29320LPE SCSI PCI-E Controller card and a Quantum DAT160 SCSI Tape drive connected. The Tape drive is passed through as a SCSI device to a virtual machine running Windows Server 2003 SBS with the VMware paravirtual SCSI controller.
For a long time now, I had strange problems with the backup - backups with "big" DAT160 tapes worked without errors, but the daily backup with smaller (and cheaper) "DAT72" tapes, shows strange input/output, parity, inconsistency, etc. errors - most times during verify.
I have tried many things to get over these problems:
- I tried to use another driver for the adaptec card than the delivered one in the ESXi image: https://my.vmware.com/de/group/vmware/details?downloadGroup=DT-ESXi50-PMC-Sierra-aic79xx-5101002&productId=285
- I searched forums, etc. for similar problems and tried many things, such as http://communities.vmware.com/thread/329905?start=0&tstart=0, but the problems persist
- I tried using PCI passthrough of the adaptec card - this worked a little better some times (!), but shows other strange "parity" and similar errors other times, etc. and, if working, it has very poor performance - a backup with the big tapes was about three times slower
This week, I started to study the kernel log for SCSI messages (cat /var/log/vmkernel.log|grep -i "scsi") in the time range of the aborted backup, and I found some strange SCSI errors like "Transmission error detected", something with "...FIFO..." and a "device overrun (status a) on 0:3:0" at least...
Now, I remembered the symptom, that big DAT160 tapes are working much butter without errors - the difference is, more capacity and, higher transfer rates...
Even I am not an SCSI expert, maybe the tape drive could not deliver the data fast enough in case of the smaller tapes?
Then, I searched for "SCSI device overrun" errors, etc.. and I found an interesting article: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1008113.
-> I have enabled "SCSI LUN queue depth throttling", as described in this article, and ... IT WORKS FINE NOW!
I made many tests with "NTBackup" and the "Quantum XTalk" tape software to be sure that this setting really solves these problems and errors.... and that's it.
BTW: Some tests with the XTalk tape software, writing/reading a full tape, breaks during reading the tape - the solution was to deactivate the Windows 2003 "removable media service" (Wechselmediendienst) during these tests - it seems that this Windows service interfere with the SCSI transmission of the tape software.