/ Solaris

Solaris 11.2 下禁用 LSI 9211-8i SAS HBA卡的 SATA NCQ

为什么禁用SATA NCQ呢?

我的LSI 9211-8i 下挂6个 SATA 硬盘,在 Nexentastor 4.0.3 下, zpool 间大数据复制时系统就 hang 住....

在安装 Nexentastor 前对 LSI 9211-8i 的固件进行过升级,从P17_IT升到了P20_IT。

于是首先怀疑固件兼容性,将固件降到了P17_IT,问题依旧....

接着怀疑 Nextenstor 的问题。换装 OmniOS 还是老问题。最后装 Solaris 11.2 问题依旧....

实在没辙了,乍办呢。

忽然想到 LSI 9211-8i SAS --> SATA,是不是 SATA NCQ 引起问题呢?

顺着这个思路去查看 LSI 9211-8i SATA NCQ 设置,BIOS 中没找到....Solaris 下 megaCLI/StorCLI 不认卡....

幸好有 lsiutil (LSI Logic MPT Configuration Utility),找到了 1.63 版,在 Solaris 11.2 下可以运行:

root@solaris:~# ./lsiutil

LSI Logic MPT Configuration Utility, Version 1.63, June 4, 2009

1 MPT Port found

   Port Name Chip Vendor/Type/Rev  MPT Rev  Firmware Rev  IOC
 1.mpt_sas0  LSI Logic SAS2008 03  200      11000100       0

Select a device:  [1-1 or 0 to quit]  

发现1块 LSI 9211-8i , 按 1 选择HBA卡:

 Select a device:  [1-1 or 0 to quit] 1

  1.Identify firmware, BIOS, and/or FCode
  2.Download firmware (update the FLASH)
  4.Download/erase BIOS and/or FCode (update the FLASH)
  8.Scan for devices
 10.Change IOC settings (interrupt coalescing)
 13.Change SAS IO Unit settings
 16.Display attached devices
 20.Diagnostics
 21.RAID actions
 23.Reset target
 42.Display operating system names for devices
 43.Diagnostic Buffer actions
 45.Concatenate SAS firmware and NVDATA files
 59.Dump PCI config space
 60.Show non-default settings
 61.Restore default settings
 66.Show SAS discovery errors
 69.Show board manufacturing information
 97.Reset SAS link, HARD RESET
 98.Reset SAS link
 99.Reset port
 e   Enable expert mode in menus
 p   Enable paged mode
 w   Enable logging

Main menu, select an option:  [1-99 or e/p/w or 0 to quit]  

e 进入专家模式:

Enabled expert mode in menus

  1.Identify firmware, BIOS, and/or FCode
  2.Download firmware (update the FLASH)
  3.Upload firmware
  4.Download/erase BIOS and/or FCode (update the FLASH)
  5.Upload BIOS and/or FCode
  6.Download SEEPROM
  7.Upload SEEPROM
  8.Scan for devices
  9.Read/change configuration pages
 10.Change IOC settings (interrupt coalescing)
 13.Change SAS IO Unit settings
 14.Change IO Unit settings (multi-pathing, queuing, caching)
 16.Display attached devices
 17.Show expander routing tables
 18.Change SAS WWID
 19.Test configuration page actions
 20.Diagnostics
 21.RAID actions
 23.Reset target
 24.Clear ACA
 33.Erase non-volatile adapter storage
 34.Remove device from initiator table
 35.Display Log entries
 36.Clear (erase) Log entries
 37.Force full discovery
 40.Display current events
 42.Display operating system names for devices
 43.Diagnostic Buffer actions
 44.Program manufacturing information
 45.Concatenate SAS firmware and NVDATA files
 46.Upload FLASH section
 47.Display version information
 48.Display chip VPD information
 49.Program chip VPD information
 50.Dump MPT registers
 51.Dump chip memory regions
 52.Read/modify chip memory locations
 54.Identify FLASH device
 55.Force firmware to fault (with C0FFEE)
 56.Read/write expander memory
 57.Read/write expander ISTWI device
 58.Alta diagnostics
 59.Dump PCI config space
 60.Show non-default settings
 61.Restore default settings
 66.Show SAS discovery errors
 67.Dump all port state
 68.Show port state summary
 69.Show board manufacturing information
 70.Dump all device pages
 80.Set SAS phy offline
 81.Set SAS phy online
 90.Send SCSI CDB
 95.Send SATA request
 96.Send SMP request
 97.Reset SAS link, HARD RESET
 98.Reset SAS link
 99.Reset port
  e   Disable expert mode in menus
  p   Enable paged mode
  w   Enable logging

Main menu, select an option:  [1-99 or e/p/w or 0 to quit]  

多出好多 Menu,专家模式就是高大上!
选择 14. Change IO Unit settings (multi-pathing, queuing, caching)

Multi-pathing:  [0=Disabled, 1=Enabled, default is 0]  

多路径,默认是关闭的,按 回车 继续

SATA Native Command Queuing:  [0=Disabled, 1=Enabled, default is 0]  

终于到 NCQ 了,先前是开启的,关闭! 回车 继续

SATA Write Caching:  [0=Disabled, 1=Enabled, default is 0]  

SATA 写缓存,先前是开启的,顺便也把它关闭了。 回车 继续

Main menu, select an option:  [1-99 or e/p/w or 0 to quit] 68  

68. Show port state summary 查看状态,确认修改:

Current Port State
------------------
SAS2008's links are down, 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G, down, 6.0 G

Software Version Information
----------------------------
Current active firmware version is 11000100 (17.00.01)
Firmware image's version is MPTFW-17.00.01.00-IT
  LSI Logic
  Not Packaged Yet
x86 BIOS image's version is MPT2BIOS-7.33.00.00 (2013.07.18)

Firmware Settings
-----------------
SAS WWID:   ****************
Multi-pathing:  Disabled
SATA Native Command Queuing:Disabled
SATA Write Caching: Disabled
SATA Maximum Queue Depth:   32
SAS Max Queue Depth, Narrow:0
SAS Max Queue Depth, Wide:  0
Device Missing Report Delay:0 seconds
Device Missing I/O Delay:   0 seconds
Phy Parameters for Phynum:  01234567
  Link Enabled: Yes  Yes  Yes  Yes  Yes  Yes  Yes  Yes  
  Link Min Rate:1.5  1.5  1.5  1.5  1.5  1.5  1.5  1.5  
  Link Max Rate:6.0  6.0  6.0  6.0  6.0  6.0  6.0  6.0  
  SSP Initiator Enabled:Yes  Yes  Yes  Yes  Yes  Yes  Yes  Yes  
  SSP Target Enabled:   No   No   No   No   No   No   No   No   
  Port Configuration:   Auto Auto Auto Auto Auto Auto Auto Auto 
Interrupt Coalescing:   Enabled, timeout is 10 us, depth is 4

Main menu, select an option:  [1-99 or e/p/w or 0 to quit]  

现在 SATA Native Command Queuing、SATA Write Caching 都已经在 Disabled 状态了。

最后,也是最重要的,选择 99. Reset port ,重启一下 HBA 卡,让设置生效。

测试了一下先前的问题,两个 zpool 间复制了近 3T 的数据一点问题都没有^_^