Linux 指南/監控
外觀
< Linux 指南
此頁面處於 TODO 狀態。任何人都可以自由地完成/為其貢獻。目前(2010-06-11)它包含我一直在收集的一些隨機筆記。
TODO 標記表示“待辦事項”(“TODO”被一些編輯工具自動識別為待辦事項)。
下一個 連結 提供了一個快速指令碼,用於在 Linux 中重新掃描 SCSI 匯流排。
大多數情況下,有一種更簡單的方法可以正常工作。
echo "- - -" > /sys/class/scsi_host/host0/scan
針對 Qlogic 卡的稍微複雜一些的指令碼示例。
#!/bin/bash
for HBA in `ls -A /proc/scsi/qla2xxx/`
do
echo "scsi-qlascan" > /proc/scsi/qla2xxx/${HBA}
done
或者,如果可用,可以使用 iscsiadm。
iscsiadm -t discovery --type sendtargets --portal <IP> iscsiadm -t node --targename <targetname>-- portal<IP> --login
在網上提供的其他文件中,Red Hat Enterprise Linux 5 線上儲存重新配置指南 也可以提供有用的幫助。
Dmidecode 根據 SMBIOS/DMI 標準(參見示例輸出)報告有關係統硬體的資訊,這些資訊在系統 BIOS 中描述。此資訊通常包括系統製造商、型號名稱、序列號、BIOS 版本、資產標籤,以及許多其他細節,這些細節的興趣程度和可靠性根據製造商的不同而有所不同。這通常包括 CPU 插槽、擴充套件插槽(例如 AGP、PCI、ISA)和記憶體模組插槽的使用狀態,以及 I/O 埠列表(例如序列、並行、USB)。
What is IPMI? The Intelligent Platform Management Interface (IPMI) specification defines a set of interfaces for platform management. It is implemented by a large number of hardware manufacturers to support system management on motherboards. The features of IPMI that most users will be interested in are sensor monitoring (i.e. CPU temperatures, fan speeds), remote power control, and serial-over-LAN (SOL). What is FreeIPMI? FreeIPMI provides in-band and out-of-band IPMI software based on the IPMI v1.5/2.0 specification. FreeIPMI provides tools and libraries for users to access and read IPMI sensor readings, system event log (SEL) entries, serial-over-LAN (SOL), remote power control functions, field replaceable unit (FRU) device information, and more. More information about FreeIPMI can be found at the FreeIPMI webpage at: http://www.gnu.org/software/freeipmi/index.html
************************************************************************ ~# smartctl -d cciss,0 -a /dev/cciss/c0d0 smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: HP DH072ABAA6 Version: HPD7 Serial number: 3PD19ZMN0000983153B8 Device type: disk Transport protocol: SAS Local Time is: Sat Jul 19 20:09:09 2008 CEST Device supports SMART and is Enabled Temperature Warning Enabled SMART Health Status: OK Current Drive Temperature: 29 C Drive Trip Temperature: 68 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 899299930 Blocks received from initiator = 14843797 Blocks read from cache and sent to initiator = 3793967485 Number of read and write commands whose size <= segment size = 48565840 Number of read and write commands whose size > segment size = 0 Vendor (Seagate/Hitachi) factory information number of hours powered up = 945.00 number of minutes until next internal SMART test = 7 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 0 0 0 0 0 0.000 0 write: 0 0 0 0 0 0.000 0 Non-medium error count: 0 No self-tests have been logged Long (extended) Self Test duration: 840 seconds [14.0 minutes] ************************************************************************ ~# smartctl -d cciss,1 -a /dev/cciss/c0d0 smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: HP DH072ABAA6 Version: HPD7 Serial number: 3PD19ZPV000098315CX2 Device type: disk Transport protocol: SAS Local Time is: Sat Jul 19 20:09:12 2008 CEST Device supports SMART and is Enabled Temperature Warning Enabled SMART Health Status: OK Current Drive Temperature: 30 C Drive Trip Temperature: 68 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 920490987 Blocks received from initiator = 14368268 Blocks read from cache and sent to initiator = 3755437180 Number of read and write commands whose size <= segment size = 48820139 Number of read and write commands whose size > segment size = 0 Vendor (Seagate/Hitachi) factory information number of hours powered up = 945.02 number of minutes until next internal SMART test = 8 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 0 0 0 0 0 0.000 0 write: 0 0 0 0 0 0.000 0 Non-medium error count: 0 No self-tests have been logged Long (extended) Self Test duration: 840 seconds [14.0 minutes] ************************************************************************ ~# smartctl -d cciss,2 -a /dev/cciss/c0d0 smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: HP DH072ABAA6 Version: HPD7 Serial number: 3PD1A0SD000098300K39 Device type: disk Transport protocol: SAS Local Time is: Sat Jul 19 20:09:15 2008 CEST Device supports SMART and is Enabled Temperature Warning Enabled SMART Health Status: OK Current Drive Temperature: 31 C Drive Trip Temperature: 68 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 913141941 Blocks received from initiator = 11455509 Blocks read from cache and sent to initiator = 3697098775 Number of read and write commands whose size <= segment size = 49159966 Number of read and write commands whose size > segment size = 0 Vendor (Seagate/Hitachi) factory information number of hours powered up = 944.93 number of minutes until next internal SMART test = 18 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 0 0 0 0 0 0.000 0 write: 0 0 0 0 0 0.000 0 Non-medium error count: 0 No self-tests have been logged Long (extended) Self Test duration: 840 seconds [14.0 minutes]
在戴爾伺服器中安裝 OMSA 用於硬體監控
OMSA 允許監控 RAID 的執行狀況、主機板/磁碟/機箱溫度、警報生成、設定/修改 BIOS、檢視已安裝的裝置等。
要在 Debian 下安裝
1.- 在 /etc/apt/sources.list 中新增下一行
deb ftp://ftp.sara.nl/pub/sara-omsa dell sara
2.- 執行
apt-get update && apt-get install dellomsa
這將把 OMSA 安裝在 /opt/dell 中。
3.- 要引導系統
~# /opt/dell/srvadmin/dataeng/bin/dsm_sa_datamgr32d -run ~# /opt/dell/srvadmin/dataeng/bin/dsm_sa_eventmgr32d -run
檢查連線到控制器 0 的磁碟的執行狀況
~# /etc/delloma.d/oma/bin/omreport.sh storage pdisk controller=0
輸出將類似於
List of Physical Disks on Controller PERC 4e/Di (Embedded) Controller PERC 4e/Di (Embedded) ID : 0:0 Status : Ok Name : Physical Disk 0:0 State : Online Failure Predicted : No Progress : Not Applicable Type : SCSI Capacity : 68.24 GB (73274490880 bytes) Used RAID Disk Space : 68.24 GB (73274490880 bytes) Available RAID Disk Space : 0.00 GB (0 bytes) Hot Spare : No Vendor ID : MAXTOR Product ID : ATLAS10K5_73SCA Revision : JNZY Serial No. : J20KVCTK Negotiated Speed : 320 Capable Speed : 320 Manufacture Day : Not Available Manufacture Week : Not Available Manufacture Year : Not Available SAS Address : Not Available ID : 0:1 Status : Ok Name : Physical Disk 0:1 State : Online Failure Predicted : No Progress : Not Applicable Type : SCSI Capacity : 68.24 GB (73274490880 bytes) Used RAID Disk Space : 68.24 GB (73274490880 bytes) Available RAID Disk Space : 0.00 GB (0 bytes) Hot Spare : No Vendor ID : MAXTOR Product ID : ATLAS10K5_73SCA Revision : JNZY Serial No. : J20KV5RK Negotiated Speed : 320 Capable Speed : 320 Manufacture Day : Not Available Manufacture Week : Not Available Manufacture Year : Not Available SAS Address : Not Available ID : 0:2 Status : Ok Name : Physical Disk 0:2 State : Online Failure Predicted : No Progress : Not Applicable Type : SCSI Capacity : 68.24 GB (73274490880 bytes) Used RAID Disk Space : 68.24 GB (73274490880 bytes) Available RAID Disk Space : 0.00 GB (0 bytes) Hot Spare : No Vendor ID : MAXTOR Product ID : ATLAS10K5_73SCA Revision : JNZY Serial No. : J20KTS8K Negotiated Speed : 320 Capable Speed : 320 Manufacture Day : Not Available Manufacture Week : Not Available Manufacture Year : Not Available SAS Address : Not Available
檢查 RAID 的狀態/配置
~# /etc/delloma.d/oma/bin/omreport.sh storage vdisk controller=0
這將看起來像
Virtual Disk 0 on Controller PERC 4e/Di (Embedded) Controller PERC 4e/Di (Embedded) ID : 0 Status : Ok Name : Virtual Disk 0 State : Ready Progress : Not Applicable Layout : RAID-5 Size : 136.48 GB (146548981760 bytes) Device Name : /dev/sda Type : SCSI Read Policy : Adaptive Read Ahead Write Policy : Write Back Cache Policy : Direct I/O Stripe Element Size : 64 KB
獲取伺服器的摘要
~# /etc/delloma.d/oma/bin/omreport.sh system summary System Summary ------------------ Software Profile ------------------ Systems Management Name : Information not available. Version : 3.2.0 Description : Systems Management Software Operating System Name : Linux Version : Kernel 2.6.18.2 (i686) System Time : Sun Nov 25 18:30:37 2007 System Bootup Time : Fri Oct 12 15:20:31 2007 -------- System -------- System Host Name : MySuperServidor System Location : Please set the value --------------------- Main System Chassis --------------------- Chassis Information Chassis Model : PowerEdge 2850 Chassis Service Tag : Chassis Lock : Present Chassis Asset Tag : Processor 1 Processor Manufacturer : Intel Processor Family : Xeon Processor Version : Model 4 Stepping 3 Current Speed : 3200 MHz Maximum Speed : 3600 MHz External Clock Speed : 800 MHz Voltage : 1400 mV Processor 2 Processor Manufacturer : Intel Processor Family : Xeon Processor Version : Model 4 Stepping 3 Current Speed : 3200 MHz Maximum Speed : 3600 MHz External Clock Speed : 800 MHz Voltage : 1400 mV Memory Total Installed Capacity : 2048 MB Memory Available to the OS : 2023 MB Total Maximum Capacity : 16384 MB Memory Array Count : 1 Memory Array 1 Location : System Board or Motherboard Use : System Memory Installed Capacity : 2048 MB Maximum Capacity : 16384 MB Slots Available : 6 Slots Used : 2 ECC Type : Multibit ECC Slot PCI1 Adapter : [Not Occupied] Type : PCI X Data Bus Width : 64 Bits Speed : 133 MHz Slot Length : Long Voltage Supply : 3.3 Volts Slot PCI2 Adapter : [Not Occupied] Type : PCI X Data Bus Width : 64 Bits Speed : 133 MHz Slot Length : Long Voltage Supply : 3.3 Volts Slot PCI3 Adapter : PRO/100 S Server Adapter Type : PCI X Data Bus Width : 64 Bits Speed : 133 MHz Slot Length : Short Voltage Supply : 3.3 Volts BIOS Information Manufacturer : Dell Inc. Version : A04 Release Date : 09/22/2005 -------------- Network Data -------------- IP Address Data IP Address 0 : 192.168.2.2 IP Address 1 : 192.168.0.115 -------------------- Storage Enclosures -------------------- Storage Enclosures Name : Backplane Service Tag : 62P00P8
| 語法 | 簡要說明 |
|---|---|
| top | 允許監視和管理正在執行的程序(用於終止程序)。 按“q”退出,按“k”終止程序。 |
| htop | 類似於 top,但具有更友好的基於選單的使用者介面。 |
| lsof | 顯示哪些程序正在“接觸”檔案或目錄,以及程序正在訪問的檔案集(這也包括任何網路套接字、管道或裝置)。 |
| netstat | 提供網路使用情況和連線(已建立的連線和監聽連線)的統計資訊和報告。 |
| vmstat | 提供有關記憶體使用情況的統計資訊。 |
| iostat | 提供有關讀寫外部裝置的統計資訊。 |
| inotifywatch inotifywait |
現代 Linux 核心允許將任何對檔案的訪問或更改立即通知程序(使用者應用程式)。“inotifywatch”和“inotifywait”命令允許等待來自核心的新的事件通知,這些事件通知與一組檔案/目錄相關的任何內容。 |
| strace -p <pid> |
允許監視使用者應用程式的系統呼叫(對核心提供的服務的呼叫)。 |
| stap | 允許即時以高細節監視核心。可以在 此處 閱讀教程。 |
| oprofile 和 perfmon2 | 允許訪問硬體效能計數器;可以在 此處 瀏覽教程。 |
| AMD CodeAnaylist | Oprofile 的圖形使用者介面前端。可以在 此處 和 此處 瀏覽簡介/教程。 |
| Intel VTune | 允許在 Intel 硬體中進行效能調整。 |