Sun Enterprise Server Maintenance

338
Sun Microsystems LTD Citygate Cross Street Sale Manchester M33 7JF UK ® Sun Enterprise Server Maintenance Revision E June 2001, Brian Jackson IT-ETC-033

Transcript of Sun Enterprise Server Maintenance

Page 1: Sun Enterprise Server Maintenance

SunEnterpriseServerMaintenance

IT-ETC-033

Sun Microsystems LTDCitygateCross StreetSaleManchester M33 7JFUK

®

Revision E June 2001, Brian Jackson

Page 2: Sun Enterprise Server Maintenance

Please

Recycle

Copyright © 2001 Sun Microsystems, Inc., 901 San Antonio Road, Palo Alto, California 94303, U.S.A. All rights reserved.

This product or document is protected by copyright and distributed under licenses restricting its use, copying,

distribution, and decompilation. No part of this product or document may be reproduced in any form by any means

without prior written authorization of Sun and its licensors, if any.

Third-party software, including font technology, is copyrighted and licensed from Sun suppliers.

Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a

registered trademark in the U.S. and other countries, exclusively licensed through X/Open Company, Ltd.

Sun, Sun Microsystems, the Sun Logo, SunVTS, OpenBoot, Sun Enterprise, UltraSPARC, Solstice SyMON, Gigaplane,

SPARCstorage, RSM, RSM Array, SunFastEthernet, SunFDDI, StorEdge, SunDiag, SunPCI, SunBus, AnswerBook, and

OBDiag are trademarks or registered trademarks of Sun Microsystems, Inc. in the U.S. and other countries.

All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc.

in the U.S. and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun

Microsystems, Inc.

The OPEN LOOK and Sun Graphical User Interface was developed by Sun Microsystems, Inc. for its users and licensees.

Sun acknowledges the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user

interfaces for the computer industry. Sun holds a non-exclusive license from Xerox to the Xerox Graphical User Interface,

which license also covers Sun’s licensees who implement OPEN LOOK GUIs and otherwise comply with Sun’s written

license agreements.

U.S. Government approval required when exporting the product.

RESTRICTED RIGHTS: Use, duplication, or disclosure by the U.S. Govt is subject to restrictions of FAR 52.227-14(g)

(2)(6/87) and FAR 52.227-19(6/87), or DFAR 252.227-7015 (b)(6/95) and DFAR 227.7202-3(a).

DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS,

AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A

PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH

DISCLAIMERS ARE HELD TO BE LEGALLY INVALID.

Page 3: Sun Enterprise Server Maintenance

Contents

Introduction to Sun Enterprise Servers .................................................1-1

Additional Resources ....................................................................... 1-2Enterprise Introduction ................................................................... 1-4Ex000 servers versus Ex500 servers................................................ 1-5Server Specifications......................................................................... 1-6

Sun Enterprise 3000 ..................................................................1-6Sun Enterprise 3500 .................................................................1-7Sun Enterprise 4500 ..................................................................1-8Sun Enterprise 5500 ..................................................................1-9Sun Enterprise 6500 ................................................................1-10

Reliability, Availability, and Serviceability Features ................ 1-11Reliability ......................................................................................... 1-12Availability....................................................................................... 1-13Serviceability.................................................................................... 1-14Scalability ......................................................................................... 1-15Concurrent Maintenance Tools..................................................... 1-16

Dynamic Reconfiguration......................................................1-16Alternate Pathing ....................................................................1-16

Monitoring and Administration .................................................. 1-17Solstice SyMON.......................................................................1-17

Hardware Component Overview............................................................2-1The Sun Enterprise 3000 Server ...................................................... 2-2

Specifications .............................................................................2-3The Sun Enterprise 3500 Server ...................................................... 2-4

Specifications .............................................................................2-6The Sun Enterprise 4000/4500 Server............................................ 2-7

Specifications .............................................................................2-8The Sun Enterprise 5500 Server ...................................................... 2-9

Specifications ...........................................................................2-10The Sun Enterprise 6500 Server .................................................... 2-11Specifications ....................................................................................2-12Gigaplane Architecture ................................................................. 2-13

Centerplane Configuration....................................................2-15

iiiCopyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 4: Sun Enterprise Server Maintenance

Centerplane Numbering Scheme..........................................2-16PCM/Slot layout ............................................................................. 2-17Performance..................................................................................... 2-18Hot Swap and Hot Plug ................................................................. 2-19Power Supplies................................................................................ 2-20

Power/Cooling Module (PCM)............................................2-20Peripheral Power Supply (PPS) ............................................2-21

Hot Pluggable Boards .................................................................... 2-25Hot Plug Architecture ............................................................2-25

Sun Enterprise Deskside Chassis Designs................................... 2-21Common and unique components ............................................... 2-27Exercise: Component Removal and Replacement...................... 2-28

Bus Structures and Types .........................................................................3-1UPA Bus Architecture ..................................................................... 3-2

The CPU/memory Board and the UPA Bus.........................3-2Ultra Port Architecture Features.............................................3-3

SBus Architecture ............................................................................. 3-4SBus Features.............................................................................3-4

PCI Architecture ............................................................................... 3-5PCI Mechanical Specifications ................................................3-5PCI Electrical Specifications ....................................................3-5PCI Board connectors ...............................................................3-6

SCSI Introduction ............................................................................. 3-7Small Computer System Interface Features ..........................3-9Fast SCSI – Higher Bus Speed .................................................3-9Wide SCSI – Wider Is Better ..................................................3-10Differential SCSI — Less Interference..................................3-10Ultra2 SCSI ...............................................................................3-10Termination..............................................................................3-14Cable quality............................................................................3-14Conclusion ...............................................................................3-14

SCSI implementation on I/O boards ........................................... 3-12Fibre Channel Interface ................................................................. 3-13

CPU/Memory and Clock Boards .............................................................4-1CPU/Memory+ Board ......................................................................4-2

CPU Module ..............................................................................4-5400 MHz, 8MB Ecache Module ...............................................4-6CPU Module Handling Precautions ......................................4-8Removing and Replacing a CPU Module............................4-10

Memory ............................................................................................ 4-11Board Status Indicators .................................................................. 4-12Clock+ Board ................................................................................... 4-14

Console Bus..............................................................................4-17Clocks........................................................................................4-17Reset logic ................................................................................4-18

iv Sun Enterprise Server MaintenanceCopyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 5: Sun Enterprise Server Maintenance

TOD/NVRAM.........................................................................4-18Serial,keyboard,mouse ports.................................................4-18JTAG..........................................................................................4-18Remote console commands ...................................................4-19XIR.............................................................................................4-20LED Status codes.....................................................................4-21

Passive Boards ................................................................................. 4-22Filler Panel ...............................................................................4-22Load Board...............................................................................4-23

I/O Boards....................................................................................................5-1Types of I/O Boards: ........................................................................ 5-2

I/O Addressing .........................................................................5-2SBus I/O Boards:............................................................................... 5-4

SBus I/O Boards– Type 1.........................................................5-5SBus + I/O Board – Type 4......................................................5-6SBus I/O Boards– Type 1.........................................................5-7SBus + I/O Boards– Type 4 .....................................................5-8

Graphics I/O Boards: ....................................................................... 5-9Graphics I/O Board – Type 2 ................................................5-10Graphics+ I/O Board – Type 5..............................................5-11Graphics I/O Board – Type 2 ................................................5-12Graphics+ I/O Board – Type 5..............................................5-13

PCI I/O Boards:............................................................................... 5-14PCI+ I/O Board – Type 3.......................................................5-14

Board Status Indicators .................................................................. 5-18Enterprise 3500 Fibre Channel Interface Board ..................5-20SCSI Disk Board .....................................................................5-21SCSI Disk Board Addressing.................................................5-21

Open Boot PROM / NVRAM...................................................................6-1Introducing OBP ............................................................................... 6-2Features of OBP ................................................................................ 6-4The OBP User Interface .................................................................... 6-7System Testing Commands ............................................................ 6-8Informational Commands ............................................................. 6-10The Device Tree............................................................................... 6-11Displaying the Device Tree ........................................................... 6-13

Using the .properties Command.......................................6-14Using the dev Command.......................................................6-14

Listing System Devices................................................................... 6-15Displaying Device Aliases ............................................................. 6-18

Device Alias Commands........................................................6-19nvalias Command ...................................................................6-20

Open Boot PROM Commands for the NVRAM......................... 6-21The printenv Command ......................................................6-22

General NVRAM parameters........................................................ 6-25

vCopyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 6: Sun Enterprise Server Maintenance

Platform specific NVRAM parameters ........................................ 6-27Environmental monitoring .............................................................6-30NVRAM security..............................................................................6-31NVRAMRC editing commands .....................................................6-32Updating Flash PROM and FCode................................................6-34Correcting a Faulty Flash PROM ................................................. 6-41Synchronizing NVRAM/TOD chips............................................ 6-43

Power on self test (POST).........................................................................7-1Introducing POST ............................................................................ 7-2Self test overview .............................................................................. 7-6POST control commands ............................................................... 7-18

s-flag..........................................................................................7-18v-flag .........................................................................................7-18

POST Menus .................................................................................... 7-20option 7 .....................................................................................7-21

POST Board status messages......................................................... 7-23Sample error messages................................................................... 7-24POST error reporting...................................................................... 7-25

show-post-results ....................................................................7-26When things go wrong................................................................... 7-29Accessing and Displaying POST .................................................. 7-30

tip session.................................................................................7-30

Internal Disk Subsystems ........................................................................8-1Internal Storage Capacities .............................................................. 8-2

The SCSI Disk Board.................................................................8-3The SCSI Disk Board Addressing ...........................................8-3

Disk Addressing ............................................................................... 8-5Examples ....................................................................................8-5

Sun Enterprise 3500 ...........................................................................8-6Fibre Channel Interface Board ................................................8-6

Disk Addressing ............................................................................... 8-9probe-fcal-all ............................................................................8-10world-wide numbers ..............................................................8-10

E3500 boot disk replacement......................................................... 8-12E3500 data disk replacement ......................................................... 8-13Sun Enterprise 3000 .........................................................................8-14

I/O Addressing test.........................................................................8-15

vi Sun Enterprise Server MaintenanceCopyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 7: Sun Enterprise Server Maintenance

Solaris Support Utilities ...........................................................................9-1How Solaris References System Components .............................. 9-2

Logical Device Names..............................................................9-2Physical Device Names ............................................................9-4Instance Names .........................................................................9-5

Configuring Components in Solaris............................................... 9-6Automatic Device Configuration............................................9-6

Displaying System Configuration Information............................ 9-9The prtconf Utility ..................................................................9-9The sysdef Utility .................................................................9-11The format Utility ..................................................................9-15

Displaying Diagnostic Information.............................................. 9-16The dmesg Command.............................................................9-16The prtdiag Command.........................................................9-18

Setting NVRAM Configuration Parameters From Solaris ........ 9-21The eeprom Command...........................................................9-21

SunVTS System Diagnostics .................................................................10-1Introduction ..................................................................................... 10-2

SunVTS Software Overview..................................................10-2Test categories ................................................................................. 10-3Hardware and software requirements......................................... 10-4Starting the SunVTS Software....................................................... 10-5The SunVTS Graphical Interface................................................... 10-6The SunVTS Window Panels......................................................... 10-7The SunVTS Window Icons........................................................... 10-8The SunVTS Menu Selections........................................................ 10-9The Schedule Options Menu ....................................................... 10-11The Test Execution Menu ............................................................ 10-12The Advance Options Menu ....................................................... 10-14Intervention Mode ........................................................................ 10-15Performance Monitor Panel......................................................... 10-16Using SunVTS in TTY Mode ....................................................... 10-18Negotiating the SunVTS TTY Interface ..................................... 10-19Running SunVTS Remotely......................................................... 10-20

Requirements.........................................................................10-20Running SunVTS Through a Remote Login .....................10-20Running SunVTS Through telnet or tip ........................10-22

SunVTS Test Summary ................................................................ 10-24Advanced Frame Buffer Test...............................................10-24SunATM Adapter Test .........................................................10-24Audio Test..............................................................................10-25Bidirectional Parallel Port Printer Test ..............................10-25Compact Disc Test ................................................................10-25Frame Buffer, GX, GX+ and TGX Options Test................10-26Disk and Floppy Drives Test...............................................10-26

viiCopyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 8: Sun Enterprise Server Maintenance

ECP 1284 Parallel Port Printer Test ....................................10-27Sun Enterprise Network Array Test...................................10-27StorEdge 1000 Enclosure Test .............................................10-28Frame Buffer Test..................................................................10-28Fast Frame Buffer Test..........................................................10-28SunVTS Test Summary ........................................................10-29Floating Point Unit Test .......................................................10-29Sun GigabitEthernet Test .....................................................10-29Intelligent Fibre Channel Processor Test ...........................10-29Dual Basic Rate ISDN (DBRI) Chip ....................................10-30M64 Video Board Test ..........................................................10-30Multiprocessor Test ..............................................................10-30Network Hardware Test ......................................................10-31SPARCstorage Array Controller Test ................................10-31Physical Memory Test ..........................................................10-32Prestoserve Test.....................................................................10-32Serial Asynchronous Interface Test....................................10-33Sun Enterprise Cluster 2.0 Network Hardware Test .......10-33Environmental Sensing Card Test ......................................10-34Soc+ Host Adapter Card Test..............................................10-34Serial Parallel Controller Test..............................................10-35Serial Ports Test .....................................................................10-35SunButtons Test.....................................................................10-35SunDials Test .........................................................................10-36HSI Board Test.......................................................................10-36Sun PCi Test...........................................................................10-36System Test ............................................................................10-37Tape Drive Test .....................................................................10-37Virtual Memory Test ............................................................10-37

Test Message Syntax..................................................................... 10-38

Alternate Pathing ......................................................................................A-1Introducing Alternate Pathing ....................................................... A-2Supported Devices ........................................................................... A-3

Disk Devices .............................................................................A-3Network Devices......................................................................A-3

Installing AP ..................................................................................... A-4How AP Works ................................................................................ A-5Physical paths ................................................................................... A-6Metadisk ............................................................................................ A-7Disk Pathgroup ................................................................................ A-8Metanetwork..................................................................................... A-9AP With Mirroring......................................................................... A-11AP and DR ...................................................................................... A-12AP State Database .......................................................................... A-13Creating the AP State Database ................................................... A-14

viii Sun Enterprise Server MaintenanceCopyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 9: Sun Enterprise Server Maintenance

The apinst Utility .................................................................A-16Creating a disk pathgroup and metadisks ................................. A-18Using the metadisks ...................................................................... A-20Placing your boot disk under AP control ................................... A-21Manually switching the active path ............................................ A-22Automatic disk pathgroup switching (AP2.1)........................... A-23Creating a network pathgroup .................................................... A-24Alternately pathing the primary network interface.................. A-25Switching a network pathgroup .................................................. A-27

Dynamic Reconfiguration ....................................................................... B-1Introducing Dynamic Reconfiguration.......................................... B-2

What Is Dynamic Reconfiguration? ....................................... B-2Benefits of DR ........................................................................... B-2Disadvantages of DR ............................................................... B-3Supported Hardware............................................................... B-3DR Limitations ......................................................................... B-4

Displaying Board Status.................................................................. B-5Basic Status Display................................................................. B-5Detailed Status Display........................................................... B-8

Reconfiguration Considerations .................................................... B-9Device driver interface DDI.................................................... B-9Suspend-Safe and Suspend-Unsafe Devices........................ B-9Hot-Plug Hardware ............................................................... B-10Permanent memory management ....................................... B-11Required additions to /etc/system..................................... B-11

Dynamic Reconfiguration Procedures........................................ B-12Removing a CPU/Memory Board....................................... B-12Installing or Replacing a CPU/Memory Board................. B-14Removing an I/O Board ....................................................... B-18Removing Boards that Use Detach-Unsafe Drivers.......... B-20Installing a New I/O Board.................................................. B-21Installing a Replacement I/O Board ................................... B-23

ixCopyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 10: Sun Enterprise Server Maintenance

x Sun Enterprise Server MaintenanceCopyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 11: Sun Enterprise Server Maintenance

Introduction toSunEnterpriseServers 1

1-11Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 12: Sun Enterprise Server Maintenance

1

Additional Resources

● http://docs.sun.com

● Server Rack Installation Manual, Part Number 802-7573

● Sun Enterprise 6500/5500/4500 Systems Installation Guide, Part

Number 805-2631

● SPARC Hardware Platform Guide, Part Number 802-5341

● Solstice SyMON User's Guide, Part Number 802-5355

● Sun Enterprise 6x00, 5x00, 4x00, and 3x00 Systems DynamicReconfiguration User’s Guide, Part Number 806-0280-05.

● Sun Enterprise Expansion Cabinet Installation and Service Manual,Part Number 805-4009

● Sun Enterprise 6/5/4/3x00 Systems SIMM Installation Guide, Part

Number 802-5032

● SBus+ and Graphics+ I/O Boards (100 MB/sec. Fibre Channels) for SunEnterprise 6/5/4/3x00 Systems, Part Number 805-2704

● PCI+ I/O Board Installation and Component Replacement for SunEnterprise 6/5/4/3x00 Systems, Part Number 805-1372

● Sun Enterprise 3500 System Reference Manual, Part Number 805-2630

● Sun Enterprise 6500/5500/4500 System Reference Manual, Part

Number 805-2632

● Sun Enterprise Server Alternate Pathing User's Guide, Part Number

805-5444

● Sun Enterprise 6x00/5x00/4x00 Disk Board Installation Guide, Part

Number 802-6740

● Sun Enterprise Systems Peripheral Power Supply Installation Guide,

Part Number 802-5033

● Sun Enterprise Systems Power/Cooling Module Installation Guide, Part

Number 802-6244

● Sun Enterprise 6/5/4/3x00 Systems Board Installation Guide, Part

Number 805-4007

1-12 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 13: Sun Enterprise Server Maintenance

1

Introduction to Sun Enterprise Servers 1-13Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 14: Sun Enterprise Server Maintenance

1

Enterprise Introduction

This course introduces you to some new concepts and some new

hardware. It is intended to give you an adequate understanding of the

enterprise computing environment and how Sun servers, software, and

applications fit into that enterprise. After you have been introduced to

the systems and understand their capabilities you will be provided

with an opportunity to take the systems apart, and put them back

together.

A main goal for this course is to help you understand the enterprise

computing environment better so that you can develop the

appropriate concurrent maintenance strategy. Troubleshooting a system

in the enterprise computing environment is quite different than a

desktop.

You must understand the function that the system you are working on

has in a company’s enterprise computing environment and how

critical it is that the system continue to operate while you troubleshoot

and repair it. No longer can a company afford to shut down a mission-

critical element in their enterprise operation while you perform

maintenance on that system.

Sun Microsystems has developed several products that can assist you

in performing your tasks with a minimal effect on the customer’s

enterprise computing environment. This course introduces you to

those products and tools and shows you how to be proficient with

them so you can safely work on Sun Enterprise servers.

1-14 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 15: Sun Enterprise Server Maintenance

1

Ex000 Servers versus. Ex500 Servers

The original Enterprise servers, the E3000, E4000, E5000 and E6000

have been upgraded; a process the marketing people called a “mid-life

enhancement”.

The enhanced servers are called the E3500,E4500, E5500 and E6500.

Note – The key difference is that the Ex000 servers run there

interconnect at 83MHz.

The E3500, E4500 and E5500 run their interconnect at 100MHz using

enhanced system boards and centreplane.

The E6500 is constrained to run at a maximum interconnect speed of

90MHz.

● E6000 v E6500

The E6000 is housed in a 56” cabinet whilst the E6500 is housed in

a 68” cabinet. This makes room for an additional A5000 or D1000.

● E5000 v E5500

The E5000 is housed in a 56” cabinet whilst the E5500 is housed in

a 68” cabinet. This again makes room for an additional A5000 or

D1000.

● E4000 v E4500

No major difference, apart from faster interconnect.

● E3000 v E3500

Very different. The E3500 has been totally re-designed. There are

too many to outline briefly here, but we shall cover them all.

Introduction to Sun Enterprise Servers 1-15Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 16: Sun Enterprise Server Maintenance

1

Server Specifications - E3000

Figure 1-1 The Sun Enterprise 3000 Cabinet

Main system features and options:

● Deskside chassis

● Enterprise 3000 is a four-slot model

● One CPU/memory+ and one I/O+ board minimum

● 1 to 6 UltraSPARC CPU modules

● 64 Mbytes to 12 Gbytes of RAM

● Up to ten internal SCSI disk drives

● Ultra SCSI-2 CD-ROM32 and 4mm- or 8mm-tape drive

1-16 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 17: Sun Enterprise Server Maintenance

1

Server Specifications - E3500

Figure 1-2 The Sun Enterprise 3500 Cabinet

Main system features and options:

● Deskside chassis

● Five-slot system (Enterprise 3000 is a four-slot model)

● One CPU/memory+ and one I/O+ board minimum

● 1 to 8 UltraSPARC CPU modules

● 64 Mbytes to 16 Gbytes of RAM

● Up to eight internal FC-AL disk drives

● Ultra SCSI-2 CD-ROM32 and 4mm- or 8mm-tape drive

● Over 6 Tbytes of external storage

Introduction to Sun Enterprise Servers 1-17Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 18: Sun Enterprise Server Maintenance

1

Server Specifications - E4500

Figure 1-3 The Sun Enterprise 4500 Cabinet

Main system features and options:

● Desktop chassis

● Eight-slot system, four in front and four in back

● One CPU/Memory+ and one I/O+ board minimum

● 1 to 14 UltraSPARC CPU modules

● 64 Mbytes to 28 Gbytes of RAM

● Up to 33.6 Gbytes of internal storage mounted on four disk

boards.

● Ultra SCSI-2 CD-ROM32 and 4mm- or 8mm-tape drive

● Over 10 Tbytes of external storage

1-18 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 19: Sun Enterprise Server Maintenance

1

Server Specifications - E5500

Figure 1-4 The Sun Enterprise 5500 Cabinet

Main system features and options:

● Datacentre cabinet

● An E4500, without cosmetic panels, mounted in a cabinet

● 1 to 14 UltraSPARC CPU modules

● 64 Mbytes to 28 Gbytes of RAM

● Up to 720 Gbytes of internal storage, comprising four disk boards

and A5000 or D1000 disk trays.

● Ultra SCSI-2 CD-ROM32 and 4mm- or 8mm- tape drive

● Over 10 Tbytes of external storage

Introduction to Sun Enterprise Servers 1-19Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 20: Sun Enterprise Server Maintenance

1

Server Specifications - E6500

Figure 1-5 The Sun Enterprise 6500 Cabinet

Main system features and options:

● Datacentre cabinet

● sixteen-slot system, eight in front and eight in back

● Minimum configuration; one CPU/memory+ and one I/O+ board

● 1 to 30 UltraSPARC CPU modules

● 64 Mbytes to 60 Gbytes of RAM

● Up to 576 Gbytes of internal storage, comprising two disk boards

and A5000 or D1000 disk trays.

● Ultra SCSI-2 CD-ROM32 and 4mm- or 8mm- tape drive

● Over 20 Tbytes of external storage

1-20 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 21: Sun Enterprise Server Maintenance

1

Reliability, Availability, and Serviceability Features

RAS is a set of enterprise computing technologies that furnish a high

degree of protection for corporate data (reliability), provide near

continuous data access (availability), and incorporate procedures to

correct problems with minimal business impact (serviceability).

These capabilities, commonly known as RAS, are a standard part of

traditional monolithic, centralized processing systems. Many

businesses today are moving to network computing where the flexible,

scalable architecture enables them to easily expand IT systems as their

needs grow while maintaining a reliable, stable computing

environment. Sun Microsystems has become a trusted vendor of safe,

innovative network computing solutions by delivering mainframe-

class RAS features and capabilities in their commercial computing

solutions.

New features that improve data integrity, system reliability, and

availability include a simpler system design, improved environmental

and hardware monitoring tools, redundant power and cooling, and

hot plug design for some components. Hot plug means that these

system components can be replaced or added while the server is up

and running. Serviceability features include requiring only one tool for

disassembly and re-assembly (a Phillips screwdriver), identical

components across the Sun Enterprise server family, and improved

diagnostics utilities.

The RAS feature set focus is to warn the operator about problems, and

act on their effects. There are new sensors in the hardware, which are

monitored by the software for just about everything. For example, it

monitors the temperature not only of each board, but of each central

processing unit (CPU) module, and the state of each fan. There are

unique monitoring tools, such as Sun Management Centre, which can

display the state of the machine to the board level, and works on a

“predictive failure” model. For example, it provides the system

administrator with warnings indicating what the likely effects of a

detected problem are to the system.

Introduction to Sun Enterprise Servers 1-21Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 22: Sun Enterprise Server Maintenance

1

Reliability

Sun Enterprise Ex500 systems have many features that improve their

reliability, which is defined as their ability to run continuously and

correctly. These features demonstrate continuous improvement and

Sun’s commitment to quality systems. The goal is to minimize the

burden on system operators and system administrators.

ECC and Parity Protection

● End-to-end error checking and correction (ECC) protection of data

● Address and control lines are parity protected

● Improved hardware monitors (time-outs and parity)

Enhanced Environmental Monitoring

● Advanced monitoring tools for power supplies, fans,

CPU/memory, input/output (I/O) boards, disks, and system

temperatures.

So, if a CPU modules overheats it will be taken off-line by the

system.

1-22 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 23: Sun Enterprise Server Maintenance

1

Availability

The following describes some of the availability features of the Sun

Enterprise servers:

System Monitoring

System monitoring enhancements improve reliability by directing

error messages to other applications that can dynamically alter the

system’s configuration without stopping or rebooting the system.

New capabilities of power on self test (POST) analyze parts and

report failures to the automatic reconfiguration software.

Automatic System Reconfiguration (ASR)

Uses the POST output to identify and remove failed components

from the systems configuration before rebooting the system.

Hot pluggable power supplies and disk drives that have failed can

be replaced without any system downtime or reboot, which

increases the system’s availability.

Redundant Components

This feature provides for an immediate replacement of a failed

component. A redundant power supply provides the current

necessary for the system to continue to operate if another power

supply fails. Large systems have multiple power supplies, each

capable of providing power for a specific number of boards (not

specific boards or slots). Should two or more power supplies fail,

the system’s ASR software would reconfigure for fewer boards,

reducing the power requirements to that of the available power

supplies, and continue to operate in a reduced capacity until the

failed power supply is replaced.

Introduction to Sun Enterprise Servers 1-23Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 24: Sun Enterprise Server Maintenance

1

Serviceability

The following describes some of the serviceability features of the Sun

Enterprise servers:

Hot Plug and hot swap components

Does away with the need for downtime.

Dynamic Reconfiguration

Eliminates the need for a reboot to logically attach a new or

replacement board.

Improved Diagnostics

Identify a system component failure more accurately.

The tests that run on system components at power on illuminate status

light emitting diodes (LEDs).

1-24 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 25: Sun Enterprise Server Maintenance

1

Scalability

The modular design allows customers to expand and enhance the

system as they require. Because Sun has leveraged the same

technology across the entire line of servers, from small (2-4 CPU) work

group servers to large (up to 30 CPU) enterprise servers, upgrade costs

can be kept to a minimum and customers can protect their

investments.

The following lists the hardware components that are the same in

workgroup servers and enterprise servers:

● CPU/Memory(+), and I/O(+) boards

● Clock boards

● Power and cooling modules

● Peripheral power supplies 184 and 195-watt models

Introduction to Sun Enterprise Servers 1-25Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 26: Sun Enterprise Server Maintenance

1

Concurrent Maintenance Tools

Dynamic Reconfiguration

Dynamic Reconfiguration (DR) is the ability to alter the configuration

of a running system by bringing components online or taking them

off-line without disrupting system operation or requiring a system

reboot. With DR, system boards can be logically and physically

included in the system configuration, or logically and physically

removed while the system is running.

This is useful in mission-critical environments if a system board has

failed and needs to be replaced or if new system boards need to be

added to the system for additional performance and capacity. DR is a

critical part of the concurrent maintenance strategy prevalent in the

enterprise computing environment.

Alternate Pathing

Alternate Pathing (AP) creates a new layer of device drivers called

meta-disks and meta-networks, which route access to one of two

physical device drivers. Applications and the operating system

components, including the disk management software, use the meta-

device name to access the resource. Only the drivers know the actual

physical paths.

The active path can be manually switched from the primary to the

alternate, at any time, with no interruption to data traffic. With AP

software operating and configured, automatic switch-over to the

alternate path occurs if the primary path fails. A manual AP switch

back to the primary path is required after service has been completed.

Meta-device definitions are stored in an AP state database that is used

early in the boot process. There are usually several copies of this

database. You must create the meta-devices yourself; the system does

not automatically create these for you.

1-26 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 27: Sun Enterprise Server Maintenance

1

Monitoring and Administration

Sun Management Centre

Sun Management Centre, formally known as SyMON, is a

comprehensive system monitoring tool for the Sun Enterprise servers.

Its graphical user interface (GUI) and intuitive design make it easy to

learn and use.

Sun Management Centre is a powerful system management solution

that dramatically increases RAS by allowing system administrators to

monitor and quickly manage large enterprise system configurations.

Sun Management Centre address the following system management

functions:

● Manages thousands of systems

● Supports heterogeneous GUI (Java technology-based)

● Supports full Simple Network Management Protocol (SNMP)

connectivity

● Supports active configuration management controls (supports DR)

● Supports historical data storage

● Supports system management capabilities

Introduction to Sun Enterprise Servers 1-27Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 28: Sun Enterprise Server Maintenance

1

1-28 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 29: Sun Enterprise Server Maintenance

HardwareComponentOverview 2

2-29Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 30: Sun Enterprise Server Maintenance

2

The Sun Enterprise 3000 Server

The Enterprise 3000 is a deskside tower enclosure. All the boards plug into therear of the E3000.

The clock board is located in the lower right, next to board slot 1. The clockboard has its own slot and does not use one of the four slots for theCPU/memory or I/O boards.

There are four slots in the bottom portion of the cabinet for CPU/memoryboards and I/O boards. The slots are numbered 1, 3, 5, and 7, from right to left.

A fully loaded E3000 will require 2 power/cooling modules (PCMs), the firstlocated above slots 1 and 3, the second located above slots 5 and 7. A thirdPCM can be used for redundant power in a fully loaded system.

If a third PCM is not used, a fan tray must be installed above the peripheralpower supplies to provide cooling.

The peripheral power supply is located in the lower left of the cabinet rear. Aspot for a redundant peripheral power supply is located to the right of the firstperipheral power supply.

Internal Disk Drives

The E3000 holds up to ten internal hot-plug disk drives. The disks are alldriven from the I/O board in slot 1.

Disk addressing is covered in chapter 8.

Three 300-watt PCMs

57 3 1

Peripheral

PeripheralFour board slots

Clockboard

PowerSupply #1

Power Supply #2(optional)

2-30 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 31: Sun Enterprise Server Maintenance

2

The Sun Enterprise 3000 Server

Specifications

Table 2-1 Sun Enterprise 3000 Server Specifications and Features

Features System/Board Configuration

Number ofGigaplane slots

Four slots. Minimum configurationrequires one I/O and one system board

Number ofprocessors

One to six Superscalar SPARC Version 9,UltraSPARC microprocessor modules

CPU interface One to six 128-bit Ultra Port Architecture(UPA) slots

Memory 256 Mbytes to 12 Gbytes

SystemInterconnect

Gigaplane, 2.68 GB/sec at 83 MHz

Three differentpower supplysystems

Up to three power and cooling modules(PCM) (power supply + fan module) forsystem and I/O boards. A peripheralpower supply (PPS1) for auxiliary powerand a peripheral power supply/AC(PPS0)

Internal disk Up to ten 3.5 inch hot-pluggable, SCSIdisk drives

Internal tape 8 mm, 4 mm, and .25 inches

CD-ROM SunCD12 drive standard

Height 65 cm (25.5 inches)

Width 43 cm (17.0 inches)

Depth 60 cm (23.5 inches)

Hardware Component Overview 2-31Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 32: Sun Enterprise Server Maintenance

2

The Sun Enterprise 3500 Server

The Sun Enterprise 3500 is vastly different to the E3000.

There are five slots in the bottom portion of the cabinet for

CPU/memory boards and I/O boards. The slots are numbered 1, 3, 5,

7, and 9 from right to left.

The Sun Enterprise 3500 server comes with at least one power/cooling

module located above slots 1 and 3. If a second power/cooling module

is required, it would fit above slots 5 and 7, to the left of the first

power/cooling module.

A fan tray above the peripheral power supply is also included in an

entry configuration.

A third power/cooling module can be used for redundant power in a

system with three or more boards. To install the third power/cooling

module, the existing fan tray, located to the left of the second

power/cooling module, must be removed. The third power/cooling

module fits into this slot.

In addition to three power/cooling modules, a second peripheral

power supply is required for full N+1 power supply redundancy in a

five-board Sun Enterprise 3500 server configuration.

The first peripheral power supply is located in the lower left of the

cabinet rear. A spot for the second, optional peripheral power supply

is located in the lower right of the Sun Enterprise 3500 cabinet front.

This second peripheral power supply is located in the rear of the Sun

Enterprise 3000 system cabinet. It was redesigned and moved to the

front of the Sun Enterprise 3500 system cabinet in order to provide

space for the additional system slot.

The second peripheral power supply on the Sun Enterprise 3500 server

is now 195 watts, instead of the 184 watts peripheral power supply

used on the Sun Enterprise 3000 server.

2-32 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 33: Sun Enterprise Server Maintenance

2

Internal FC-AL Drives

The Sun Enterprise 3500 server has two internal disk banks (four disks perdisk bank), which support up to eight 9.1-GB FC-AL disks with optional dual-port connections. The number of internal disks supported in the SunEnterprise 3500 server was reduced in order to provide room for theadditional system slot in the rear of the server.

The inclusion of the fifth system slot in the back of the cabinet required thatthe optional second peripheral power supply be redesigned and moved to thefront of the cabinet, resulting in less space in the front of the cabinet for diskdrives.

The newer drives, however, can be configured to provide better diskavailability than that offered by the Sun Enterprise 3000 server. Each of thetwo disk banks can have one or two FC-AL loops connected to the installeddrives for a total of up to four loops. Dual-loop configurations provide ahighly-available, redundant hardware configuration.

Because the two banks are independent, a full configuration of eight diskdrives requires a minimum of two loops: one for each bank of four drives. Onthe other hand, a minimum configuration requires only one FC-AL connectionfor up to four disk drives.

The new FC-AL drives in the Sun Enterprise 3500 server still provide the hot-swap capability offered with the internal SCSI drives on the Sun Enterprise3000 server.

Disk addressing is covered in chapter 8.

Three 300-watt power/cooling modules

57 3 1Peripheral

FC-AL Five board slots

ClockboardPower

Supply with

InterfaceBoard

9

Key switch

Second peripheralpower supply

Fan tray

InternalFC-ALdisks

32X CD-ROM Tape drive

Front view Rear view

AC inlet

Hardware Component Overview 2-33Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 34: Sun Enterprise Server Maintenance

2

The Sun Enterprise 3500 Server

Specifications

Table 2-2 Sun Enterprise 3500 Server Specifications and Features

Features System/Board Configuration

Number ofGigaplane slots

Five slots. Minimum configurationrequires one I/O and one system board

Number ofprocessors

One to eight Superscalar SPARC Version 9,UltraSPARC microprocessor modules

CPU interface One to eight, 128-bit Ultra PortArchitecture (UPA) slots

Memory 256 Mbytes to 16 Gbytes

SystemInterconnect

Gigaplane, 2.68 GB/sec at 83 MHz, 3.2GB/sec at 100 MHz

Three differentpower supplysystems

Up to three power and cooling modules(PCM) (power supply + fan module) forsystem and I/O boards. A peripheralpower supply (PPS1) for auxiliary powerand a peripheral power supply/AC(PPS0)

Internal disk Up to eight, 3.5 inch hot-swappable, FC-AL disk drives with dual porting

Internal tape 8 mm, 4 mm, and .25 inches

CD-ROM SunCD32 drive standard

Height 65 cm (25.5 inches)

Width 43 cm (17.0 inches)

Depth 60 cm (23.5 inches)

2-34 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 35: Sun Enterprise Server Maintenance

2

The Sun Enterprise 4000/4500 Server

A compact mid-range server with tremendous computing power, this server nearly

doubles the expendability of the Sun Enterprise 3500 server.

You can install up to fourteen UltraSPARC II processor modules in a single chassis

with four CPU/memory boards in the front and three CPU/memory boards in back.

You can install up to four Sun Enterprise 4500 servers in a single data center cabinet.

When properly configured, each Enterprise 4500 system can support over 4

Terabytes of disk storage.

The Enterprise 4500, like the Sun Enterprise 5500 and 6500 servers, uses a horizontal

card cage.

Power/coolingmodules

Peripheral PowerTape drive(optional)32X CD-ROM drive

CPU/memory, I/Oand disk boards

Clock Board

Front view Rear view

SupplyKey switch

Hardware Component Overview 2-35Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 36: Sun Enterprise Server Maintenance

2

The Sun Enterprise 4500 Server

Specifications

Table 2-3 Sun Enterprise 4500 Server Specifications and Features

Features System/Board Configuration

Number ofGigaplane slots

Eight slots. Minimum configurationrequires one I/O and one system board

Number ofprocessors

Two to 14 Superscalar SPARC Version 9,UltraSPARCII microprocessor modules

CPU interface One to 14, 128-bit Ultra Port Architecture(UPA) slots

Memory 256 Mbytes to 28 Gbytes

SystemInterconnect

Gigaplane, 2.68 GB/sec (E4000 at 83MHz), 3.2 GB/sec (at 100 MHz)

Two differentpower supplymodules used

Up to four PCM (300 watt power supply +fan module) for system and I/O boards.One PPS1 (184 watt peripheral powersupply) for auxiliary power

Internal disk Up to eight 9.1 GByte disk drives on up to4 Disk Boards

Internal tape 8 mm, 4 mm, and .25 inches

CD-ROM SunCD32 drive standard

Height 34cm (13.5 inches)

Width 50 cm (19.7 inches)

Depth 56 cm (22 inches)

2-36 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 37: Sun Enterprise Server Maintenance

2

The Sun Enterprise 5500 Server

The Sun Enterprise 5500 is a 68-inch data center cabinet with an 8-slot E4500 card cage

mounted inside. The data center cabinet provides power distribution and cooling for the

system and up to one half terabyte of disk space. Each Enterprise 5500 data center rack

can accommodate up to four A5000 disk StorEdge subsystems. The Sun Enterprise 5000

server can accommodate up to six removable storage modules (RSMs). The system,

when completed with the proper features and options, can support over six terabytes

of disk space. This does require additional disk expansion racks.

Note: You can have A5000s or D1000s

Front view Rear view

CPU/memory,I/O, and disk

Power/cooling modules

Power sequencer

Peripheral power supplyClock board

board slots

Key switch32X CD-ROM drive

Tape drive(optional)

Cabinetfan tray

Optional secondpower sequencer

Sun StorEdgeLibrary FlexiPackTray or Hub Tray

Sun StorEdgeTM

Library FlexiPackTray or Hub Tray

Sun StorEdgeA5000

Sun StorEdgeA5000

Sun StorEdgeA5000

Sun StorEdgeA5000

Sun StorEdgeD1000 Array

Sun StorEdgeD1000 Array

Sun StorEdgeD1000 Array

Sun StorEdgeD1000 Array

Sun StorEdgeD1000 Array

Hardware Component Overview 2-37Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 38: Sun Enterprise Server Maintenance

2

The Sun Enterprise 5500 Server

Specifications

Table 2-4 Sun Enterprise 5500 Server Specifications and Features

Features System/Board Configuration

Number ofGigaplane slots

Eight slots. Minimum configurationrequires one I/O and one system board

Number ofprocessors

Two to 14 Superscalar SPARC Version 9,UltraSPARCII microprocessor modules

CPU interface Up to 14, 128-bit Ultra Port Architecture(UPA) slots

Memory 256 Mbytes to 28 Gbytes

SystemInterconnect

Gigaplane, 2.68 GB/sec (E5000 at 83MHz), 3.2 GB/sec (at 100 MHz)

Two differentpower supplymodules used

Up to four PCM (300 watt power supply+fan module) for system and I/O boards. APPS1 (184 watt peripheral power supply)for auxiliary power

Internal disk Up to eight 9.1 GByte disk drives on up tofour Disk Boards

A5200 option Up to four subassemblies for over 1 TByteof storage

Internal tape 8 mm, 4 mm, and .25 inches

CD-ROM SunCD32 drive standard

Height 173 cm (68.3 inches)

Width 77 cm (30 inches)

Depth 99 cm (39 inches)

2-38 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 39: Sun Enterprise Server Maintenance

2

The Sun Enterprise 6500 Server

The Sun Enterprise 6500 server is a 68-inch data center cabinet with a 16-slot card

cage; 8-board slots in front as well as the back. The E6000 will have one less storage

array, since it is housed in a 56-inch cabinet.

Note: You can have A5000s or D1000s

CPU/memory

Power/cooling modules

Power sequencer

Peripheral power supply

Clock board

Front view Rear view

Key switchCD-ROM drive

Tape drive(optional)

Cabinetfan tray

and I/O boardslots

Optional secondpower sequencer

Sun StorEdgeLibrary FlexiPackTray or Hub Tray

Sun StorEdgeTM

Library FlexiPackTray or Hub Tray

Sun StorEdgeD1000 Array

Sun StorEdgeD1000 Array

Sun StorEdgeD1000 Array

Sun StorEdgeD1000 ArraySun StorEdge

A5000

Sun StorEdgeA5000

Sun StorEdgeA5000

Sun StorEdgeA5000

Sun StorEdgeA5000

Sun StorEdgeA5000

Hardware Component Overview 2-39Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 40: Sun Enterprise Server Maintenance

2

The Sun Enterprise 6500 Server

Specifications

Table 2-5 Sun Enterprise 6500 Specifications and Features

Features System/Board Configuration

Number ofGigaplane slots

16 slots. Minimum configuration requiresone I/O and one system board

Number ofprocessors

Two to 30 Superscalar SPARC Version 9,UltraSPARCII microprocessor modules

CPU interface Up to 30, 128-bit Ultra Port Architecture(UPA) slots

Memory 256 Mbytes to 60 Gbytes

SystemInterconnect

Gigaplane, 2.68 GB/sec at 84 MHz.

Two differentpower supplymodules used

Up to eight PCM (300 watt power supply+fan module) for system and I/O boards. APPS1 (184 watt peripheral power supply)for auxiliary power

Internal disk Up to four 18.2 GByte disk drives on twodisk boards slots 14 and 15 only

A5200 option Up to three subassemblies for over 760GByte of storage

Internal tape 8 mm, 4 mm, and .25 inches

CD-ROM SunCD32 drive standard

Height 6500 - 173 cm (68.3 inches)6000 - 141 cm (56 inches)

Width 77 cm (30 inches)

Depth 99 cm (39 inches)

2-40 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 41: Sun Enterprise Server Maintenance

2

Gigaplane Architecture

Ultra Port Architecture (UPA)

The gigaplane interconnect is based around the Sun4u (UPA)

architecture. Each board within the gigaplane is assigned 2 UPA port

numbers, which are used by the system to derive addressing

information which is passed to the Solaris kernel.

Board Layout

● CPU/memory boards are usually in even-numbered slots in the

front (component side down) of E4500, E5500, and E6500 systems.

● I/O boards are usually in odd-numbered slots in the back.

▼ I/O boards are in the back because of the interface ports and

connected cables.

Note – You can install any CPU/Memory board in any slot, front or

back and you can install any I/O board in any slot, front or back.

You must install an I/O board in slot 1 to drive the internal CD-ROM

and tape unit.

The clock board has its own special slot, which is numbered slot 16 in

all the systems.

Hardware Component Overview 2-41Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 42: Sun Enterprise Server Maintenance

2

Packet Switched Bus

● 256-bit data width (plus error correction)

● Out-of-order completion

▼ A centerplane transaction does not tie up the bus. Due to the

packet nature of bus data, you can have up to 112 transactions

waiting for completion. Because there are no unused cycles

when different boards access the centerplane, we have a

sustained bandwidth that is 97 percent of the maximum.

● Pipeline transactions

▼ Up to 7 outstanding transactions from each processor

▼ Up to 7 outstanding transactions from each board on the

Gigaplane.

Gigaplane Speed

● Sun Enterprise x000 systems use a clock speed of 83 MHz

▼ 83 MHz provides for up to 2.6 Gbytes of bandwidth

● Sun Enterprise x500 systems use a clock speed 100 MHz

▼ 100 MHz provides for up to 3.2 Gbytes of bandwidth

Note – You can install a 100 MHz board in the 83 MHz system and it

should operate properly, although the board will only run at 83MHz.

But, installing an 83 MHz board in a 100 MHz system changes the

gigaplane speed to 83 MHz.

The 100 MHz boards are identified with a plus (+) sign in their

product name.

2-42 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 43: Sun Enterprise Server Maintenance

2

Centerplane Configuration

The centerplane is a backplane with more connections to the bus for

the same linear space.

It does not matter which type of board plugs into which side or slot,

with the exception of slo1 which we will talk about later.

The main considerations are that you want the boards as close to one

another as possible to reduce noise and latency.

You should place boards with external cabling in the back of the

system.

The next page gives a layout of the UPA port numbers assigned to

each gigaplane slot. We have included the SCSI assignments for the

slots; we will cover this later in the course.

I/O

Address Bus

Data Bus

BoardI/O

BoardI/O

BoardI/O

BoardI/O

Board

CPU/Mem

CPU/Mem

CPU/Mem

I/OBoard

CPU/Mem

System Front

System Rear

CPU/Mem

I/OBoard

ClockBoard

Hardware Component Overview 2-43Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 44: Sun Enterprise Server Maintenance

2

Centreplane Slot Assignment

2-44 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 45: Sun Enterprise Server Maintenance

2

E3000 PCM and Slot Layout

Note: If you do not have PS5 in place, you will need to fit a fan tray in

its place to provide cooling for the PPSs

Hardware Component Overview 2-45Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 46: Sun Enterprise Server Maintenance

2

E3500 PCM and Slot Layout

Note: If you do not have PS5 in place, you will need to fit a fan tray in

its place to provide cooling for the PPSs

2-46 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 47: Sun Enterprise Server Maintenance

2

E4500 - 6500 PCM and Slot Layout

Hardware Component Overview 2-47Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 48: Sun Enterprise Server Maintenance

2

Performance

Memory performance

Memory performance is improved by:

● 512 bits plus ECC (error correction code) = 576 bits transfer per

CPU clock cycle

● Cache-to-cache transfers, with the same-line buffer reducing

latency and processor intervention

● High memory bandwidth

▼ 500 Mbytes per second per bank (600 Mbytes per second for 2

banks on one board)

▼ Up to 16-way interleaved memory

● Address and data packets, 2-cycles each, so contention delay is

small

I/O performance

I/O performance is improved by:

● Multiple I/O boards for greater bandwidth

● Efficient interrupt processing

▼ Interrupt packets carry data and interrupts route to any CPU

● Two SBus controllers, three sbus slots per Sbus I/O+ board

▼ 64-bit, 25 MHz, 64-byte bursts

▼ 100 Mbytes per second direct memory access (DMA) read, 120

Mbyte per second DMA write for each SBus

▼ Double-buffered streaming buffers for read-ahead, write-

behind

● Graphics I/O card replaces one SBus and slot with a UPA bus and

graphics adapter slot. Other components and ports are the same.

2-48 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 49: Sun Enterprise Server Maintenance

2

Hot swap and Hot Plug devices

Be aware of the difference between the above:

Hot Swap

The unit automatically detaches from the system software.

Examples are:

● PCMs

● PPSs

Hot Plug

The unit has to be manually detached from the system software.

Examples are:

● Disk drives

● CPU/Memory boards

● I/O boards

Hardware Component Overview 2-49Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 50: Sun Enterprise Server Maintenance

2

Power Supplies

Power/Cooling Module (PCM)

AC Input, 100-240V AC @ 5.5A, DC Output

PCM power supplies are used in Enterprise 3x00, 4x00, 5x00, and 6x00

systems. There must be a 300W PCM for every two adjacent boards in

the system, because the fans inside the PCM are the only cooling for

those boards. This means that if a board is added to the system, there

must be an associated PCM. If one is not present, it must be added.

Each 300W PCM supplies enough power for two boards, although in a

fully loaded configuration one supply can be lost and there will still be

enough power for the remaining boards (N+1).

The PCMs:

● Are hot pluggable

● Supply cooling for two adjacent boards

● Operate in redundant current share mode (N+1)

+3.3V +5V +2.0V MaximumContinuous

51A 32A 5A 300 watts

2-50 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 51: Sun Enterprise Server Maintenance

2

Peripheral Power Supply (PPS)

The PPS is used in Enterprise systems to power internal SCSI devices

(CD, tape, and disks), in addition to the devices below.

There are two types of PPS; one with an AC input which is specific to

the E3x00 systems and one without an AC Input common to all the

servers.

Backup PPS

You will find one PPS per 4x00, 5x00, or 6x00 system; and one or two

PPS in the E3X00. This is because the PPS in the 4x00, 5x00, 6x00

systems power the CD-ROM and tape only, whilst the PPS in an E3x00

powers the internal disks.

Losing a PPS in a E3x00 is a system down, hence the backup.

The PPS provides the following:

● +5Vdc and +12Vdc peripheral tray power

● +5Vdc and +12Vdc drive precharge (nonredundant)

● +3.3Vdc and +5Vdc system precharge (nonredundant)

● +5Vdc redundant system power

● +12Vdc redundant power for PCM fans

● +12Vdc redundant power for E3000/E3500 Auxiliary Fan Module

● +5Vdc auxiliary power for Clock Board remote console serial port

● E4000/E4500 Keyswitch Assembly fan power

● E5000/E5500 and E6000/E6500 AC Input Box fan power

Internal Disk Board

It is the PCM, not the PPS, that powers the disk board.

Hardware Component Overview 2-51Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 52: Sun Enterprise Server Maintenance

2

Peripheral Power Supply (PPS)

184 Watt PPS, used in the E4x00, E5x00 and E6X00. Used as a backup

PPS in an E3000. Part number 300-1301

184 Watt PPS with AC Input, used as a main PPS in an E3000. Part

number 300-1307

2-52 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 53: Sun Enterprise Server Maintenance

2

Peripheral Power Supply (PPS)

195 Watt PPS, used as a backup PPS in an E3500. Part number 300-1358

300-1358 - AC Input 100-240V AC @ 3A, DC Output

300-1301/1307 - AC Input 100-240V AC @ 3A, DC Output

+5V +5V +12.0V -12.0V +14V MaximumContinuous

20A 5A 13A 1.5A 1A 195 Watts

+5V +5V +12.0V +12.0V Maximum Continuous

20A 5A 13A 1.5A 184 Watts

Hardware Component Overview 2-53Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 54: Sun Enterprise Server Maintenance

2

PCM and PPS Status Lights

Status LEDs Codes

Green Yellow Description

Off Off No AC Input

On Off Normal Operation

On On Fan Failure

Off On DC Output Failure

2-54 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 55: Sun Enterprise Server Maintenance

2

Hot Pluggable Boards

Hot Plug Architecture

The CPU/Memory+ boards and the I/O+ boards are hot pluggable

under certain conditions.

You can only remove a system board if it has an amber light on only,

and even then there are checks to be made to ensure the board may be

removed.

In the middle of the centerplane connector are three large pins that are

larger and longer than the Gigaplane connector pins.

Hardware Component Overview 2-55Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 56: Sun Enterprise Server Maintenance

2

Hot Plug Architecture

These connectors provide for connection to the power bus before data

and address pins make contact in the Gigaplane connectors. Each of

the power connectors is a different length, which provides for a

sequential connection process.

The first pin to make contact when a board is plugged into the card

slot is the ground pin.

Next is the precharge voltage connection. This applies a low voltage to

the logic and prepares the logic for full voltage with less current drain

at contact than would be required if the precharge was not provided.

This eliminates the power surge, which corrupts data and address

lines, and causes systems to halt when boards are inserted.

Warning – The precharge voltage is provided by the PPS. Ensure the

precharge is available before attempting a hot-plug.

# /usr/platform/sun4u/sbin/prtdiag -v | grep \precharge

Trigger Pin

Just before the data and address pins in the Gigaplane are connected a

logic pin called the trigger pin makes connection. This informs the

clock board to suspend activity on the gigaplane for 200ms whilst the

board insertion completes.

!Caution – You can not hot-plug the clock board for two reasons.

Firstly, there are no precharge connections for the slot, and secondly

because it is the clock board which controls board insertion.

2-56 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 57: Sun Enterprise Server Maintenance

2

Common and Unique components

Common Components

One of the features of the Exx00 range is the commonality betweenmajor components. Some common components include:

● CPU/Memory boards

● I/O boards

● CPU Modules

● Memory

● PCMs

● PPSs

● Clock Boards

● Disk Boards

Unique Components

Some unique components are:

● AC Input units

● Media bays

● Load boards

● E3500, IB boards

● E3500, auxiliary PPS

● E3500, FC-AL drives

● E6500 load boards

Hardware Component Overview 2-57Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 58: Sun Enterprise Server Maintenance

2

Cooling Considerations

Filler Panel

The filler panel shown below directs airflow inside the card cage and

helps shield electromagnetic interference (EMI) type emissions from

interfering with normal operations.

Caution – Empty slots in Enterprise 4X00 and 5X00 systems must have

a filler panel installed. Whenever you remove a board and do not

immediately replace it, you must install a filler panel.

Springfingers

2-58 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 59: Sun Enterprise Server Maintenance

2

Cooling & Loading Considerations

Load Board

The load board shown below does the same tasks as the filler panel,

but it also helps maintain a constant load on the power supply system,

reducing the occurrences of voltage spikes.

Whenever you remove a system board in an E6x00 and it is not

immediately replaced, you must install a load board in its place.

Caution – Load Boards are used only in Enterprise 6X00 Systems.

All slots in Enterprise 6X00 systems that do not contain system boards

must have a load board installed.

Springfingers

Centerplaneconnector

Hardware Component Overview 2-59Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 60: Sun Enterprise Server Maintenance

2

Exercise: Component Removal and Replacement

Sun Enterprise 4500, 5500, and 6500 Systems FRU RemovalProcedures.

!Caution – Before beginning any procedure to remove static sensitive

components from any Sun Enterprise server, attach an approved ESD

wrist strap to your wrist and connect the other end to the system

chassis. Connect the ESD mat provided to the same chassis and verify

that it is properly grounded before preceding. Always place removed

system components on the ESD mat provided

Removing the Power and Cooling Modules

Note – Remember the following rules for hot-plug replacement of a

PCM: The peripheral power supply must be operational (to provide

precharge current). Hot-plugging requires adequate redundancy of

electrical power or an overload condition might occur when a power

supply is removed. Use the prtdiag command to determine if

precharge current is present before removing or installing a hot

pluggable power supply.

1. Use a #1 Phillips screwdriver to turn each quarter-turn access

screw on the power supply to the unlocked position

2-60 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 61: Sun Enterprise Server Maintenance

2

Exercise: Component Removal and Replacement

2. Pull the end of the extraction lever outward to release the power

supply from the centerplane.

Figure 2-1 Extracting a Power and Cooling Module

3. Slide the power and cooling module out of the chassis.

Front Rear

Extraction levers toward near side

Hardware Component Overview 2-61Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 62: Sun Enterprise Server Maintenance

2

Exercise: Component Removal and Replacement

Removing the Peripheral Power Supplie(s)

1. Use a Phillips #1 screwdriver to unlock the quarter-turn access

slots on the power supply.

2. Pull the ends of the extraction levers outward to release the power

supply from the centerplane

Figure 2-2 E5500/6500 PPS Removal

Figure 2-3 E3500 PPS/AC Removal

2-62 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 63: Sun Enterprise Server Maintenance

2

Exercise: Component Removal and Replacement

Removing the Auxiliary Peripheral Power Supply 1 (PPS1) Fromthe E3500

1. Release the power supply from the system chassis by loosening

the captive screws.

2. Pull the ends of the extraction levers outward to release the power

supply from the centerplane.

3. Pull the power supply straight out.

Figure 2-4 E3500 Auxiliary Peripheral Power Supply 1 Removal

Hardware Component Overview 2-63Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 64: Sun Enterprise Server Maintenance

2

Exercise: Component Removal and Replacement

Removing the Removable Media Tray

1. E3500/4500, remove the front bezel.

a. Grasp the front bezel on both sides near the center.

b. Place your thumbs on top of the front bezel and place your

other fingers at the slight indentations under the front bezel

for leverage.

c. Pull the front bezel straight out toward you and set it aside.

2. Loosen the bottom two captive screws that secure the media tray

to the chassis tray.

Figure 2-5 E3500/E4500 Media Tray Removal

3. Use a screwdriver in the notch at the bottom center of the tray to

assist in separating the media tray from the rear slip connectors,

and pull out the tray.

E5500/6500

1. Remove the left side panel

2. Release the device enclosure from the media tray by removing

three screws on the left side of the media tray.

2-64 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 65: Sun Enterprise Server Maintenance

2

Exercise: Component Removal and Replacement

3. Pull the device enclosure forward and disconnect the data and

power cables from the rear of each device.

4. After the cabling is removed, remove the device enclosure from

the media tray.

Figure 2-6 E5500/6500 Media Tray Removal

Hardware Component Overview 2-65Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 66: Sun Enterprise Server Maintenance

2

2-66 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 67: Sun Enterprise Server Maintenance

BusStructuresandTypes 3

3-67Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 68: Sun Enterprise Server Maintenance

3

UPA Bus Architecture

The CPU/memory Board and the UPA Bus

The figure below shows the relationship between the CPU modules

and the system board. The area within the shaded box is supported by

the UPA bus.

The table below shows you the bus widths for different system

functions.

UPA and Gigaplane bus widths

UPA bus Gigaplane bus

Processor; 128 data + 16 ecc

SYSIO; 64 data + 8 ecc

FFB; 64 data

256 data + 32 ecc

41 address

UPA bus

3-68 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 69: Sun Enterprise Server Maintenance

3

Ultra Port Architecture Features

The Ultra Port Architecture (UPA) supports the high-performance

UltraSPARC design. Sun Microsystems created this new component

interconnect bus to optimize data transfers between devices and

system boards. Designed specifically for multitasking, multiprocessing

environments, the UPA interconnect handles multiple simultaneous

requests for data transfers between processors, memory, and I/O devices.

UPA features include:

● Packet-switched bus

● High speed (1.6 Gbytes/second)

● High bandwidth

● Direct CPU to memory without crossbar switching

● Improved 3D graphics acceleration

This new high-performance architecture has a processor-to-memory

interconnect using the UPA bus. The UPA bus runs at one-half the

CPU clock rate because it is twice as wide. This enables the CPU to

load each half of the bus’s data before the next bus cycle.

To increase the data flow between the CPU and other subsystems, the

UPA uses crossbar packet switching. Packets from various subsystems,

such as memory, graphics, and I/O devices can be multiplexed. This

allows multiple transactions to occur seemingly simultaneously, with

peak transfers in excess of 1.6 Gbytes per second.

Bus Structures and Types 3-69Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 70: Sun Enterprise Server Maintenance

3

SBus Architecture

SBus Features

Sun Microsystems designed the SunBus™ (SBus) to provide the

SPARC™ products with a high-performance, space-efficient, and cost

effective system bus. The 25 MHz 32-bit data and address SBus

specifications have been adopted by the Institute of Electrical and

Electronic Engineers (IEEE) and are available to third-party

developers.

SBus provides for device autoconfiguration. Installing SBus expansion

boards is easy because of an EPROM containing machine-independent

Forth code that describes the board’s function and contains a POST

that is compatible with Sun systems POST commands. The system

retrieves configuration information from the expansion boards at

power-up, thereby identifying and initializing all devices connected on

the SBus.

3-70 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 71: Sun Enterprise Server Maintenance

3

PCI Bus Architecture

PCI Mechanical Specifications

PCI boards have two basic form factors, standard or long length (312

mm) and short length (119-167 mm). Board edge connectors are keyed

for 3.3V signaling, 5V signaling, or universal signaling. Universal

boards are designed to fit in 3.3V or 5V connectors.

The 32-Bit, 124 pin PCI connector has 120 signal pins and 4 key pins.

The 32-Bit connector defines the system signaling as 3.3V or 5V. An

optional 64-Bit extension is built into the same connector molding

extending the number of pins to 184.

A 32-Bit PCI board identifies itself for 32-Bit transfers when it is

installed in a 32-Bit or 64-Bit connector. A 32-Bit PCI board can be

installed in either a 32-Bit or 64-Bit connector.

A 64-Bit PCI board identifies itself for 32-Bit transfers when it is

installed in a 32-Bit connector. A 64-Bit PCI board identifies itself for

64-Bit transfers when it is installed in a 64-Bit connector.

The signals that enable 64-bit operation are REQ64 and ACK64. They

are Side A Pin-60 and Side B Pin-60 of the 32-bit connector.

PCI Electrical Specifications

The PCI specification provides for 3.3V and 5V signaling. Signaling is

determined by the motherboard. Signaling for a 3.3V PCI board is at

3.3V. Signaling for a 5V PCI board is at 5V. Signaling for a universal

PCI board is at 3.3V or 5V.

All PCI connectors require four power rails: +3.3V, +5V, +12V, and -

12V. The distinction between a 3.3V and 5V PCI boards is in the

signaling protocol, not the connector power rails. The maximum

power allowed for a PCI board is 25 Watts from all four power rails

combined.

Bus Structures and Types 3-71Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 72: Sun Enterprise Server Maintenance

3

PCI Bus Architecture

PCI Board Connections

PCI Boards are shown with the solder side up because this is the

orientation in many PCI systems.

3-72 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 73: Sun Enterprise Server Maintenance

3

SCSI SBus card

You will find a number of scsi connections within the Exx00 servers.

Single-Ended Fast/Wide (SunSwift), part number 501-2739

There are sbus scsi cards, pci scsi cards, and each I/O board has an

on-board scsi port which is driven by a FEPS chip on the board.

Bus Structures and Types 3-73Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 74: Sun Enterprise Server Maintenance

3

SCSI PCI card

Single-Ended Ultra/Wide (SunSwift PCI), part number 501-2741

This is a PCI SCSI card. Note the driver chip. This card will provide an

Ultra-SCSI bus.

3-74 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 75: Sun Enterprise Server Maintenance

3

SCSI Features - Fast SCSI

Small Computer System Interface Features

The Small Computer System Interface (SCSI)-1 standard defines two

modes of data transfer: asynchronous (handshaking) and synchronous(streamed) mode. SCSI-1 synchronous transfer rates are limited to 5

Mbytes per second. In many environments this is acceptable. But in

configurations with multiple high-performance devices on the bus, 5

Mbytes per second can make the bus a bottleneck.

Besides a better-defined set of required features, the (SCSI)-2 standard

defines several optional features that have an impact on users: Fast,

Wide, differential, and tagged queueing. A specific implementation

can be SCSI-2-compliant, yet implement none of these four features. In

fact, all current Sun Microsystems SCSI disk and CD-ROM products,

as well as the tape drive devices, are compliant with SCSI-2. There are

many more features to the SCSI-2 standard than these four options.

This section discusses only these options, because they are the most

commonly used features of SCSI-2.

Fast SCSI – Higher Bus Speed

The SCSI-2 standard defines an option known as Fast SCSI, which

increases the synchronous transfer rate to 10-Mbytes per second. The

terms Fast SCSI and 10-Mbyte SCSI are synonymous, and are used

interchangeably. The term SCSI-2 is often incorrectly used to mean

Fast SCSI.

10-Mbytes per second, 5-Mbytes per second, and asynchronous

devices can be mixed on a SCSI bus. Transfer rates are negotiated on

an individual basis between the host and each device.

Fast SCSI requires the proper protocol chips in both the host adapter

and device controller, as well as a modified software driver. Solaris 2.0

(and higher) software support fast SCSI.

The SPARC desktop systems developed after the SPARCstation™ 10

offer the fast SCSI host adapter on the system board. There are also

additional host adapter SBus cards available from Sun Microsystems

that support Fast SCSI.

Bus Structures and Types 3-75Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 76: Sun Enterprise Server Maintenance

3

Wide SCSI, Differential SCSI

Wide SCSI – Wider Is Better

In SCSI-1, all data transfer paths are parallel and 8-bits wide. The

SCSI-2 standard defines two options that widen the bus to 16 or 32

bits. Each of these options are referred to as Wide SCSI. Most

implementations of Wide SCSI are 16-bits wide and also implement

the Fast option, thus yielding burst-transfer rates of 20 Mbytes per

second.

Differential SCSI — Less Interference

The SCSI standard defines two types of electrical interfaces: single-

ended and differential. Single-ended uses a 50-pin, high-density,

connector. Differential SCSI uses special hardware drivers and

receivers that reference the signals to each other rather than to ground.

Sun Microsystems differential implementation uses a slightly larger,

industry-standard, 68-pin, high-density connector.

There is no performance benefit to differential SCSI, but it

accommodates considerably longer SCSI bus lengths than does the

single-ended interface. Differential SCSI busses can be up to 25 meters

(82 feet) in length.

Single-ended SCSI is limited to 6 meters (19.7 feet) total bus length. In

fact, the SCSI-2 standard recommends that busses with Fast SCSI

devices be limited to 3 meters. However, with high-quality shielded

cables and proper active (regulated) bus termination, 6-meter Fast

busses are quite reliable.

3-76 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 77: Sun Enterprise Server Maintenance

3

SCSI Termination, Ultra-SCSI

Termination

SCSI buses need to be correctly terminated. If the bus is not

terminated, you may get signal reflections on the bus which will give

SCSI transport errors. There are two types of termination; active (or

regulated) and passive (or standard). Active termination is the better

of the two.

Ultra-SCSI

Ultra-SCSI is also known as Fast-20. It combines the features of Fast

SCSI with Wide SCSI and doubles the transfer rate to 40 MBytes per

second. This increase in transfer rate requires the faster (33MHz) PCI

bus systems to handle the increased transfer speeds.

Bus Structures and Types 3-77Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 78: Sun Enterprise Server Maintenance

3

SCSI Icons, cable quality

Cable Quality

The following figures assume SUN cables are being used. Ensure your

customer is using these cables, or cables of a similar quality.

SCSI icons

Below are the icons which denote single-ended and differential.

The icon on the left is for a single

ended scsi controller or terminator.

To the left is the icon for a

differential scsi controller or

terminator.

3-78 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 79: Sun Enterprise Server Maintenance

3

Conclusion - SCSI Cable Lengths

The Signal Frequency and the Electrical Wiring can then be used to

calculate the Maximum Cable Length. The following tables show the

Maximum Cable Length in meters (m):

Cable length

Signal Freq. Devices Single ended Differential

SCSI-2 Fast/wide 1-16 6.00m 25.00m

Ultra-SCSI 1-4 3.00m 12.50m

Ultra-SCSI 4-8 1.50m 6.25m

Bus Structures and Types 3-79Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 80: Sun Enterprise Server Maintenance

3

SCSI Implementation on Ex00 I/O Boards

!Caution – You must include the internal cable-lengths of the I/O

boards and peripherals in your calculations.

Device Internal Cable length

I/O boards 0.5 m

Disk boards 1.0 m

I/O Board in Slot 1

This is a special case, since the I/O board in slot 1 drives the internal

CD-ROM and Tape drive.

Rules

E3500 4.5 m cable length supported

E4x00 4.5 m cable length supported

E5x00,6x00 SCSI devices are not supported on slot 1 in an E6500,

apart from the internal CD-rom and tape.

I/O Boards in all other slots

All other slots support 5.5m of cable length

3-80 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 81: Sun Enterprise Server Maintenance

3

Fibre Channel Interface

Fibre Channel

SCSI is by far the most common peripheral interconnect today,

although others are in common use. The primary disk interconnect

used by Sun today is Fibre Channel (FC), an ANSI standard (ANSI

X3T9.3) that defines a SCSI-like command set but which is carried via

a fiber optic connection instead of copper wires. Suns SPARCstorage

Array uses a Fibre Channel connection to carry standard SCSI-2

commands and data. Although Fibre Channel is an ANSI standard, it

has been brought under the SCSI-3 umbrella. Future FC standards will

be generated as a subset of the SCSI-3 specification, which includes a

bewildering variety of options, for command sets, interconnect media,

and interoperability.

Fibre Channel Topologies

The familiar SCSI-2 really has only one or two ways to connect: a tree

of peripherals is connected to a host. Alternatively, the peripheral tree

is connected to two hosts via some sort of multi-initiator arrangement.

Fibre Channel has three very different topology options:

● point-to-point, in which a device connects to exactly one other

device;

● arbitrated loop (normally abbreviated FC-AL), in which the

peripherals and one or more hosts are connected together in a ring

topology using many point-to-point links;

FC-AL is architecturally similar to a full-duplex FDDI;

● fabric, in which switches and hubs are used to create an arbitrarily

complex network, possibly including multiple paths from a host to

a peripheral.

These topologies are shown overleaf.

Bus Structures and Types 3-81Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 82: Sun Enterprise Server Maintenance

3

Topologies

3-82 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 83: Sun Enterprise Server Maintenance

3

World wide numbers (WWN)

Fibre Channel devices use a flat, universal addressing structure in

which every device is assigned a unique address, known as the world

wide name (WWN). The WWN must be unique in the FC topology;

because Fibre Channel domains can potentially be connected into

arbitrary fabrics, the usual practice is to assign completely unique

WWNs to devices, in much the same way that Ethernet addresses are

assigned uniquely.

The SPARCstorage Array uses the simplest of these options, a point-to-

point link that connects a disk array controller to one or two hosts. The

controller connects to a host via a point-to-point link using a two-

strand fiber cable. Fibre Channel is a full-duplex medium, requiring a

strand for each direction. The SPARCstorage Array can be connected

to two hosts through the simple expedient of having two

(independent) FC interfaces. Expanding the point-to-point mechanism

into a more complex network is impossible without resorting to hubs

and switches and the use of a fabric.

Fibre Channel Transfer specifications

The FC standard defines several classes of signal, corresponding to

different capabilities when combined with actual fiber connectors.

Each signal type uses a different type of laser, so the varieties are not

interchangeable. The classes are normally described in terms of their

data speed, or 25 MB/sec, 50 MB/sec, and 100 MB/sec. Because FC is

a full-duplex standard, transferring between two devices at double

these speeds is theoretically possible, although in practice few devices

are capable of handling this much data. Although Sun has fielded over

20,000 SPARCstorage Arrays using FC-25, the industry as a whole

deferred acceptance of Fibre Channel until the arrival of FC-100 parts.

The market seems to have bypassed FC-50 completely. A few vendors

are now delivering products capable of FC-100 interoperability, but

little volume has been achieved to date (mid 1996). However, every

major storage vendor is planning FC-100 products in late 1996 or early

1997, and a safe bet is that high-end storage will be dominated by FC-

100 products by 1998.

Bus Structures and Types 3-83Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 84: Sun Enterprise Server Maintenance

3

Fibre Channel Distance Capability

One of the most useful capabilities of the FC medium is that its lasers

are capable of transmitting signals reliably over distances that are far

in excess of those attainable using standard copper SCSI technology.

Whereas SCSI-2 is limited to six meters in single-ended

implementations and 25 meters using differential transceivers, Fibre

Channel uses 50 micron multimode fiber capable of 2 km transmission

distance, although Sun itself offers cable lengths only up to 15 meters.

The FC standard permits distances up to 10 km.

One of the most useful capabilities made possible by Fibre Channel is

the ability to geographically disburse storage across much wider

distances than with other technologies.

With a practical cabling distance of several kilometers, it is possible to

mirror data onto two different disk arrays located on opposite ends of

a campus, or even nearby in a metropolitan area. Because the FC

connection operates at full disk subsystem speed, disaster recovery can

be simplified without loss of performance. This capability is similar to

the those offered by a few mainframe disk vendors, with one major

exception: the FC operates at full FC speeds with negligible

transmission latency, whereas the wide-area disk mirroring available

on some mainframe storage units is subject to significant delays due to

wide-area networking latency. For bandwidth-sensitive applications,

3-84 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 85: Sun Enterprise Server Maintenance

3

Fibre Channel Cable

As its name implies, the fibre channel devices use a glass fibre instead

of a copper wire to carry the signal from the source to the destination.

The glass fibre shown in Figure 3-1 is about the thickness of three

sheets of paper.

Figure 3-1 Cross Section of a Fibre Optic Cable

The jacket on the fibre-cable provides something a connecting device

can bond with because the glass fibre is too thin and fragile for direct

access. The connector ends of the cable are precession manufactured to

guide the end of the glass fiber so it matches up exactly with the

transceiver port. If the glass fibre is not aligned perfectly with the laser

LED, the light does not pass along the cable.

!Caution – Be careful how you handle fibre cable. It has a minimum

bend radius which must not be exceeded.

Buffercoating

125 microncladding

glass fibre

62.5 Microncore of pure

Bus Structures and Types 3-85Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 86: Sun Enterprise Server Maintenance

3

Fibre Channel Interface - FC/OM and GBIC

The jacket helps prevent the cable from being bent or kinked. Any

damage to the glass causes a loss of signal. If the cable is bent sharply,

the laser beam will not go around the corner. If the cable is cracked or

crushed, the laser beam bounces back because it cannot pass through.

Figure 3-2 FC/OM and GBIC Optical Cable and Connector

The fibre channel optical module (FC/OM, predecessor to the GBIC)

and GBIC fibre cable plug and module connectors are keyed so they

can connect together only one way. Always observe the two pieces and

ensure they are properly aligned before connecting them.

Dual Porting

Fibre-channel allows disk drives and arrays to be dual ported. This

gives a great RAS advantage; alternate pathing or dynamic multi-pathing (DMP) software can be installed which protects the storage

from a failing I/O path.

Dual porting has implications for device addressing, which we shall

look at in chapter 5.

3-86 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 87: Sun Enterprise Server Maintenance

CPU/MemoryandClockBoards 4

4-87Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 88: Sun Enterprise Server Maintenance

4

Sun Enterprise 3x00/4x00/5x00/6x00 CPU/Memory Boards

CPU/Memory+ board block diagram showing the major component groups and the

interconnecting buses.

The CPU/Memory+ board includes

An Address Controller (AC+),

8 x Data Controllers (DC+s),

A Bootbus Controller, also known as the fhc

Onboard devices (including a Flash PROM, and SRAM),

Two UPA bus CPU processor slots

4-88 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 89: Sun Enterprise Server Maintenance

4

CPU/Memory Board - Overview

CPU/Memory Board Component Layout.

Note the plastic cover over the address and data controllers. It is there

to prevent the heatsinks being knocked loose on a board insertion or

removal.

Loose heatsinks cause us many problems in the field with unreliability.

If you find a loose heatsink in the field, replace the board.

The older boards do not have this cover. Be especially careful with

these boards.

CPU/Memory and Clock Boards 4-89Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 90: Sun Enterprise Server Maintenance

4

CPU/Memory Board - Physical

A 501-2976 support 2MB cache modules and run at 83 MHz

A 501-4312 support 8MB cache modules and run at 83MHz

A 501-4882 support 8MB cache modules and run at 100 MHz

4-90 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 91: Sun Enterprise Server Maintenance

4

Memory DIMMs

Each CPU/Memory+ board has 16 DIMM sockets, which are divided

into two banks of 8 DIMMs each.

Bank 0 and bank 1 DIMMs occupy alternate slot locations; bank 0

DIMMs are in the even numbered slots, and bank 1 DIMMs are in odd

numbered slots.

Memory DIMMs come in sizes ranging from 8 MBytes to 128 MBytes

each.

Memory must be installed in a complete bank of eight DIMMs with

each DIMM being the same size, type, and speed. Bank 0 can contain

different size DIMMs than bank 1.

UPA Ports

Proc 0 is assigned the first port number associated with the slot, proc 1

the second.

DC - DC convertors

These ensure that the CPU modules get the correct voltage they

require. Yo do not necessarily have to upgrade a CPU/Memory board

if you upgrade the CPU module

CPU/Memory and Clock Boards 4-91Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 92: Sun Enterprise Server Maintenance

4

CPU Modules

Processing power on each CPU/Memory+ board is provided by one or

two UltraSPARC II CPU modules, with one to four Mbytes of local

high-speed external cache memory. Supported modules are as listed

below.

167 MHZ, 0.5/1.0 MB Ecache

250MHZ, 1.0/4.0 MB Ecache

336MHZ, 4.0 MB Ecache

400MHZ, 4.0/8.0 MB Ecache

Figure 4-1 UltraSPARC II CPU Module

A CPU/Memory+ board is not required to contain an UltraSPARC II

processor module and can operate as a memory-only board.

288 Pin144 Pin

Screws

Screws

ConnectorConnector

4-92 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 93: Sun Enterprise Server Maintenance

4

400 MHz, 8 MB Ecache processor modules

When trying to install Solaris 2.5.1 HW 11/97 or 2.6 HW 3/98 on a

Ex000 server with a 400MHz/8MB cache CPU module, booting from

CD-ROM or network install server gives the error message:

Fast Data Access MMU Miss error

or panics with;

mutex_enter: bad mutex.

This is because there is no support for the 8MB cache without the

following patches. The procedure is as follows.

NOTE: This procedure requires downloading and applying patches so

the install client must have a network connection.

1. Verify OBP version by typing at the ok prompt be typing

ok .version

Or check at the UNIX prompt.

# /usr/sbin/prtconf -V

If needed, upgrade to at least flash PROM version 3.2.21 using patch

103346-22 or greater.

2. ok setenv auto-boot? false

3. ok reset

4. ok limit-ecache-size

5. ok boot cdrom (at least 2.5.1 HW 11/97 or 2.6 HW 3/98)

6. Install the OS but do not allow auto-reboot!

7. # init 0

8. ok reset (usually not needed with 2.6)

CPU/Memory and Clock Boards 4-93Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 94: Sun Enterprise Server Maintenance

4

9. ok limit-ecache-size

10. ok boot

11. Make sure you have a network connection, FTP to

sunsolve.sun.com and get latest kernel patch (minimum levels to

support 400 mhz/8mb cache listed):

Solaris 2.5.1 --> 103640-27 and prtdiag patch104595-08.

Solaris 2.6 --> 105181-14

12. Change run level to single-user mode using init S.

13. Install patches from ftp download directory.

14. Reboot.

4-94 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 95: Sun Enterprise Server Maintenance

4

CPU Module Handling Precautions

Use the following precautions when handling UltraSPARC II modules:

!Caution – Do not handle the modules by touching the gold pins on the

compression connectors. The natural oils on your hands causes these

connectors to oxidize and corrode over a period of time. Corroded

connector pins cause the module to fail, requiring you to replace the

module again.

!Caution – Handle the UltraSPARC modules by the edges only. Do not

handle them by the heatsinks because they can break easily.

Warning – The heatsinks attached to the UltraSPARC processor chip

can get very hot. Avoid touching the heatsink because you can get a

severe burn. You could damage the module if you drop it.

CPU/Memory and Clock Boards 4-95Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 96: Sun Enterprise Server Maintenance

4

Removing and Replacing a CPU Module

Use a 3/32 hex-driver to loosen all screws on each of the compression

connectors on the module to be removed (three screws for the 288-pin

connector, two screws for the 144-pin connector).

Lift the module straight up, off the board mating surface and the

single standoff that positions the module on the board.

Figure 4-2 Removing a CPU Module

4-96 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 97: Sun Enterprise Server Maintenance

4

Removing and Replacing a CPU Module

Each module is located on the main board with a single standoff and is

connected to the main board by two spring loaded connectors. The

pins within the connectors are compressed to the corresponding

board’s mating surfaces by a compression bar which, when secured

with screws, connects the module connector pins to the board’s

corresponding connector surface.

To ensure that the connectors are correctly aligned, you must align the

post on the MLB with the corresponding hole in the module. When

you have the post and hole aligned, you can insert the five hex-

socketed screws and finger tighten them. Now you must torque the

screws, in the order described below to six inch-pounds using the

torque-driver (Sun part number 560-2324) supplied with the system.

Ignore the reference to Method B. The torque sequence has gone

through a number of changes.

Take up the slack on each screw, then go

around the screws in the order shown

putting a 1/4 turn on each screw.

Each screw should reach the correct torque

setting at the same time.

FOLLOW THIS PROCEDURE. IT IS

IMPORTANT.

DO NOT MAKE UP YOUR OWN

SEQUENCE.

DO NOT RUSH THIS PROCEDURE.

CPU/Memory and Clock Boards 4-97Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 98: Sun Enterprise Server Maintenance

4

Memory Interleaving

Enterprise servers allow up to 16-way interleaving. There is an OBP

parameter which sets up interleaving.

memory-interleave

min disables interleaving, max sets interleaving to the maximum

possible factor. How you populate memory will have a major effect on

system performance. The rules are below.

Note – You must set memory-interleave=min to allow dynamic

reconfiguration of CPU/Memory boards

Memory Configuration Rules

The following rules apply to configuring the systems memory:

● DIMMs are 72-pin.

● Eight DIMMs form a bank.

● All DIMMs in a bank must have the same capacity.

● The first bank of memory can be either Bank 0 or Bank 1.

● There is a better performance from mixing many smaller banks

than fewer bigger banks.

● Install one bank on each CPU/Memory board before installing the

second bank on any board.

● Install the largest density banks (128MB DIMMs) first, then

medium density banks (32MB DIMMs), and finally the smallest

density banks (8MB DIMMs).

All DIMMs in a bank should have the same speed rating. If DIMMs of

different speeds are mixed in a bank, the bank will function, but at the

lowest speed.

4-98 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 99: Sun Enterprise Server Maintenance

4

CPU/Memory Board Status Indicators

CPU/Memory+ boards have three LEDs indicating the status of that

board.

With the advent of dynamic reconfiguration (DR), the meaning of the

amber service LED has changed.

Before DR, the only time a board had an amber light on was when it

had failed POST. The correct meaning of the amber light on as

highlighted below, is the board is in low power mode. Either it has a

fault or it has been DR’d out.

Table 4-1

Power Service Running Condition

Off Off Off Board has no electrical power

Off On Off Board is in low power mode, can beunplugged

Off Off On Undefined

Off On On Undefined

On Off Off System is hung, either in POST/OpenBootor in the operating system

On Off On Hung in OS

On On Off Hung in POST/OBP or hung in OS andhas failed component on board

On On On Hung in POST/OBP or hung in OS andhas failed component on board

On Off Flash OS running

On On Flash OS running and failed component onboard.

On Flash Off Slow flash = POST. Fast flash = OBP.

On Flash On Undefined

CPU/Memory and Clock Boards 4-99Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 100: Sun Enterprise Server Maintenance

4

CPU/Memory Board Status Indicators

The General Rules

The following lists the general LED condition rules for the

CPU/Memory+ boards:

● If no LEDs are lit, there is no electrical power to the board.

● If the green Power and Running LEDs are not lit, and only the

amber light is lit, the board is ready for removal.

● If no LEDs are flashing, the system is hung or in the process of

booting up.

● It used to be the case that the board required service if the amber

Service LED was lit continuously (not flashing).

The amber light is not a fault light, it is a low power indicator.

There may well be a fault, or equally the board my have been

dynamically reconfigured out of operation

● It is a normal condition for the Service LED to flash during POST

testing.

4-100 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 101: Sun Enterprise Server Maintenance

4

Clock+ Board Introduction

There are, at the time of writing, four different clock boards. The main

difference between them is the clock ratio.

501-2975 provides a 1:2 clock ratio.

501-4286 supports 1:2 and 1:3 clock ratios.

501-4946 supports 1:2, 1:3, and 1:4 clock ratios.

501-5365 supports 1:2, 1:3, 1:4, 1:5, and 1:6 clock ratios.

Now, these ratios are used to derive the gigaplane frequency. The maximum speedis 100 MHz.

So, for example, for 400 MHz processors you would need a 501-4946

Note – Full details of which clock board is used alongside which

processor module, is provided in the FE Handbook.

CPU/Memory and Clock Boards 4-101Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 102: Sun Enterprise Server Maintenance

4

Clock+ Board Block Diagram

The Clock+ board block diagram below shows a high level view of the

functionality of the Clock+ board.

Clock+ Board Block Diagram

The Clock+ board consists of the following subsystems:

● Console Bus

● Clocks

● Reset logic

● JTAG logic and interface port for factory testing only

● Centerplane connector signals monitoring

LEDs

Serial ports

Keyboard/mouse

Console

Clocks

Reset

JTAG

Clock Frequency

led [2.0]

Console bus

Clock bus

Reset bus

JTAG bus

Centerplaneconnector

ResetButton

ResetButton(xir)

4-102 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 103: Sun Enterprise Server Maintenance

4

Clock+ Board - Physical

Backpanel and Connectors

CPU/Memory and Clock Boards 4-103Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 104: Sun Enterprise Server Maintenance

4

Clock+ Board Console Bus

Note – The console bus passes information such as enviromental

information and POST around. It is a ‘back door’ path between

boards.

4-104 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 105: Sun Enterprise Server Maintenance

4

Clock+ Board Console Bus

Console Bus

The Console Bus provides CPU/Memory+ boards access to global

system control and status as well as to the keyboard, mouse, and serial

ports. In addition, there is a NVRAM/time of day (TOD) chip that

maintains the date and time and 8 Kbytes of data when the power to

the system is shut off.

The state of physical hardware conditions is maintained in registers on

the Clock+ board. Each of these registers has inputs generated from

other subsystems on the Clock+ board, from other boards, or from the

power supplies in the system. Some Clock+ board registers are

reserved for controlling various states of the machine.

The Clock+ board allows you to connect an ASCII terminal to the

serial port and a Sun keyboard and mouse to the keyboard port. This

allows you to interface to the local system console. The serial port

allows POST messages to be displayed to a local ASCII terminal.

You can configure the serial port for standard serial devices, such as

modems and printers.

Clocks

The clock subsystem generates the clocks for the entire system. The

base clock is synthesized and then divided into various frequencies.

These clock signals are then distributed to the centerplane by an array

of driver chips. Two clocks for processor slots and one system timing

clock go to each of the board slots on the centerplane.

Clock synthesizer and drivers. The clock synthesizer generates the

base clock signal, which is divided into several different signals by the

clock divider. These clocks are then distributed to the centerplane by

the clock drivers.

CPU/Memory and Clock Boards 4-105Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 106: Sun Enterprise Server Maintenance

4

Clock+ Board - Overview

Reset Logic

Generates and sends reset commands to all system boards when either

an XIR or POR reset signal is received.

TOD/NVRAM

Centralized Time-of-day (TOD) chip that includes NVRAM. You can

copy the contents to each I/O board in the system for redundancy and

backup

Serial, keyboard and mouse ports

There are two tty connections, along with the kbd/mouse.

JTAG

There is a JTAG (Joint Test Action Group) connection between the

system ASICs and the Clock board. POST information is passed

around the system via JTAG. There is a further connection on the clock

board which is blanked off and used for factory testing only.

4-106 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 107: Sun Enterprise Server Maintenance

4

Clock+ Board Reset Logic

There are four circuits that control system reset and error state.

● Manual Reset

● System Reset

● System Error Reset

● Externally initiated reset (XIR)

We can initiate resets in a number of ways:

● Power the machine off & on. This is the Power-on Reset POR

● Type reset at the ok prompt. This is a software reset SOR

● Use the Reset Buttons on the clock board. The button labelled POR

will initiate a power-on reset.

The button labelled XIR will run an externally initiated reset (See

below)

● We can use the remote console commands

Remote Console Commands

The remote console feature is a very basic method of controlling the

Exx00 servers. A customer may send reset commands to the servers

via the ttya port. The system is constantly monitoring ttya for the

commands listed below.

CR CR ~ CNTL SHFT P Power cycle reset

CR CR ~ CNTL SHFT R Software reset

CR CR ~ CNTL SHFT X XIR

On receiving the key sequences on ttya, the system will initiate the

appropriate reset.

CPU/Memory and Clock Boards 4-107Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 108: Sun Enterprise Server Maintenance

4

Clock+ Board - XIR resets

Note – The secure position of the keyswitch disables the remote

console.

Enter remote console characters with a 0.5 to 5 second delay.

Externally Initiated Reset XIR

This is a useful reset to use if you are resetting a hung

machine. When an XIR occurs, memory is cleared and a

“snapshot” of the CPU registers and processes is saved.

To view this snapshot of CPU registers, you must be at the ok

prompt. Type

OK.xir-state-all

This displays information similar to the following:

CPU ID#1TL=1 TT=3TPC=e0028688 TnPC=e0028688 TSTATE=9900001e06CPU ID#5TL=1 TT=3TPC=e002755c TnPC=e0027560 TSTATE=4477001e03

Note – It is outside the scope of this course to go into decoding the XIR

log reports. An XIR does not override the NVRAM auto-boot?variable.

You can initiate an XIR either by using the XIR button on the

Clock+ board or the remote console XIR sequence.

4-108 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 109: Sun Enterprise Server Maintenance

4

Clock+ Board Status Indicators

LED States

Note – The Clock+ Board LEDs display the same information as the

system LEDs.

This has led people in the past to assume that the clock board has a

fault on it.

Always check for other fault conditions before assuming a clock board

fault.

Table 4-2 Clock+ Board LED States

Power Service Cycling Condition

Off Off Off No power

Off On Off Failure mode

Off Off On Failure mode

Off On On Failure mode

On Off Off Hung in POST/OPB or OS

On Off On Hung in OS

On On Off Hung in POST/OBPHung in OS / failedcomponent

On On On Hung in POST/OBPHung in OS/ failedcomponent

On Off Flashing OS running normally

On On Flashing OS running / failedcomponent

On Flashing Off Slow flash=POSTFast flash=OBP

On Flashing On OS or OBP error

CPU/Memory and Clock Boards 4-109Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 110: Sun Enterprise Server Maintenance

4

4-110 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 111: Sun Enterprise Server Maintenance

I/OBoards 5

5-111Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 112: Sun Enterprise Server Maintenance

5

I/O Boards

Types of I/O Boards:

The enterprise systems support the five types of I/O boards identified

as follows.

● Type 1 – SBus I/O board with FC-OM Fibre Channel

● Type 4 – SBus+ I/O board with FC-AL Fibre Channel

● Type 2 – Graphics I/O board with FC-OM Fibre Channel

● Type 5 – Graphics+ I/O board with FC-AL Fibre Channel

● Type 3 – PCI+ I/O board

The + denotes boards capable of connecting to the 100MHz Gigaplane

bus in the X500 series. Each board has three LEDs that provide board

status codes.

I/O Addressing

It is essential that you fully understand how disk subsystems,

networks, SBus cards, PCI cards are addressed.

If your customer has errors on the database /engineering/parts, you

need to find where this partition is mounted.

If your customer tells you hme4 is faulty, where do you start?

5-112 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 113: Sun Enterprise Server Maintenance

5

I/O Addressing

We will be going through many examples of I/O addresses.

Physical paths are derived using UPA port numbers and device driver

names.

These are the most common driver names that may appear in a device

path.

fas - driver for fast/wide SCSI FEPS controllers

hme - driver for Fast Ethernet

isp - driver for differential SCSI controllers and the SunSwift card

sf - driver for soc+ or socal Fiber Channel Arbitrated Loop (FC-AL)

soc - driver for SPARC Storage Array (SSA) controllers

socal - driver for serial optical controllers for FCAL (soc+)

pln - SPARCstorage Array Nexus Driver

System Slot 1

Slot 1 in an Enterprise server will always have an I/O board installed,

since it is the on-board SCSI FEPS chip, which drives the internal CD-

ROM and tape drive.

I/O Boards 5-113Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 114: Sun Enterprise Server Maintenance

5

SBus I/O Boards

Block diagram of the SBus I/O board showing two SBuses connecting

the components and SBus card slots.

Onboard devices include a Flash PROM, SRAM, and environmental

sensors.

5-114 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 115: Sun Enterprise Server Maintenance

5

SBus I/O Board – Type 1

The Type 1 was the original 83 MHz SBus I/O board.

The SBus+ I/O board provides the following interface connections:

● Two SBus channels for three SBus slots

● SunFastEthernet

● Fast/wide SCSI-II

● Two OLC sockets for FC/OM (Fibre Channel – Optical Module)

interface converter modules

Part Numbers 501-2977, 501-4287, (83 MHz)

SBus 0

SBus 2SBus 1

I/O Boards 5-115Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 116: Sun Enterprise Server Maintenance

5

SBus + I/O Board – Type 4

A Type 4 I/O board is the newer 100 MHz SBus I/O board, which

differs from a Type 1 in its on-board serial optical controller.

The SBus+ I/O board provides the following interface connections:

● Two SBus channels for three SBus slots

● SunFastEthernet

● Fast/wide SCSI-II

● Two FC-AL sockets for hot-pluggable gigabit interface converter

(GBIC) modules

Part Numbers 501-4266 (83 MHz), 501-4883 (83, 90, 100MHz)

SBus 0

SBus 2SBus 1

5-116 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 117: Sun Enterprise Server Maintenance

5

SBus I/O Board – Type 1, Physical layout

This is the original dual SYSIO board. Type 1 boards have an on-board

SOC chip, which drives two on-board Fibre channel optical modules

(FC-OM). These are otherwise known as optical link controllers (OLC).

The on-board FC-OMs are used to drive a Sparc Storage Array. You

may drive 2km of fibre cable from these boards.

Note the connector layout. pln@a is on the right and pln@b is on the

left.

I/O Boards 5-117Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 118: Sun Enterprise Server Maintenance

5

SBus + I/O Board – Type 4, Physical layout

A Type 4 board has an on-board SOC+, otherwise known as the socal

(SOC arbitrated loop).

The SOC+ drives two on-board GBICs, which are used to drive the

A500 disk systems. You may drive 500m of fibre cable from these

boards.

The GBIC on the right is addressed as sf@0, the one on the left is

addressed as sf@1

5-118 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 119: Sun Enterprise Server Maintenance

5

Graphics I/O Boards

The Graphics+ I/O is similar to the SBus(+) I/O board with the

following differences:

● The Graphics I/O boards (Type 2 and Type 5) have one SBus

implemented with one SYSIO chip with two SBus card slots.

● The Graphics I/O board has one UPA port number assigned to the

SYSIO chip, and one UPA port for a fast-frame buffer.

I/O Boards 5-119Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 120: Sun Enterprise Server Maintenance

5

Graphics I/O Board – Type 2

The Graphics I/O board shown below provides you with the SBus you

need and a UPA interface for those systems on which you need to

install a monitor.

The Graphics I/O board provides the following interface connections:

● One SBus channel, for two SBus slots

● One UPA slot for Creator and Creator3D graphics cards

● SunFastEthernet

● Fast/wide SCSI-II,

● Two OLC sockets for FC/OM interface converter modules

Part Numbers 501-2749, 501-4288 (83 MHz),

SBus 0

SBus 1

UPA Bus

5-120 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 121: Sun Enterprise Server Maintenance

5

Graphics+ I/O Board – Type 5

The Graphics I/O board shown below is the 100 MHz “+” version of

the type 2 board.

The Graphics I/O board provides the following interface connections:

● One SBus channel, for two SBus slots

● One UPA slot for Creator and Creator3D graphics cards

● SunFastEthernet

● Fast/wide SCSI-II,

● Two FC-AL sockets for hot-pluggable gigabit interface converter

(GBIC) modules

Part Number 501-4884 (83, 90, 100MHz)

SBus 0

SBus 1

UPA Bus

I/O Boards 5-121Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 122: Sun Enterprise Server Maintenance

5

Graphics I/O Board – Type 2, Physical layout

The difference from a Type 1 is that both sbus0 and sbus2 are driven

from one SYSIO chip, which takes the second UPA port number for the

board.

The first UPA port number is assigned to the Creator 3d graphics card.

5-122 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 123: Sun Enterprise Server Maintenance

5

Graphics+ I/O Board – Type 5, Physical layout

The difference from a Type 4 is that both sbus0 and sbus2 are driven

from one SYSIO chip, which takes the second UPA port number for the

board.

The first UPA port number is assigned to the Creator 3d graphics card.

I/O Boards 5-123Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 124: Sun Enterprise Server Maintenance

5

PCI+ I/O Board – Type 3

The PCI+ I/O board provides the following interface connections:

● There are risers for 32- or 64-bit cards, 33- or 66-MHz cards, and

3.3- or 5-volt cards. The riser must match the specification of the

PCI card used

● One on-board 10/100-Mb-per-second Ethernet port (twisted pair)

● Ultra SCSI

5-124 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 125: Sun Enterprise Server Maintenance

5

PCI+ I/O Board – Type 3

The diagram of the PCI interface board shown below has two PCI

interface connectors to which you must connect a riser for the specific

type of PCI card you are installing.

The PCI+ I/O board provides the following interface connections:

● Four PCI bus channels for two configurable interface riser card

slots

● SunFastEthernet

● On-board SCSI implemented by an ISP 1040 controller, which

gives an Ultra SCSI connection.

Note: Ultra SCSI transfer rates are not supported as of 6/98, and

should be disabled.

Refer to PCI I/O Product Note, 805-3364-10 of September 1997.

Part Numbers 501-4325 (83 MHz), 501-4926 (100MHz)

PCI Bus 1PCI Bus 0

I/O Boards 5-125Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 126: Sun Enterprise Server Maintenance

5

PCI+ I/O Board – Type 3, Physical layout

Type 3 boards have PSYCHO chips instead of SYSIO chips.

PCI0 on the right takes the first UPA port number.

PCI1 on the left takes the second UPA port number.

5-126 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 127: Sun Enterprise Server Maintenance

5

PCI+ I/O Board – Type 3 Port Definitions

/pci@ x,4000/SUNW,hme@1,1

is the device path (or physical name) for the onboard fast ethernet porton a PCI I/O board. This port is controlled by the PCI 0 Psycho chipon the board.

/pci@ y,4000/SUNW,isptwo@3

is the device path (or physical name) for the onboard UltraSCSI porton a PCI I/O board. This port is controlled by the PCI 1 Psycho chipon the board.

The pci slot labelled J3200 is driven from PCI0 and has a device pathbeginning with

/pci@ x,2000/

which denotes that it can drive pci cards at 33MHz or 66 MHz

Similarly, the pci slot labelled J4200 is driven from PCI1 and has adevice path beginning with

/pci@ y,2000/

which denotes that it can drive pci cards at 33MHz or 66 MHz

I/O Boards 5-127Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 128: Sun Enterprise Server Maintenance

5

Board Status Indicators

CPU/Memory+ boards and I/O boards have three LEDs indicating

the status of that board.

With the advent of dynamic reconfiguration (DR), the meaning of the

amber service LED has changed.

Before DR, the only time a board had an amber light on was when it

had failed POST. The correct meaning of the amber light on as

highlighted below, is the board is in low power mode. Either it has a

fault or it has been DR’d out.

LED Status Codes

Table 5-1 LED Codes for the CPU/Memory+ and I/O Boards

Power Service Running Condition

Off Off Off Board has no electrical power

Off On Off Board is in low power mode, can beunplugged

Off Off On Undefined

Off On On Undefined

On Off Off System is hung, either in POST/OpenBootor in the operating system

On Off On Hung in OS

On On Off Hung in POST/OBP or hung in OS andhas failed component on board

On On On Hung in POST/OBP or hung in OS andhas failed component on board

On Off Flash OS running

On On Flash OS running and failed component onboard.

On Flash Off Slow flash = POST. Fast flash = OBP.

On Flash On Undefined

5-128 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 129: Sun Enterprise Server Maintenance

5

Board Status Indicators

The General Rules

The following lists the general LED condition rules for the

CPU/Memory+ and I/O+ boards:

● If no LEDs are lit, there is no electrical power to the board.

● If the green Power and Running LEDs are not lit, and only the

amber light is lit, the board is ready for removal.

● If no LEDs are flashing, the system is hung or in the process of

booting up.

● It used to be the case that the board required service if the amber

Service LED was lit continuously (not flashing).

The amber light is not a fault light, it is a low power indicator.

There may well be a fault, or equally the board my have been

dynamically reconfigured out of operation

● It is a normal condition for the Service LED to flash during POST

testing.

I/O Boards 5-129Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 130: Sun Enterprise Server Maintenance

5

Enterprise 3500 Fibre Channel Interface Board

This is a new board designed to provide connectivity to the internal

disk drives in the Sun Enterprise 3500 server. The internal disk drives

operate with the fibre channel arbitrated loop (FC-AL) architecture.

Each of the four potential FC-AL loops corresponds to one of four

gigabit interface converter (GBIC) modules on the Fibre channel

interface board.

GBIC LAGBIC LB

GBIC UA

GBIC UB

Part Number 501-4820

5-130 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 131: Sun Enterprise Server Maintenance

5

SCSI Disk Board

You can install up to four SCSI disk boards in the Sun Enterprise 4x00,

and 5x00 systems and two in the Sun Enterprise 6x00. Each SCSI disk

card can contain one or two, 2.1, 4.2 or 9.1 GByte 7200 RPM disk

drives.

SCSI Disk Board Addressing

SCSI disk addressing is dependent on drive position and gigaplane

slot the SCSI disk board is plugged into. We will cover addressing in

chapter 8.

Part Numbers 501-3113 (no disks) 501-4168, 501-5137

High densityUltraSCSI connector

I/O Boards 5-131Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 132: Sun Enterprise Server Maintenance

5

5-132 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 133: Sun Enterprise Server Maintenance

OpenBootPROM/NVRAM 6

6-133Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 134: Sun Enterprise Server Maintenance

6

Introducing OBP

History

The original SPARC boot PROM was based on revision 1.x

A boot command at this revision was of the form

>b sd(3,0,0)

The first open boot PROM was OBP 2.x The disadvantage with this

revision, was that to upgrade the firmware, you had to change the

chip.

Enterprise servers operate on OBP3.x which has the advantage that it

is downloadable.

The OpenBoot architecture provides a significant increase in

functionality and portability when compared to proprietary systems of

the past. Although this architecture was first implemented by Sun

Microsystems as OpenBoot on SPARC systems, its design is processor-

independent.

!Caution – Don’t get mixed up between NVRAM and OBP.

The OBP holds Device drivers, POST code and provides some user

diagnostics.

The NVRAM holds the hostid, MAC address, time-of-day and

parameters which dictate how the OBP code will interact with the

system.

Refer back to your desktop course notes.

6-134 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 135: Sun Enterprise Server Maintenance

6

Introducing OBP (cont)

Open Boot PROM on each CPU/Memory Board

The proms on each CPU/Memory board all contain the same OBP and

POST and should all be at the same revision. The OBP loaded into

memory at boot time will be from the POST master.

Open Boot PROM on each I/O Board

The proms on the I/O boards will hold FCODE and iPOST specific to

that type of board.

Master NVRAM

Resides on the Clock board.

Backup NVRAM

Reside on each I/O board. There are no backup NVRAM chips on the

CPU/Memory boards.

Open Boot PROM/NVRAM 6-135Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 136: Sun Enterprise Server Maintenance

6

Introducing OBP (cont)

POST and OpenBoot work together in the system to test and manage

system hardware. When the system is turned on, or if a system reset is

issued, POST detects and tests buses, power supplies, boards, CPUs,

DIMMs, and many board functions.

Only POST can configure the system hardware at power up, and only

POST can enable hot-pluggable boards (if DR and AP are not present

and operating).

ok prompt

Once POST is completed, OBP checks the NVRAM parameters to see

how it should configure the system. The OBP is then loaded into main

memory. The system may then return to the ok prompt, assuming it

has been setup to do so.

{6} ok

Note – The number proceeding the ok prompt specifies the

POST/JTAG master. It is usually the first CPU module in the system.

6-136 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 137: Sun Enterprise Server Maintenance

6

Features of OBP

Plug-in Device Drivers

A plug-in device driver is usually loaded from a plug-in device,

such as an SBus card. You can use a plug-in device driver to boot

the operating system from a device other than the default boot

device. Another example would be to display text on an output

device, other than the one attached to ttya, before the operating

system has loaded its own device drivers.

FCode Interpreter

Plug-in drivers are written in a machine-independent interpreted

language called FCode. Each OpenBoot system PROM contains an

FCode interpreter. This means that the same device and driver can

be used on machines with different types of CPUs (SPARC, Intel).

Device tree

The device tree is a data structure describing the devices

(permanently installed and plug-in) attached to a system. Both the

user and the operating system can determine the hardware

configuration of the system by inspecting the device tree.

Forth toolkit

The OpenBoot User Interface is based on the interactive

programming language Forth. You can combine sequences of user

commands to form complete programs. This provides a powerful

capability for debugging hardware and software.

Open Boot PROM/NVRAM 6-137Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 138: Sun Enterprise Server Maintenance

6

Features of OBP (cont)

Flash Programmable

This makes upgrading the system’s POST, OBP, and I/O devices

Fcode fast, easy, and inexpensive.

You can upgrade several Sun Enterprise servers with little

downtime to the enterprise. The new OBP program information

can come from a CD-ROM or a network server.

POST

The code to run power on self tests resides within the OBP chip. It too

can be upgraded to include tests for new boards which come out.

6-138 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 139: Sun Enterprise Server Maintenance

6

Recovery Features

These keyboard functions reset variable parameters in the

NVRAM configuration file.

Note – These keyboard functions work only from a local keyboard.

They do not work from an ASCII terminal or remote access terminal

connected to the systems serial port A.

If your system is down because it does not complete POST, you must

connect a Sun keyboard to the keyboard connector to enable these

recovery functions.

To activate these recovery functions:

1. Start with power off.

2. Press and hold the Stop key and action key simultaneously.

3. Apply power to the system while continuing to hold the keys

down until the keyboard LEDs flash.

The key combinations and functions available are:

Stop-F

Forces I/O to ttya. Enter Forth command mode on ttya before

probing hardware. Use fexit to continue probing hardware.

Stop-N

Resets NVRAM contents to default values.

Stop-D

Sets the diag-switch? parameter variable to true and enables

verbose output during POST.

Open Boot PROM/NVRAM 6-139Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 140: Sun Enterprise Server Maintenance

6

The OBP User Interface

The OBP user interface is based on an interactive command interpreter

that gives you access to an extensive set of functions for hardware and

software development, fault isolation, and debugging.

You can enter the OpenBoot environment, that is, get to the okprompt, in the following ways:

Shutdown the operating system.

# shutdown -y -g0 -i0

Execute the Stop-A keystroke sequence.

You will sometimes see Stop-A referred to as L1-A

Press the reset switch on systems equipped with one

(not recommended unless absolutely necessary).

Power-cycle the system

(also not recommended).

Note – A reset will only get you to the OpenBoot user interface i.e. the

ok prompt if the OBP parameter auto-boot? is set to false

6-140 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 141: Sun Enterprise Server Maintenance

6

System Testing Commands

The Open Boot PROM contains many commands used to test the

system hardware.

test-all

Tests all devices that have built-in self test methods. Testing starts with

the current device node, or the specified device and includes all

children

test (device-specifier)

Tests the specified device. The NVRAM diag-switch? parameter

and the front panel keyswitch control the verbosity and depth of

the test command.

!Caution – After entering the OpenBoot command to probe something,

a WARNING message is displayed. It informs you that if the operating

system has been running, you must type the reset-all command

before you probe anything. Failure to do this causes the system to

hang (lock up).

probe-scsi

Identifies devices attached to the (primary) SCSI bus.

probe-scsi-all

Identifies devices attached to all SCSI host adapters on all system

boards.

probe-fcal-all

Identifies devices within the E3500 on the FC-AL loops

Open Boot PROM/NVRAM 6-141Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 142: Sun Enterprise Server Maintenance

6

watch-clock

Tests the clock function.

watch-net

Monitors the network connection.

probe-net-all

Monitors all network connections of built-in and plugged-in

networking cards.

6-142 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 143: Sun Enterprise Server Maintenance

6

Informational Commands

Some OpenBoot commands provide information about the system

components, including their contents if applicable.

banner

Displays the power-on banner.

.enet-addr

Displays the current Ethernet address.

.idprom

Displays the “ID PROM” contents.

.traps

Displays a list of SPARC trap types.

.version

Displays the PROM version for all the boards in the system.

Open Boot PROM/NVRAM 6-143Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 144: Sun Enterprise Server Maintenance

6

The Device Tree

Devices are attached to a host computer through a hierarchy of

interconnected buses.

OpenBoot represents the interconnected buses and their attached

devices as a tree of nodes.

Such a tree is called the device tree. A node representing the host

computer’s main physical address bus forms the tree’s root node.

The physical address generally represents a physical characteristic

unique to the device (such as the bus address or the slot number

where the device is installed).

The use of physical addresses to identify devices prevents device

addresses from changing when other devices are installed or removed.

Note – The system generates the device tree structure after POST and

passes it to memory.

It is this structure which maps low level addresses to high level

addresses.

E.g. /sbus@3,0/SUNWfas@3,f880000/sd@0,0 maps to

/dev/dsk/c0t0d0s0

6-144 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 145: Sun Enterprise Server Maintenance

6

Typical Device Tree

OpenBoot deals directly with hardware devices in the system. Each

device has a unique name representing the type of device and where

that device is located in the system addressing structure. The

following example shows a typical device tree.

Figure 6-1 Typical Device Tree

machine

ac

fhc sbus

ethernet

SUNW,socal

scsi-disk

scsi-tape

cpu-modulememory

SUNW,hme SUNW,faseeprom

sf

ssd

flashprom

central

fhc

zs

clock-board

eeprom

upa

Open Boot PROM/NVRAM 6-145Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 146: Sun Enterprise Server Maintenance

6

Displaying the Device Tree

You can browse the device tree to examine and modify individual

device tree nodes. The device tree browsing commands are similar to

the Solaris commands for changing (cd), displaying (ls ) and listing

the current directory (pwd) in the Solaris directory. Selecting a device

node makes it the current node.

Table 6-1 Commands for Browsing the Device Tree

Command Description

.properties Displays the names and values of the current node'sproperties.

dev device-path

Chooses the indicated device node, making it thecurrent node.

dev node-name Searches for a node with the given name in the subtreebelow the current node, and choose the first such nodefound.

dev .. Chooses the device node that is the parent of the currentnode.

dev / Chooses the root machine node.

device-end Exits the device tree.

ls Displays the names of the current node's children.

pwd Displays the device path name that names the currentnode.

show-devs[device-path]

Displays all the devices directly beneath the specifieddevice in the device tree. The show-devs command,used by itself shows the entire device tree.

6-146 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 147: Sun Enterprise Server Maintenance

6

Using the .properties Command

The .properties command displays the names and values of all the

properties in the current node:

ok dev /zs@1,f0000000ok .propertiesaddress ffee9000port-b-ignore-cdport-a-ignore-cdkeyboarddevice_type serialslave 00000001intr 0000000c 00000000interrupts 0000000creg 00000001 0000000 00000008name zsok

Using the dev Command

The dev command sets the current node to the named node so you can

be view its contents. For example, to make the ACME company's SBus

device named ACME,widget the current node:

ok dev /sbus/ACME,widget

The find-device command is identical to the dev command,

differing only in the way the input pathname is passed.

ok /sbus/ACME,widget find-device

Note – After choosing a device node with dev or find-device ,

usually, you cannot execute that node's methods because dev does not

establish the current instance. For a detailed explanation of this issue,

refer to Writing FCode 3.x Programs, part number 802-3239-10.

Open Boot PROM/NVRAM 6-147Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 148: Sun Enterprise Server Maintenance

6

Listing System Devices

The show-devs command displays a listing of all devices currently

available in the system. If a device has been added to a disable list

(discussed in the next section) but the system has not been reset or

gone through a POST, the device still shows up on the dev report. A

device can be physically installed in the system chassis but not show

up on the following report because it is listed on the disabled-boardlist . You must remove the entry from the disabled board list after

the board has been replaced. You must do a system reset to enable

POST and OBP to add the device back to the dev listing.

The following device listing is from a Sun Enterprise 4000.

ok show-devs/SUNW,ffb@2,0/counter-timer@7,3c00/sbus@7,0/counter-timer@6,3c00/fhc@6,f8800000/sbus@6,0/counter-timer@3,3c00/sbus@3,0/fhc@2,f8800000/SUNW,UltraSPARC@5,0/SUNW,UltraSPARC@4,0/fhc@4,f8800000/SUNW,UltraSPARC@1,0/SUNW,UltraSPARC@0,0/fhc@0,f8800000/central@1f,0/virtual-memory/memory@0,0/aliases/options/chosen/openprom/packages/sbus@7,0/SUNW,fas@3,8800000/sbus@7,0/SUNW,hme@3,8c00000/sbus@7,0/SUNW,fas@3,8800000/st/sbus@7,0/SUNW,fas@3,8800000/sd/fhc@6,f8800000/sbus-speed@0,500000/fhc@6,f8800000/eeprom@0,300000/fhc@6,f8800000/flashprom@0,0

6-148 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 149: Sun Enterprise Server Maintenance

6

Listing System Available Devices

/fhc@6,f8800000/environment@0,400000/fhc@6,f8800000/ac@0,1000000/sbus@6,0/SUNW,soc@d,10000/sbus@3,0/SUNW,fas@3,8800000/sbus@3,0/SUNW,hme@3,8c00000/sbus@3,0/SUNW,soc@d,10000/sbus@3,0/SUNW,fas@3,8800000/st/sbus@3,0/SUNW,fas@3,8800000/sd/sbus@3,0/SUNW,soc@d,10000/SUNW,pln@a0000000,78c0c9/sbus@3,0/SUNW,soc@d,10000/SUNW,pln@a0000000,78c0c9/SUNW,ssd/fhc@2,f8800000/sbus-speed@0,500000/fhc@2,f8800000/eeprom@0,300000/fhc@2,f8800000/flashprom@0,0/fhc@2,f8800000/environment@0,400000/fhc@2,f8800000/ac@0,1000000/fhc@4,f8800000/flashprom@0,0/fhc@4,f8800000/sram@0,200000/fhc@4,f8800000/environment@0,400000/fhc@4,f8800000/simm-status@0,600000/fhc@4,f8800000/ac@0,1000000/fhc@0,f8800000/flashprom@0,0/fhc@0,f8800000/sram@0,200000/fhc@0,f8800000/environment@0,400000/fhc@0,f8800000/simm-status@0,600000/fhc@0,f8800000/ac@0,1000000/central@1f,0/fhc@0,f8800000/central@1f,0/fhc@0,f8800000/clock-board@0,900000/central@1f,0/fhc@0,f8800000/zs@0,904000/central@1f,0/fhc@0,f8800000/zs@0,902000/central@1f,0/fhc@0,f8800000/eeprom@0,908000/openprom/client-services/packages/disk-label/packages/obp-tftp/packages/deblocker/packages/terminal-emulatorok

Open Boot PROM/NVRAM 6-149Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 150: Sun Enterprise Server Maintenance

6

Listing System Available Devices

!Caution – If you boot the operating system, exit from the operating

system into OpenBoot without resetting the system, then use some

OpenBoot commands, the commands might not work as expected. In

this case, you might have to power cycle the system to restore normal

operation.

For example, suppose you boot the operating system, exit to

OpenBoot, then execute the probe-scsi command. You find that

probe-scsi fails, hangs the system, and you cannot resume (Ok go)

the operating system. To regain control of the system, you must

perform a hardware reset (power cycle or reset switch).

The correct method for executing OpenBoot probe commands is to

reset the system before entering the command. You must type

reset-all as the first OBP command, then invoke the desired probecommand, as shown:

ok reset-allok probe-scsi-all

sifting Command

sifting acts very much like the UNIX grep command. If you have a

command you wish to run and you can’t remember the syntax, type:

ok sifting test

6-150 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 151: Sun Enterprise Server Maintenance

6

Displaying Device Aliases

The devalias command prints a listing of shortcuts or nicknames for

long device addresses. The system has no trouble remembering long

device addresses but humans do. So the device aliases list was created.

You should be familiar with one or two of these aliases, such as diskand cdrom , because you have used both of these to boot the system.

You can always use the entire device path at the OKprompt when

booting.

Systems usually have predefined device aliases for the most

commonly used devices, such as the following listing taken from a Sun

Enterprise 3500.

ok devaliasdisk /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@0,0disksocal /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@0,0disk /sbus@3,0/SUNW,fas@3,8800000/sd@0,0diskbrd /sbus@3,0/SUNW,fas@3,8800000/sd@a,0diskisp /sbus@3,0/QLGC,isp@0,10000/sd@0,0net /sbus@3,0/SUNW,hme@3,8c00000cdrom /sbus@3,0/SUNW,fas@3,8800000/sd@6,0:ftape /sbus@3,0/SUNW,fas@3,8800000/st@4,0scsi /sbus@3,0/SUNW,fas@3,8800000disk0 /sbus@3,0/SUNW,fas@3,8800000/sd@0,0disk1 /sbus@3,0/SUNW,fas@3,8800000/sd@1,0disk2 /sbus@3,0/SUNW,fas@3,8800000/sd@2,0disk3 /sbus@3,0/SUNW,fas@3,8800000/sd@3,0disk4 /sbus@3,0/SUNW,fas@3,8800000/sd@4,0disk5 /sbus@3,0/SUNW,fas@3,8800000/sd@5,0tape0 /sbus@3,0/SUNW,fas@3,8800000/st@4,0tape1 /sbus@3,0/SUNW,fas@3,8800000/st@5,0ttya /central/fhc/zs@0,902000:attyb /central/fhc/zs@0,902000:bkeyboard /central/fhc/zs@0,904000keyboard! /central/fhc/zs@0,904000:forcemodename aliasesok

Open Boot PROM/NVRAM 6-151Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 152: Sun Enterprise Server Maintenance

6

Device Alias Commands

A device alias, or simply, alias, is a shorthand representation of a

device path.

For example, the boot disk, partition a, can be aliased as disk, which

represents the complete device path name to the boot disk drive.

The devalias commands are used to examine, create, and change

aliases

Table 6-2 Device Alias Commands.

!Caution – User-defined aliases are lost after a system reset or power

cycle.

To create permanent aliases, use the nvalias command.

ok devalias diskdisk /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@0,0ok devalias disk /sbus@3,0/SUNW,fas@3,8800000/sd@0,0ok devalias diskdisk /sbus@3,0/SUNW,fas@3,8800000/sd@0,0ok

This changed the default boot disk from one in a storage subsystem

connected to a GBIC (socal@d) to a local disk on a fast SCSI SBus card.

Command Description

devalias Displays all current device aliases.

devalias alias Displays the device path namecorresponding to alias.

devalias aliasdevice-path

Creates and defines an aliasrepresenting device-path. If an aliaswith the same name already exists,the new value supersedes the old.

6-152 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 153: Sun Enterprise Server Maintenance

6

nvalias command

An easy method of setting up an alias is to use the show-disks

command.

Example

We will set up a boot device on the first disk on a disk board located in

slot 3.

{0} ok show-disksa) /sbus@7,0/SUNW,fas@3,8800000/sdb) /sbus@3,0/SUNW,fas@3,8800000/sdq) NO SELECTION

Enter Selection, q to quit: a

/sbus@7,0/SUNW,fas@3,8800000/sd has been selected.Type ^Y ( Control-Y ) to insert it in the commandline.

e.g. ok nvalias mydev ^Y for creating devaliasmydev for /sbus@7,0/SUNW,fas@3,8800000/sd

{0} ok nvalias bootdisk CTRL-Y

pressing CNTRL-Y here will insert

/sbus@7,0/SUNW,fas@3,8800000/sd. You must add @a,0

To set boot device, the boot-device NVRAM parameter must be

changed:

ok setenv boot-device bootdiskok reset

Open Boot PROM/NVRAM 6-153Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 154: Sun Enterprise Server Maintenance

6

Open Boot PROM Commands for the NVRAM

Whenever you are not sure of the correct command or what the

command is used for, you can ask for help. The OPB displays a listing

of commands available.

ok help

After listing and selecting a command you think might be the one you

want, you can ask for help on that one command.

Type help command-name or help category-name for more specific help.

Note – Use ONLY the first word of a category-name or category

description.

For example, type help select

ok help selectMain categories are:Repeated loopsDefining new commandsNumeric outputRadix (number base conversions)ArithmeticMemory accessLine editorSystem and boot configuration parametersSelect I/O devicesFloppy ejectPower on resetDiag (diagnostic routines)Resume executionFile download and bootnvramrc (making new commands permanent)Enable/Disable selected hardware subsystemsEnvironmental monitor

6-154 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 155: Sun Enterprise Server Maintenance

6

OBP Commands for displaying and changing the NVRAMParameters

The printenv Command

The printenv command displays NVRAM parameter names, current

values, and default values.

The following is a listing of current parameter names. Each system

type and model can have different parameters available. Desktops

have one set, single main logic board (MLB) servers, such as the Sun

Enterprise 250, have a different set and multiple CPU board servers,

such as the Sun Enterprise 5500 have another set of parameters.

To display the contents of the NVRAM, use the printenv command.

ok printenv

Variable Name Value Default Valuedisabled-memory-listdisabled-board-listconfiguration-policy board componentmemory-interleave max maxdiag-passes 1 1diag-verbosity 0 0diag-continue? false falsetpe-link-test? true truescsi-initiator-id 7 7keyboard-click? false falsekeymapttyb-mode 9600,8,n,1,- 9600,8,n,1,-ttya-mode 9600,8,n,1,- 9600,8,n,1,-ttyb-rts-dtr-off false falsettyb-ignore-cd true truettya-rts-dtr-off false falsettya-ignore-cd true truereboot-flag false falsereboot-posc 4294582272 0reboot-posl 0 0reboot-cmd boot net -r

Open Boot PROM/NVRAM 6-155Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 156: Sun Enterprise Server Maintenance

6

Open Boot PROM Commands for the NVRAM

diag-level min minenv-monitor enabled enabled#power-cycles 4system-board-serial# 802F01F0system-board-date 34cf6a6bfcode-debug? false falseoutput-device screen screeninput-device keyboard keyboardload-base 16384 16384boot-command boot bootauto-boot? true trueauto-boot-on-error? false falsewatchdog-reboot? false falsediag-filediag-device net netboot-fileboot-device net disk netlocal-mac-address? false falseansi-terminal? true truescreen-#columns 80 80screen-#rows 34 34silent-mode? false falseuse-nvramrc? false falsenvramrcsecurity-mod nonesecurity-passwordsecurity-#badlogins 0oem-logooem-logo? false falseoem-banneroem-banner? false falsehardware-revisionlast-hardware-updatediag-switch? false false

ok

6-156 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 157: Sun Enterprise Server Maintenance

6

Open Boot PROM Commands for the NVRAM

To show a specific parameter, for example the diag-switch ? variable,

type printenv and the variable name.

ok printenv diag-switch?

diag-switch? = true

ok

You can modify the values of the configuration variables, and any

changes you make remain in effect even after a power cycle.

!Caution – Configuration variables should be adjusted cautiously.

These NVRAM variables determine the startup routine of the system

so their configuration, if incorrect, can cause the system to operate in

an unexpected manner.

To change a parameter, use the setenv command. To change the

diagnostic switch:

ok setenv diag-switch? true

ok set-defaults

The set-defaults command restores the default setting of all

parameters.

ok set-default variable

The set-default variable command resets the value of variableto the default setting.

Open Boot PROM/NVRAM 6-157Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 158: Sun Enterprise Server Maintenance

6

General NVRAM Parameters

Below are the NVRAM parameters which apply to all Sun servers. The

list is as compiled by the Solaris eeprom command.

Note – Not all OpenBoot systems support all parameters. Defaults

may vary depending on the system and the PROM revision.

List of NVRAM Configuration Parameters

Variable TypicalDefault Description

auto-boot? true If true, boot automatically after power-on or reset.

boot-command boot Command executed if auto-boot? is true.

boot-device disk net Device from which to boot.

boot-file empty string File to boot. An empty string lets the secondarybooter choose the default.

diag-device net Diagnostic boot source device.

diag-file empty string File from which to boot in diagnostic mode.

diag-level platform-dependent

Diagnostics level. Values include off, min, max andmenus.

diag-switch?fcode-debug?

falsefalse

If true, run in diagnostic mode.If true, includes name parameter for plug-in deviceFCodes

hardware-revision N/A System version information.

input-device

keyboard-click?

keyboard

false

Input device used at power-on (usually keyboard,ttya, or ttyb).If true, enable keyboard click.

last-hardware-update

N/A System update information.

local-mac-address? false If true, network drivers use their own MAC address,not system’s.

6-158 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 159: Sun Enterprise Server Maintenance

6

General NVRAM Parameters ( cont)

Variable TypicalDefault Description

nvramrc empty Contents of NVRAMRC.

output-device screen Output device used at power-on (usually screen,ttya, or ttyb).

screen-#columns 80 Number of on-screen columns (characters/line).

screen-#rows 34 Number of on-screen rows (lines).

scsi-initiator-id 7 SCSI bus address of host adapter, range 0-7.

security-mode none Firmware security level (options: none, command, orfull). If set to command or full, system will promptfor PROM security password.

security-password N/A Firmware security password (never displayed). Canbe set only when security-mode is set to command orfull.

selftest-#megs 1 Metabytes of RAM to test. Ignored if diag-switch?is true.

tpe-link-test? true Enable 10baseT link test for built-in twisted pairEthernet.

ttya-mode 9600,8,n,1,- TTYA line discipline (baud rate, #bits, parity, #stop,handshake).

ttyb-mode 9600,8,n,1,- TTYB line discipline (baud rate, #bits, parity, #stop,handshake).

ttya-ignore-cd true If true, operating system ignores carrier-detect onTTYA.

ttyb-ignore-cd true If true, operating system ignores carrier-detect onTTYB

ttya-ignore-cd false If true, operating system does not assert DTR andRTS on TTYA.

ttyb-ignore-cd

use-nvramrc?

false

false

If true, operating system does not assert DTR andRTS on TTYB.If true, execute commands in NVRAMRC duringsystem start-up.

Open Boot PROM/NVRAM 6-159Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 160: Sun Enterprise Server Maintenance

6

Platform specific NVRAM Commands

The OpenBoot PROM Version 3.x used in Sun Enterprise server

systems now includes additional parameters for managing the

hardware. These new parameters include:

● disabled-board-list

Is a list of boards, by system backplane slot number to be disabled

at boot up. This example puts the board in slots 4 and 6 in the

NVRAM disable-board-list parameter:

ok setenv disabled-board-list 46

To return disable-board-list to default value, type:

ok set-default disable-board-list

● disabled-memory-list (whole board at a time)

Displays a list of CPU boards whose memory is to be disabled and

left unused by the operating system. The value (for example, 7a) is

the CPU board in slots 7 and 10 containing the memory that is to

be disabled. There is no way to disable individual memory banks

at this time.

The CPU modules, if any, on the board continue to operate

normally.

To disable the memory on the CPU board in slot 7 type:

ok setenv disabled-memory-list 7a

● memory-interleave

Used to enable or disable memory interleaving. Values are min to

disable memory interleaving and max to set the maximum possible

memory interleaving.

6-160 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 161: Sun Enterprise Server Maintenance

6

● configuration-policy

Defines how the system handles devices when they fail POST.

The values are component, board, or system.

For example, if a SYSIO chip on an I/O board in slot 5 fails its self

test, POST disables the entire board if the variable is set to board.

POST disables only the SBus if the variable is set to component.

● sbus-probe-default

sbus-probe-default d3120

This variable defines the SBus device probe order on an I/O

board per SBus, where:

d = On-board SOC

3 = On-board FEPS

0-2 = SBus slots 0, 1, and 2

On a Type 2 and a Type 5 I/O board, since there is only 1 SBus, the

probe order will be:

d 3 2 0 (no slot 1)

To change the default probe order to ‘123d0’, enter the following at

the ok prompt:

ok setenv sbus-probe-default 123d0

Remember that this changes the default probe order for all boards in

the system. You can also use this to skip over an SBus slot, but don’t

include it in the list of devices to probe. To change the probe order for

a specific board, use the sbus-specific-probe variable.

Open Boot PROM/NVRAM 6-161Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 162: Sun Enterprise Server Maintenance

6

● sbus-specific-probe

This variable controls the SBus probe order on a given list of

boards. To set the probe order as 320 on I/O board 4, enter the

following at the ok prompt:

ok setenv sbus-specific-probe 4:320

The number preceding the ‘:’ is the slot number; the numbers

following it are the SBus device numbers in the desired probe

order. All unlisted I/O boards in the system will use the default

probe order as defined by the sbus- default-probe NVRAM

variable.

Multiple boards can be defined by this variable as follows:

ok setenv sbus-specific-probe 4:320 6:d3210

6-162 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 163: Sun Enterprise Server Maintenance

6

Environmental Monitoring

Some of the functions of the OBP do not use inputs from a user. These

functions are preprogramed operations that start automatically after

the system has booted. Some take input from the Solaris operating

system and perform tasks as described in their initial configuration.

These configurations might not be configurable by you or the

operating system.

● ok disable-environmental-monitor

Stops the monitoring of power supply status, board temperatures,

and board hot plug while the screen displays the ok prompt.

● ok enable-environmental-monitor

Starts monitoring power supply status, board temperatures and

board hot plug while the screen displays the ok prompt.

Note – This environmental-monitor function is enabled by default.

Console messages for environmental conditions appear as follows:

● PROM NOTICE: Overtemp detected on board <n.

● PROM NOTICE: System has cooled down.

● PROM WARNING: Board <n is too hot.

● PROM NOTICE: Insufficient power detected.

● PROM NOTICE: Power supply restored.

● PROM NOTICE: Board insert detected.

● PROM NOTICE: Reset Initiated...

If a board is over the predetermined temperature, then the PROM

initiates a warning message to the console and performs a resetcommand resulting in POST disabling the faulty board and the system

rebooting the operating system.

If insufficient power is detected and is not fixed in 30 seconds, the OBP

initiates a reset to allow POST to deconfigure some of the boards

according to the amount of available power.

Open Boot PROM/NVRAM 6-163Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 164: Sun Enterprise Server Maintenance

6

NVRAM Security

The NVRAM system security variables are:

● security-mode

Sets the firmware security level (options: none , command, or full ).

Default is none .

● security-password

Sets the firmware security password (never displayed). No default.

● security-#badlogins

Sets the number of incorrect security password attempts. No

default.

!Caution – Do not set a password at the OBP level.

Your customer may or may not wish to.

If he does and then forgets it, there is no way to recover back to a

default

6-164 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 165: Sun Enterprise Server Maintenance

6

NVRAMRC Editing Commands for the NVRAM

The script editor, nvedit , lets you create and modify the script

using the commands listed in NVRAM

Table 6-3 nvramrc Script Editor Commands.

Command Description

nvalias aliasdevice-path

Stores the command “devalias alias device-path”in the script. The alias persists until eithernvunalias or set-defaults is executed.

$nvalias Performs the same function as nvalias, exceptthat it takes its arguments, name-string device-string, from the stack.

nvedit Enters the script editor. If data remains in thetemporary buffer from a previous nveditsession, resumes editing those previouscontents. If not, reads the contents of nvramrcinto the temporary buffer and begins editing it.

nvquit Discards the contents of the temporary buffer,without writing it to nvramrc . Prompts forconfirmation.

nvrecover Recovers the contents of nvramrc if they havebeen lost as a result of the execution of set-defaults ; then enters the editor as with nvedit.nvrecover fails if nvedit is executed betweenthe time that the nvramrc contents were lostand the time that nvrecover is executed.

nvrun Executes the contents of the temporary buffer.

nvstore Copies the contents of the temporary buffer tonvramrc ; discards the contents of the temporarybuffer.

nvunalias alias Deletes the specified alias from nvramrc .

$nvunalias Performs the same function as nvunaliasexcept that it takes its argument, name-string,from the stack.

Open Boot PROM/NVRAM 6-165Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 166: Sun Enterprise Server Maintenance

6

NVRAMRC Editing Commands for the NVRAM ( cont)

NVRAM Command Precautions

There are two commands you should understand along with their

implications:

● set-defaults and the escape hatch Stop-N

▼ Sets all NVRAM variables to the default values

Note – Key switch in secure position will inhibit Stop key functions.

● use-nvramrc? Set to false

▼ Clears the nvramrc memory location.

If any device alias had been set, they would have been in nvramrc

along with possible other tests or codes required to execute during

POST and boot.

The nvrecover command can restore the contents if you do not do the

nvstore command after you type the set-defaults command. If the

nvstore command was done, the contents of the nvramrc memory

area are not recoverable. This is one more reason why it is important

that you write down the contents of the nvramrc before attempting

any changes to it.

6-166 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 167: Sun Enterprise Server Maintenance

6

Updating Flash PROM and FCode

Do you need to update?

At the ok prompt, type .version. The banner command gives the OBP

revision but not the FCode revisions.

ok .versionSlot 1 - I/O Type 4 FCODE 1.8.7 1997/12/08 15:39 iPOST 3.4.41997/08/26 17:37Slot 3 - I/O Type 3 FCODE 1.8.7 1997/05/09 11:18 iPOST 3.0.21997/05/01 10:56Slot 7 - I/O Type 1 FCODE 1.8.3 1997/11/14 12:41 iPOST 3.4.61998/04/16 14:22Slot 9 - CPU/Memory OBP 3.2.16 1998/06/08 16:58 POST 3.9.41998/06/09 16:25

You can use the .properties command to display the CPU/Memory

Board Flash PROM revision in hexadecimal ASCII, but this is a long

way round to get to the information above.

It is included to demonstrate how the flash-proms connect to the fhc,

aka fire-hose controller, aka bootbus controller.

Note – Remember that the show-devs command lists all the devices in

the OpenBoot device tree, which you need for the following

commands.

ok cd /fhc@12,f8800000/flashprom@0, 0ok .propertiesversion 4f 42 50 20 20 20 33 2e 32 2e 31 36 20 31 39 39 39model SUNW,525-1431name flashprom

Note – 4f 42 50 20 20 20 33 2e 32 2e 31 36 20 31 39 39 39 is the hex code

for OBP 3.2.16 1999

Open Boot PROM/NVRAM 6-167Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 168: Sun Enterprise Server Maintenance

6

Updating Flash PROM and FCode (cont)

ok cd /fhc@e,f8800000/flashprom@0,0ok .propertiesversion 46 43 4f 44 45 20 31 2e 38 2e 33 20 31 39 39 37model SUNW,525-1432name flashprom

Note – 46 43 4f 44 45 20 31 2e 38 2e 33 20 31 39 39 37 = FCODE 1.8.3

1997

Use the .properties command to display the I/O Board SOC

Controller FCode revision.

ok cd /sbus@2,0/SUNW,soc@d,10000ok .propertiessoc-fcode 1.3 95/09/28model 501-2069name SUNW,soc

Use the .properties command to display the I/O Board SOC+

Controller FCode revision.

ok cd /sbus@2,0/SUNW,socal@d,10000ok .propertiesversion @(#) FCode 1.11 97/12/07model 501-3060name SUNW,socal

Checking version under UNIX

At the UNIX prompt, you can obtain the OBP revision level using:

# prtconf -V

6-168 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 169: Sun Enterprise Server Maintenance

6

Updating Flash PROM and FCode (cont)

Where do I obtain the latest revisions?

At the time of this writing, patch 103346-24 updates the OBP to 3.2.24

The patch is available on Sunsolve CD and from sunsolve.sun.com

Flash PROM and FCode are available within this patch

!Caution – You can not use patchadd or installpatch to upgrade

the Flash PROM and FCode.

You must obtain the patch, uncompress it and extract the files.

After that you use the Flash PROM programming utility to update the

OpenBoot PROM on the CPU/Memory board and FCode on the I/O

boards.

Example

# zcat 103346-24.tar.Z | tar xvf -

# gzcat 103346-24.tar.gz | tar xvf -

The gzcat utility does not come as standard on Solaris 2.6 systems, but

is available on the Sunsolve CD, under the directory

/cdrom/cdrom0/gzip/bin/svr4

Open Boot PROM/NVRAM 6-169Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 170: Sun Enterprise Server Maintenance

6

Procedure to update FlashPROM and FCode

!Caution – As a consequence of the upgrade, the system’s NVRAM

configuration variables MAY BE reset to their default values.

If you have any custom NVRAM CONFIGURATION then you

SHOULD NOTE THEM DOWN Before proceeding.

Attach to the directory derived from the previous step. The flash-

update is achieved by running the UNIX programme within the

directory.

# cd 103346-24

# ./flash-update-<latest-rev>

Generating flashprom driver...

Generating SUNW,Ultra-Enterprise flash-update program...

Current System Board PROM Revisions:

---------------------------------------------------------

Board 0: cpu OBP 3.2.23 1999/10/01 10:07 POST 3.9.23 1999/10/01

17:54

Board 2: cpu OBP 3.2.23 1999/10/01 10:07 POST 3.9.23 1999/10/01

17:54

Board 1: Dual SBus + IO Board

FCODE 1.8.23 1999/10/01 10:07 iPOST 3.4.23 1996/03/16

17:55

Board 3: Dual PCI IO Board

FCODE 1.8.23 1999/10/01 10:07 iPOST 3.0.23 1999/10/01

17:55

Available’Update’ Revisions:

-----------------------------------------

CPU/Memory Board:

OBP 3.2.24 1999/12/23 17:31

POST 3.9.24 1999/12/23 17:35

IO Graphics Board:

6-170 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 171: Sun Enterprise Server Maintenance

6

I/O Type 2 FCODE 1.8.24 1999/12/23 17:29

iPOST 3.4.24 1999/12/23 17:34

IO Graphics + Board:

I/O Type 5 FCODE 1.8.24 1999/12/23 17:34

iPOST 3.4.24 1999/12/23 17:34

Dual Sbus IO Board:

I/O Type 1 FCODE 1.8.24 1999/12/23 17:29

iPOST 3.4.24 1999/12/23 17:34

Dual Sbus + IO Board:

I/O Type 4 FCODE 1.8.24 1999/12/23 17:30

iPOST 3.4.24 1999/12/23 17:34

Dual PCI IO Board:

I/O Type 3 FCODE 1.8.24 1999/12/23 17:30

iPOST 3.0.24 1999/12/23 17:34

Verifying Checksums: Okay

Do you wish to flash update your firmware? y/[n]: y

Are you sure? y/[n]: y

Updating Board 0: Type’cpu’

1 Erasing ... Done.

1 Verifying Erase... Done.

1 Programming... Done.

1 Verifying Program... Done.

Updating Board 2: Type ’cpu’

1 Erasing... Done.

1 Verifying Erase... Done.

1 Programming... Done.

1 Verifying Program... Done.

Open Boot PROM/NVRAM 6-171Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 172: Sun Enterprise Server Maintenance

6

Updating Board 1: Type ’dual-sbus’

1 Erasing... Done.

1 Verifying Erase... Done.

1 Programming... Done.

1 Verifying Program... Done.

Updating Board 3: Type ’upa-sbus’

1 Erasing... Done.

1 Verifying Erase... Done.

1 Programming... Done.

1 Verifying Program... Done.

#

NOTE: The flash proms are write protected by either of the

following two conditions:

a) Front panel key switch in secure mode.

b) Jumper (P601) removed on clock board.

At the time of writing this document systems are shipped with the

jumper on the clock board installed.

This means that only the front panel key switch being in secure

position write protects the proms.

If the proms are detected to be write protected then the flash update

process will fail with the following message:

FPROM Write Protected: Check Write Enable Jumper orFront Panel Key Switch.

!Caution – If there is a power failure while the flash proms are being

upgraded then you need to follow steps listed on the following pages.

6-172 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 173: Sun Enterprise Server Maintenance

6

Correcting a Faulty Flash PROM

You will have a problem if you lose power in the middle of a flash-

prom update. If the system only has on CPU/Memory board, you may

need to replace it.

But, if there are two CPU/Memory boards, there are a number of

options for recovery.

update-proms

Assuming the system gets to the ok prompt, there will be a message

stating that

xxxxx

Synchronize all Flash PROMs in the system of the same board types, to

the most current level available in the system by typing

ok update-proms

prom-copy

You can copy the contents of one I/O boards (slot 3) Flash PROM to

another I/O board (slot 9)., for example. To do this, type

ok prom-copy 3 9

Open Boot PROM/NVRAM 6-173Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 174: Sun Enterprise Server Maintenance

6

Correcting a Faulty Flash PROM - Updating within ExtendedPOST

You can reprogram a corrupted PROM if another board of the same

type with uncorrupted code is available.

Refer to the Flash PROM Programming Guide, 805-5579, for more

information.

To reprogram a faulty FlashPROM:

1. Connect an ASCII terminal to Serial Port A.

2. Remove the board with corrupted code from the backplane.

3. Install a known good board in any available slot.

4. Turn the keyswitch to On.

5. Wait 15 seconds and press s to enter Extended POST.

6. Select f for fcopy from the Extended POST Menus.

7. Insert the board with corrupted code into the backplane (the board

is hot-pluggable).

8. Select 4 for Activate System Board and follow the instructions.

9. Select 1 to copy the code and follow the instructions.

10. Turn the keyswitch to Standby.

6-174 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 175: Sun Enterprise Server Maintenance

6

Synchronizing NVRAM/TOD chips

The NVRAM/TOD chip on the Clock board and all I/O boards

contain the same information, including the NVRAM environmental

variables and configuration settings.

The master NVRAM/TOD parameters are kept on the NVRAM chip

held on the Clock board.

On occasion, you will see a message at the ok prompt stating:

Clock TOD doe not match any I/0 board

This means the NVRAM/TOD chip on the Clock board and the chip

on all I/O boards has got out of step.

Figure 7-1 illustrates how to recover a corrupted TOD Clock value.

Figure 6-2 NVRAM/TOD Contents Can Be Copied Automaticallyor Manually From One Source to Another

Open Boot PROM/NVRAM 6-175Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 176: Sun Enterprise Server Maintenance

6

Synchronizing NVRAM/TOD chips (cont)

This happens, for example, when a new I/O board is fitted.

To correct the time of day, copy the correct information from the clock

board to the I/O boards.

ok copy-clock-tod-to-io-boards

Correcting a Corrupted NVRAM/TOD

It could happen that the master chip gets corrupted.

If this happens, copy the contents from an I/O board with the correct

data to the clock board TOD chip.

ok (ioboard# in hex) copy-io-board-tod-to-clock-tod

In this example the correct data is on the I/O board in slot three.

ok 3 copy-io-board-tod-to-clock-tod

6-176 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 177: Sun Enterprise Server Maintenance

PowerOnSelfTest (POST) 7

7-177Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 178: Sun Enterprise Server Maintenance

7

Introducing POST

Always runs after a reset

The Sun Enterprise servers always execute the power on self tests

(POST) at power up and whenever a system reset is initiated. The

POST initializes all of the hardware devices before OBP starts booting

the operating system. The POST also identifies new boards that have

been installed in the system and makes them available to the OBP and

the system.

Checks the environment

Once POST are complete, the OpenBoot PROM environmental

monitoring process checks the temperature sensors in the system to

detect any over heated conditions. If the temperature sensed is above

the predefined level, a warning message is written to the system

console. If the temperature sensed exceeds a higher predefined level,

the OBP disables the board and places it into low power mode.

POST Output on Serial Port A

To effectively service Exx00 servers, there must be either VT100 type

terminal connected to ttya or a tip session from another system.

POST resides on each system board

POST resides in the OBP on each CPU/Memory+ board.

POST sets LED indicators

POST controls the status LEDs on the system front panel and all

boards.

Only POST can configure the system hardware at power up, and only

POST can enable hot-pluggable boards (if DR and AP are not present

and operating).

7-178 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 179: Sun Enterprise Server Maintenance

7

Level of testing

Over 90 percent of system board interconnects

Over 80 percent of each system board ASICs

Identify 95% of detectable faults to FRU level

Performance

Runtime should be less than 90 seconds (diag-level set to minimum)

Code size should be less than 256 Kbytes for CPU boards

Code size should be less than 64 Kbyes for I/O boards

Coverage

POST is designed to test just about everything that is internal to the

system and the system boards. POST tests the following:

● CPU modules and caches

● System board ASICs (DC, AC, and FHC)

● Busses (SBus, UPA, centerplane, boot-bus)

● I/O ASICs (Sysio, FEPS, SOC)

● Clock board and console bus devices (NVRAM, TOD, EEPROM)

● DIMMS

▼ Environmental Sensors

What POST doesn’t cover

POST will not test SBus cards or PCI cards.

In fact, there is a jumper on the PCI riser 501-8888 to enable or disable

JTAG. Disable it or POST may hang.

Power On Self Test (POST) 7-179Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 180: Sun Enterprise Server Maintenance

7

Introducing POST (cont)

Bootbus Controller

Otherwise known as the fhc (fire hose controller!!) Each board in the

system has an fhc which connects to a bootbus running on the

gigaplane, and various on-board ASICs including the SRAM and

temperature sesnsors.

The purpose of the bootbus is twofold. It passes the POST data around

the system, and is used by the clock board to pass NVRAM

parameters to the CPU/Memory boards.

Also connected to the fhc is the JTAG scan controller.

JTAG

JTAG is a 4-wire connection between various ASICs in the system. The

spec was developed by the Joint Test Action Group, a group set up by

the IEEE who give the spec its name, and is defined by IEEE 1149.1

Its purpose is to pass around POST information between boards and

ASICs, assuming the ASICs are JTAG compliant.

Warning – Not all ASICs in the system are JTAG compliant. Certainly

not the ASICs on the PCI cards plugged into a Type 3 I/O board.

Set the JTAG jumper on the PCI riser appropriately.

For details regarding JTAG specs, scan rings etc refer to

http://solutions.sun.com/embedded/databook/pdf/whitepapers/WPR-0018-01.pdf

7-180 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 181: Sun Enterprise Server Maintenance

7

Introducing POST (cont)

POST Master

After a Power-on reset (POR) each CPU module checks itself, its cache

and its gigaplane interface using JTAG loops VIA the bootbus. POST

runs in SRAM on each board. The first CPU that passes is elected the

POST Master, normally (0,0)

The CPU and the OBP on the master system board, when determined,

runs the self-test routines for each I/O board. It then sets the I/O

board configuration parameters according to the resident firmware.

OBP Parameters

diag-switch? False, Diagnostic level determined by

diag-level parameter

True, full (verbose) diagnostics run

diag-level min, minimum diagnostics run

max, full (verbose) diagnostics run

Keyswitch Positions

Normal power-on Diagnostic level determined by diag-level

parameter

Diagnostic power-on Full (verbose) diagnostics run

Note – The diag-switch? and diag-level parameters are not

particularly useful on the Enterprise servers, since if you want to run

full diagnostics, you can power on the system by turning the

keyswitch to the diagnostic position.

Power On Self Test (POST) 7-181Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 182: Sun Enterprise Server Maintenance

7

Power on Self Test Overview

Sample Output

The following is an example of what you see if you have an ASCII

terminal device connected to serial port ttya on the clock board of an

Enterprise x000/x500 server.

POST runs a complete and in-depth set of tests when the system

keyswitch is set to the diagnostic position or the NVRAM parameter

diag-switch? is set to true.

Hardware Power ON

POST COMPLETE7,0>7,0>@(#) POST 3.9.4 1998/06/09 16:257,0> SelfTest Initializing (Diag Level 10, ENV 0000ff00) IMPL 0011 MASK 207,0>Board 7 CPU FPROM Test7,0>Board 7 Basic CPU Test7,0> Set CPU UPA Config and Init SDB Data7,0> SRAM Mode = 22, Clock Mode = 4:1, PCON = 6fa, MCAP = 07,0>Board 7 MMU Enable Test7,0> DMMU Init7,0> IMMU Init7,0> Mapping Selftest Enabling MMUs7,0>Board 7 Ecache Test7,0> Ecache Probe7,0> Ecache Tags7,0> Ecache Quick Verify7,0> Ecache Init7,0> Ecache RAM7,0> Ecache Address Line7,0> Configure Ecache Limit7,0>Ecache Size = 00400000, Limited to 004000007,0>Board 7 FPU Functional Test7,0> FPU Enable7,0>Board 7 Board Master Select Test7,0> Selecting a Board Master7,0>Board 7 FireHose Devices Test7,0>Board 7 Address Controller Test7,0> AC Initialization7,0> AC DTAG Init

7-182 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 183: Sun Enterprise Server Maintenance

7

7,0>Board 7 Dual Tags Test7,0> AC DTAG Init7,0>Board 7 FireHose Controller Test7,0> FHC Initialization7,0>Board 7 JTAG Test7,0> Verify System Board Scan Ring7,0>Board 7 Centerplane Test7,0> Centerplane Join7,0>Setting JTAG Master7,0>Clear JTAG Master7,0>Board 7 Setup Cache Size Test7,0> Setting Up Cache Size7,0>Board 7 System Master Select Test7,0> Setting System Master7,0>POST Master Selected (JTAG,CENTRAL)7,0>Board 16 Clock Board Test7,0> Clock Board Initialization7,0> Clock Board Temperature Check7,0>Board 16 Clock Board Serial Ports Test7,0>Board 16 NVRAM Devices Test7,0> M48T59 (TOD) Init7,0>Board 7 System Board Probe Test7,0> Probing all CPU/Memory BDA7,0> Probing System Boards7,0> Probing CPU Module JTAG Rings7,0>Setting System Clock Frequency7,0> CPU Module mid 14 Checked in OK (speed code = 4)7,0> CPU mid 18 Version=00170011.200005077,0> CPU Module mid 18 Checked in OK (speed code = 4)7,0> CPU mid 19 Version=00170011.200005077,0> CPU Module mid 19 Checked in OK (speed code = 4)7,0> ******** Clock Reset - retesting7,0>System Frequency (MHz),fcpu=248, fmod=124, fsys=82, fgen=4967,0>7,0>@(#) POST 3.9.4 1998/06/09 16:257,0> SelfTest Initializing (Diag Level 40, ENV 0000ff80) IMPL 0011 MASK 207,0>Board 7 CPU FPROM Test7,0> CPU/Memory Board FPROM Checksum Test7,0>Board 7 Basic CPU Test7,0> FPU Registers and Data Path Test7,0> Instruction Cache Tag RAM Test7,0> Instruction Cache Instruction RAM Test7,0> Instruction Cache Next Field RAM Test7,0> Instruction Cache Pre-decode RAM Test7,0> Data Cache RAM Test

Power On Self Test (POST) 7-183Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 184: Sun Enterprise Server Maintenance

7

7,0> Data Cache Tags Test7,0> DMMU Registers Access Test7,0> DMMU TLB DATA RAM Access Test7,0> DMMU TLB TAGS Access Test7,0> IMMU Registers Access Test7,0> IMMU TLB DATA RAM Access Test7,0> IMMU TLB TAGS Access Test7,0> Set CPU UPA Config and Init SDB Data7,0> SRAM Mode = 22, Clock Mode = 3:1, PCON = 6fa, MCAP = 07,0>Board 7 MMU Enable Test7,0> DMMU Init7,0> IMMU Init7,0> Mapping Selftest Enabling MMUs7,0>Board 7 Ecache Test7,0> Ecache Probe7,0> Ecache Tags7,0> Ecache Quick Verify7,0> Ecache Init7,0> Ecache RAM7,0> Ecache 6N RAM Pattern Test7,0> Ecache Address Line7,0> Configure Ecache Limit7,0>Ecache Size = 00400000, Limited to 004000007,0>Board 7 FPU Functional Test7,0> FPU Enable7,0>Board 7 Board Master Select Test7,0> Selecting a Board Master7,0>Board 7 FireHose Devices Test7,0> PROM Datapath Test7,0> FHC CPU SRAM Test7,0>Board 7 Address Controller Test7,0> AC Registers Test7,0> AC Initialization7,0> Memory Registers Test7,0> Memory Registers Initialization Test7,0> AC DTAG Init7,0>Board 7 Dual Tags Test7,0> AC DTAG Test7,0> AC DTAG Init7,0>Board 7 FireHose Controller Test7,0> FHC Initialization7,0>Board 7 JTAG Test7,0> Verify System Board Scan Ring7,0>Board 7 Centerplane Test7,0> Centerplane and Arbiter Check Test7,0>Setting JTAG Master

7-184 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 185: Sun Enterprise Server Maintenance

7

7,0>Clear JTAG Master7,0> Centerplane Join7,0>Setting JTAG Master7,0>Clear JTAG Master7,0>Board 7 Setup Cache Size Test7,0> Setting Up Cache Size7,0>Board 7 System Master Select Test7,0> Setting System Master7,0>POST Master Selected (JTAG,CENTRAL)

Note – At this point POST has completed the system board testing and

assigned a master to start testing other boards on the backplane. For

example, each I/O board has its own PROM containing information

about the board (type, revision, speed) and tests for components and

interfaces. The tests are initiated by the master CPU. I/O POST reports

from these tests are sent to the master, indicating the state of the

system. The master CPU deactivates I/O boards or components

according to the report.

7,0>Board 16 Clock Board Test7,0> Clock Board Registers Test7,0> Clock Board Initialization7,0> Clock Board Temperature Check7,0>Board 16 Clock Board Serial Ports Test7,0> 85C30 Register Test7,0> 85C30 Serial Ports Test7,0> Keyboard Loopback7,0> Mouse Loopback7,0> Serial Port B Loopback7,0> Remote Serial Port A Loopback7,0> Remote Serial Port B Loopback7,0>Board 16 NVRAM Devices Test7,0> M48T59 (TOD) Init7,0> M48T59 (TOD) Functional Part 1 Test7,0> NVRAM (Non-Destructive) Test7,0>Board 7 System Board Probe Test7,0> Probing all CPU/Memory BDA7,0> Probing System Boards7,0> Probing CPU Module JTAG Rings7,0>Setting System Clock Frequency7,0> CPU Module mid 14 Checked in OK (speed code = 4)7,0> CPU mid 18 Version=00170011.200005077,0> CPU Module mid 18 Checked in OK (speed code = 4)

Power On Self Test (POST) 7-185Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 186: Sun Enterprise Server Maintenance

7

7,0> CPU mid 19 Version=00170011.200005077,0> CPU Module mid 19 Checked in OK (speed code = 4)7,0>System Frequency (MHz),fcpu=248, fmod=124, fsys=82, fgen=4967,0>TESTING BOARD 17,0>Board 1 JTAG Test7,0> Verify System Board Scan Ring7,0>Board 1 Centerplane Test7,0> Centerplane Check7,0>Board 1 Address Controller Test7,0> AC Registers Test7,0> AC Initialization7,0>Setting Freq to 25MHZ7,0> Memory Registers Test7,0> Memory Registers Initialization Test7,0> AC DTAG Init7,0>Board 1 FireHose Controller Test7,0> FHC Initialization7,0>Board 1 NVRAM Devices Test7,0> M48T59 (TOD) Init7,0> M48T59 (TOD) Functional Part 1 Test7,0> NVRAM (Non-Destructive) Test7,0>TESTING BOARD 37,0>Board 3 JTAG Test7,0> Verify System Board Scan Ring7,0>Board 3 Centerplane Test7,0> Centerplane Check7,0>Board 3 Address Controller Test7,0> AC Registers Test7,0> AC Initialization7,0>Setting Freq to 25MHZ7,0> Memory Registers Test7,0> Memory Registers Initialization Test7,0> AC DTAG Init7,0>Board 3 FireHose Controller Test7,0> FHC Initialization7,0>Board 3 NVRAM Devices Test7,0> M48T59 (TOD) Init7,0> M48T59 (TOD) Functional Part 1 Test7,0> NVRAM (Non-Destructive) Test7,0>Re-mapping to Local Device Space7,0>Begin Central Space Serial Port access7,0>Enable AC Control Parity7,0>Hotplug Trigger Test7,0>Init Counters for Hotplug7,0>Board 7 Cross Calls Test7,0> Cross Calls Test

7-186 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 187: Sun Enterprise Server Maintenance

7

7,0>Displaying PROM Versions7,0>Slot 1 IO Type 4 FCODE 1.8.7 1997/12/8 15:39 iPOST 3.4.6 1998/4/1614:227,0>Slot 3 IO Type 4 FCODE 1.8.7 1997/12/8 15:39 iPOST 3.4.6 1998/4/1614:227,0>Slot 7 CPU/Memory OBP 3.2.16 1998/6/8 16:58 POST 3.9.4 1998/6/916:257,0>Slot 9 CPU/Memory OBP 3.2.16 1998/6/8 16:58 POST 3.9.4 1998/6/916:257,0>Board 7 Environmental Probe Test7,0> Environmental Probe7,0>Checking Power Supply Configuration7,0>Power is more than adequate, load 4 ps 37,0>Reconfig memory due to POR or CLOCK RESET7,0>Reconfig memory due to DIAG_LEVEL7,0>Board 7 Probing Memory SIMMS Test7,0> Probe SIMMID7,0> Populated Memory Bank Status7,0> bd # Size Address Way Status7,0> 9 256 Normal7,0>Board 7 Memory Configuration Test7,0> Memory Interleaving7,0> Total banks with 8MB SIMMs = 07,0> Total banks with 32MB SIMMs = 17,0> Total banks with 128MB SIMMs = 07,0> Total banks with 256MB SIMMs = 07,0> Overall memory default speed = 60ns7,0>Do OPTIMAL INTLV7,0> Board 9 AC rev 5 RCTIME = 0 (Tras 71)7,0> Memory Refresh Enable7,0>Board 7 SIMMs Test7,0> MP Memory SIMM Clear Test7,0> Memory Size is 256Mbytes7,0> CPU MID 18 clearing 00000000.00004000 to 00000000.055000007,0> CPU MID 19 clearing 00000000.05500000 to 00000000.0aa000007,0> CPU MID 14 clearing 00000000.0aa00000 to 00000000.100000007,0> CPU MID 14 clearing 00000000.00000000 to 00000000.000040007,0> Memory Walking Rows and Columns Test7,0> MP Memory SIMM (6N RAM Patterns) Test7,0> Memory Size is 256Mbytes7,0> CPU MID 18 testing 00000000.00000000 to 00000000.055000007,0> CPU MID 19 testing 00000000.05500000 to 00000000.0aa000007,0> CPU MID 14 testing 00000000.0aa00000 to 00000000.100000007,0> MP Memory SIMM (moving inverse) Test7,0> Memory Size is 256Mbytes7,0> CPU MID 18 testing 00000000.00000000 to 00000000.05500000

Power On Self Test (POST) 7-187Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 188: Sun Enterprise Server Maintenance

7

7,0> CPU MID 19 testing 00000000.05500000 to 00000000.0aa000007,0> CPU MID 14 testing 00000000.0aa00000 to 00000000.100000007,0>Slave CPU Functional Tests7,0> Slave CPU MID 18 started9,0>Board 9 Functional CPU 0 Test9,0> Dcache Init9,0> Dcache Enable Test9,0> Dcache Functionality Test9,0> Ecache Stress Test9,0> Ecache Functional Test9,0> CPU Dispatch (Multi-Scalar) Test9,0> SPARC Atomic Instructions Test9,0> SPARC Prefetch Instructions Test9,0> CPU Softint Registers and Interrupts Test9,0> Uni-Processor Cache Coherence Test9,0> Branch Memory Test9,0> SDB ECC CE Test9,0> SDB ECC Uncorrectable Test9,0> FPU Instruction Test7,0> Slave CPU MID 19 started9,1>Board 9 Functional CPU 1 Test9,1> Dcache Init9,1> Dcache Enable Test9,1> Dcache Functionality Test9,1> Ecache Stress Test9,1> Ecache Functional Test9,1> CPU Dispatch (Multi-Scalar) Test9,1> SPARC Atomic Instructions Test9,1> SPARC Prefetch Instructions Test9,1> CPU Softint Registers and Interrupts Test9,1> Uni-Processor Cache Coherence Test9,1> Branch Memory Test9,1> SDB ECC CE Test9,1> SDB ECC Uncorrectable Test9,1> FPU Instruction Test7,0>Board 7 Functional CPU 0 Test7,0> Dcache Init7,0> Dcache Enable Test7,0> Dcache Functionality Test7,0> Ecache Stress Test7,0> Ecache Functional Test7,0> CPU Dispatch (Multi-Scalar) Test7,0> SPARC Atomic Instructions Test7,0> SPARC Prefetch Instructions Test7,0> CPU Softint Registers and Interrupts Test7,0> Uni-Processor Cache Coherence Test

7-188 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 189: Sun Enterprise Server Maintenance

7

7,0> Branch Memory Test7,0> SDB ECC CE Test7,0> SDB ECC Uncorrectable Test7,0> FPU Instruction Test7,0>TESTING IO BOARD 17,0>Board 1 I/O FPROM Test7,0> I/O Board EPROM checksum Test7,0>@(#) iPOST 3.4.6 1998/04/16 14:227,0> TESTING IO BOARD 1 ASICs7,0> TESTING SysIO Port 07,0>Board 1 SysIO Registers Test7,0> SysIO Register Initialization7,0> IOMMU Registers and RAM Test7,0> Streaming Buffer Registers and RAM Test7,0> SBus Control and Config Registers Test7,0> SysIO RAM Initialization7,0>Board 1 SysIO Functional Test7,0> Clear Interrupt Map and State Registers7,0> SysIO Interrupts Test7,0> SysIO Timers/Counters Test7,0> IOMMU Virtual Address TLB Tag Compare Test7,0> Streaming Buffer Flush Test7,0> DMA Merge Buffer Test7,0> SYSIO ECC Correctable Test7,0> SYSIO ECC UnCorrectable Test7,0> SysIO Sbus Probe Test7,0> SysIO Register Initialization Test7,0> SysIO RAM Initialization Test7,0> Clear Interrupt Map and State Registers Test7,0>Board 1 OnBoard IO Chipset (SOC) Test7,0> SOC SRAM Test7,0> SOC Registers Test7,0> SOC Interrupt Test7,0> Clear Interrupt Map and State Registers Test7,0> TESTING SysIO Port 17,0>Board 1 SysIO Registers Test7,0> SysIO Register Initialization7,0> IOMMU Registers and RAM Test7,0> Streaming Buffer Registers and RAM Test7,0> SBus Control and Config Registers Test7,0> SysIO RAM Initialization7,0>Board 1 SysIO Functional Test7,0> Clear Interrupt Map and State Registers7,0> SysIO Interrupts Test7,0> SysIO Timers/Counters Test7,0> IOMMU Virtual Address TLB Tag Compare Test

Power On Self Test (POST) 7-189Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 190: Sun Enterprise Server Maintenance

7

7,0> Streaming Buffer Flush Test7,0> DMA Merge Buffer Test7,0> SYSIO ECC Correctable Test7,0> SYSIO ECC UnCorrectable Test7,0> SysIO Sbus Probe Test7,0> SysIO Register Initialization Test7,0> SysIO RAM Initialization Test7,0> Clear Interrupt Map and State Registers Test7,0>Board 1 OnBoard IO Chipset (FEPS) Test7,0> FAS366 Registers Test7,0> ESP FAS366 DVMA burst mode read/write Test7,0> FAS366 FIFO TO DMA Test7,0> DMA TO FAS366 FIFO Test7,0> FEPS (Ethernet) Registers Test7,0> FEPS Ethernet(BM, DP83840, Twister) Internal Loopbacks Test7,0> SysIO Register Initialization Test7,0> SysIO RAM Initialization Test7,0> Clear Interrupt Map and State Registers Test7,0>IO BOARD 1 TESTED7,0>TESTING IO BOARD 37,0>Board 3 I/O FPROM Test7,0> I/O Board EPROM checksum Test7,0>@(#) iPOST 3.4.6 1998/04/16 14:227,0> TESTING IO BOARD 3 ASICs7,0> TESTING SysIO Port 07,0>Board 3 SysIO Registers Test7,0> SysIO Register Initialization7,0> IOMMU Registers and RAM Test7,0> Streaming Buffer Registers and RAM Test7,0> SBus Control and Config Registers Test7,0> SysIO RAM Initialization7,0>Board 3 SysIO Functional Test7,0> Clear Interrupt Map and State Registers7,0> SysIO Interrupts Test7,0> SysIO Timers/Counters Test7,0> IOMMU Virtual Address TLB Tag Compare Test7,0> Streaming Buffer Flush Test7,0> DMA Merge Buffer Test7,0> SYSIO ECC Correctable Test7,0> SYSIO ECC UnCorrectable Test7,0> SysIO Sbus Probe Test7,0> SysIO Register Initialization Test7,0> SysIO RAM Initialization Test7,0> Clear Interrupt Map and State Registers Test7,0>Board 3 OnBoard IO Chipset (SOC) Test7,0> SOC SRAM Test

7-190 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 191: Sun Enterprise Server Maintenance

7

7,0> SOC Registers Test7,0> SOC Interrupt Test7,0> Clear Interrupt Map and State Registers Test7,0> TESTING SysIO Port 17,0>Board 3 SysIO Registers Test7,0> SysIO Register Initialization7,0> IOMMU Registers and RAM Test7,0> Streaming Buffer Registers and RAM Test7,0> SBus Control and Config Registers Test7,0> SysIO RAM Initialization7,0>Board 3 SysIO Functional Test7,0> Clear Interrupt Map and State Registers7,0> SysIO Interrupts Test7,0> SysIO Timers/Counters Test7,0> IOMMU Virtual Address TLB Tag Compare Test7,0> Streaming Buffer Flush Test7,0> DMA Merge Buffer Test7,0> SYSIO ECC Correctable Test7,0> SYSIO ECC UnCorrectable Test7,0> SysIO Sbus Probe Test7,0> SysIO Register Initialization Test7,0> SysIO RAM Initialization Test7,0> Clear Interrupt Map and State Registers Test7,0>Board 3 OnBoard IO Chipset (FEPS) Test7,0> FAS366 Registers Test7,0> ESP FAS366 DVMA burst mode read/write Test7,0> FAS366 FIFO TO DMA Test7,0> DMA TO FAS366 FIFO Test7,0> FEPS (Ethernet) Registers Test7,0> FEPS Ethernet (BM, DP83840, Twister) Internal Loopbacks Test7,0> SysIO Register Initialization Test7,0> SysIO RAM Initialization Test7,0> Clear Interrupt Map and State Registers Test7,0>IO BOARD 3 TESTED7,0>SYSTEM LEVEL TESTING7,0>Board 7 Cache Coherency Test7,0> Multi-Processor Cache Coherence Test7,0> Testing CPU MID 187,0> Testing CPU MID 197,0>Probing for Disk System boards7,0>Board 7 System Interrupts Test7,0> System Interrupts Test7,0>Checking Power Supply Configuration7,0>Power is more than adequate, load 4 ps 3 (Four boards, and

3 power supplies)7,0> Check Board Present Test

Power On Self Test (POST) 7-191Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 192: Sun Enterprise Server Maintenance

7

7,0> Board Present Interrupt Test7,0>7,0> System Board Status7,0>-----------------------------------------------------------------7,0> Slot Board Status Board Type Failures7,0>-----------------------------------------------------------------7,0> 0 | Not installed | |7,0> 1 | Normal |+IO Type 4 |7,0> 2 | Not installed | |7,0> 3 | Normal |+IO Type 4 |7,0> 4 | Not installed | |7,0> 5 | Not installed | |7,0> 6 | Not installed | |7,0> 7 | Normal |+CPU/Memory |7,0> 8 | Not installed | |7,0> 9 | Normal |+CPU/Memory |7,0> 16 | Normal | Clock Board |7,0>-----------------------------------------------------------------7,0>7,0> CPU Module Status7,0>-----------------------------------------------------------------7,0> MID OK Cache Speed Version7,0>-----------------------------------------------------------------7,0> 14 | y | 4096 | 248 | 00170011.200005077,0> 18 | y | 4096 | 248 | 00170011.200005077,0> 19 | y | 4096 | 248 | 00170011.200005077,0>-----------------------------------------------------------------7,0>System Frequency (MHz),fcpu=248, fmod=124, fsys=82, fgen=4967,0> Populated Memory Bank Status7,0> bd # Size Address Way Status7,0> 9 256 0 0 Normal7,0>7,0> POST COMPLETE7,0>Entering OBP

7-192 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 193: Sun Enterprise Server Maintenance

7

POST Control Commands

The following are the control commands for POST.

Note – These commands are entered on the terminal connected to ttya

or the keyboard of the workstation running the tip session.

Don’t try to enter these commands on the Sun keyboard connected to

the clock board

The toggle keys turn on and off the feature on each stoke of the key.

There are two particularly useful commands:

s - Toggle Stop flag

This flag stops the POST on completion in the extended POST menus.

Get into the habit of hitting the s key during POST which will then put

you into the extended POST.

v - Toggle verbose print flag

Normally, the only way to get a display of POST to ttya is to power on

in diagnostic mode or have diag-switch? set to true.

By hitting the v key during a normal power-on, POST is displayed to

ttya.

Power On Self Test (POST) 7-193Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 194: Sun Enterprise Server Maintenance

7

POST Control Commands (cont)

L Toggle Loop on full POST

7-194 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 195: Sun Enterprise Server Maintenance

7

POST Menus

Power On Self Test (POST) 7-195Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 196: Sun Enterprise Server Maintenance

7

POST Menus (cont)

Option 7... Display system summary

This is the most useful command, since it gives a display of the final

system configuration:

7,0> System Board Status7,0>-----------------------------------------------------------------7,0> Slot Board Status Board Type Failures7,0>-----------------------------------------------------------------7,0> 0 | Not installed | |7,0> 1 | Normal |+IO Type 4 |7,0> 2 | Not installed | |7,0> 3 | Normal |+IO Type 4 |7,0> 4 | Not installed | |7,0> 5 | Not installed | |7,0> 6 | Not installed | |7,0> 7 | Normal |+CPU/Memory |7,0> 8 | Not installed | |7,0> 9 | Normal |+CPU/Memory |7,0> 16 | Normal | Clock Board |7,0>-----------------------------------------------------------------7,0>7,0> CPU Module Status7,0>-----------------------------------------------------------------7,0> MID OK Cache Speed Version7,0>-----------------------------------------------------------------7,0> 14 | y | 4096 | 248 | 00170011.200005077,0> 15 | y | 4096 | 248 | 00170011.200005077,0> 18 | y | 4096 | 248 | 00170011.200005077,0> 19 | y | 4096 | 248 | 00170011.200005077,0>-----------------------------------------------------------------7,0>System Frequency (MHz),fcpu=248, fmod=124, fsys=82, fgen=4967,0> Populated Memory Bank Status7,0> bd # Size Address Way Status7,0> 9 256 0 0 Normal

7-196 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 197: Sun Enterprise Server Maintenance

7

Warning – Note the MID address for the processors. POST numbers

processors in decimal (as does Solaris) whereas OBP numbers the

processors in hex.

BE AWARE OF THIS DIFFERENCE....

Experiment with the POST Menus. Some of the tests return a message

STILL UNDER DEVELOPEMENT

and should no be too heavily relied upon for fault finding.

Power On Self Test (POST) 7-197Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 198: Sun Enterprise Server Maintenance

7

POST Board Status Messages

On completion of testing, POST will display the status of each board.

There are four board status types:

Normal

On-line/Failed A component on that board has failed POST

Low-power mode Either the whole board has failed POST or the

obp parameter configuration-policy is set to board

or the board has been detached using dr

Not Installed

7,0> System Board Status7,0>-----------------------------------------------------------------7,0> Slot Board Status Board Type Failures7,0>-----------------------------------------------------------------7,0> 0 | Not installed | |7,0> 1 | Normal |+IO Type 4 |7,0> 2 | Not installed | |7,0> 3 | Low Power Mode |+IO Type 4 | AC7,0> 4 | Not installed | |7,0> 5 | Not installed | |7,0> 6 | Not installed | |7,0> 7 | Online/failure |+CPU/Memory | CPU 17,0> 8 | Not installed | |7,0> 9 | Normal |+CPU/Memory |7,0> 16 | Normal | Clock Board |7,0>-----------------------------------------------------------------

7-198 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 199: Sun Enterprise Server Maintenance

7

Sample Error Messages

Power On Self Test (POST) 7-199Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 200: Sun Enterprise Server Maintenance

7

POST Error Reporting

You can view the output from the last POST by running the show-post-results command. You can examine the report for error

messages. The report generated by the show-post-results command

displays a synopsis of the POST tests in a less confusing manner than

the actual POST output you observed using the serial port connection.

The symbols used in the show-post-results report are defined as

follows:

● P = present

● *** = failed component

● NOT = Not found

● 0 = no failures

7-200 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 201: Sun Enterprise Server Maintenance

7

POST Error Reporting (cont)

The following is a sample output from a show-post-resultscommand.

ok show-post-results

Slot 0 - Status=Okay, Type: CPU/Memory

Cpu0=P Cpu0-OK=P FailCode=0 Cpu1=P Cpu1-OK=P FailCode=0AC=P FHC=P SRAM=P FPROM=P LabCon=Not Ovtemp=NotBank0=0 Bank1=0 DTag0=P DTag1=P JTAG=P CntrPl=PBank0=P Bank1=Not DC=ff

Slot 1 - Status=Fail, Type: IO board Type 2

Sysio0=P Sysio1=P FEPS=P FEPSFC=0 SOC=P FFB=PSbus0=P Sbus2=PAC=P FHC=P SRAM=P FPROM=P LabCon=Not Ovtemp=NotTODC=*** JTAG=P CntrPl=P DC=ff

Slot 2 - Status=Okay, Type: CPU/Memory

Cpu0=P Cpu0-OK=P FailCode=0 Cpu1=P Cpu1-OK=P FailCode=0AC=P FHC=P SRAM=P FPROM=P LabCon=Not Ovtemp=NotBank0=0 Bank1=0 DTag0=P DTag1=P JTAG=P CntrPl=PBank0=Not Bank1=Not DC=ff

Slot 16 - Status=Fail, Type: Clock

Clock=P Serial=P KbdMse=P PPS-DC=P DCReg0=P DCReg1=PAC=P ACFan=P KeyFan=P PSFail=0 Ovtemp=Not TODC=P RKFan=P

P = Present or Passed*** = Failed ComponentNot = Not present

ok

The following few pages provides a key to the show-post-resultsoutput.

Power On Self Test (POST) 7-201Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 202: Sun Enterprise Server Maintenance

7

POST Error Reporting - definitions

CPU/Memory Board

Cpu0/Cpu1 CPU modules on the board

CPU{0,1}-OK CPU module status

FailCode Failure code (valid only if CPU failed)

FHC Fire Hose Controller

SRAM Static RAM

FPROM Flash PROM

FHC Fire Hose Controller

LabCon Lab Console

Ovtemp Overtemp

Bank0 Bank0 status (a bit indicates a missing or failed SIMM)

Bank1 Bank1 status (a bit indicates a missing or failed SIMM)

DTag0 DTags0 status

DTag1 DTags1 status

JTAG Jtag status

CntrPl Centerplane status

DC Data Controllers (0 bit indicates a failed DC)

I/O Board

Sysio0 SysIO 0 status

Sysio1 SysIO 1 status

FEPS Onboard FEPS chip

FEPSFC FEPS fail code (valid only if failed)

SOC Onboard SOC status

FFB FFB card status

Sbus0 SBus0 slot status

Sbus1 SBus1 slot status

Sbus2 SBus2 slot status

AC Address Controller

FHC Fire Hose Controller

SRAM Static RAM

FPROM Flash PROMs

LabCon Lab Console

Ovtemp Overtemp

TODC Time of Day Clock

JTAG JTAG status

CntrPl Centerplane status

DC Data Controllers (0 bit indicates a failed DC)

7-202 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 203: Sun Enterprise Server Maintenance

7

Disk Board

Disk0 Disk0 ID (valid only if disk present)

Disk1 Disk1 ID (valid only if disk present)

Disk0P Disk0 Present

Disk1P Disk1 Present

VDDOK SCSI VDD status

Fan Fan Fail status

JTAG JTAG status

Clock Board

Clock Clock running

Serial Serial Port

KBytes Keyboard Mouse status

PPS-DC Peripheral PS ok (all DC levels OK)

AC AC power status

ACFan AC box fan status

KeyFan KeySwitch fan status

PSFail Power Supply fail status

(bit position indicates which ps failure)

Ovtemp Overtemp

TODC Time of Day Clock

V5-P Peripheral 5V

V12-P Peripheral 12V

V5-Aux Auxilary 5V

V5P-PC Peripheral 5V Precharge

V12-PC Peripheral 12V Precharge

V3-PC System 3.3V Precharge

V5-PC System 5.0V Precharge

RKFan Rack Fan Status

3.3V Clock board 3.3 V

5.0V Clock board 5.0 V

Power On Self Test (POST) 7-203Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 204: Sun Enterprise Server Maintenance

7

When things go wrong...

What constitutes a minimum system?

If you have a system which hangs under POST, or is unpredictable in

its results, run POST with a minimum config.

You can run POST with a clock board, and a CPU/Memory board with

one CPU module, and no memory. You do not need any memory for

POST, since it runs in SRAM on each board.

Frequency Margining

Again, if you have intermittent faults, increase the frequency of the

gigaplane interconnect to trap these faults.

Do not margin it too high, since it will automatically fail.

loop on diagnostics

Remember the loop function which you can set on the POST control

menu.

Warning – POST does not check SBus cards, or peripherals. It is no use

running POST with a loop command and with frequency margined

high, if the fault is that the system will not see any disks.

7-204 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 205: Sun Enterprise Server Maintenance

7

Accessing and Displaying POST

To access the host’s operating system from the console and to interact

with OBP and POST programs, you must access the system’s serial

port A. For interactive capability you must have an ASCII terminal

with keyboard attached to serial port A.

tip session

The best method of getting POST output is to tip into the serial port A

from another Sun system. Typically, you will tip out of port B on a

workstaion.The method is outlined below.

workstation# more /etc/remote | grep hardwirehardwire:dv=/dev/term/b:br#9600:el=^C^S^Q^U^D:ie=%$:oe=^D:workstation# tip hardwireconnected

Note the tip commands..

~# break (stop-A)

~. exit

Null modemcable

Serial Port A

ASCII terminal or workstation

Power On Self Test (POST) 7-205Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 206: Sun Enterprise Server Maintenance

7

7-206 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 207: Sun Enterprise Server Maintenance

InternalDiskSubsystems 8

8-207Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 208: Sun Enterprise Server Maintenance

8

Internal Storage Capacities

Sun Enterprise systems have the following maximum internal storage

capacities:

● Sun Enterprise 3000 – Up to ten 18.2-Gbyte SCSI drives are used to

populate the internal bays

● Sun Enterprise 3500 – Up to eight 36.4-Gbyte FC-Al dual-ported

disks drives can be used to populate the internal bays

● Sun Enterprise 4x00 and Enterprise 5x00 – Up to eight 18.2-Gbyte

SCSI drives, mounted on four disk boards

● Sun Enterprise 6x00 – Up to four 18.2-Gbyte SCSI drives, mounted

on two disk boards

8-208 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 209: Sun Enterprise Server Maintenance

8

Disk Subsystems

Sun Enterprise Servers can support several terabytes of disk storage

when external assemblies are used.

This module focuses on disks that are configured as internal devices.

The SCSI Disk Board

With the exception of the Sun Enterprise 3500 and 3000 systems, the

Sun Enterprise x500 servers support dual-SCSI disk boards that

contain one or two UltraSCSI disk drives.

The disk board capacity for these servers varies as follows:

● The Sun Enterprise 4x00, supports up to four disk boards.

● The Sun Enterprise 5x00, supports up to four disk boards.

● The Sun Enterprise 6x00, supports only two disk boards

maximum.

This is due to the fact that the disk boards do not put a load on the

gigaplane. Indeed, the only thing the disk board does take from

the gigaplane is power. Putting more than two disk boards in an

E6x00 would leave spaces on the bus, which is not allowed. (This

is why we have load boards in empty slots).

Disk boards are limited to slots 14 and 15 only, which are the slots

closest to the gigaplane terminators.

The SCSI Disk Board Addressing

SCSI addressing is assigned according to the Gigaplane slot in which

the board is installed, as shown in Table 8-1.

Internal Disk Subsystems 8-209Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 210: Sun Enterprise Server Maintenance

8

Note – The SCSI disk board requires a SCSI-2 interface from an I/O

board that connects to the external SCSI-2 port. The SCSI disk boards

can be daisy-chained so only one interface is required

Table 8-1 Default Drive Address Settings

Jumpers J0702 and J0703 override the default drive address settings as

shown in Table 8-2, assigned by the centerplane slot position.

Table 8-2 SCSI Disk Board Disk Addressing Override JumperConfigurations

SLOT DISK 0ADDRESS

DISK 1ADDRESS

SLOT DISK 0ADDRESS

DISK 1ADDRESS

0 4 5 8 10 11

1 6 7 9 0 1

2 0 1 10 12 13

3 10 11 11 2 3

4 2 3 12 14 15

5 12 13 13 8 9

6 8 9 14 0 1

7 14 15 15 10 11

JUMPER PINS SETTING DESCRIPTION

J0702 1-21-2A0-A3

OutInAs required

Disk 0 default address selectionDisk 0 manual address selectionDisk 0 address select

J0703 1-21-2A0-A3

OutInAs required

Disk 1 default address selectionDisk 1 manual address selectionDisk 1 address select

J0705 1-2 As required Disk 0 delay spin

J0706 1-2 As required Disk 1 delay spin

8-210 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 211: Sun Enterprise Server Maintenance

8

Disk Addressing

You can type a complete physical path name or a complete logical path

name to specify the device or controller. How Solaris derives device

addresses is covered in the upcoming Solaris module. In this module,

you are given sample addresses both for SCSI devices and FC-AL

devices.

● Physical addresses are designed to follow a hardware tree to a

specific device.

● Logical addresses allow applications to point to a specific device

an a specific bus.

● Solaris performs the translation between logical and physical

addresses transparent to the end-user.

Examples

A typical physical path name for a disk device is:

/sbus@3,0/SUNW,fas@3,880000/sd@0,0:a,raw

or

/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037000f96,0:a,raw

A typical logical path name is:

c2t1d0s1

Additional information on addressing that is specific to the server type

is covered with the individual servers.

Internal Disk Subsystems 8-211Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 212: Sun Enterprise Server Maintenance

8

Sun Enterprise 3500

Enterprise 3500 Fibre Channel Interface Board

This is a new board designed to provide connectivity to the internal

disk drives in the Sun Enterprise 3500 server. The internal disk drives

operate with the fibre channel arbitrated loop (FC-AL) architecture.

Each of the four potential FC-AL loops corresponds to one of four

gigabit interface converter (GBIC) modules on the Fibre channel

interface board.

Figure 8-1 Sun Enterprise 3500 Fibre Channel Interface Board

The Fibre channel interface board comes with two hot-pluggable GBIC

modules. The 2-meter fibre channel cables establish a loop or

connection with the internal disk drives. This board is part of the

standard internal disk drive option. If no internal drives are ordered,

this board is not present.

Table 8-3 GBIC to Disk Drive Bay and Drive Port Connection

Disk Drives Drive Port GBIC name

0, 1, 2, 3 A GBIC LA (lower bank)

0, 1, 2, 3 B GBIC LB (lower bank

4, 5, 6, 7 A GBIC UA (upper bank)

4, 5, 6, 7 B GBIC UB (upper bank)

GBIC LAGBIC LB

GBIC UA

GBIC UB

Part Number 501-4820

8-212 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 213: Sun Enterprise Server Maintenance

8

Enterprise 3500 Fibre Channel Interface Board

The Sun Enterprise 3500 can be ordered without internal disk drives.

Any of the bootable external Sun StorEdge disk products (such as the

Sun StorEdge UniPack, MultiPack, D1000, A3500, and A5X00

products) can be used as a boot device for a Sun Enterprise 3500

without internal disk drives. Such a configuration does not require an

FC-AL Interface board because the FC-AL Interface board’s only

purpose is to connect to internal disk bays.

The interface boards can be connected to the SBus I/O Board and the

Graphics I/O Board which both come with a pair of on-board 100

MB/second FC-AL sockets. In addition, both types of boards support

a SBus Host Adapter that has a pair of 100 MB/second FC-AL sockets.

Each of these pairs of sockets can support the internal disk drives in

the Sun Enterprise 3500 or the Sun StorEdge A5000, but they cannot be

split up so that one supports one type of device while the other socket

supports a different type of device.

However, a PCI-only configuration in a Sun Enterprise 3500 does not

provide a way to connect the internal FC-AL disk drives. This is

because the PCI I/O Board does not have on-board FC-AL sockets and

there currently is no PCI FC-AL card available. So, if you want to use

the internal disk drives in the Sun Enterprise 3500, you must have at

least one SBus I/O or one Graphics I/O Board installed. There are no

plans to add on-board FC-AL sockets to the PCI I/O Board because

there is not enough physical space on the board to accommodate

on-board FC-AL sockets.

Even though the FC-AL connection cannot be split between internal

and external connection, the individual FC-AL connections on the

FC-AL Interface board are logically independent. The components do

get their power through a single connection. However, the power to

the FC-AL Interface board comes from the backplane which is

supported by redundant power supplies. Therefore the design has

practically no single point of failure.

Internal Disk Subsystems 8-213Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 214: Sun Enterprise Server Maintenance

8

Fibre Channel Interface Board

The FC-AL board comes with two GBIC modules and one 2-meter

fibre channel cable to establish one loop (connection).

Figure 8-2 Basic FC-AL Loop

One GBIC module is installed on the FC-AL Interface board and,

typically, the other is installed on the I/O board (or SBus card) leaving

three empty FC-AL sockets on the FC-AL Interface board. Each

additional loop requires two additional GBIC modules and one 2-

meter fibre channel cable. The GBIC modules on the FC-AL Interface

board are exactly the same as those used in the Sun StorEdge A5X00

arrays, FC-AL SBus Host Adapter, and on the SBus I/O board.

SBusSBusSBus

SCSIEthernetFibre channel cable

Gigaplane Bus Connector

Card Card Card

AddressControl

DataControl

28841 UPA Bus

SBus I/O boardInterface board

To lower

{ {GBICs

disk baysTo upperdisk bays

8-214 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 215: Sun Enterprise Server Maintenance

8

Sun Enterprise 3500 - Disk Addressing

A typical configuration, as illustrated in , takes advantage of the dual-ported capability of

the Sun Enterprise 3500 disk structure. Having two paths to each disk allows eliminates the

path to disk as a single point of failure.

Sun Enterprise 3500 Disk Configuration

In the Sun Enterprise 3500, the lower four drives are configurable as one group of disks, or

they can be accessed as two smaller independent groups of disks. The configuration is

application dependant. The same is true for the upper four disk bays.

I/OI/OIB

LA

UB

LB

UA

0 1 2 3

e1 e0 dc da

ef e8 e4 e2

4 5 6 7

13579

Internal Disk Subsystems 8-215Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 216: Sun Enterprise Server Maintenance

8

probe-fcal-all

A new command has been introduced to look at the FC-AL disk on an

E3500.

{e} ok probe-fcal-all

/sbus@6,0/SUNW,socal@d,10000/sf@1,0

/sbus@6,0/SUNW,socal@d,10000/sf@0,0

WWN 200d080020940232 Loopid 1

WWN 21000020370cbc0e Loopid e1Disk SEAGATE ST19171FCSUN9.0G117E9804P938

/sbus@2,0/SUNW,socal@d,10000/sf@1,0

/sbus@2,0/SUNW,socal@d,10000/sf@0,0

WWN 2005080020940232 Loopid 1

WWN 21000020370d8ad0 Loopid efDisk SEAGATEST19171FCSUN9.0G117E9814T324

Each disk in an E3500 has an independent world-wide number

(WWN). These numbers are assigned by the manufacturer and are

unique to the disk. The FC-AL specification states that each

component in a fibre channel loop must have a unique WWN. This

includes the interface boards.

The WWN of the IBs is derived from the host MAC address, in this

case 8:00:20:94:02:32

The WWN is mapped to a logical path at install time.

Do a long listing on the logical path to view how the numbers relate.

# ls -l /dev/dsk/c0t0d0s0

8-216 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 217: Sun Enterprise Server Maintenance

8

lrwxrwxrwx 1 root root 74 Jan 22 15:00/dev/dsk/c0t0d0s0 ->

../../devices/sbus@2,0/SUNW,socal@d,10000/ssd@w21000020370d8ad0,0:a

Fortunately, we don’t have to boot the device using the WWN. We can

boot using the disk id.

ok boot /sbus@2,0/SUNW,socal@d,10000/ssd@0,0

The proper approach is to put the above in the boot-device parameter

of the NVRAM and then boot from the alias

ok devalias disk

disk=/sbus@2,0/SUNW,socal@d,10000/ssd@0,0

ok boot disk

Internal Disk Subsystems 8-217Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 218: Sun Enterprise Server Maintenance

8

Sun Enterprise 3500 - Boot Disk Replacement

A host that boots from a non-mirrored FCAL disk (either an A5000 or

the E3500 internal disks) will have to overcome the hard-coded World

Wide Number (WWN) that each of these disks uses as an integral part

of their device path.

On failure of the boot disk the systems administrator must ensure that

this WWN is correctly updated throughout the system to ensure it will

reboot.

Procedure

When the boot disk is replaced, and a system is booted from CD-ROM,

a device tree is built in memory as part of the boot sequence.

But, when the data is restored from a backup tape, the old path_to_instfile with the old WWN is put back on the disk.

To recover, mount the root filesystem which you have now restored on

/a. Run the following commands to re-build the devices tree:

# drvconfig -r /a -p /a/etc/path_to_inst# cd /devices# find . -print | cpio -pduVm /a/devices# disks -r /a# devlinks -r /a

NOTE: It is currently necessary to use both "drvconfig" and "find |

cpio" due to bugid 4161768, drvconfig does not work properly with

socal disks.

Restore the other filesystems on that disk, or comment out the entries

for them from /a/etc/vfstab. At least you must have all the Solaris

filesystems (root, /var, /usr, /opt, etc.) recovered.

Reboot the system from the recovered disk.

For full details, see Internal SRDB 17658

8-218 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 219: Sun Enterprise Server Maintenance

8

Sun Enterprise 3500 - Data Disk Replacement

We will still have to overcome the hard-coded World Wide Number

(WWN) that each of these disks uses as an integral part of their device

path.

Procedure

Ensure the the following patches are installed or higher

Solaris 2.6

sf/socal/ib/luxadm patch - 105375-10

ssd patch - 105356-08

Solaris 2.5.1

sf/socal/ib/luxadm patch - 105310-08

ssd patch - 104708-16

These provide support for the luxadm commands on the E3500.

Unmount the disk and then stop it with

# luxadm stop <logical path, physical path or WWN ...>

Remove the device entries,the following command will complete this

# luxadm remove_device <logical path, physical path or WWN ...>

Replace the disk and then

# luxadm insert <no arguments required>

This will recreate the device entries, the device is now ready to be

used.

For full details, see Internal SRDB 18595

Internal Disk Subsystems 8-219Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 220: Sun Enterprise Server Maintenance

8

Sun Enterprise 3000 Disk Addressing

As you look at the front of an E3000, the top four disks are assigned

scsi targets 0-3, and the bottom six disks are assigned scsi targets 10-15.

Note that the system addresses the disks in hex

Note: All ten drives plus the tape unit and CD-ROM are driven from

the onboard scsi controller in slot 1.

8-220 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 221: Sun Enterprise Server Maintenance

8

I/O Addressing Test

The following output has been generated from an E3500.

Outline all the boards within the system, with part numbers.

You may assume that we have 400MHz processors.

{e} ok show-disks

a) /pci@b,4000/SUNW,isptwo@3/sdb) /sbus@7,0/SUNW,fas@0,8800000/sdc) /sbus@7,0/SUNW,fas@3,8800000/sdd) /sbus@6,0/SUNW,socal@d,10000/sf@1,0/ssde) /sbus@6,0/SUNW,socal@d,10000/sf@0,0/ssdf) /sbus@3,0/SUNW,fas@3,8800000/sdg) /sbus@2,0/SUNW,socal@d,10000/sf@1,0/ssdh) /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssdq) NO SELECTION

Internal Disk Subsystems 8-221Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 222: Sun Enterprise Server Maintenance

8

8-222 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 223: Sun Enterprise Server Maintenance

SolarisSupportUtilities 9

9-223Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 224: Sun Enterprise Server Maintenance

9

How Solaris References System Components

In the Solaris 2.x and 7 operating environments, system components

are referenced in three different ways:

● Logical device names – Names used by system administrators and

software to access system resources.

● Physical device names – Names that represent the full device path

name in the device information hierarchy (or tree).

● Instance names – The kernel’s abbreviated names for every

possible device on the system. dmesg displays instance names,

such as sd0 and sd1.

Logical Device Names

These names are symbolically linked to their corresponding physical

device (/devices ) names. The logical names are located in the /devdirectory and are created at the same time as the physical names.

It is important to remember that in most cases, software applications

and system administrators view system resources (such as disk)

through their logical names. When a system fault occurs, it might be

necessary to translate a device’s logical name to some physical

identifier so that you can repair the problem. The next few pages will

show you the relationship between the logical name and the physical

name.

The following examples show the logical names of a diskette drive and

hard disk drive 0.

# ls /dev/diskette*/dev/diskette/dev/diskette0

# ls /dev/rdsk/c0t0d0*c0t0d0s0 c0t0d0s1 c0t0d0s2 c0t0d0s3c0t0d0s4 c0t0d0s5 c0t0d0s6 c0t0d0s7

9-224 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 225: Sun Enterprise Server Maintenance

9

Figure 9-1 shows the relationship of the hard disk drive logical name

syntax to traditional SCSI components.

Figure 9-1 Logical Name Syntax

/dev/[r]dsk/c#t#d#s#

Slice or partition number

Disk or logical unit number (LUN)

Target number

Controller number

Solaris Support Utilities 9-225Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 226: Sun Enterprise Server Maintenance

9

How Solaris References System Components (cont)

Physical Device Names

The physical names are located in the /devices directory where the

entries are created during installation or subsequent automatic device

configuration or by using the drvconf command. The device file

provides a pointer to the kernel device drivers.

● The following examples show the relationship of the diskette drive

and hard disk drive 0 physical names to their logical names.

Note – The following example is from an Enterprise 450.

# ls -l /dev/diskette*lrwxrwxrwx 1 root root 49 Aug 5 13:52 /dev/diskette ->/devices/pci@1f,4000/ebus@1/fdthree@14,3023f0:clrwxrwxrwx 1 root root 49 Aug 5 13:52 /dev/diskette0 ->/devices/pci@1f,4000/ebus@1/fdthree@14,3023f0:c

# ls -l /dev/rdsk/c0t0d0s0lrwxrwxrwx 1 root root 45 Aug 5 13:52 /dev/rdsk/c0t0d0s0 ->/devices/pci@1f,4000/scsi@3/sd@0,0:a,raw

● The next two examples show the corresponding OBP device treeand devalias entries for the same two devices.

ok show-devs./pci@1f,4000/ebus@1/fdthree@14,3023f0:c./pci@1f,4000/scsi@3/disk.

ok devalias.floppy /pci@1f,4000/ebus@1/fdthree.disk0 /pci@1f,4000/scsi@3/sd@0,0.

9-226 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 227: Sun Enterprise Server Maintenance

9

How Solaris References System Components (cont)

Instance Names

In the Solaris 2.x and 7 environments, the instance name is bound to

the physical name by references in the /etc/path_to_inst file.

The device instance is the number on the right side of the file (the

number is in bold in the displayed output for each device in the

following example). The kernel uses these names to identify every

possible device instance.

The instance numbers are assigned in order of insertion/configuration

and therefore do not necessarily follow any recognizable or usable

pattern. However, they do map to groupings of the minor device

numbers listed in the /devices/... sub-directories.

The following example shows the entries in the /etc/path_to_instfile for the same diskette drive and hard disk drive 0 seen earlier.

“/pci@1f,4000/ebus@1/fdthree@14,3023f0” 0 “ fd ”“/pci@1f,4000/scsi@3/sd@0” 0 “ sd ”

Solaris Support Utilities 9-227Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 228: Sun Enterprise Server Maintenance

9

Configuring Components in Solaris (cont)

Automatic Device Configuration

The kernel, consisting of a small generic core with a platform-specific

component and a set of modules, is configured automatically in the

Solaris environment.

A kernel module is a hardware or software component that is used to

perform a specific task on the system. An example of a loadable kernel

module is a device driver that is loaded when the device is accessed.

The system determines what devices are attached to it at boot time.

Then the kernel configures itself dynamically, loading needed modules

into memory. At this time, device drivers are loaded when devices,

such as disk and tape devices, are accessed for the first time. This

process is called autoconfiguration because all kernel modules are

loaded automatically when needed.

Adding New Components to Solaris

Note – The following procedure should be used only when

configuring components that not hot-pluggable and/or Dynamic

Reconfiguration is unavailable.

If Solaris is running, perform the following steps:

1. Become superuser.

2. Create the /reconfigure file.

# touch /reconfigure

The /reconfigure file causes the Solaris software to check for the

presence of any newly installed devices the next time you turn on

or boot your system.

3. Shut down the system.

# shutdown -i0 -g30 -y

4. Turn off power to the system after it is shut down.

9-228 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 229: Sun Enterprise Server Maintenance

9

5. Turn off the system.

6. Install the device.

7. Turn on the power to the system.

The system will boot to multiuser mode and the login prompt will

be displayed.

8. Verify that the device has been configured.

Note – If the system is in OBP, execute the boot -r command to force

a Solaris reconfiguration.

Solaris Support Utilities 9-229Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 230: Sun Enterprise Server Maintenance

9

How to a Add a Device Driver

This procedure assumes that the device has already been added to the

system.

1. Become superuser.

2. Place the tape, diskette, or CD-ROM into the appropriate drive.

3. Use the pkgadd command to install the driver.

# pkgadd -d device package-name

where

-d device

Identifies the device pathname.

package-name

Identifies the package name that contains the device driver.

4. Verify that the package has been added correctly by using the

pkgchk command. The system prompt returns with no response if

the package is installed correctly.

# pkgchk packagename

9-230 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 231: Sun Enterprise Server Maintenance

9

Displaying System Configuration Information - prtconf ,sysdef and format

Solaris provides you with a variety of utilities that you can use to

monitor Sun Enterprise systems. The following is a list of utilities to

display system and device configuration information:

● prtconf – Displays system configuration information, including

total amount of memory and the device configuration as described

by the system’s device hierarchy. The output displayed by this

command depends upon the type of system.

● sysdef – Displays device configuration information including

system hardware, pseudo devices, loadable modules, and selected

kernel parameters.

● format – Displays both logical and physical device names.

The prtconf Utility

The following prtconf output is displayed on a Enterprise 450

system. To execute the prtconf command, type the following:

# /usr/sbin/prtconfSystem Configuration: Sun Microsystems sun4uMemory size: 256 MegabytesSystem Peripherals (Software Nodes):

SUNW,Ultra-4 packages (driver not attached) terminal-emulator (driver not attached) deblocker (driver not attached) obp-tftp (driver not attached) disk-label (driver not attached) ufs-file-system (driver not attached)

openprom (driver not attached) client-services (driver not attached) options, instance #0 aliases (driver not attached) memory (driver not attached) virtual-memory (driver not attached) associations slot2disk slot2led

Solaris Support Utilities 9-231Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 232: Sun Enterprise Server Maintenance

9

slot2devpci, instance #0 ebus, instance #0 auxio (driver not attached) power, instance #0 (driver not attached) SUNW,pll (driver not attached) sc (driver not attached) se, instance #0 su, instance #0 su, instance #1 ecpp, instance #0 (driver not attached) fdthree, instance #0 eeprom (driver not attached) flashprom (driver not attached)

.

.

9-232 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 233: Sun Enterprise Server Maintenance

9

The sysdef Utility

The following sysdef output is displayed on a Enterprise 450 system.

To execute the sysdef command, type the following:

# /usr/sbin/sysdef** Hostid* 8095febb** sun4u Configuration*** Devices*packages (driver not attached) terminal-emulator (driver not attached) deblocker (driver not attached) obp-tftp (driver not attached) disk-label (driver not attached) ufs-file-system (driver not attached)openprom (driver not attached) client-services (driver not attached)options, instance #0aliases (driver not attached)memory (driver not attached)virtual-memory (driver not attached)associations (driver not attached) slot2disk (driver not attached) slot2led (driver not attached) slot2dev (driver not attached)counter-timer (driver not attached)pci, instance #0 ebus, instance #0 auxio (driver not attached) power, instance #0 (driver not attached) SUNW,pll (driver not attached) sc (driver not attached) se, instance #0 su, instance #0 su, instance #1 fdthree, instance #0 eeprom (driver not attached) flashprom (driver not attached)

Solaris Support Utilities 9-233Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 234: Sun Enterprise Server Maintenance

9

SUNW,envctrl, instance #0 network, instance #0 (driver not attached) scsi, instance #0 disk (driver not attached) tape (driver not attached) sd, instance #0 sd, instance #1 sd, instance #2 sd, instance #3 sd, instance #4 (driver not attached) sd, instance #5 (driver not attached) sd, instance #6 (driver not attached) sd, instance #7 (driver not attached) sd, instance #8 (driver not attached) sd, instance #9 (driver not attached) sd, instance #10 (driver not attached) sd, instance #11 (driver not attached) sd, instance #12 (driver not attached) sd, instance #13 (driver not attached) sd, instance #14 (driver not attached) scsi, instance #1 disk (driver not attached) tape (driver not attached) sd, instance #15 sd, instance #16 sd, instance #17 sd, instance #18 sd, instance #19 (driver not attached) sd, instance #20 (driver not attached) sd, instance #21 (driver not attached) sd, instance #22 (driver not attached) sd, instance #23 (driver not attached) sd, instance #24 (driver not attached) sd, instance #25 (driver not attached) sd, instance #26 (driver not attached) sd, instance #27 (driver not attached) sd, instance #28 (driver not attached) sd, instance #29 (driver not attached)pci, instance #1mc (driver not attached) bank (driver not attached) dimm (driver not attached) dimm (driver not attached) dimm (driver not attached) dimm (driver not attached) bank (driver not attached)

9-234 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 235: Sun Enterprise Server Maintenance

9

bank (driver not attached) bank (driver not attached)SUNW,UltraSPARC-II (driver not attached)pci, instance #2pci, instance #3pci, instance #4 SUNW,m64B, instance #0pci, instance #5pseudo, instance #0 clone, instance #0 ip, instance #0 tcp, instance #0..** Loadable Objects** Loadable Object Path = /platform/SUNW,Ultra-4/kernel*misc/platmodmisc/sparcv9/platmod** Loadable Object Path = /platform/sun4u/kernel*cpu/sparcv9/SUNW,UltraSPARC-IIcpu/sparcv9/SUNW,UltraSPARC-IIicpu/sparcv9/SUNW,UltraSPARC** Loadable Object Path = /kernel*drv/ispdrv/logdrv/le..** Loadable Object Path = /usr/kernel*drv/sparcv9/tnfdrv/sparcv9/audiocsdrv/sparcv9/dbristrmod/u8lat2** System Configuration** swap files

Solaris Support Utilities 9-235Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 236: Sun Enterprise Server Maintenance

9

swapfile dev swaplo blocks free/dev/dsk/c0t0d0s1 32,1 16 308800 308800** Tunable Parameters* 5095424 maximum memory allowed in buffer cache (bufhwm) 3898 maximum number of processes (v.v_proc) 99 maximum global priority in sys class (MAXCLSYSPRI) 3893 maximum processes per user id (v.v_maxup) 30 auto update time limit in seconds (NAUTOUP) 25 page stealing low water mark (GPGSLO) 5 fsflush run rate (FSFLUSHR) 25 minimum resident memory for avoiding deadlock (MINARMEM) 25 minimum swapable memory for avoiding deadlock (MINASMEM)..

9-236 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 237: Sun Enterprise Server Maintenance

9

The format Utility

The format utility is normally used to prepare a disk drive for access

by the Solaris operating system. Maintenance personnel also use this

utility as a visibility tool to determine which disk drives can be “seen”

by Solaris. To execute the format command, type the following:

# formatAVAILABLE DISK SELECTIONS: 0. c0t0d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133> /pci@1f,4000/scsi@3/sd@0,0 1. c0t3d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133> /pci@1f,4000/scsi@3/sd@3,0Specify disk (enter its number):

This format example identifies two 9 GByte SCSI disk drives (sd@0,0and sd@3,0 )

Note – Press Control-d to exit the format utility.

Here is a rather more realistic example:

# formatAVAILABLE DISK SELECTIONS:

0. c0t12d0 <SUN2.1G cyl 2733 alt 2 hd 19 sec 80> /sbus@3,0/SUNW,fas@3,8800000/sd@c,0 1. c0t13d0 <SUN2.1G cyl 2733 alt 2 hd 19 sec 80> /sbus@3,0/SUNW,fas@3,8800000/sd@d,0 2. c1t0d0 <SUN2.1G cyl 2733 alt 2 hd 19 sec 80> /sbus@7,0/SUNW,fas@3,8800000/sd@0,0 3. c1t1d0 <SUN2.1G cyl 2733 alt 2 hd 19 sec 80> /sbus@7,0/SUNW,fas@3,8800000/sd@1,0 4. c2t4d0 <SYMBIOS-RSMArray2000-0204 cyl 8182 alt 2 hd 64 sec 64> /pseudo/rdnexus@2/rdriver@4,0 5. c2t4d1 <SYMBIOS-RSMArray2000-0204 cyl 8182 alt 2 hd 64 sec 64> /pseudo/rdnexus@2/rdriver@4,1 6. c2t4d2 <SYMBIOS-RSMArray2000-0204 cyl 8182 alt 2 hd 64 sec 64> /pseudo/rdnexus@2/rdriver@4,2 7. c3t5d3 <SYMBIOS-RSMArray2000-0204 cyl 8182 alt 2 hd 64 sec 64> /pseudo/rdnexus@3/rdriver@5,3 8. c3t5d4 <SYMBIOS-RSMArray2000-0204 cyl 8182 alt 2 hd 64 sec 64> /pseudo/rdnexus@3/rdriver@5,4 9. c3t5d5 <SYMBIOS-RSMArray2000-0204 cyl 8182 alt 2 hd 64 sec 64>

Solaris Support Utilities 9-237Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 238: Sun Enterprise Server Maintenance

9

/pseudo/rdnexus@3/rdriver@5,5 10. c4t5d0 <SYMBIOS-RSMArray2000-0205 cyl 8108 alt 2 hd 64 sec 64> /pseudo/rdnexus@4/rdriver@5,0 11. c4t5d2 <SYMBIOS-RSMArray2000-0205 cyl 8106 alt 2 hd 64 sec 64> /pseudo/rdnexus@4/rdriver@5,2 12. c4t5d3 <SYMBIOS-RSMArray2000-0205 cyl 8106 alt 2 hd 64 sec 64> /pseudo/rdnexus@4/rdriver@5,3 13. c4t5d4 <SYMBIOS-RSMArray2000-0205 cyl 8108 alt 2 hd 64 sec 64> /pseudo/rdnexus@4/rdriver@5,4 14. c5t4d1 <SYMBIOS-RSMArray2000-0205 cyl 8106 alt 2 hd 64 sec 64> /pseudo/rdnexus@5/rdriver@4,1

Specify disk (enter its number):

9-238 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 239: Sun Enterprise Server Maintenance

9

Displaying Diagnostic Information

In addition to monitoring utilities, Solaris provides you with

commands that you can use to display diagnostic information. The

following commands are used for this purpose:

● dmesg – Looks in a system buffer for recently printed diagnostic

messages and prints them on the standard output.

● prtdiag – Displays displays system configuration and diagnostic

information. The diagnostic information lists any failed Field

Replaceable Units (FRUs) in the system.

Note – /var/adm/messages – Contains error messages relative to the

current operating system initialization.

The dmesg Command

The following dmesg output is from an Enterprise 450 system. To

execute the dmesg command, type the following:

# /usr/sbin/dmesgMon Aug 9 12:50:07 MDT 1999Aug 5 14:02:31 proto144 unix: pseudo-device: winlock0Aug 5 14:02:31 proto144 unix: winlock0 is /pseudo/winlock@0Aug 5 14:02:31 proto144 unix: pseudo-device: devinfo0Aug 5 14:02:31 proto144 unix: devinfo0 is /pseudo/devinfo@0Aug 5 14:02:32 proto144 unix: pseudo-device: vol0Aug 5 14:02:32 proto144 unix: vol0 is /pseudo/vol@0Aug 5 14:02:32 proto144 unix: pseudo-device: llc10Aug 5 14:02:32 proto144 unix: llc10 is /pseudo/llc1@0Aug 5 14:02:32 proto144 unix: pseudo-device: pm0Aug 5 14:02:32 proto144 unix: pm0 is /pseudo/pm@0Aug 5 14:02:32 proto144 unix: pseudo-device: tod0Aug 5 14:02:32 proto144 unix: tod0 is /pseudo/tod@0Aug 5 14:02:32 proto144 unix: ecpp0 at ebus0: offset 14,3043bcAug 5 14:02:32 proto144 unix: ecpp0 is/pci@1f,4000/ebus@1/ecpp@14,3043bcAug 5 14:02:59 proto144 unix: SUNW,hme0: Link Down - cableproblem?Aug 5 14:03:09 proto144 last message repeated 2 timesAug 6 10:07:50 proto144 unix: syncing file systems...Aug 6 10:07:50 proto144 unix: done

Solaris Support Utilities 9-239Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 240: Sun Enterprise Server Maintenance

9

Aug 6 10:08:37 proto144 unix: SunOS Release 5.7 Version Generic_106541-04 64-bit [UNIX(R) System V Release 4.0]Aug 6 10:08:37 proto144 unix: Copyright (c) 1983-1999, Sun Microsystems,Inc.Aug 6 10:08:37 proto144 unix: Ethernet address = 8:0:20:95:fe:bbAug 6 10:08:37 proto144 unix: mem = 262144K (0x10000000)Aug 6 10:16:45 proto144 unix: avail mem = 250568704Aug 6 10:16:45 proto144 unix: root nexus = Sun Enterprise 450(UltraSPARC-II 296MHz)Aug 6 10:16:45 proto144 unix: pci0 at root: UPA 0x1f 0x4000Aug 6 10:16:45 proto144 unix: pci0 is /pci@1f,4000Aug 6 10:16:45 proto144 unix: pci1 at root: UPA 0x1f 0x2000Aug 6 10:16:45 proto144 unix: pci1 is /pci@1f,2000..

9-240 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 241: Sun Enterprise Server Maintenance

9

The prtdiag Command

To execute the prtdiag command, type the following:

# /usr/platform/ platform-name /sbin/prtdiag -v

Note – The command options are -l , which logs information to disk if

any error is found, and -v , which provides verbose output.

The following is an example of a prtdiag command output.

● CPU

========================= CPUs =========================Run Ecache CPU CPU

Brd CPU Module MHz MB Impl. Mask--- --- ------ ---- ---- ------ ---- 7 14 0 248 4.0 US-II 2.0 9 18 0 248 4.0 US-II 2.0 9 9 1 248 4.0 US-II 2.0

● Memory group

=============================== Memory ==================================Intrlv Intrlv.

Brd Bank MB Status Condition Speed Factor With--- ----- --- ------- --------- ----- ------- ------ 9 0 256 Active OK 60ns 1-way

● I/O boards

========================= IO Cards =========================Bus Freq

Brd Type MHz Slot Name Model--- ---- ---- --- ------------------------- -------- 1 SBus 25 0 DOLPHIN,sci 1 SBus 25 3 SUNW,hme 1 SBus 25 3 SUNW,fas/sd (block) 1 SBus 25 13 SUNW,socal/sf (scsi-3) 501-3060 3 SBus 25 0 DOLPHIN,sci 3 SBus 25 3 SUNW,hme 3 SBus 25 3 SUNW,fas/sd (block) 3 SBus 25 3 SUNW,socal/sf (scsi-3) 501-3060

Solaris Support Utilities 9-241Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 242: Sun Enterprise Server Maintenance

9

● Detached boards

No failures found in System===========================

● Fatal hardware reset

▼ This information is collected from components after a

hardware failure. This information is useful in determining the

correct FRU to be replaced.

No failures found in System===========================

● POST-detected failures

No System Faults found======================

● OS detected system faults

▼ System-detected faults lights the Yellow LED on the failing

board.

▼ You can repair system-detected faults. These faults will be

removed from the display when they are repaired (overtemp,

fan failure, power supply failure)

Most recent AC Power Failure:=============================Fri Mar 12 10:44:07 1999

● Environmental display

========================= Environmental Status =========================Keyswitch position is in Normal ModeSystem Power Status: Redundant

System LED Status: GREEN YELLOW GREENNormal ON OFF BLINKING

Fans:-----Unit Status---- ------Rack OKKey OKAC OK

9-242 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 243: Sun Enterprise Server Maintenance

9

System Temperatures (Celsius):------------------------------Brd State Current Min Max Trend--- ------- ------ --- --- -----1 OK 8 37 38 stable3 OK 44 44 44 stable7 OK 40 39 41 stable9 OK 44 43 45 stableCLK OK 35 35 35 stable

Power Supplies:---------------Supply Status--------- ------0 OK1 OK2 OK3 OKPPS OK System 3.3v OK System 5.0v OK Peripheral 5.0v OK Peripheral 12v OK Auxilary 5.0v OK Peripheral 5.0v precharge OK Peripheral 12v precharge OK System 3.3v precharge OK System 5.0v precharge OKAC Power OK

● Firmware levels

========================= HW Revisions =========================

ASIC Revisions:---------------Brd FHC AC SBus0 SBus1 PCI0 PCI1 FEPS Board Type Attributes--- --- -- ----- ----- ---- ---- ---- ---------- ---------- 0 1 5 CPU 98MHz Capable 1 1 5 1 1 22 Dual-SBus-SOC+ 98MHz Capable 2 1 5 CPU 98MHz Capable 3 1 5 1 1 22 Dual-SBus-SOC+ 98MHz Capable 4 1 5 CPU 98MHz Capable 6 1 5 CPU 98MHz Capable

Solaris Support Utilities 9-243Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 244: Sun Enterprise Server Maintenance

9

System Board PROM revisions:----------------------------Board 0: OBP 3.2.24 1999/12/23 17:31 POST 3.9.24 1999/12/23 17:35Board 1: FCODE 1.8.24 1999/12/23 17:30 iPOST 3.4.24 1999/12/23 17:34Board 2: OBP 3.2.24 1999/12/23 17:31 POST 3.9.24 1999/12/23 17:35Board 3: FCODE 1.8.24 1999/12/23 17:30 iPOST 3.4.24 1999/12/23 17:34Board 4: OBP 3.2.24 1999/12/23 17:31 POST 3.9.24 1999/12/23 17:35Board 6: OBP 3.2.24 1999/12/23 17:31 POST 3.9.24 1999/12/23 17:35

9-244 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 245: Sun Enterprise Server Maintenance

9

Setting NVRAM Configuration Parameters From Solaris

The eeprom Command

Solaris provides system administrators and service personnel with the

ability to change system configuration parameters in NVRAM so that

they can take effect when the system is restarted. This is accomplished

by using the eeprom command.

The eeprom command displays or changes the values of parameters in

the EEPROM.

It processes parameters in the order given. When processing a

parameter accompanied by a value, eeprom makes the indicated

alteration to the EEPROM; otherwise it displays the parameter’s value.

When given no parameter specifiers, eeprom displays the values of all

EEPROM parameters.

The following are examples of the eeprom commands available:

● To display all configuration parameter settings, type

# eeprom

● To display the current setting of the auto-boot? parameter, type

# eeprom auto-boot?

● To disable boards in slots 3 and 5, type

# eeprom disable-board-list=35

● To set configuration policy to board, type

# eeprom configuration-policy=board

Solaris Support Utilities 9-245Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 246: Sun Enterprise Server Maintenance

9

9-246 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 247: Sun Enterprise Server Maintenance

SunVTSSystemDiagnostics 10

10-247Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 248: Sun Enterprise Server Maintenance

10

Introduction

SunVTS Software Overview

SunVTS is Sun’s online validation test suite. With VTS, you can verify

the functionality of most of Sun’s hardware devices. You can use the

SunVTS tests to stress certain areas of the system as needed for

diagnostic and troubleshooting purposes.

The SunVTS diagnostic software is the successor to SunDiag™

diagnostics, which is shipped with the Solaris 2.4 operating system or

earlier releases. SunVTS runs on the Solaris 2.5 operating system and

later releases.

Like its SunDiag predecessor, SunVTS software can run concurrently

with customer applications and the Solaris operating system. SunVTS

is a vital part of the Sun Enterprise sever concurrent maintenance

strategy.

10-248 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 249: Sun Enterprise Server Maintenance

10

Test Categories

SunVTS is comprised of many individual tests that support testing of a

wide range of products and peripherals. Most of the tests are capable

of testing devices in a 32-bit or 64-bit Solaris environment.

Use SunVTS to test one device or multiple devices. Some of the major

test categories are:

● Audio Tests

● Communication (Serial and Parallel) Tests

● Graphic/Video Tests

● Memory Tests

● Network Tests

● Peripherals (Disks, Tape, CD-ROM, Printer, Floppy) Tests

● Processor Tests

● Storage Tests

SunVTS System Diagnostics 10-249Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 250: Sun Enterprise Server Maintenance

10

Hardware and Software Requirements

The following lists the requirements to run SunVTS Version 3.1

software successfully in the common desktop environment (CDE)

environment:

● The Solaris 7 3/99 operating environment

● The SunVTS 3.1 package

● Operating system kernel configured to support all peripherals to

be tested

● Superuser access to startup SunVTS software

● Connection of loopback connectors, installation of test media, or

the availability of disk space

Note – In this module, all references to SunVTS imply SunVTS 3.1.

10-250 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 251: Sun Enterprise Server Maintenance

10

Starting the SunVTS Software

The SunVTS program is run when the superuser types one of the

following commands. The ex /opt/SUNWvts/bin directory needs to be

defined as part of the PATHvariable.

● sunvts – Runs the SunVTS kernel and default graphical interface

(CDE) on the local machine

● sunvts -l – Runs the SunVTS kernel and OpenLook graphical

interface on the local machine

● sunvts -t – Runs the SunVTS kernel in TTY mode, vtstty

● sunvts -h host_name – Runs the graphical interface on the local

machine while connecting and testing a remote machine

Note – The SUNvts package and, if needed, the SUNvtsx package must

be installed on both local and remote machines to perform remote

diagnostics.

SunVTS System Diagnostics 10-251Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 252: Sun Enterprise Server Maintenance

10

The SunVTS Graphical Interface

The initial SunVTS graphic menu is shown in Figure 10-1.

Figure 10-1 SunVTS Graphical Interface

10-252 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 253: Sun Enterprise Server Maintenance

10

The SunVTS Window Panels

The five major panels of the SunVTS window are:

● System Status Panel – Test status, host name, model type, number

of passes and errors, and elapsed time are displayed in the upper

area of the SunVTS menu.

● System Map – This area of the initial menu displays a logical device

view consisting of a selectable list of devices to test by default. You

can turn each device test on or off by clicking on the check box.

You can select particular devices, such as CPUs, network

interfaces, or disks, by clicking on the plus sign box.

● Select Devices – This area of the SunVTS menu enables you to

quickly select the devices to test, including a default set (shown in

Figure 6-2).

● Select mode – A SunVTS test session runs in one of two test modes:

Connection test mode and Functional test mode.

▼ Connection Test Mode

In Connection test mode, the tests determine if the devices are

connected to the system you are testing and if they are

accessible. Functional testing is not done in this mode, but the

devices are accessed to establish system connection and

accessibility.

You can safely run this mode when the system is online. When

SunVTS testing is started in Connection test mode, each test is

run sequentially until all tests are run.

The limited nature of the tests in this mode makes it possible to

run periodic checks for configuration verification on the system.

▼ Functional Test Mode

Checks the operation of the system devices. This mode finds

any faults and exercises the system by running tests to

increase the load and stress on the system.

Do not run critical applications on the system or use the

system for production purposes in Functional test mode.

● Test Messages – This area displays any information or error

messages that are issued during test executions.

SunVTS System Diagnostics 10-253Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 254: Sun Enterprise Server Maintenance

10

The SunVTS Window Icons

Seven icons are provided at the top of the SunVTS menu. These are:

● Start – Begins the test, according to the selections made in the

System map, Select devices, and Select mode areas. Progressive

updates are displayed in the Information Panel during testing.

● Stop – Stops current testing, without exiting SunVTS.

● Reset – Sets the System map area to previous state.

● Host – Provides a submenu in which you can enter a remote host

name for a test connection. This host must be reachable, with

SunVTS installed.

● Log – Displays the log file, and provides a menu to select the

amount of information to log, including errors, information, and

UNIX messages (/var/adm/messages ).

● Meter – Invokes the performance monitor utility, which

graphically displays system resource activity during testing.

● Quit – Exits the SunVTS program.

10-254 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 255: Sun Enterprise Server Maintenance

10

The SunVTS Menu Selections

The top horizontal bar of the SunVTS menu has four selections with

lists of associated submenus.

● Commands – This menu provides the following commands:

▼ Start testing – Begins testing

▼ Stop testing – Halts testing

▼ Connect to host – Specifies host name target host

▼ Trace test – Selects a test to trace, and a location for the output

▼ Reprobe system – Probes the hardware

▼ Quit VTS – Exits SunVTS

● View – This menu provides two options:

▼ Open System map – Displays full device selection list

▼ Close System map – Displays default device selection list

● Options – The following selections are available:

▼ Thresholds – Specifies number of passes, errors, and time torun

▼ Notify – Specifies a user to mail with test status information

▼ Schedule, Test Execution, and Advanced – Runs specifiednumber of tests with stress, verbose, core file, or run on erroroption (see the next page)

▼ Option files – Loads, stores, or removes a test options file

SunVTS System Diagnostics 10-255Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 256: Sun Enterprise Server Maintenance

10

● Reports – Two selections are provided:

▼ System configuration – Displays the system configurationreport as obtained with the prtconf command

▼ Log files – Displays the log file and allows selection of the levelof information to log

10-256 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 257: Sun Enterprise Server Maintenance

10

The Schedule Options Menu

Clicking on the Schedule option beneath the Options selection on the

horizontal bar of the SunVTS window displays the window in

Figure 10-2.

Figure 10-2 Schedule Options Window

The available options are:

● Auto Start – Runs tests selected in a previously saved option file

using a command-line specification when sunvts is invoked.

● Single Pass – Runs only one pass of each selected test.

● System Concurrency – Specifies the maximum number of tests that

can be run concurrently on the machine being tested.

● Group Concurrency – Specifies the number of tests to be run at the

same time in the same group.

SunVTS System Diagnostics 10-257Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 258: Sun Enterprise Server Maintenance

10

The Test Execution Menu

Clicking on Test Execution beneath the Options selection on the upper

horizontal bar of the SunVTS menu displays the window in

Figure 10-3.

Figure 10-3 Text Execution Options Window

10-258 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 259: Sun Enterprise Server Maintenance

10

The Test Execution Menu

The following is a list of options available in the Test Execution menu:

● Stress – Runs certain tests in stress mode, working the system

harder than normal.

● Verbose – Enables more information to be logged and displayed

during testing.

● Core file – Allows for a core dump generation in the SunVTS bindirectory when abnormal conditions occur. The core file name

format is core .testname.xxxxxx .

● Run on Error – Continues testing until the max_errors value is

reached.

● Max Passes – Specifies the maximum number of passes that tests

can run. A value of zero indicates no limit.

● Max Errors – States the maximum number of errors any test allows

before stopping. A value of zero causes tests to continue regardless

of errors.

● Max Time – Specifies the maximum number of minutes tests are

allowed to run. A value of zero indicates no limit.

● Number of Instances – Specifies the number of tests to run for all

tests that are scalable.

SunVTS System Diagnostics 10-259Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 260: Sun Enterprise Server Maintenance

10

The Advance Options Menu

Clicking on the Options selection on the topmost horizontal bar of the

SunVTS window displays the window in Figure 10-4.

Figure 10-4 Advanced Options Window

The available options are:

● System Override – Supersedes group and test options in favor of

the options selected in a Global Options window; set all options on

all test group and test option menus.

● Group Override – Supersedes specific test options in favor of the

group options set in a Group Options window.

● Group Lock – Protects specific group options from being changed

by the options set at the system level. (System Override

supersedes this option.)

● Test Lock – Protects specific test options from being changed by

options set at the group or system level. (System Override and

Group Override supersede this option.)

10-260 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 261: Sun Enterprise Server Maintenance

10

Intervention Mode

Certain tests require that you intervene before you can run the test

successfully. These include tests that require media or loopback

connectors.

● Loopback connectors are required to run certain tests, such as

serial port tests, successfully.

See the SunVTS Test Reference Manual for more information about

loopback connectors, and which tests need them.

● Media (tapes, diskettes, or CD-ROMs) must be present in the

drive(s) before the system is probed at SunVTS startup. If this is

not done, the following error message is displayed:

Using old or damaged tapes and diskettes may causeerrors in corresponding tests.

You cannot select these tests until you enable the intervention mode.

This setting reminds you that you must intervene before the test can

be successfully completed.

SunVTS System Diagnostics 10-261Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 262: Sun Enterprise Server Maintenance

10

Performance Monitor Panel

The performance monitor displays system resource activity. A brief

description of each component is provided on the next page.

Figure 10-5 Perfmeter Window

10-262 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 263: Sun Enterprise Server Maintenance

10

The Performance Monitor Panel

The information displayed with the SunVTS Performance Monitor is

the same as that displayed by the operating system perfmeter utility.

● cpu – Percentage of CPU used per second

● pkts – Ethernet packets per second

● page – Paging activity in pages per second

● swap – Jobs swapped per second

● intr – Number of device interrupts per second

● disk – Disk use in transfers per second

● cntxt – Number of context switches per second

● load – Average number of processes that have run over last minute

● colls – Collisions per second detected on the Ethernet

● errs – Errors per second on receiving packets

SunVTS System Diagnostics 10-263Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 264: Sun Enterprise Server Maintenance

10

Using SunVTS in TTY Mode

If you use the SunVTS software in TTY mode, no frame buffer is

required. To run in TTY mode, perform the following steps:

1. Start the SunVTS kernel with the vtsk command.

# /opt/SUNWvts/bin/vtsk

2. Start the SunVTS TTY User Interface with the vtstty command

# /opt/SUNWvts/bin/vtstty

or the sunvts command with the -t option.

# /opt/SUNWvts/bin/sunvts -t

Figure 10-6 SunVTS Window

10-264 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 265: Sun Enterprise Server Maintenance

10

Negotiating the SunVTS TTY Interface

The SunVTS TTY interface provides a screen with four working

panels: Message, Status, Control, and System map.

The following keys operate as follows with the TTY interface:

● Tab – Selects a screen panel for keyboard input

● Spacebar – Selects an option within a panel

● Arrows – Moves between the options in a panel

● Esc – Closes pop-up option windows

● Control-l – Refreshes the TTY window

Figure 10-7 Various Working Panels of the SunVTS TTY Interface

Control panel

Status panel

System map

Message area

SunVTS System Diagnostics 10-265Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 266: Sun Enterprise Server Maintenance

10

Running SunVTS Remotely

A testing session can be run across a network or even a modem.

Both the kernel and the user interface components are used in remote

testing.

Requirements

The following requirements must be met to run SunVTS on a remote

system:

● There must be network connectivity between the local and remote

system.

● You must install the same revision of SunVTS on both the local

and remote system.

Running SunVTS Through a Remote Login

1. Use the xhost command to allow the remote system to display on

your local system.

$ /usr/openwin/bin/xhost + remote_hostname

where remote_hostname is the name of the remote system.

2. Log in to the remote system and substitute user to root.

$ rlogin remote_hostname$ su -

3. Start SunVTS.

# /opt/SUNWvts/bin/sunvts -display \local_hostname :0

where local_hostname is the name or IP address of the local system.

Note – The SunVTS kernel starts on the remote system and the user

interface displays on your system.

10-266 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 267: Sun Enterprise Server Maintenance

10

4. Configure SunVTS for the test session and start the tests.

5. Review the SunVTS logs for test results.

You can view the remote system test logs through the local

SunVTS interface. The log files are stored on the system under test

(SUT).

SunVTS System Diagnostics 10-267Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 268: Sun Enterprise Server Maintenance

Running SunVTS Remotely

Running SunVTS Through telnet or tip

You can run SunVTS on a remote system, with the TTY interface,

through a telnet or tip session.

You need to set the correct terminal type and number of columns and

rows before starting the interface. The steps below describe this

process.

1. Use the echo command to display the value of the TERM variable:

Note – In this example, the TERM variable is a Korn or Bourne shell

variable and the value is sun-cmd .

$ echo $TERMsun-cmd

2. Use the stty command to display the settings of your terminal:

$ stty

speed 9600 baud; -parity hupclrows = 60; columns = 80; ypixels = 780; xpixels = 568;switch = <undef>;brkint -inpck -istrip icrnl -ixany imaxbel onlcrecho echoe echok echoctl echoke iexten

Note – You must have a minimum of 80 columns and 24 rows to run

the SunVTS TTY interface.

Write down the values of your TERM variable and rows and

columns settings. You will need these values later.

3. Connect to the remote system using either the telnet or tipcommands.

Page 269: Sun Enterprise Server Maintenance

10

Running SunVTS Remotely

Running SunVTS Through telnet or tip

4. Become superuser on the remote system.

5. Identify your terminal type and settings in the telnet (or tip )

session window:

# TERM=sun-cmd# stty rows 60# stty columns 80

6. Start SunVTS with the TTY interface:

# /opt/SUNWvts/bin/sunvts -t

7. Configure SunVTS for the test session and start the tests.

8. Review the SunVTS logs for test results.

You can view the remote system test logs through the local

SunVTS TTY interface. The log files are stored on the system under

test (SUT).

SunVTS System Diagnostics 10-269Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 270: Sun Enterprise Server Maintenance

10

SunVTS Test Summary

SunVTS supports a number of tests that are applicable to Sun

Enterprise servers. This section gives a brief description of these

tests. Further details on each test can be found in the SunVTS 3.x TestReference Manual.

Advanced Frame Buffer Test

The afbtest verifies the functionality of the Advanced Frame Buffer.

Note – This test supports Function Test mode only.

!Caution – Do not run any other application or screen saver program

that uses the AFB accelerator port while running afbtest . This

combination causes SunVTS to return incorrect errors.

SunATM Adapter Test

The atmtest checks the functionality of the SunATM-155 and

SunATM-622 SBus and PCI bus adapters. It runs only in loopback

(external or internal) mode. The Asynchronous Transfer Mode (ATM)

adapter, and ATM device driver must be present.

To run the atmtest in external loopback mode, a loopback connector

must be attached to the ATM adapter. The internal loopback mode

does not require a loopback connector.

Note – This test supports Function Test mode only.

Note – Do not run nettest while running atmtest .

Note – Bring the ATM interface down to make sure that the interface is

in offline mode before running atmtest

10-270 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 271: Sun Enterprise Server Maintenance

10

SunVTS Test Summary

Audio Test

The audiotest verifies the hardware and software components of the

audio subsystem. This test supports all Sun audio implementations.

Note – This test supports Connection and Function Test modes.

Note – The audio device is an exclusive use device. Only one process

or application can interface with it at a time.

Bidirectional Parallel Port Printer Test

The bpptest verifies the functionality of the bidirectional parallel port.

The bpptest verifies that your SBus card and its parallel port are

working properly by attempting to transfer a data pattern from the

SBus card to the printer.

Note – This test supports Connection and Function Test modes.

Compact Disc Test

The cdtest checks the CD-ROM unit by reading the CD. cdtest is

not a scalable test. Each track is classified as follows:

● Mode 1 uses error detection/correction code (288 bytes).

● Mode 2 uses that space for auxiliary data, or as an audio track.

Note – Load a compact disc into the drive before starting the test.

Note – This test supports Connection and Function Test modes.

SunVTS System Diagnostics 10-271Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 272: Sun Enterprise Server Maintenance

10

SunVTS Test Summary

Frame Buffer, GX, GX+ and TGX Options Test

The cg6 test verifies the cgsix frame buffer and the graphics options

offered with most SPARC based workstations and servers.

Note – This test supports Function Test mode only.

Disk and Floppy Drives Test

The disktest verifies the functionality of hard disk drives and floppy

drives using three subtests; Media, File System, and Asynchronous

I/O. Most disk drives, such as SCSI disks, native or SCSI floppy disks,

IPI, and so on, are supported. The type of drive being tested is

displayed at the top of the Test Parameter option menu.

The WriteRead option of the Media subtest is allowed only if a

selected partition is not mounted. By default, disktest does not

mount any partitions.

!Caution – If a power failure occurs while the Media subtest is running

in WriteRead mode, disk data might be destroyed.

Caution – Running the Media subtest on a disk partition in the

WriteRead mode can cause data corruption if the same partition is

being used by other programs. Only select this mode when the system

is offline (not used by any other users or programs).

Note – This test supports Connection and Function Test modes.

10-272 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 273: Sun Enterprise Server Maintenance

10

SunVTS Test Summary

ECP 1284 Parallel Port Printer Test

The ecpptest verifies the functionality of the ecpp IEEE 1284 parallel

printer port device.

Note – The ecpp device is an exclusive use device. Only one

application can interface with it at a time

Note – This test supports Connection and Function Test modes.

Sun Enterprise Network Array Test

The enatest is used to provide configuration verification, fault

isolation, and repair validation of the Sun Enterprise Network Array.

The Sun Enterprise Network Array is a high availability mass storage

subsystem consisting of:

▼ SCSI fibre channel protocol host adapters with dual 100-

Megabyte FC-AL ports.

▼ A disk enclosure.

▼ A Front panel display for configuration information.

▼ Up to two interface boards in the enclosure, which provide

FC-AL connections to the enclosure and also provide status

information and control of the conditions within the enclosure.

▼ Other field-replaceable units (FRUs) within the enclosure

include power supply units, fan trays and backplane.

enatest detects all Sun Enterprise Network Array™ enclosures

connected to the host and collects relevant configuration information.

Note – This test supports Connection and Function Test modes.

SunVTS System Diagnostics 10-273Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 274: Sun Enterprise Server Maintenance

10

SunVTS Test Summary

StorEdge 1000 Enclosure Test

The enctest tests the StorEdge 1000 enclosures. The enclosure can

support either 12 1” 4Gbyte drives or 8 1.6” 9Gbyte drives and have

redundant power and cooling. Two enclosure models are available:

● StorEdge A1000 - Disk Tray with the hardware RAID controller

● StorEdge D1000 - Disk Tray without the hardware RAID

controller.

You can use enctest can be used for validation, configuration

verification, repair verification, and fault isolation of both models.

The enctest probe detects all the connected StorEdge enclosures and

displays the status of the various elements in the enclosure.

Note – This test supports Connection and Function Test modes.

Frame Buffer Test

The fbtest is a generic test for all dumb frame buffers used with the

Solaris 2.x and Solaris 7 software.

Note – This test supports Function Test mode only.

Fast Frame Buffer Test

The ffbtest verifies the functionality of the Fast Frame Buffer.

ffbtest can detect and adapt to the video modes of single- and

double-buffer versions of the fast frame buffer (FFB).

Note – This test supports Function Test mode only.

10-274 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 275: Sun Enterprise Server Maintenance

10

SunVTS Test Summary

Floating Point Unit Test

The fputest checks the floating point unit on machines with the

SPARC-based architecture.

Note – This test supports Connection and Function Test modes.

Sun GigabitEthernet Test

The gemtest provides functional test coverage of the Sun

GigabitEthernet SBus and PCI bus adapters. It runs in loopback

(external/internal) mode and must be selected mutually exclusive

with the nettest . The gemtest provides better fault isolation as

compared to nettest .

Note – This test supports Function Test mode only.

Intelligent Fibre Channel Processor Test

The ifptest tests the functionality of the PCI FC_AL card when there

are no devices attached to the loop. The driver checks for devices on

the fibre loop. If devices are detected the driver blocks any diagnostic

commands.

Note – When devices are attached to the loop, do not run ifptest .

Instead, run disktest tests on the individual devices. This will test the

whole subsystem including the FC_AL controller.

Note – This test supports Connection and Function Test modes.

SunVTS System Diagnostics 10-275Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 276: Sun Enterprise Server Maintenance

10

SunVTS Test Summary

Dual Basic Rate ISDN (DBRI) Chip

The isdntest verifies the functionality of the ISDN portion of the

Dual Basic Rate ISDN (DBRI) chip.

Note – This test supports Function Test mode only.

M64 Video Board Test

The m64test tests the PCI-based M64 video board by performing the

following subtests:

● Video Memory test

● RAMDAC test

● Accelerator Port test

!Caution – DO NOT run any other application or screen saver program

that uses the Pineapple accelerator port while running m64test . Do

not run power management software. These programs cause SunVTS

to return incorrect errors.

Note – This test supports Function Test mode only.

Multiprocessor Test

The mptest verifies the functionality of multiprocessing hardware.

mptest can test up to 256 processors can be tested by mptest.

Note – This test supports Connection and Function Test modes.

10-276 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 277: Sun Enterprise Server Maintenance

10

SunVTS Test Summary

Network Hardware Test)

The nettest checks all the networking hardware on the system CPU

board and separate networking controllers (for example, a second

SBus Ethernet controller). For this test to be meaningful, the machine

under test must be attached to a network with at least one other

system on the network.

Note – This version of nettest is used for all networking devices,

including Ethernet (ie and le ), token ring (tr , trp ), quad Ethernet

(QED), fiber optic (fddi , nf , bf , pf ), SPARCcluster™ 1 System , ATM

(sa , ba), HiPPI, and 100-Mbits per second Ethernet (be,hme) devices.

Note – This test supports Connection and Function Test modes.

SPARCstorage Array Controller Test

The plntest checks the functionality of the controller board on the

SPARCstorage™ Array.

The SSA controller card is an intelligent, CPU-based board with its

own memory and ROM-resident software. In addition to providing a

communications link to the disk drives, it also buffers data between

the host system and disk drives in its nonvolatile RAM (NVRAM). For

data to go from the host to a particular disk, it must first be

successfully transferred to this NVRAM space.

The host machine, SBus host adapter card, fiber-channel connection,

and the SSA controller board must be working properly to perform

this data transfer operation. By verifying and stressing this operation,

plntest can isolate failures on the SSA disk drives from failures on

the SSA controller board.

Note – This test supports Connection and Function Test modes.

SunVTS System Diagnostics 10-277Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 278: Sun Enterprise Server Maintenance

10

SunVTS Test Summary

Physical Memory Test

The pmemtest checks the physical memory of the system. The

pmemtest locates parity errors, hard and soft error correction code

(ECC) errors, memory read errors, and addressing problems.

This test reads through all available physical memory. It does not write

to any physical memory location.

Note – This test supports Connection and Function Test modes.

Prestoserve Test

Prestoserve is an Network File System (NFS) accelerator. It reduces the

frequency of disk I/O access by caching the written data blocks in

nonvolatile memory. Prestoserve then flushes the cached data to disk

asynchronously, as necessary.

The pstest verifies the Prestoserve accelerator’s functionality with the

following three checks:

● Board battery check

● Board memory check

● Board performance and file I/O access check

Note – This test supports Function Test mode only.

10-278 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 279: Sun Enterprise Server Maintenance

10

SunVTS Test Summary

Serial Asynchronous Interface Test

The saiptest checks the functionality of the Serial Asynchronous

Interface card through its device driver.

Note – You must run the saiptest in intervention mode.

Note – This test supports Function Test mode only.

Sun Enterprise Cluster 2.0 Network Hardware Test

The scitest verifies the functionality of the Sun Enterprise Cluster 2.0

by checking the networking hardware. For this test to be meaningful,

the cluster must already be configured before the test is run.

After finding the cluster nodes (targets), scitest performs the

following tests:

● Random test sends out 256 packets with random data length and

random data.

● Incremental test sends out packets with length from minimum to

maximum packet size using incremental data.

● Pattern test sends 256 packets of maximum length

Note – This test supports Function Test mode only.

SunVTS System Diagnostics 10-279Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 280: Sun Enterprise Server Maintenance

10

SunVTS Test Summary

Environmental Sensing Card Test

The sentest checks the SCSI Environmental Sensing card (SEN)

installed in the SPARCstorage RSM to monitor the enclosure

environment. The SEN card monitors the enclosure’s over-temperature

condition, fan-failures, power-supply failures, and drive activity.

sentest verifies the following control functions in the enclosure:

● Alarm (enable/disable) – sentest toggles the alarm to the disable

state, then to the enable state.

● Alarm time (0-0xff seconds) – sentest sets the time (from 0 to

4095), then reads it back to verify the time setting.

● Drive fault LED (DL0-DL6) – sentest toggles each LED to its OFF

and ON states.

Note – This test supports Connection and Function Test.

Soc+ Host Adapter Card Test

The socaltest aids the validation and fault isolation of the SOC+

host adapter card. In the case of a faulty card, the test tries to isolate

the fault to the card, the Gigabit Interface Controller (GBIC) module,

or the DMA between the host adapter card and the host memory.

Note – This test supports Function Test mode only.

10-280 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 281: Sun Enterprise Server Maintenance

10

SunVTS Test Summary

Serial Parallel Controller Test

The spiftest accesses card components such as the cd-180 and ppc2

chips, and the serial and parallel ports through the serial parallel

controller device driver.

Note – The spiftest must be run in Intervention mode.

Note – This test supports Function Test mode only.

Serial Ports Test

The sptest checks the system’s on-board serial ports (zs[0,1], zsh[0,1],

se[0,1], se_hdlc[0,1]), as well as any multi-terminal interface (ALM2)

boards (mcp[0-3]). Data is written and read in asynchronous and

synchronous modes utilizing various loopback paths.

Note – The sptest must be run in Intervention mode.

Note – This test supports Connection and Function Test.

SunButtons Test

The sunbuttons test verifies that the SunButtons graphics

manipulation device is working correctly

Note – This test supports Function Test mode only.

SunVTS System Diagnostics 10-281Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 282: Sun Enterprise Server Maintenance

10

SunVTS Test Summary

SunDials Test

The sundials test verifies that the SunDials graphics manipulation

device controls are working properly. sundials also verifies the

connection between the dialbox and serial port.

Note – This test supports Function Test mode only.

HSI Board Test

The sunlink test verifies the functionality of the SBus and PCI bus

High Speed Serial Interface (HSI) boards by using the High-level Data

Link Control (HDLC) protocol. sunlink initializes and configures the

selected channel.

Note – This test will not pass unless you install the correct loopback

connectors or port to port cables on the ports you are testing.

Note – This test supports Function Test mode only.

Sun PCi Test

The sunpcitest tests the SunPCi™ plug-in PCI card, which is an X86

processor embedded in an add-on card.

Note – This test supports Function Test mode only.

10-282 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 283: Sun Enterprise Server Maintenance

10

SunVTS Test Summary

System Test

The systest checks the CPU board by exercising the I/O, memory,

and CPU channels simultaneously as threads. There is no quick test

option for systest ; it is a CPU stress test.

Note – This test supports Function Test mode only.

Tape Drive Test

The tapetest synchronous I/O test writes a pattern to a specified

number of blocks (or, for a SCSI tape, writes to the end of the tape).

The tapetest then rewinds the tape and reads and compares the data

just written. The tapetest asynchronous I/O test sends a series of up

to five asynchronous read/write requests to the tape drive, writing to

the tape and then reading and comparing the data. The tapetest file

test writes four files to the tape and then reads them back, comparing

the data. For tape library testing, the pass count is incremented only

after all tapes in the library have been tested.

Note – A blank writable tape (scratch tape) must be loaded before you

start this test.

Note – This test supports Connection and Function Test.

Virtual Memory Test

The vmemtest checks virtual memory; that is, it tests the combination

of physical memory and the swap partitions of the disk(s).

Note – This test supports Function Test mode only.

SunVTS System Diagnostics 10-283Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 284: Sun Enterprise Server Maintenance

10

t.

re-

r

rec-

Test Message Syntax

All SunVTS test messages follow this format:

SUNWvts.testname[.subtest_name].message_number date time testnamedevice_name [FRU_path]ERROR|FATAL|INFO|WARNING|VERBOSE message

Table 10-1 lists the SunVTS test message arguments and gives a brief

description.

Table 10-1 SunVTS Test Message Arguments

Argument Description

SUNWvts SunVTS package name

testname SunVTS test name

subtest_name The subtest module name (optional)

message_number The message identifier, which is a unique number for the tesThe number is usually within the following ranges: VER-BOSE: 1 - 1999 INFO: 2000 - 3999 WARNING: 4000 - 5999ERROR/FATAL: 6000 - 7999 FATAL: 8000 - 9998 (The num-ber 9999 is reserved for any possible old message types in pvious SunVTS releases for compatibility reasons.)

date time Tells when the error occurred

testname The name of the test reporting the error

device_name The device being tested when the error occurred

FRU_path A full Solaris device path of the failed FRU; this argumentvaries, depending on the type of test running when the errooccurred

message Contains test messages, in addition to probable cause and ommended action

10-284 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 285: Sun Enterprise Server Maintenance

AlternatePathing A

A-285Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 286: Sun Enterprise Server Maintenance

A

Introducing Alternate Pathing

Alternate Pathing (AP) provides high availability to storage and

network devices. With AP, you have two physical paths to the same

A5000 or SSA storage array or network interface, transparent to the

operating system.

Only one path can be active at a time. If a path fails, the alternate path

can be switched in place of the failed path. Path switching does not

always occur automatically; you might need to switch it manually.

The system uses the meta-device, a name representing the end object

(such as the disk partition or network interface), but does not use the

physical path names to access the device.

Note – The AP material covered in this module applies to the AP 2.2

support that Solaris 7 provides for the Sun Enterprise x000 and x500

servers.

A-286 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 287: Sun Enterprise Server Maintenance

A

Supported Devices

Disk Devices

AP supports the StorEdge A5000 and SPARCstorage arrays.

SCSI devices are not supported. The StorEdge A3000 is not supported,

but has its own internal AP capability.

After you set up AP for disks, you can use Solstice DiskSuite Version

4.1 and Sun Volume Manager Versions 2.3, 2.4, and 2.5 normally.

(However, on installation Dynamic Multipathing (DMP) automatically

disables itself in Volume Manger 2.5 if AP is already installed.)

!Caution – You must make sure that any AP devices used by these

products are used by their meta-device names only.

You can place your boot disk and primary network interface under AP

control. This makes it possible for the system to boot unattended, even

if the primary network or boot disk controller is not accessible, as long

as a usable alternate path for these devices is defined and available.

Network Devices

The following network devices are supported by AP 2.2:

● SunFastEthernet 2.0 (hme)

● Sun FDDI™ 5.0 (nf) SAS and DAS

● Lance Ethernet (le)

● Quad Ethernet (qe)

● Sun Quad FastEthernet (qfe)

● GigabitEthernet (ge)

Alternate Pathing A-287Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 288: Sun Enterprise Server Maintenance

A

Installing AP

Solaris 7

Solaris 7 supports AP 2.2.

Solaris 2.6

Solaris 2.6 supports AP 2.1.

Solaris 2.5.1

Solaris 2.5.1 supports AP 2.0.

Installing AP

Install the following packages:

● SUNWapr– AP subsystem (root)

● SUNWapu– AP subsystem (usr)

● SUNWapdv

Documentation:

● SUNWabap– AP AnswerBook

● AP 2.0 only. AP 2.1 documentation is in theHardware AnswerBook, SUNWabhdw.

● SUNWapdoc– AP man pages

● Apply all appropriate patches

The installation process uses the pkgadd command to

install the AP packages. There is no order dependency.

A-288 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 289: Sun Enterprise Server Maintenance

A

How AP Works

AP creates a new layer of device drivers (meta-disks and meta-

networks), which accesses one of two physical device drivers to access

the device. Applications and the OS components, including the disk

management software, use the meta-device name to access the

resource. Only the drivers know the actual physical paths.

No component other than AP is aware that the normal device paths

are to the same device. This can cause problems for applications that

use the physical paths instead of the meta-devices to scan or inspect

disk or network devices; they might identify the meta-device paths as

separate devices.

AP automatically switches from the active disk path to the alternate

disk path if the active path fails. Additionally, you can manually

switch the active path to the alternate, at any time, with no

interruption to active traffic using the meta-device for both disk and

networks.

Note – In the Enterprise x000 and x500 computers there is no

automatic switch-over to the alternate path during a DR operation.

Meta-device definitions are stored in an AP state database that is used

early in the boot process. There are usually several copies of this

database. You must create the meta-devices yourself; the system will

not automatically create these for you.

Note – AP can be used with Sun Enterprise Volume Manager (SEVM)

or Solstice DiskSuite (SDS)

Alternate Pathing A-289Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 290: Sun Enterprise Server Maintenance

A

Physical Paths

For the purposes of AP, an I/O device is either a disk or network device.

The only types of disk device currently supported by AP are the

StorEdge A5000 and the SPARCstorage Array (SSA). In this module,

the term disk always refers to one of these devices.

An I/O adapter is the controller for an I/O device such as an A5000

SOC+ adapter.

A device node is a path in the devices directory that is used to access a

physical device, such as /dev/dsk/c0t1d1s0 .

The term physical path refers to the electrical path from the host to a

disk or network.

A-290 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 291: Sun Enterprise Server Maintenance

A

Metadisk

A metadisk is logical name that enables you to access a physical disk

device without having to specify the particular path to the device. You

reference a metadisk just as if it were a real device, using an AP-

specific device node, such as /dev/ap/dsk/mc0t1d1s0 . The AP

software determines which path is active and uses that path to access

the device.

The path, /dev/ap/dsk/mc0t1d1s0 is used to access a slice on a

metadisk, regardless of which pln port is currently active (handling

I/O) for the metadisk. For the A5000, the sf ports (representing an

SOC+ adapter) are where AP activates the paths.

Alternate Pathing A-291Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 292: Sun Enterprise Server Maintenance

A

Disk Pathgroup

A disk pathgroup consists of two physical paths leading to the samestorage array. When a physical path is part of a pathgroup, it is called

an alternate path. An alternate path to a disk can be uniquely identified

by the pln or sf port that the alternate path uses.

Make sure that you understand the use of the term alternate. It means

either possible path, not just the spare path. The path in use is the

active alternate.

Only one alternate path at a time is allowed to handle disk I/O. The

alternate path that is currently handling I/O is called the activealternate.

One of the alternate paths is designated the primary path. The primary

path is initially made the active alternate. Although you can change

which path is the active alternate, the primary path is always the same.

The primary path has several properties.

● It is initially the active alternate.

● It provides the metadisk name.

● Identifies the metadisk.

You reference a disk pathgroup by specifying the pln or sf port (such

as pln1 or sf7 ) that corresponds to the primary path. For example, if

the primary path is sf1 , the pathgroup name is msf1 .

Some considerations are:

● Both array interfaces in a pathgroup must be attached to the same

array

● Only one interface is active at a time through the meta-device

● There must be exactly two adapters in a pathgroup

● If you have two interface boards, consider connecting a path to

each

● If you are using hubs in your configuration, use a separate hub for

each interface

A-292 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 293: Sun Enterprise Server Maintenance

A

Metanetwork

A metanetwork, just like a metadisk, is a logical interface that enables

you to access a network through either of two physical paths without

having to reference either path explicitly within your scripts and

programs. You reference a metanetwork by using a metanetworkinterface name such as mle1 .

Interface mle1 is used to access the metanetwork, regardless of which

physical adapter (le1 or le6 ) is currently active for the metanetwork

device.

Alternate Pathing A-293Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 294: Sun Enterprise Server Maintenance

A

Network Pathgroup

Similar to a disk pathgroup, a network pathgroup consists of two

network adapters connected to the same physical network.

To specify a network pathgroup, use the metanetwork interface name,

such as mle1 . Just as with a disk pathgroup, this is how you would

switch the active alternate.

Some considerations are:

● Network adapters in a pathgroup must be attached to the same

subnet

● Only one adapter is active at a time

● Use a separate hub for each path for even more redundancy

● There must be exactly two adapters in a pathgroup

● Both network adapters must be of the same device type

A-294 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 295: Sun Enterprise Server Maintenance

A

AP With Mirroring

AP is similar to, but not the same as, disk mirroring. Disk mirroring

replicates data to separate devices and thus achieves data redundancy.

AP, on the other hand, achieves pathing redundancy. Disk mirroring

and AP are complementary; you can use them together to achieve both

data redundancy and pathing redundancy.

Mirroring occurs on top of AP, which enables switching of the

underlying adapters used to implement the mirror from one board to

another without disruption of the disk mirroring or any active I/O.

AP does not provide mirroring itself.

Alternate Pathing A-295Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 296: Sun Enterprise Server Maintenance

A

AP and DR

AP supports DR which is used to logically attach and detach system

boards from the operating system without having to halt and reboot.

For example, with DR you can detach a board from the operating

system, physically remove and service the board, and then re-insert

the board and attach it to the operating system again. You can do all of

this without halting the operating system or terminating any user

applications.

To detach a board that is connected to an I/O device, and if that I/O

device is alternately pathed, you can first use AP to redirect the I/O

flow to a controller on a different board. You can then use DR to

detach the system board without interrupting the I/O flow.

A-296 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 297: Sun Enterprise Server Maintenance

A

The AP State Database

AP maintains a database that contains information about

all defined meta-disks, meta-networks, and their

corresponding alternate paths and properties. Each

system will have its own database.

Conceptually, a single AP database is maintained in a

single system. However, you should set up multiple

copies of this database. In this way, if a given database

copy is not accessible or becomes corrupted, AP can

automatically begin to use a current, non-corrupted

database copy. All of the AP databases synchronize their

contents during system initialization and DR operations.

● You must dedicate an entire raw disk slice, of at least

300 Kbytes, to each AP database copy. As configured

at the factory, slice 4 of the root disk is appropriately

sized for an AP database (2 Mbytes) and is not

allocated to any other purpose.

When choosing partitions for the AP database, remember

that:

● You should set up at least three to five database

copies.

● The database copies should have no I/O adapters in

common with each other. This helps protect against

an adapter failure.

● The copies can be on any slice of any type of disk

device. They do not need to be on devices that AP

supports, and do not need to have alternate paths.

● Especially if you are using Dynamic Reconfiguration

(DR), the database copies should be on I/O adapters

on different system boards so that at least one

database copy is always accessible if one of the

system boards is detached. Generally, you should

have one separate copy per system board.

Alternate Pathing A-297Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 298: Sun Enterprise Server Maintenance

A

Creating the AP Database

Before you can begin configuring AP, you must create at

least one AP database. The AP database is created with

the apdb command. You can use apdb to create the

original database or a copy.

The apdb Command

# apdb -c /dev/rdsk/c0t3d0s4 -f

The -c (create) option is followed by the raw disk slice

that will contain the new AP database copy. Each copy

requires its own dedicated slice, which must be at least

300 Kbytes in size.

The -f (force) option is only necessary to create the firstAP database copy. It is not used otherwise.

If you want an AP database copy to reside on an AP

disk, you must create two copies of the AP database. The

AP configuration process can only access database

locations by the physical disk slice address, and is not

aware of meta-devices at this level.

You must create this database copy twice, specifying

each of the physical paths to the AP meta-disk. For

example, if c1 and c9 are connected to the same AP

pathgroup, to create a copy of the AP database residing

on target 3, slice 4, use the following two commands:

# apdb -c /dev/rdsk/c1t3d0s4 -f# apdb -c /dev/rdsk/c9t3d0s4

The AP software will be aware of two copies of the

database when actually there is only one, because the

disk is accessible through two paths. This database

"alias" is safe, because AP always updates and accesses

its database copies sequentially. The AP copy is updated

twice with the same information, but this is insignificant

overhead.

A-298 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 299: Sun Enterprise Server Maintenance

A

The whole process works outside of AP. AP is not aware

that these are two separate copies of the database.

Alternate Pathing A-299Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 300: Sun Enterprise Server Maintenance

A

Creating the AP Database

Example

# apdb -c /dev/rdsk/c0t1d0s4 -f

The -c option specifies the raw disk slice (under /dev/rdsk ) where

you want to create the database copy. You must dedicate an entire disk

partition to each database copy. The disk partition must have at least

300 Kbytes. The -f (force) option is only necessary to create the first

AP database copy.

# apconfig -D

path: /dev/rdsk/c3t3d0s1major: 32minor: 145timestamp: Wed Mar 10 18:45:58 1999checksum: 2636010350default: yescorrupt: noinaccessible: no

path: /dev/rdsk/c3t3d0s6major: 32minor: 150timestamp: Wed Mar 10 18:50:43 1999checksum: 2636010350default: nosynced: yescorrupt: noinaccessible: no

A-300 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 301: Sun Enterprise Server Maintenance

A

AP Utility Examples

Identifying Disk Host Adapter Instances

Identifies all ports and provides the name, instance number, and disk

special files (/dev/dsk ) targets attached to each port.

# apinst

isp0/dev/dsk/c0t0d0/dev/dsk/c0t1d0/dev/dsk/c0t2d0pln0/dev/dsk/c1t0d0/dev/dsk/c1t1d0/dev/dsk/c1t2d0/dev/dsk/c1t3d0/dev/dsk/c1t4d0/dev/dsk/c1t5d0pln1/dev/dsk/c2t0d0/dev/dsk/c2t1d0/dev/dsk/c2t2d0/dev/dsk/c2t3d0/dev/dsk/c2t4d0/dev/dsk/c2t5d0sf0/dev/dsk/c3t0d0/dev/dsk/c3t1d0/dev/dsk/c3t2d0/dev/dsk/c3t3d0/dev/dsk/c3t4d0/dev/dsk/c3t5d0sf1/dev/dsk/c4t0d0/dev/dsk/c4t1d0/dev/dsk/c4t2d0/dev/dsk/c4t3d0/dev/dsk/c4t4d0/dev/dsk/c4t5d0

Alternate Pathing A-301Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 302: Sun Enterprise Server Maintenance

A

Meta-Disk Configuration

# ssaadm disp c1

SPARCstorage Array 110 Configuration (ssaadm version: 1.20 97/05/14)Controllerpath:/devices/sbus@45,0/SUNW,soc@0,0/SUNW,pln@b0000000,8a0e2f:ctlr DEVICE STATUS TRAY 1 TRAY 2 TRAY 3slot1 Drive: 0,0 Drive: 2,0 Drive: 4,02 NO SELECT NO SELECT NO SELECT3 NO SELECT NO SELECT NO SELECT4 NO SELECT NO SELECT NO SELECT5 NO SELECT NO SELECT NO SELECT6 Drive: 1,0 Drive: 3,0 Drive: 5,07 NO SELECT NO SELECT NO SELECT8 NO SELECT NO SELECT NO SELECT9 NO SELECT NO SELECT NO SELECT10 NO SELECT NO SELECT NO SELECT

CONTROLLER STATUSVendor: SUNProduct ID: SSA110Product Rev: 1.0Firmware Rev: 3.12Serial Num: 00000083BE1DAccumulate Performance Statistics: Enabled

For A5000s, you would use:

# luxadm disp c2

Note that the luxadm command includes the ssaadmcommand functionality. You could use luxadm to obtain

information for both A5000 and SSA devices.

A-302 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 303: Sun Enterprise Server Maintenance

A

Creating a Disk Pathgroup and Meta-Disks

1. Use apdisk to create an uncommitted disk

pathgroup. The apdisk command creates the meta-

disk names and updates the AP database with the

alternate paths for all six SSA disks.

# apdisk -c -p pln0 -a pln1

The -c operand specifies creation of a pathgroup,

and the -p and the -a operands specify the primary

and alternate paths, respectively.

2. Verify the results with apconfig -S -u .

# apconfig -S -u

c1 pln0 P Ac3 pln1 metadiskname(s): mc1t5d0 U mc1t4d0 U mc1t3d0 U mc1t2d0 U mc1t1d0 U mc1t0d0 U

Note that the entries are uncommitted.

3. Use apdb -C to commit the new database entries.

# apdb -C

4. Use apconfig -S to view the new disk entries in the

database. Note that the U is now gone.

# apconfig -S

c1 pln0 P Ac3 pln1 metadiskname(s): mc1t5d0 mc1t4d0 mc1t3d0 mc1t2d0 mc1t1d0 mc1t0d0

Alternate Pathing A-303Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 304: Sun Enterprise Server Maintenance

A

Creating a Disk Pathgroup and Meta-Disks

5. Run drvconfig to create the new metadevice entries

in the /devices directory. The -i operand ensures

that only AP metadevices are created.

# drvconfig -i ap_dmd

6. Use the ls command to confirm that the device

nodes have been created.

# ls /devices/pseudo/ap_dmd*/devices/pseudo/ap_dmd@0:128,blk/devices/pseudo/ap_dmd@0:128,raw/devices/pseudo/ap_dmd@0:129,blk/devices/pseudo/ap_dmd@0:129,raw/devices/pseudo/ap_dmd@0:130,blk/devices/pseudo/ap_dmd@0:130,raw...

7. Use apconfig -R to create the /dev directory links

to the new /devices directory nodes. /dev/ap/dskand /dev/ap/rdsk links for each possible partition

on each drive will be created, just like the diskscommand does for regular disk devices.

# apconfig -R

8. Use the ls command to confirm that the /dev links

to the device nodes have been created.

# ls -l /dev/ap/dsktotal 8lrwxrwxrwx 1 root 40 Jul 27 16:47 mc1t0d0s0 -> ../../../devices/pseudo/ap_dmd@0:128,blklrwxrwxrwx 1 root 40 Jul 27 16:47 mc1t0d0s1 -> ../../../devices/pseudo/ap_dmd@0:129,blklrwxrwxrwx 1 root 40 Jul 27 16:47 mc1t0d0s2 -> ../../../devices/pseudo/ap_dmd@0:130,blk

Similar entries will exist for /dev/ap/rdsk .

A-304 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 305: Sun Enterprise Server Maintenance

A

Using the Meta-Devices

You must modify every reference to a physical device

node (such as a path name that begins with /dev/dsk or

/dev/rdsk ) to use the corresponding meta-disk device

node, the path that begins with /dev/ap/dsk or

/dev/ap/rdsk .

If a partition is currently mounted under a physical path

name, it should be unmounted and remounted under the

meta-disk path name. This can be done by changing the

vfstab file and having the meta-device become active on

the next reboot.

Do not do this for the boot device.

If you are placing the boot disk under AP control, you

will need to modify the vfstab file by using the apbootcommand. Refer to the following page for further

information.

Alternate Pathing A-305Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 306: Sun Enterprise Server Maintenance

A

Placing the Boot Disk Under AP Control

1. Create an AP pathgroup for physical path that

includes the boot disk.

2. Run apboot , specifying the boot meta-disk name, to

define the new AP boot device. apboot modifies

/etc/vfstab and /etc/system .

# apboot mc2t0d0

where mc2t0d0 is the meta-disk name of the boot

disk.

apboot examines /etc/vfstab and replaces the

physical device name of the boot disk, such as

/dev/dsk/c2t0d0s x, with the meta-disk name,

such as /dev/dsk/mc2t0d0s x. It also edits

/etc/system so that the drivers required for AP

boot disk usage are force loaded.

Do not manually replace the physical devices in

/etc/vfstab with meta-disks for the boot disk.

Instead, use the apboot command to ensure that all

required changes are made. Just changing

/etc/vfstab will prevent the system from booting.

3. Set the OBP environment variable boot-device to

the physical path most likely to be used for booting.

Do not use multiple device names from the devaliascommand, including the other path.

4. Define an OBP devalias for the alternate boot

device physical path in case you need to perform a

manual boot from the alternate path. Set the OBP

boot-device parameter to this name. Do not add it

to the boot-device parameter value.

5. At this point, just reboot the system to begin using

the AP boot device.

Warning – If you want to create a new AP database copy

after you have placed the boot disk under AP control,

and the new database copy is to be located on a partition

A-306 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 307: Sun Enterprise Server Maintenance

A

controlled by a pln port that does not control any of the

current AP database copies, you must first remove the

boot disk from AP control. Make sure that the new AP

database has been created. Then place the boot disk

under AP control again. Failure to follow this procedure

may cause the AP database to become inaccessible

during boot.

Alternate Pathing A-307Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 308: Sun Enterprise Server Maintenance

A

Manually Switching the Active Path

Note – You can perform a switch at any time, even while

I/O is occurring on the device. You might want to

experiment with the switching process to verify that you

understand it and that your system is set up properly,

rather than wait until a critical situation occurs.

1. Use apconfig -S to view the current configuration:

# apconfig -S

c1 pln0 P Ac3 pln1 metadiskname(s): mc1t5d0 mc1t4d0 mc1t3d0 mc1t2d0 mc1t1d0

2. To perform the switch, use apconfig -P -a , where

-P identifies the pathgroup and -a specifies the path

to become active.

# apconfig -P pln0 -a pln1

3. Verify the results with the apconfig -S command.

You can see that the active alternate has been

switched to pln1 .

# apconfig -S

c1 pln0 Pc3 pln1 A metadiskname(s): mc1t5d0 mc1t4d0 mc1t3d0 mc1t2d0 mc1t1d0

A-308 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 309: Sun Enterprise Server Maintenance

A

Note – Remember that switch operations take effect

immediately.

Alternate Pathing A-309Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 310: Sun Enterprise Server Maintenance

A

Automatic Disk Pathgroup Switching (AP 2.1)

AP 2.1 provides the ability to automatically switch the

active path of a disk pathgroup. This will occur only

under two conditions:

● The currently active path has failed

● DR requests the switch (Enterprise 10000 only)

If AP detects that a path has failed, it will be marked

with a T in the apconfig -S output.

# apconfig -S

c1 pln0 P Ac3 pln1 T metadiskname(s): mc1t5d0 mc1t4d0 mc1t3d0

mc1t2d0 mc1t1d0

When a path is marked T (tried), AP will not

automatically switch to it. You can reset the tried flag by:

● Rebooting the system

● Using DR detach and then DR attach the board

● Resetting the flag manually with apdisk -w . Specify

the tried path, not the pathgroup name.

# apdisk -w pln1#

Note – Resetting the flag manually should only be done

after the cause of the failure has been repaired.

You can still manually switch to a path marked tried

with the apdisk -P command.

A-310 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 311: Sun Enterprise Server Maintenance

A

Creating a Network Pathgroup

This example assumes that you are creating a network

pathgroup using physical interfaces le0 and le2 , with

le0 as the primary interface.

1. Use apnet to create an uncommitted network

pathgroup. The apnet command creates the meta-

interface names and updates the AP database with

the alternate paths.

# apnet -c -p le0 -a le2

The -c operand specifies creation of a pathgroup,

and the -p and the -a operands specify the primary

and alternate paths, respectively.

2. Verify the results with apconfig -N -u .

# apconfig -N -u

metanetwork: mle0 Uphysical devices: le2 le0 P A

3. Use apdb -C to commit the new database entries.

# apdb -C

4. Use apconfig -N to view the new network entries

in the database. Note that the U is now gone.

# apconfig -N

metanetwork: mle0physical devices: le2 le0 P A

Alternate Pathing A-311Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 312: Sun Enterprise Server Maintenance

A

Alternately Pathing the Primary Network Interface

The primary network interface between your system and the other

machines on the network is difficult to configure down. There are

three ways to solve this problem:

● Create the appropriate AP database entries, create a new

/etc/hostname.m xxx file or rename the corresponding

/etc/hostname. xxx file, and then reboot your system.

● Set up a script file to perform the transition in your system

without rebooting.

Log in to your system from another network interface so that you

can stay connected when the primary network interface is

disabled.

You can also execute these commands all on one line, separated

with semi-colons. Ensure that you do not have any syntax errors.

Remember to remove any /etc/hostname.qfe0 and

/etc/hostname.qfe4 files, and add the /etc/hostname.mqfe0file.

# ifconfig qe0 down unplumb # ifconfig qe4 down unplumb # ifconfig mqe0 plumb # ifconfig mqe0 inet 136.162.22.45 netmask + broadcast + up

An example of a script to perform this operation is shown

overleaf..

A-312 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 313: Sun Enterprise Server Maintenance

A

Alternately Pathing the Primary Network Interface

● Generate a script to configure the qe0 and qe4 interfaces down,

then configure up the meta-network interface. This method does

not require you to reboot your system, but you will briefly lose allcommunication over the primary network interface.

# ifconfig -alo0: flags=849<UP,LOOPBACK,RUNNING,MULTICAST> mtu 8232inet 127.0.0.1 netmask ff000000qe0: flags=863<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST> mtu 1500inet 136.162.22.45 netmask ffffff00 broadcast 136.162.22.255ether 0:0:be:0:8:c5# cat > /tmp/washington.restartifconfig qe0 down unplumbifconfig qe4 down unplumbifconfig mqe0 plumbifconfig mqe0 inet 136.162.22.45 netmask + broadcast + up^D# chmod 700 /tmp/washington.restart# nohup /tmp/washington.restart &# ifconfig -alo0: flags=849<UP,LOOPBACK,RUNNING,MULTICAST> mtu 8232inet 127.0.0.1 netmask ff000000mqe0: flags=863<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST> mtu 1500inet 136.162.22.45 netmask ffffff00 broadcast 136.162.22.255ether 0:0:be:0:8:c5#

Boot Time Interface Failure

If the primary network path fails at boot time, AP will switch the

primary interface to the other alternate. An automatic switch due

to an error will not occur at any other time.

Alternate Pathing A-313Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 314: Sun Enterprise Server Maintenance

A

Switching a Network Pathgroup

Remember that you can switch the active interface of a

network pathgroup while the meta-interface is active.

The change is recorded in the state databases. The new

active path will be used until you switch back, even after

a reboot.

To switch the active interface, use the apconfigcommand. The change will occur immediately. There is

no commit process for pathgroup switching.

# apconfig -P mle0 -a le2

You can see that the switch has occurred by using the

apconfig -N command.

# apconfig -N

metanetwork: mle0physical devices:le2 A

le0 P

Note – Remember that switch operations take effect

immediately; there is no commit process for them.

Warning – When you switch interfaces, AP does not

check that the interface you are going to is the correct

path. AP does not know if the new interface is connected

to the wrong subnet, disconnected, or inoperative.

A-314 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 315: Sun Enterprise Server Maintenance

DynamicReconfiguration B

B-315Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 316: Sun Enterprise Server Maintenance

B

Introduction to Dynamic Reconfiguration

What Is Dynamic Reconfiguration?

Dynamic Reconfiguration (DR) is the ability to alter the configuration

of a running system by bringing components online or taking them

offline without disrupting system operation or requiring a system

reboot. With the availability of DR, system boards can be logically and

physically included in the system configuration, or logically

deactivated and removed while the system is running.

DR is useful in mission-critical environments if a system board fails

and must be replaced or if new system boards need to be added to the

system for additional performance and capacity. It is a critical part of

the concurrent maintenance strategy prevalent in the enterprise

computing environment.

Note – DR capability requires that the system OBP be at revision 3.2.22

or later (refer to the prtconf -V command) and the operating system

be Solaris 7 5/99 or later (refer to the /etc/release file).

Benefits of DR

DR increases system availability and flexibility by allowing the hot-

swap CPU/memory and I/O board functionality that the Sun

Enterprise 3000-6000 server hardware has supported from the

beginning. Hot-swap functionality means that the components can be

physically and logically removed or added while the system is

running.

DR includes:

● Dynamic attachment of system boards making them available for

use without rebooting the system

● Dynamic detachment of system boards making them ready for

physical removal without rebooting the system

● Display of board status

● Initiation of board testing

B-316 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 317: Sun Enterprise Server Maintenance

B

Disadvantages of DR

The main disadvantage is that to dynamically add and remove

CPU/Memory boards, you must set memory_interleaving to min, i.e.

disable it, since dr can not handle memory spread across boards.

This has a major impact on performance.

Supported Hardware

Table 2-1 lists the supported system board types that the cfgadmcommand displays. System I/O boards are classified by numerical typevalue.

Table 2-1 DR Supported Boards

!Caution – Do not assume that just because an I/O board will dr, the

SBus cards on it will dr. For a complete list of supported hardware,

refer to http://sunsolve5.sun.com/sunsolve/Enterprise-dr/

Type Name and Identifying Characteristics

CPU/mem CPU/memory board with at least one CPU module

Mem CPU/memory board with no CPU module

Disk board System board containing two SCSI disk drives

Type 1 SBus I/O board with 3 SBus slots and 2 FC-OM

Type 2 Graphics I/O with 1 UPA slot, 2 SBus slots and 2 FC-OM

Type 3 PCI+ I/O board with 2 PCI card adapter slots

Type 4 SBus+ I/O board with 3 SBus slots and 2 GBIC

Type 5 Graphics+ I/O with 1 UPA slot, 2 SBus slots and 2 GBICs.

Dynamic Reconfiguration B-317Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 318: Sun Enterprise Server Maintenance

B

Limitations to Dynamic Reconfiguration

Slot 1 board can not be removed

● The slot 1 board provides the electrical path to devices on the clock

board, and is normally the lowest-numbered working I/O board.

First CPU board can not be removed

● This is due to the fact that the POST Master is also set up as the

JTAG Master, and can not be dr’d since the JTAG Master controls

the dr POST.

It is not too difficult to crash the system...

● Inserting a failed board can immediately crash the system.

Connecting a bad board that passes POST can also crash the

system.

● Bending a pin when inserting a board can crash system. Hardware

slots are not isolated.

● Inserting a board in too slowly can panic Solaris. If an interrupt is

in flight when the pause pin is asserted during insert for more

than one second, Solaris will panic.

Fails using 168MHz modules

● POST fails during DR connect on 168 MHz machine. DR connect

operation with a CPU/Memory board that has UltraSparc I

modules can fail or take a long time.

Fails in single user mode

● DR connect operation hangs in single user mode. DR connect

operations performed in single user mode causes the system to

hang.

B-318 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 319: Sun Enterprise Server Maintenance

B

Displaying Board Status

Basic Status Display using cfgadm

When used without options, the cfgadm command displays

information about all known attachment points, the collective term for a

board and its card cage slot (or receptacle).

There are two types of system names for attachment points:

● Physical attachment point – Describes the software driver and

location of the card cage slot. For example:

/devices/central@1f,0/fhc@0,f8800000/clock-board@0,900000:sysctrl,slot0

● Logical attachment point – An abbreviated name created by the

system to refer to the physical attachment point. For example:

sysctrl0:slot0

DR displays the status of the slot, the board, and the attachment point.

The DR definition of a board also includes the devices connected to it.

The term occupant is used to refer to the combination of board and

attached devices.

The following display shows a typical cfgadm output:

Ap_Id Receptacle Occupant Condition

ac0:bank0 connected configured okac0:bank1 empty unconfigured unknownac1:bank0 connected configured okac1:bank1 empty unconfigured unknownac2:bank0 connected configured okac2:bank1 empty unconfigured unknownsysctrl0:slot0 connected configured oksysctrl0:slot1 connected configured oksysctrl0:slot2 disconnected unconfigured unknownsysctrl0:slot3 connected configured ok

Dynamic Reconfiguration B-319Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 320: Sun Enterprise Server Maintenance

B

canbe

is

on-

fig-

Displaying Board Status

The following is a lists the possible conditions of the receptacle and occupant.

The LED assignments are as you look at them from left to right.

ReceptacleStatus Explanation

Empty No board is present in the slot. All LEDs are off.

Disconnected A board is present but is electrically disconnected. The system identify the board type. The board is in low power mode and can unplugged at any time. LED state off on off

Connected The board is electrically connected and powered up. The systemactively monitoring the board for temperature and cooling.LED state on off off

OccupantStatus Explanation

Configured Devices on the board are fully initialized and can be mounted or cfigured for use. LED state on off blink

Unconfigured The unconfigured state covers all device states that are not conured, including receptacles in the empty state.LED state on off off

B-320 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 321: Sun Enterprise Server Maintenance

B

tsd is

n

asys-s

te

-

er

pty

Displaying Board Status

Conditions Explanation

Unknown The current condition cannot be determined. This situation resuleither when a new board is inserted in a running system, or a boarplaced on the disabled board list before a reboot. A transition to aconnected receptacle state changes an attachment point conditiofrom unknown to either OK or Failed.

OK No problems detected. This condition occurs only after a board hbeen connected. This condition persists either until the board is phically removed, or a problem is detected. An OK condition requirecorrect hardware compatibility, correct firmware revision, adequapower, adequate cooling, and adequate precharge.

Failing A failing condition occurs when a board that was in the OK condition develops a problem.

Failed The board has failed POST/OBP. A failed condition can occur eithduring bootup or after a failed connect attempt. This condition isconsidered uncorrectable and will persist until the board is physi-cally removed.

Unusable Either an attachment point has incompatible hardware, or an emattachment point lacks power, cooling, or precharge current.

Dynamic Reconfiguration B-321Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 322: Sun Enterprise Server Maintenance

B

Detailed Status Display using cfgadm -v

For a more detailed status report, use the command cfgadm -v . The

-v option turns on expanded (verbose) descriptions.

Figure B-1 shows a breakdown of each field found in the output of the

cfgadm -v command. The example shown is of a 64MB memory

module.

Figure B-1 Detail status display entry

ac0:bank0 connected configured ok slot0 64mb base 0x00000000

May 1 13:00 memory n /devices/fhc@0,f8800000/ac@0,1000000/bank0

Attachment point

Slot electrical condition

Board operational condition

Physical ID and locationBoard Activity

Board status

Location

Board type(board not busy)

B-322 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 323: Sun Enterprise Server Maintenance

B

Reconfiguration Considerations

Device Driver Interface DDI

For a device to fully conform to dr, it must comply with the following:

The device driver must support DDI_ATTACH, DDI_DETACH andDDI_SUSPEND/RESUME.

All drivers support DDI_ATTACH but not all drivers support

DDI_DETACH and DDI_SUSPEND/RESUME.

A dr detach must pause the operating system, i.e. quiesce it, and to do

this the driver must be suspend-safe.

Suspend-Safe and Suspend-Unsafe Devices

A driver is suspend-safe if it supports operating system quiescence, that

is, one that does not access memory or interrupt the system while the

operating system is in quiescence (suspend/resume).

It also guarantees that when a suspend request is successfully

completed, the device that the driver manages does not attempt to

access memory, even if the device is open when the suspend request is

made.

A suspend-unsafe device is one that allows a memory access or a system

interruption while the operating system is in quiescence.

Suspend-safe drivers allows you to:

● Stop user threads.

● Execute the DDI_SUSPEND call in each device driver.

● Stop the clock and CPUs.

The operating system refuses a quiescence request if a suspend-unsafe

device is open.

To manually suspend the device, you will have to issue a modunloadcommand.

Dynamic Reconfiguration B-323Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 324: Sun Enterprise Server Maintenance

B

Testing for Suspend-Safe Drivers

The quiesce-test option tests for suspendable drivers. For example:

# cfgadm -x quiesce-test sysctrl0:slot <number>

Note – All tape drivers are considered suspend-unsafe.

Hot-Plug Hardware

Hot-plug boards and modules have special connectors that supply

electrical power to the board or module before the data pins make

contact. Boards and devices that do not have hot-plug connectors

cannot be inserted or removed while the system is running.

!Caution – Before inserting a board into the centreplane, it is essential

that the precharge voltages are present. Ensure the PPS is supplying

these voltages by typing:

/usr/platform/sun4u/sbin/prtdiag -v | grep precharge

I/O boards and CPU/memory boards used in Enterprise x000 and

x500 systems are hot-plug devices. Some devices, such as the clock

board, are not hot-plug modules and cannot be removed while the

system is running.

B-324 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 325: Sun Enterprise Server Maintenance

B

Permanent Memory Management

Certain parts of memory can not be paged out during a detach. This

permanent memory includes the kernel and OBP.

The kernel is loaded to high order memory during boot up.

The kernel must be confined to one system board, a process known as

caging the kernel.

The only system board that can not be removed from an operating

system is the board in the lowest numerical slot.

It is recommended that steps be taken to force the kernel to load on

that board so only one system board is restricted.

Required additions to /etc/system

The following entries must be added to the /etc/system file.The

following enables dr on I/O boards:

set soc:soc_enable_detach_suspend=1set pln:pln_enable_detach_suspend=1

The following enables dr on CPU/Memory boards:

set kernel_cage_enable=1

Dynamic Reconfiguration B-325Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 326: Sun Enterprise Server Maintenance

B

Procedures - Removing a CPU/Memory Board

!Note – Performing the following board removal procedures is the

responsibility of the system administrator. However, it is important for

you to understand these procedures in order to assist where possible.

The memory modules on a CPU/memory board can be shared by

other CPU/memory boards. Therefore, you must halt all use of

memory modules on a board before you can remove the board.

1. Log into the system console as root .

2. Use the cfgadm command to determine the system name for the

CPU/memory board and associated memory banks.

Note – A CPU/memory board can have up to two banks of memory.

Memory banks have logical names of the form acnumber:bank number.The term acnumber identifies the driver instance, but the number is not

directly related to the board slot number. The bank number is either

bank0 or bank1.

Note – For the example in this procedure, the board is ac1 , which has

one memory bank (bank1).

Also, verify that you can relocate the memory modules on the

CPU board.

# cfgadm -v

You cannot unconfigure non-relocatable memory pages in the

memory span (a section of memory that is reserved for system

use). Non-relocatable memory is identified as permanent in a

cfgadm listing.

3. If the memory is relocatable, stop all activity in the memory

modules on the board.

# cfgadm -c unconfigure ac1:bank1

B-326 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 327: Sun Enterprise Server Maintenance

B

This step halts all accesses by other CPU/memory boards and

prevents any further memory use until the board is replaced.

4. Verify that the CPUs on the board are not bound to any processes

running in the system.

If a CPU is bound to a process, the board cannot be removed until

the process is unbound.

The CPUs are identified by numbers that are related to the board

number. The first CPU number is twice the board number (2*n).

The second CPU number is twice the board number, plus one (2*n

+ 1).

To list all bound processes, use the pbind command. If any of the

listed processes show the CPUs in question, the related boards

cannot be removed until those processes are unbound.

The following example shows that process ID 1145 is bound to

processor 10 (board number 5, CPU 0). The pbind -u (unbind)

command unbinds the process. The pbind -q (query) command

shows that process ID 1145 is no longer bound.

# pbind

process id 1145: 10

# pbind -u 1145# pbind -q 1145

process id 1145: not bound

5. Unconfigure the board.

# cfgadm -c unconfigure sysctrl0:slot <number>

where slot <number> is the slot location (number) in the card

cage.

6. If the previous step did not also disconnect the board, disconnect

the board by typing the following command:

# cfgadm -c disconnect sysctrl0:slot <number>

7. When the LEDs on the board indicate that the board is ready for

removal (two outer LEDs off and the middle LED on), you can

physically remove the board.

Dynamic Reconfiguration B-327Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 328: Sun Enterprise Server Maintenance

B

!Caution – If a replacement board is not available and you remove the

board, you must fill the empty slot to maintain the proper flow of

cooling air in the card cage. For Sun Enterprise 3000, 3500, 4000, 4500,

5000, and 5500 systems, use a dummy board (part number 504-2592).

For Sun Enterprise 6000 or 6500 systems, use a load board

(part number 501-3142).

B-328 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 329: Sun Enterprise Server Maintenance

B

Procedures - Installing or Replacing a CPU/Memory Board

1. Verify that the selected board slot can accept a board.

# cfgadm

The states and conditions should be:

▼ Empty, Unconfigured, Unknown

2. Physically insert the board into the slot and watch for an

acknowledgment on the system console or in the system log file.

The acknowledgment is of the form,

Name inserted into slot <number>

where Name is the name of the system board being installed and

<number> is the slot location (number) in the card cage.

After a CPU/memory board is inserted, the states and conditions

should become:

▼ Disconnected, Unconfigured, Unknown

Note – Any other states or conditions should be considered an error.

Dynamic Reconfiguration B-329Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 330: Sun Enterprise Server Maintenance

B

Operating System Quiescence

During an insertion operation of a board, the operating system is

briefly paused, which is known as operating system quiescence. All

operating system and device activity on the backplane must cease for a

few seconds during a critical phase of the operation. You must reply

with a yes to continue or no to stop the configuration process and

allow the operating system to continue operating normally.

Before quiescence can be achieved, the operating system must

temporarily suspend all processes, CPUs, and device activities. If the

operating system cannot achieve quiescence, it displays the reasons,

which can include the following:

● A user thread did not suspend

● Real-time processes are running

● A device exists that cannot be paused by the operating system

The conditions that cause processes to fail to suspend are generally

temporary. Examine the reasons for the failure. If the operating system

encountered a transient condition causing a failure to suspend a

process, you can try the operation again.

3. Configure the board.

# cfgadm -v -c configure sysctrl0:slot <number>

This command should both connect and configure the receptacle.

Us the cfgadm command to verify this.

The states and conditions for a connected and configured

attachment point should be:

▼ Connected, Configured, OK

Now the system is aware of the usable devices on the board and

the devices can be used.

4. Configure the memory devices on the board in Solaris.

# drvconfig -i ac

5. Determine the system numbers of the new CPU modules.

B-330 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 331: Sun Enterprise Server Maintenance

B

# psrinfo

4 on-line since 5/15/99 08:01:145 on-line since 5/15/99 08:01:196 powered-off since 5/16/99 09:27:21

In this example, there is one new CPU module (system number 6).

The module has not yet been enabled, so it is listed as being

powered off.

The system number for a CPU is equal to twice the board number,

plus 0 for CPU module 0, or 1 for CPU module 1. In the example

shown, system number 6 represents module 0 on board number 3.

6. Enable the new CPU module or modules.

# psradm -n 6

where 6 is the system number of the CPU module to be enabled.

7. Test the new memory banks.

# cfgadm -o test_type -t ac number:bank0# cfgadm -o test_type -t ac number:bank1

where test_type is one of three memory tests:

▼ Quick – Writes a pattern of ones and zeros.

▼ Normal – Detects specific memory address failures.

▼ Extended – Tests interference between memory cells.

Note – The acnumber can be found in the basic or detailed status

display.

8. Configure the new memory banks.

# cfgadm -c configure ac number:bank0# cfgadm -c configure ac number:bank1

9. Verify that the board and the memory banks are configured.

▼ For the CPU status, use the psrinfo or mpstat commands.

▼ For the memory status, use the prtconf or vmstatcommands.

Dynamic Reconfiguration B-331Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 332: Sun Enterprise Server Maintenance

B

Procedures - Removing an I/O Board

This procedure assumes that all activity going to the I/O board to be

removed has been stopped, file systems have been unmounted, and

network interfaces have been shut down.

Or, if AP is in use, all I/O functions have been switched to the

alternate I/O board.

1. Verify that all I/O activity to the board has been terminated.

2. Check the status of the board.

# cfgadm

For a board removal or replacement, the states and conditions

must be one of the following sets:

If the board is ok, state is:

▼ Connected, Configured, OK

If the board is failing, state is:

▼ Connected, Configured, Failing

3. Unconfigure the board.

# cfgadm -c unconfigure sysctrl0:slot <number>

4. Use the cfgadm command to confirm that the board is

unconfigured.

If the unconfigure operation failed, verify that:

▼ The board is Detach-Safe.

▼ Activity on the board has been quiesced.

!Caution – A failure of step 4 results in a partially unconfigured

condition. If this happens, attempt to unconfigure again. A

configuration operation is not permitted at this point.

5. When the board is unconfigured, you can do one of the following:

B-332 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 333: Sun Enterprise Server Maintenance

B

▼ Leave the board in the system unconfigured

▼ Configure the board

▼ Disconnect the board manually, if the unconfiguration

operation did not do so automatically by typing the following

command:

# cfgadm -v -c disconnect sysctrl0:slot <number>

6. If you wish to remove the board from the card cage, first verify the

board status.

▼ Use the cfgadm command to verify that the board is logically

disconnected.

▼ Check the LEDs on the board to verify that the board is

electrically disconnected. The two outer LEDs must be off and

the middle LED must be on.

!Caution – If a replacement board is not available and you remove the

board, you must fill the empty slot to maintain the proper flow of

cooling air in the card cage. For Sun Enterprise 6000 or 6500 systems,

use a load board (part number 501-3142), for all other systems use a

dummy board (part number 504-2592).

Dynamic Reconfiguration B-333Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 334: Sun Enterprise Server Maintenance

B

Procedures - Removing Boards that Use Detach-Unsafe Drivers

Some drivers do not yet support DR on Sun Enterprise 3x00, 4x00,

5x00, and 6x00 systems. DR cannot detach these drivers, but you can

remove some undetachable drivers manually.

1. Halt all use of the device controller.

2. Halt the use of all other controllers of the same type on all boards

in the machine.

The remaining controllers can be used again after the DR

unconfigure operation is complete.

3. Use Unix commands to manually close all such drivers on the

board and use the modunload command to unload them.

# modinfo | grep tape107 f66a0000 dfe9 33 1 st (SCSI tape driver 1.1)

# modunload -i 107#

4. Disconnect the board.

# cfgadm -c disconnect sysctrl0:slot <number>

The disconnected board can be physically removed now or at a

later time.

!Caution – Many third-party drivers (those purchased from vendors

other than Sun Microsystems) do not yet properly support the

standard Solaris software modunload interface. Test these driver

functions during the qualification and installation phases of any third-

party device.

B-334 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 335: Sun Enterprise Server Maintenance

B

Procedures - Installing a New I/O Board

1. Verify that the selected board slot is ready for a board.

# cfgadm

The states and conditions should be:

▼ Empty, Unconfigured,Unknown

2. Physically insert the board into the slot and look for an

acknowledgment on the console in the form of

Name board inserted into slot <number>

After an I/O board is inserted, the states and conditions should

become:

▼ Disconnected,Unconfigured,Unknown

Note – Any other states or conditions should be considered an error.

3. Connect any peripheral cables and interface modules to the board.

4. Configure the board with the command.

# cfgadm -v -c configure sysctrl0:slot <number>

Note – This command should both connect and configure the

receptacle.

5. Verify with the cfgadm command.

The states and conditions for a connected and configured

attachment point should be

▼ Connected, Configured, OK

Now the system is also aware of the usable devices that reside on

the board and all devices that can be mounted or configured to be

used.

Dynamic Reconfiguration B-335Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 336: Sun Enterprise Server Maintenance

B

If the command fails to connect and configure the board and slot,

try the connection and configuration as separate steps:

a. Connect the board and slot by typing the following:

# cfgadm -v -c connect sysctrl0:slot <number>

The states and conditions for a connected attachment point

should be:

▼ Connected, Unconfigured, OK

Now the system is aware of the board, but not the usable

devices which reside on the board. Temperature is monitored

and power and cooling affect the attachment point condition.

b. Configure the board and slot by typing the following:

# cfgadm -v -c configure sysctrl0:slot <number>

The states and conditions for a configured attachment point

should be:

▼ Connected, Configured, OK

Now the system is also aware of the usable devices that reside

on the board and all devices that can be mounted or

configured.

6. Reconfigure the devices on the board.

# drvconfig; devlinks; disks; ports; tapes;

Reconfiguring the system normally falls under one or more of the

following categories:

● Board removal – If you remove a board that is not to be replaced,

you can (but do not have to) execute the reconfiguration sequence

to clean up the /dev links for disk devices.

● Board change – If you remove a board and then insert it into a

different slot, or replace a board with another board that has

different I/O devices, you must execute the reconfiguration

sequence to configure the I/O devices associated with the board.

B-336 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 337: Sun Enterprise Server Maintenance

B

● Board installation – When adding a board, you must execute the

reconfiguration sequence to configure the I/O devices associated

with the board.

● Board replacement – If you replace a board with another board

that hosts the same set of I/O devices, inserting the replacement

into the same slot, you might not need to execute the

reconfiguration sequence.

The console should display a list of devices and their addresses.

7. Activate the devices on the board using commands, such as mountand ifconfig , as appropriate.

Dynamic Reconfiguration B-337Copyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E

Page 338: Sun Enterprise Server Maintenance

B

Procedures - Installing a Replacement I/O Board

This procedure assumes that you have previously performed the

Removing an I/O Board procedure discussed earlier in this module.

1. If you are not continuing from the procedure Removing an I/O

Board, use the cfgadm command and select a card cage slot to use,

but do not insert the board yet.

2. View the configuration list and verify that the slot is unconfigured.

# cfgadm

3. Insert the board in the slot and look for an acknowledgment on the

console, such as:

Name board inserted into slot<number>.

4. Use the cfgadm command again to look for the system name

assigned to the new board.

5. Configure the board using the system name for the board.

# cfgadm -c configure sysctrl0:slot <number>

6. Configure any I/O devices on the board using commands, such as

drvconfig and devlinks , as appropriate.

7. Activate the devices on the board using commands, such as mountand ifconfig , as appropriate.

B-338 Sun Enterprise Server MaintenanceCopyright 2001 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services June 2001, rev E