- Topic ID: id_17423153
- Version: 3.0
- Date: Nov 27, 2020 2:15:25 AM
Host Computer (Z840) Recon GPU Card Replacement
Prerequisites
Overview
This procedure shall be followed when replacing the Recon GPU card in the Host Computer (Z840).
Figure 1. Recon GPU card

1 Host Computer Removal
Procedure
- Shutdown system. Select one of the following methods to Power
OFF the Console:
-
If Applications are up, click on the Shut Down button on desktop display and select Shutdown.
-
If Applications are down, open a Terminal Window. Type: halt , then press ENTER.
-
When halt command has finished, power Off the console at the front panel switch.
-
- Apply LOTO. See Equipment Service - Lockout-Tagout-PPE procedure.
- Remove Front and Top covers. Refer to the following procedure.
Refer to Replacement → Console → Console Cover Removal and Installation
- Remove the host computer from console chassis. Refer to the
following procedure.
Refer to Replacement → Console → Host Computer (Z840) Replacement
2 Recon GPU Card Replacement
Procedure
- Open the host computer side access panel.
- Remove the Expansion Card Support.
Figure 2. Z840 Airflow Guide and Expansion Card Support

- Disconnect the power cable from the Recon GPU card.
- Replace the existing Recon GPU Card with new one.note:
Lift up the card latch when removing the card. (See Figure 4)
Figure 3. Z840 Component Location

Figure 4. PCI card Latch

- Connect the power cable to the Recon GPU card.
- Install the Expansion Card Support and close the Side Access Panel.
3 Restore the Console
Procedure
- Install Host computer into Console chassis. Refer to the following
procedure.
Refer to Replacement → Console → Host Computer (Z840) Replacement
- Reconnect all cables removed earlier to the Z840 computer.
- Install Console covers. Refer to the following procedure.
Refer to Replacement → Console → Console Cover Removal and Installation
- Remove LOTO on console.
4 Finalization
Procedure
- Confirm Host computer powers up when console power is turned on.
- Check the GPU card installed. Open a shell, then type:
{ctuser@hostname} ls /proc/driver/nvidia/gpus | wc -l
-
If “2” displays, the Recon GPU card is installed.
-
If “1” displays, the Recon GPU card is un-installed.
-
- Ensure GPU ECC state is ON:
- Open a Unix shell and log on as root.
- Type: su - [ENTER].
- Type the root password [ENTER].
- Type: nvidia-smi [ENTER].
- Check GPU ECC status as below:
- If the GPU ECC is ON, below is what the output would look like (boxed in green):

- If the GPU ECC is OFF, below is what the output would look like (boxed in red):

- If the GPU ECC is ON, below is what the output would look like (boxed in green):
- How to turn ECC back on:
- Type: nvidia-smi -g 0 --ecc-config=1 [ENTER]
- A message will show that ECC is enable and a reboot is required:

- After reboot, check that the ECC is ON according to previous steps.
- Perform the Functional Checks → System Scanning Test instructions from the procedure list.