## Adaptive Computing

Gus Uht (w/ Rick Vaccaro)

Dept. of Electrical and Computer Engineering



#### UNIVERSITY OF Rhode Island



Copyright © 2003-5, A. K. Uht, R. J. Vaccaro Patent applied for. BARC: January 21, 2005





- Three typical goals (alone or combined):
  - 1. ~Maximize performance.
  - 2. ~Minimize power consumption.
  - 3. Adapt to current (or prior) conditions:
    - a. <u>Environmental</u>, e.g., temperature. (current)
    - b. <u>Operational</u>, e.g., power supply voltage. (current)
    - c. <u>Manufacturing</u>, e.g., slower or faster chips. (prior)
- ~All adaptive methods applicable to <u>all.</u>
- "Better-than-worst-case design"





- 1. Adaptive Methods
- 2. TEAtime Prototype Picture

- 3. Adaptive Control System
- 4. TEAPC & TEA42 Prototypes and Data
- 5. Demo TEA42
- 6. Summary



- Timing Error *Toleration*:
  - 1. Change something (freq., volt.), then:
  - 2. Let error occur, <u>detect it</u>, then <u>recover</u>.
  - 3. Change the thing back, repeat: GOTO 1.
  - Complex, hard to design; only ~optimal <u>within</u> cycle.
- Timing Error Avoidance:
  - 1. Change something (freq., volt.), then:
  - 2. Stop just <u>before</u> error would occur.
  - 3. Change the thing back, repeat: GOTO 1.
  - Simple, easy to design & build; <u>constant</u> cycles.



### **TEAtime Prototype**



BARC: January 21, 2005

Adaptive Computing



TEAPC, TEA42 Goals

- Motivating Goals:
  - Realize TEAtime characteristics in a real computer.
  - *Adaptive computing*, in particular.
- Additional Goals:
  - 1. Workload adaptation.
  - 2. Reduced power consumption.
  - 3. Improved reliability.
  - 4. Disaster tolerance (always enabled).
  - 5. ...and all in a production machine.
- <u>BUT</u>: can't redesign or build Pentium 4's.
- <u>SO</u>: use real IBM/Intel-standard PC.



## Current Control System

- Only input is CPU temperature (feedback line).
- Primary output is CPU frequency (N).
  [sometimes: Vcore = f (N)]





- Hardware:
  - No modifications  $\rightarrow$  all COTS parts.
  - System: ~3.0 GHz Pentium 4;
    Intel chipset; 1 GB RAM.
- Software (teapcwin program):
  - Realizes feedback control system.
  - Standard MS Windows application.
  - (Standard OS: W2K SP4; no modifications [of course].)
  - Small: 1.1 megabytes.
  - Fast: low CPU utilization.







### ...and now it's time for the:

# DEMO

BARC: January 21, 2005



10 of 13



**Overall Summary** 

- TEAtime, TEAPC & TEA42 realize:
  - 1. Better-than-worst-case performance.
  - 2. *Adaptive* operation to both environment and/or loading.
  - 3. Low-power, high-reliability operation.
  - 4. Disaster tolerance.
- Feedback-control great for a system, too.
- Adaptive systems are the way to go.
- They Work!!





- Appendix.
- Computer, March, 2004 Special Issue.
- IEEE Trans. Computers, Feb. 2005.
- My website: <u>www.ele.uri.edu/~uht</u>
- Or μRI website: <u>www.ele.uri.edu/muri</u>

## Adaptive Computing

Gus Uht (w/ Rick Vaccaro)

Dept. of Electrical and Computer Engineering



#### UNIVERSITY OF Rhode Island



Copyright © 2003-5, A. K. Uht, R. J. Vaccaro Patent applied for. BARC: January 21, 2005





- 15. Timing Error Toleration.
- 16. Timing Error Avoidance.
- 17. Adaptive Control Systems.
- 18. TEAPC Block Diagram.
- 19. TEAPC Components.
- 20. [TEAPC] Experiment Setup.
- 21. TEAtime Block Diagram.
- 22. [TEAtime Data.]
- 23. TEAtime Summary.
- 24. TEA42 Disaster Tolerance.



## **Timing Error Toleration**

- Operation:
  - 1. <u>Speed up clock</u> (resp. <u>reduce voltage</u>) until error occurs.
  - 2. <u>Slow down clock</u> (resp. <u>increase voltage</u>) for no error.
  - 3. Backtrack / repair error.
  - 4. Repeat: GOTO 1.
- Examples:
  - Uht 2000: *TIMERRTOL*: performance ~maximized; not pursued.
  - Austin et al 2003: Razor: power ~minimized; chip fabbed.
- Plusses:
  - True (or better than) perf. max., resp. power min. [<u>within</u> cycle]
- Minuses:
  - Can be costly (TIMERRTOL: 2x cost; Razor: little extra cost)
  - Is complex and hard to design.
  - Can lead to <u>increase</u> in cycles  $\rightarrow$  perf. may NOT be true max.



- **Operation:** 
  - <u>Speed up clock</u> (resp. <u>reduce voltage</u>) 'til **just before** error occurs.
  - <u>Slow down clock</u> (resp. <u>increase voltage</u>) for no error. 2.
  - 3. *Repeat: GOTO 1.* [No backtracking or repair needed.]
- **Examples:** 
  - Olivieri, et al, 1999 used microcontroller.
  - Uht 2003: *TEAtime*: ~performance maximization; prototype built.
    - One-bit wide slowest-path test logic always errs before real logic does.
- **Plusses:** 
  - Close to true performance maximization, resp. power minimization.
  - Simple, easy to design and build. (TEAtime)
  - No increase in cycles.
- Minuses:
  - Shorter papers. (no, wait, that's a plus...)



- TEAtime:
  - Simple up/down control system 0 delay.
- Skadron, Bahar:
  - Classic feedback control system.
  - In hardware.
- TEAPC, TEA42:
  - Bigger delays.
  - State-space feedback control system.
  - In software; little overhead.



## **TEAPC Block Diagram**





## **TEAPC** Components

| PC Component                   | Manufacturer                         | Part Number/Description                                                       |
|--------------------------------|--------------------------------------|-------------------------------------------------------------------------------|
| Motherboard                    | Gigabyte                             | GA-8KNXP (Rev. 2); w/DPS regulator                                            |
| CPU                            | Intel                                | P4 3.0 GHz 800 MHz bus                                                        |
| Chipset                        | Intel                                | 875P, ICH5R                                                                   |
| Clock Synthesizer              | ICS                                  | ICS952635                                                                     |
| Super I/O (Environment Mon.)   | ITE                                  | IT8712F V0.6                                                                  |
| CPU Volt. Regulator Control    | ITE                                  | IT8206R V0.1                                                                  |
| Main Memory                    | Ultra                                | U10-5903R; 2 x 512 MB;<br>400 MHz DDR, Dual Channel<br>(Operated at 320 MHz.) |
| Operating System               | Microsoft                            | Windows 2000 SP4, HT disabled                                                 |
| Disk System – RAID 0+1         | ITE                                  | GigaRAID IT8212F                                                              |
| Disks                          | Maxtor                               | 4 x 6E040L0, 40 GB, 133MHz IDE                                                |
| Equipment for experiments only |                                      |                                                                               |
| Fan Controller & Temp. Mon.    | Thermaltake                          | Hardcano 12; for 4 fans, 4 thermocouples                                      |
| Power Meter                    | Electronic<br>Educational<br>Devices | watts up? PRO<br>(Note: this is the unit's model name.)                       |
| CPU Fan Controller             | custom                               | On/Off, control sel. (MOBO or Hardcano)                                       |

BARC: January 21, 2005

Adaptive Computing



## **Experiment Setup**



Adaptive Computing



## **TEAtime Block Diagram**



NOTES:  $-w, x \gg 1$ ; -DAC: Digital-to-Analog Convertor; -VCO: Voltage-Controlled Oscillator.

- Timing Error Avoidance system
  - Blue: TEAtime hardware
  - Green: on FPGA





**TEAtime Summary** 

- Performance of prototype improves by >= 34%
- TEAtime provides adaptable frequency control of any synchronous digital system
- Always ~maximizes performance
- Very cheap
- Very easy to add to existing or future designs
- ~Can be adapted to physically existing systems
- <u>It Works!!</u>

BARC: January 21, 2005) (Adapt



- Example: CPU Fan dies....
- Changes (automatic, via feedback system):
  - Freq:  $3.15 \text{ GHz} \rightarrow 1.5 \text{ GHz}$
  - Vcore: ~1.5 V.  $\rightarrow$  ~1.1 V.
  - $\rightarrow$  Power: ~160 W.  $\rightarrow$  ~100 W. (~<u>37% savings</u>)
- CPU temperature stabilizes at safe value (with this CPU).
- System still works.