1. Uni-Processor
    1. Statically Scheduled Pipelines
      1. Classic 5-stage Pipeline
        1. Data Hazards
        2. Control Hazards
        3. Structural Hazards
        4. Precise Exception
      2. out of order instruction completion
      3. Superpipelined & Superscalar
      4. Branch Prediction
      5. Static instruction scheduling
        1. local
        2. global
    2. Dynamically Scheduled Pipelines
      1. enforcing data dependencies: Tomasulo algorithm
      2. Speculative execution: Execution beyond unresolved branches
      3. Adding Speculation to Tomasulo alogrithm
      4. Dynamic memory disambiguation
      5. Explicit register renaming
      6. Checking pointing
      7. Register fetch after instruction issue
      8. speculative instruction scheduling
      9. memory disambiguation
      10. beating the data-flow limit: value prediction
      11. multiple instructions per clock
      12. deal with complex ISAs
    3. VLIW Micro-achitecture
      1. Duality of Dynamic and Static Techniques
      2. VLIW Architecuture
      3. Loop Unrolling
      4. Software pipelining
      5. Non-cyclic VLIW Scheduling
      6. Predicated Execution
      7. Speculative memory disambiguation
      8. Exception
    4. EPIC Micro-architecture
    5. Vector Micro-architecture
  2. Multi-Processor
  3. Memory Hierarchies
    1. THe pyramid of memory levels
      1. Memory access locality
      2. Memory hierarchy coherence
      3. Memory inclusion
    2. Cache hierarchy
      1. Cache mapping and organization
      2. Replacement policies
      3. Write policies
      4. Cache hierarchy performance
      5. Classification of cache miss
      6. non-blocking (look-up free) caches
      7. Cache prefetching and preloading
    3. Virtual Memory
      1. Motivation for virtual memory
      2. Operating Systems' View of Virtual Memory
      3. Virtual Address Translation
      4. Memory Access Control
      5. Hierarchical Page Tables
      6. Inverted page table
      7. Translation Lookaside Buffer
      8. VIrtual-address caches with physical tags
      9. Virtual-address caches with virtual tags
  4. Coherence and Memory Consistency
    1. Background
      1. Shared-memory communication model
      2. Hardware components
    2. Coherence and Memory Access Atomicity
      1. why is coherence in multiprocessors so hard
      2. Cache Protocols
        1. Snooping protcols
        2. Directory protocols (cc-NUMA)
      3. Memory access atomicity
      4. Plain Coherence
    3. Sequential Consistency
      1. Formal model for sequential consistency
      2. Access ordering rules for sequential consistency
      3. Memory access buffering
    4. Synchronization
      1. Basic synchronization primitives
      2. Hardware-based synchronization
      3. Software-based synchronization
    5. Relaxed Memory Consistency Models
      1. Not relying on synchronization
      2. Relaying on synchronization
    6. Speculative violations of memory orders
      1. Conservative memory model enforcement in OoO processors
      2. Speculative violations of memory orders
  5. introduction
    1. what is computer architecture
    2. Components of parallel architecture
      1. processors
      2. memory
      3. interconnects
    3. parallelism in architecture
      1. Instruction-level parallelism (ILP)
      2. Thread-Level Parallelism (TLP)
      3. Vector and array processors
    4. Performance
      1. Benchmarking
      2. Reporting performance for a set of programs
      3. Reporting speedups
      4. Amdahl's law
      5. Parallel speedup
    5. Technological Challenges
      1. Power and energy
      2. Reliabiligy
      3. Wire Delays
      4. Design Complexity
      5. Limits of miniaturization and the CMOS end-point