8-bit Five-Stage Pipelined RISC Microprocessor (VHDL → FPGA)

Microarchitecture + synthesizable RTL · hazard-aware pipeline · full toolflow validation

This CPU implements a fixed instruction encoding over an 8-bit datapath with 16 general-purpose registers and a strict Load/Store model. The architecture is partitioned into a five-stage pipeline with explicit stage boundaries and deterministic commit behavior. Validation combines waveform-level verification and FPGA synthesis/place-and-route evidence.

  • VHDL
  • Vivado
  • Artix-7 (Basys 3)
  • Pipeline
  • Hazards
  • BRAM
  • Timing closure

Design Philosophy: Silicon-Disciplined RTL

  • Microarchitecture choices were treated as implementation-bound decisions shaped by routing, clocks, fanout, and critical paths.
  • Each stage owns explicit logic and synchronous state boundaries to avoid timing ambiguity.
  • The RTL was structured for deterministic behavior and survivability across synthesis and place-and-route.
RISC CPU design concept
Figure 1 — Design intent aligned with implementation reality.
Global five-stage CPU architecture
Figure 2 — Five-stage datapath and control partitioning.

Pipeline Contract: Five-Stage Datapath Partitioning

  • The datapath is partitioned into LI / DI / EX / MEM / ER stages with clear responsibilities.
  • Pipeline registers are explicitly inserted at each stage boundary.
  • Load/Store paths are separated from compute paths with predictable stage latency.

FPGA Deployment Platform (Basys 3 / Artix-7)

  • Basys 3 provides realistic clocking infrastructure, BRAM/DSP resources, and enough fabric for constrained implementation.
  • It serves as the toolflow endpoint from synthesis to place-and-route under real constraints.
  • Bring-up stays deterministic through fully synchronous design assumptions.
Basys 3 Artix-7 FPGA board
Figure 3 — Target FPGA platform for implementation closure.
Top-level CPU RTL integration
Figure 4 — Top-level decomposition and interface ownership.

Top-Level Integration: RTL Decomposition and Signal Ownership

  • The top level decomposes into ALU, register file, instruction memory, data memory, and control.
  • Control is stage-local to keep fanout bounded and timing pressure contained.
  • The structure is fully synthesizable with no ambiguous combinational feedback loops.

Execution Core: ALU + Flag Semantics

  • Add/sub datapaths and operation muxing depth are the primary EX-stage timing risk.
  • MUL operations are resource-disciplined and mapped to DSP blocks where applicable.
  • N / C / O flags are produced deterministically and aligned with commit semantics.
ALU datapath and flags
Figure 5 — Execution core and deterministic flag generation.
Register file architecture
Figure 6 — Register file with write collision bypass behavior.

Register File: Dual-Read, Single-Write with Collision Bypass

  • The register file implements 16×8 storage with 2R1W access.
  • A same-cycle RAW collision path applies write-first bypass behavior.
  • This bypass lowers bubble insertion and reduces hazard pressure.

Memory Subsystem: Synchronous DMEM (Load/Store Boundary)

  • Data memory follows synchronous access discipline for predictable behavior.
  • Latency is deterministic and inference is FPGA-friendly.
  • Load/Store logic remains cleanly separated from compute operations.
Synchronous data memory block
Figure 7 — DMEM organization at the Load/Store boundary.
Instruction memory implementation
Figure 8 — IMEM organization with stable instruction image.

Instruction Fetch: ROM-like IMEM and Stable Program Image

  • Instruction memory exposes a stable program image with synchronous readout.
  • Fetch behavior is consistent with pipeline timing assumptions.
  • The model stays clear for both simulation intent and synthesis mapping.

Waveform-Level Verification: Stage-by-Stage Proof

  • The testbench checks stage propagation, commit timing, and Load/Store correctness.
  • Waveforms provide cycle-accurate visibility into hazards and control decisions.
  • Verification evidence anchors the architectural contract to observed behavior.
CPU testbench waveform results
Figure 9 — Testbench waveforms proving stage-level behavior.
Post-implementation FPGA view
Figure 10 — Post-implementation mapping and locality check.

Physical Realization: Post-Implementation FPGA Mapping

  • Placement and routing were reviewed for congestion, locality, and structural cleanliness.
  • The view confirms the RTL remains coherent after tool transformations.
  • EX depth and control fanout typically bound fMAX.

Second Region View: Routing Density and Floorplan Distribution

  • This second region corroborates routing density and floorplan distribution.
  • It highlights whether the design is tool-hostile or structurally disciplined.
  • The result reinforces a hardware-is-the-truth validation posture.
Second FPGA region routing view
Figure 11 — Additional floorplan region for routing density assessment.

Deliverables

This page summarizes the implementation. The PDF captures full architecture, verification traces, and FPGA implementation evidence.