Introduction
This manual describes the Saturn Vector Unit, a parameterized and extensible vector microarchitecture executing the RISC-V vector extension. Saturn was developed to address an academic need for a representative, compliant, and flexible generator of RISC-V vector units targeting deployment in domain-specialized cores. Saturn is implemented as a parameterized Chisel RTL generator, enabling a range of possible Saturn configurations across many target deployment scenarios. This document discusses the microarchitectural details of all Saturn components.
-
Chapter 1 describes the motivation for Saturn and compares Saturn’s design approach to those of existing data-parallel microarchitecture archetypes
-
Chapter 2 discusses the system organization of Saturn
-
Chapter 3 describes the microarchitecture of Saturn’s vector frontend unit
-
Chapter 4 describes the microarchitecture of Saturn’s vector load-store unit
-
Chapter 5 describes the microarchitecture of Saturn’s datapath and vector instruction sequencers
-
Chapter 6 provides guidance on writing efficient vector code for Saturn
-
Chapter 7 discusses the historical context of Saturn within past academic vector units
Objectives
Saturn was developed with the following objectives:
-
Provide a representative baseline implementation of the RISC-V Vector specification
-
Support full compliance with the complete RVV specification, including support for virtual memory and precise faults
-
Target performant ASIC implementations, rather than FPGA deployments
-
Be sufficiently parameterized to support configurations across a wide power/performance/area design space
-
Demonstrate efficient scheduling of vector operations on a microarchitecture with a short hardware vector length
-
Implement a SIMD-style microarchitecture, comparable to existing SIMD datapaths in DSP microarchitectures
-
Integrate with existing efficient area-compact scalar cores, rather than high-IPC general-purpose cores
-
Support extensibility with custom vector instructions, functional units, and accelerators that leverage the baseline capability in the standard RVV ISA
-
Target deployment as part of a DSP core or similarly domain-specialized core, instead of general-purpose systems
Questions, bug reports, or requests for further documentation can be made to jzh@berkeley.edu.
1. Background
This chapter presents a background discussion of deployment scenarios for data-parallel systems and the dominant architectures in the commercial space for such systems. Saturn, as an implementation of a modern scalable vector ISA targeting deployment in specialized cores, fills an underexplored niche in the space of open-source data-parallel microarchitectures. A comparison of Saturn’s microarchitecture philosophy to alternative vector approaches is discussed.
1.1. Data-parallel Workloads
We broadly categorize the range of deployment scenarios and workloads for programmable data-parallel systems into four domains. These categories are differentiated by power/area constraints and workload characteristics. Figure 2 outlines the four deployment domains where DLP-enabled architectures are critical, and the salient architectural and microarchitectural archetypes that have come to dominate each domain.