Bridging the Gap – From Functional Description to RTL Design
The process of transforming a high-level functional description into a Register-Transfer Level (RTL) design is a crucial step in digital hardware design. This transition involves converting abstract functionality into a hardware description language that captures the behavior and structure of the digital circuit.
In this article, we will explore the key steps and considerations involved in the journey from a functional description to RTL, shedding light on the intricacies of this critical phase in hardware development.
Understanding Functional Description:
A functional description serves as the blueprint for the digital system, outlining its intended behavior, functionality, and interactions with the external environment. This phase often involves the use of high-level modeling languages, such as SystemVerilog or VHDL, to describe the system’s functionality through algorithms, data flow, and control logic.
This abstract representation helps designers conceptualize the system’s behavior before diving into detailed RTL design.
The functional specification can be made in a high-level language, for example, C, C++, or MATLAB. At a very high-level language, we can make the specification.
We want to make a specification in a high-level language because there, we have lots of flexibility and can do a lot of experimentation and analyze various trade-offs of the algorithms on the system behavior and other things. Therefore, we want to do an initial analysis or make a Functional Specification in a high-level language where we have lots of flexibility. But there is a downside to this as well.
The downside is that once we make a specification in a high-level language, we need to then take it to an RTL or we need to convert that specification to an RTL here, and that creates what is known as an Implementation Gap.
So, we write a functional specification in C, C++, System C, MATLAB, or it can be Metadata, and finally, we need to convert it into an RTL here, which is typically in Verilog, SystemVerilog, or VHDL, right? So now this conversion needs to be done, and that is what we mean by Implementation Gap.
RTL: General Structure
Let’s understand briefly what an RTL is and what is the structure of a general RTL.
RTL can also be called data flow descriptions; it has got two distinct parts. One is the data path, and the other is the control path.
In the data path, the major portion of computation is done. For example, there are ALUs or large arithmetic logic units that are doing a kind of computation. These units can be Adder, multipliers, or logical units and gates. These kinds of logical operations will be performed in the data path.
How this data moves from moves is decided by a control signal which is generated in the control. The FSM generates these control signals which are on the control path, and a mux passes the data based on these control signals, and we have registers from which data can go. So the information can go to the arithmetic logic units and so on. This is the general structure of an RTL.
Functional Specification to RTL
RTL needs to be modeled in a language or in a framework where both kinds of modeling are allowed, that is, sequential modeling, as well as concurrent modeling, and Hardware description languages, basically provide that facility.
Therefore, RTL is typically modeled using Verilog, VHDL, or system Verilog languages.
We know that there is an implementation Gap when we describe a design in terms of functional specification in a high-level language, and the RTL, which basically describes design in a way in which data moves from register to register.
Now, we fill this Implementation Gap; there are typically three methods in which this implementation Gap can be filled: Manual Coding, IP Assembly, and Behavior Synthesis.
Manual Coding: Given an algorithm, we code an RTL manually and RTL describes which arithmetic operation or which kind of computation is done in which clock cycle and add the timing information and other details in the RTL.
IP Assembly: IP assembly, which is basically reusing an existing RTL. This is typically used in SOC design methodologies. SoC is System-on-chip, a complete system built on a single chip.
Earlier, what we used to have is that we used to have various components, for example, processors separate, memory separate, and various other components separate for a system. We used to integrate them over a board. On a board, we used to connect and get the required functionality.
But with the increasing level of integration now, in a given chip, we can embed all those things together inside one chip. That chip in which we develop various components is integrated; that is known as system one chip or SoC. SoC can consist of Processors, Hardware accelerators, Memory peripherals, Analog components, RF devices, etc. These things are basically connected using some structured communication links, and these associates can also contain embedded software within them.
Merits of SoC design methodology
Improves productivity because what we do in SoC design methods is that we do not design individual components; we reuse the existing components. Reusing the existing common components means that if we have already designed a processor, we take that processor and put it in our design. Then we take another memory component which was pre-designed and put it in the SoC.
So the designer’s task is not designing a complicated uh chip in which each component is designed but only integrating those components. So the task of Designing it reduces to integrating various components in the system level and of course verifying it, right? This is a simpler task than designing each of the components at the individual level. and that is how we get a lot of or improved productivity in SOC-based design.
Lower Cost: Because we take less time to design and many things we reuse. We do not spend much effort on already done things and therefore, it saves cost in designing.
Increases Features: Since we can easily integrate many things on the same chip, we can have many features. The SoC design methodologies allow us to implement very complex features or complicated functionality within a single chip.
IPS are pre-designed and pre-verified subsystems or blocks. For example, it may be a processor or it can be a memory unit or peripherals, or many other kinds of things.
Now these are already pre-designed; this pre-designing can be done either in-house by a company or a subsystem or an IP can be purchased from third-party IP vendors. It can be developed internally or it can be purchased from the IP vendors.
Integration of IPs: When designing an SoC, the major task is to instantiate those IPs and make the connections out of them.
Metadata are the top-level IP models, bus interfaces, ports, registers, and the required configuration that needs to be filled in by the IP integrator, and there are some standards for filling that information. For example, there is IP-XACT, System RDL XML, or even a spreadsheet can be used for filling in that information.
Generated tools produce an SoC-level RTL with the instantiated IPs. The generator tool will read the metadata about which IP to use, how to connect and other things and automatically produce an RTL with the instantiated IP. A generator tool can also produce a verification environment and low-level software driver. This can greatly help the designer and allow us to avoid mistakes in it.
Behavioral synthesis is the process of converting an algorithm that is not timed. It doesn’t carry the timing information meaning which operation needs to be performed in which clock cycle that information is not there to an equivalent RTL design.
RTL which is fully timed means that it is saying how the date computation is done in which clock cycle and so on right and while doing this transformation it must satisfy the specified constraints of resource uses latency and other things.
Behavior synthesis is also called high-level synthesis.
An untimed algorithm can be implemented in many ways for Behavioral synthesis tools. There are more options because there are lots of possibilities in which a given algorithm can be implemented. Now, out of that implementation, which one is better and which one is worse is to be analyzed.
To do that analysis, the Behaviour synthesis tool needs to evaluate the cost of each of the implementations and pick the one with the lowest cost.
1. Area: This can be measured by the number of circuit elements in the RTL.
2. Latency: it is the number of clock cycles required before results are available.
3. Maximum Clock frequency: it can be measured using the worst combinational DeLay.
4. Power dissipation. Throughput etc.