Introduction to Direct Digital Synthesis

Direct Digital Synthesis is a digitally-controlled method of producing analog waveforms with multiple frequencies from a reference source. In this post we are going to take a closer look at the theory behind direct digital synthesis and how to generate digitally a sine-wave using a FPGA and Xilinx’s DDS-core. If you know already all the theory behind the operation of a DDS you can skip the first part 🙂 !

Some fundamentals

In the figure below, a simplified architecture can be found of a DDS-system. The three main components are the phase accumulator, the phase-to-amplitude converter and a Digital to Analog Converter (DAC).

The frequency of the output waveform can be controlled by two variables, those are:

  • The reference clock frequency (the input clock signal that you feed into the DDS).
  • The variable programmed into the delta phase register, called the tuning word or M.

The phase accumulator is the most important part in the system, every clock cycle a number called the tuning word is added to the phase accumulator register. The truncated output of the phase accumulator serves as the address to the sine-lookup table and each address in the lookup table corresponds to a phase point on the sine wave from 0° to 360°. Therefore, the lookup table contains the corresponding digital amplitude information for one complete cycle of the wave, and it maps the phase information into a digital amplitude word.

The tuning word provided to the phase accumulator can be rather large or small:

  • Large phase increment: the phase accumulator will step quickly trough the sine look-up table and thus generating a high frequency wave.
  • Small phase increment: the phase accumulator will take more steps and generates a low frequency wave.

In the previous paragraph, the term “truncated output” was used, let’s clarify this term. Phase truncation occurs at the interface between the phase accumulator and the phase-to-amplitude converter. It means that only a subset of the bits at the output of the phase accumulator appear at the input of the phase-to-amplitude converter. For example if we have a DDS that uses a 32-bit accumulator, only the 16 most significant bits get passed along the phase-to-amplitude converter. This is used to reduce the power consumption and complexity of the phase-to-amplitude converter and has no impact on the frequency resolution of the DDS.

Import to know is that a DDS-system is a sampled data system. This means that all the issues involved in sampling must be considered like quantization noise, aliasing, filtering, ….

If you want some more in-depth theory, I recommend reading the following paper (this was also my main source of information for this tutorial).

The DDS Compiler core from Xilinx

The datasheet for the DDS compiler core from Xilinx can be found via the following link. By default, the standard mode of the DDS compiler uses phase truncation and can be seen in the following figure:

Components D1 and A1 form an integrator, which computes a phase slope that is mapped to a sinusoidal by the lookup table T1. The Quantizer Q1, which truncates the phase angle and generates a lower precision representation of the angle. This value is fed into the address port of the lookup table  that performs the mapping from phase-space to time.

The output frequency can be calculated with the following formula:


  • fout is the desired frequency of the wave.
  • fclk is equal to the system clock.
  • the phase increment value, Δθ.
  • Bθ(n) is equal to the phase width, that is the number of bits in the phase accumulator.

When we need to calculate the phase increment value that is required to generate an output frequency of a certain amount of Hertz, the following formula is used:

Designing with the core


For this implementation, we would like to generate an output wave with a frequency of 1 MHz using a 16-bit width output. The first thing we are going to do is creating a block diagram in Vivado and initialize a clocking wizard and a processor system reset. The clock wizard will provide a stable clock signal of 10 MHz for the DDS. The DDS core itself is initialized with the following configuration:

We need to calculate the phase increment value with the previous mentioned formula:

Since we know the reference input clock for the DDS, the desired output frequency of the waveform and the output length (in bits), a small calculation in Matlab will result in the following:

After calculation the phase increment, lets enter this in it’s 16-bit variant into the phase angle increment values of the configuration screen:

After all of this, the resulting board design is depicted in the figure beneath:

Everything is ready to go, just create a wrapper of the board design and we can start on writing a testbench for the system to see if everything went well and all configurations match up.

The testbench

The code for the testbench is provided below:

-- Company: /
-- Engineer: Levi Marien
-- Description: Testbench for the DDS-system

library IEEE;
use IEEE.STD_LOGIC_1164.all;

entity tb_dds_wrapper is
-- port ();
end tb_dds_wrapper;

architecture Behavioral of tb_dds_wrapper is
  -- component declaration
  component dds_wrapper
    port (
      aclken             : in  std_logic;
      m_axis_data_tdata  : out std_logic_vector (15 downto 0);
      m_axis_data_tready : in  std_logic;
      m_axis_data_tvalid : out std_logic;
      reset              : in  std_logic;
      sys_clock          : in  std_logic
  end component;

  -- signal declaration
  signal sys_clock : std_logic := '0';
  signal reset     : std_logic;
  signal aclken    : std_logic;

  signal m_axis_data_tdata  : std_logic_vector(15 downto 0);
  signal m_axis_data_tready : std_logic;
  signal m_axis_data_tvalid : std_logic;

  -- procedure to generate a clock signal
  procedure clk_gen(signal i_clk : out std_logic; constant freq : real) is
    constant period    : time := 1 sec / freq;        -- full period
    constant high_time : time := period / 2;          -- high time
    constant low_time  : time := period - high_time;  -- low time; always >= high_time
    -- check the arguments
    assert (high_time /= 0 fs) report "clk_plain: high time is zero; time resolution to large for frequency" severity failure;
    -- generate a clock cycle
      i_clk <= '1';
      wait for high_time;
      i_clk <= '0';
      wait for low_time;
    end loop;
  end procedure;


  -- Clock generation with concurrent procedure call
  clk_gen(sys_clock, 100.00E6);
  -- Time resolution show
  assert false report "Time resolution: " & time'image(time'succ(0 fs)) severity note;

  dut : dds_wrapper
    port map(
      sys_clock          => sys_clock,
      reset              => reset,
      aclken             => aclken,
      m_axis_data_tdata  => m_axis_data_tdata,
      m_axis_data_tready => m_axis_data_tready,
      m_axis_data_tvalid => m_axis_data_tvalid);

  stimuli : process

    -- EDIT Adapt initialization as needed
    reset              <= '1';
    aclken             <= '0';
    m_axis_data_tready <= '0';

    -- Reset generation
    reset <= '0';
    wait for 100 ns;
    reset <= '1';

    -- EDIT Add stimuli here
    wait for 6500 ns;                   -- wait until the clock is stable

    aclken             <= '1';
    m_axis_data_tready <= '1';

  end process stimuli;
end Behavioral;

The testbench will first reset the whole system and then waits for 6500 ns (this is the time needed to drive the locked signal high, indicating that the 10 MHz clock from our clock wizard is stable). At this point in time the reference clock signal to the DDS is stable and we can turn on the active-enabled signal (aclken) and the m_axis_data_tready signal, indicating we are ready to receive the output of the DDS. After the DDS core performed it’s calculations and the data is valid, the core will set the m_axis_data_tvalid signal high indicating the output data is valid. When the valid signal is high the DDS will provide a valid output and a waveform is created:

It’s clearly visible by the markers that the period of the wave is equal to 1000 nano seconds, a small conversion to frequency (1/1000 ns) will indicate that the output frequency is equal to 1 MHz, so we meat our design constraints.

The code and project are available on Github using the following link. If there are any questions or unclarities, let me know in the comments 😉 !

Happy  Desinging!

Leave a Reply

Your email address will not be published. Required fields are marked *