Algorithm

Computing the log base 2 of a whole number is easy: just use a priority encoder. But what about a fixed point number? That sounds a lot harder. But, like many things, there’s a trick to make things easier.

Consider the following equation:

$\log_2{x} = \log_2 (2^n \cdot 2^{-n} \cdot x) = n + \log_2(2^{-n} \cdot x)$

If we choose $n$ such that $2^{-n}\cdot x\in [1,2)$ , then $n$ is just the whole number portion of the log. and we can compute that with a priority encoder. Furthermore, we can compute the $2^{-n}\cdot x$ portion by barrel-shifting the original number. Finally, we can use a lookup table to compute the lower bits. The nice thing about this is that the lookup table can be much smaller, since we only need to store the values between 1 and 2.

Another way to think about this is that the logarithm curve from [2,4) is the same as the one from [1,2), just shifted up by 2 and scaled horizontally by 2. Likewise the curve from [4,8) is scaled the same way. The algorithm is just taking advantage of this symmetry.

Oh– additionally, this algorithm only works on input values greater or equal to one, so the input and output values are always positive.

Implementation

Now, let’s code up our Verilog module. We’ll start with the module declaration:

module log2
 #(
 parameter in_int = 16,
 parameter in_frac = 8,
 parameter out_int = $clog2(in_int),
 parameter out_frac = in_int+in_frac-out_int,
 parameter lut_precision = 6,
 parameter register_output_stage = 0
 )

Since the module is dealing with a fixed-point input and output value, we need to specify how many integer and fractional bits there are in the input and output values. The lut_precision parameter specifies the log base 2 of the number of entries in our lookup table, and the default setting will be 64 entries in the table. There is also a parameter which allows for an optional final output register stage.

Next come the module ports:

  (
   input clk,
   input [in_int+in_frac-1:0] din,
   input din_valid,             // data in is valid
   output din_ready,            // ready to receive data
   output reg [out_int+out_frac-1:0] dout,
   output reg dout_valid,       // data out is ready (valid)
   output reg dout_error        // data out is incorrect - input data was less than 1.0
   );

We have the clock clk and the input value din. A handshake input indicating that the din value is valid. There is an output signal din_ready indicating that the module is ready to accept an input value. There is, of course, the output value dout, and a valid signal dout_valid. Since it is possible to provide an illegal input between 0 and 1, there is also a dout_error signal indicating the output value is not valid due to an inadmissible input value.
Now, let’s look at the body of the module:

  assign din_ready = 1;

Ready? Please. We’re always ready.

Pipeline Stage 1

Next, we instantiate the recursive priority encoder from my previous blog post. This takes the integer portion of the input value and produces the integer portion of the output value. There is also a prienc_error output which indicates that none of the input bits were set. This means we were given an input value that was less than one.

  wire [$clog2(in_int)-1:0] prienc_out;
  wire prienc_error;
  
  priencr #(in_int)
  priencr
    (
     .decode(din[in_frac+:in_int]),
     .encode(prienc_out),
     .valid(prienc_error)
     );

Next, we have the stage one pipeline logic:

  reg [$clog2(in_int)-1:0] stage1_prienc_out;
  reg stage1_error;
  reg stage1_valid;
  reg [in_int+in_frac-1:0] stage1_din;
  
  always @(posedge clk)
    begin
      stage1_din <= din;
      stage1_prienc_out <= prienc_out;
      stage1_error <= prienc_error;
      stage1_valid <= din_valid;
    end

Pipeline Stage 2

Stage two of the pipeline is the barrel shift logic. This shifts the input value to the left, based on the priority encoder output. Things are flipped, though, so lower priority encoder output values cause a larger shift. We’ll also pipeline the other signals:

  reg [in_int+in_frac-1:0] stage2_barrel_out;
  reg [out_int-1:0] stage2_barrel_out_int;
  reg stage2_error;
  reg stage2_valid;
  always @(posedge clk)
    begin
      stage2_barrel_out <= stage1_din << (in_int-stage1_prienc_out-1);
      stage2_barrel_out_int <= stage1_prienc_out;
      stage2_error <= stage1_error;
      stage2_valid <= stage1_valid;
    end

Pipeline Stage 3

The third pipeline stage is the lookup table, which computes the fractional part of the output. We will use an initial block to fill the table. First, we define a function to take the floating point log base-2 of an input value, since Verilog does not have one built in. Remember this rule from high school algebra?

  function real rlog2;
    input real x;
    begin
      rlog2 = $ln(x)/$ln(2);
    end
  endfunction

Next we declare the table and fill it in an initial block:

  reg [out_frac-1:0] lut[(1<<lut_precision)-1:0];
  integer i;
  initial
    begin : init_lut
      lut[0] = 0;
      for (i=1; i<1<<lut_precision; i=i+1)
        lut[i] = $rtoi(rlog2(1.0+$itor(i)/$itor(1<<lut_precision))*$itor(1<<out_frac)+0.5);
    end

Now, we need to use the barrel shift output as an address to the lookup table. We also carry along some of the stage 2 results:

  reg [out_frac-1:0] stage3_lut_out;
  reg [out_int-1:0] stage3_out_int;
  reg stage3_error;
  reg stage3_valid;
  always @(posedge clk)
    begin
      stage3_out_int <= stage2_barrel_out_int;
      stage3_lut_out <= lut[stage2_barrel_out[in_int+in_frac-2-:lut_precision]];
      stage3_error <= stage2_error;
      stage3_valid <= stage2_valid;
    end

Finally we have the code for the optional stage 4 pipeline. This is controlled by the register_output_stage parameter:

  generate
    if (register_output_stage)
      begin
        always @(posedge clk) dout = {stage3_out_int,stage3_lut_out};
        always @(posedge clk) dout_error = stage3_error;
        always @(posedge clk) dout_valid = stage3_valid;
      end
    else
      begin
        always @(*) dout = {stage3_out_int,stage3_lut_out};
        always @(*) dout_error = stage3_error;
        always @(*) dout_valid = stage3_valid;
      end
    endgenerate

Caveats

So there you have it: a module that computes log base 2 for fixed point inputs. However, there ‘s still one tiny problem: the design is not synthesizable by XST or Vivado. I haven’t tried other tools, but they may have issues with it as well.

The issue is that the tools don’t seem to implement the built-in Verilog floating point functions during elaboration. Essentially, the synthesis tool needs to run the initial block in order to know how to populate the lookup table. In general, the tools can do this. However, if they catch wind of a float value, they tuck their tail between their legs and run.

My workaround for this is to use the Perl preprocessor I describe in a previous blog post. But that’s a topic for another time.

3 thoughts on “Computing the Logarithm Base 2”

Aaron Ferrucci on October 29, 2015 at 10:15 am said:

Sorry to report: I tried compiling your code in a recent version of Quartus, and got this error:

Error (10172): Verilog HDL unsupported feature error at log2.sv(98): real variable data type values are not supported

(That’s this line:
function real rlog2;
)
Pete Johnson on October 29, 2015 at 10:21 am said:

I’m not surprised. I think there is this thinking in the software engineers of “we don’t support reals for synthesis so if you see a real function error our”. But during the initialization phase they really only need to implement the functions as software. At that point it’s not synthesis and hardware it’s just running some code to get an initial value. It should be easy to implement. This is a really useful feature for implementing a lot of math functions.

-Pete
Pingback: Getting the Log Base 2 Algorithm to Synthesize | Beyond Circuits

Beyond Circuits

Thoughts on logic design that goes beyond circuits.

Computing the Logarithm Base 2