# Algorithm

Computing the log base 2 of a whole number is easy: just use a priority encoder. But what about a fixed point number? That sounds a lot harder. But, like many things, there’s a trick to make things easier.

Consider the following equation:

If we choose such that , then is just the whole number portion of the log. and we can compute that with a priority encoder. Furthermore, we can compute the portion by barrel-shifting the original number. Finally, we can use a lookup table to compute the lower bits. The nice thing about this is that the lookup table can be much smaller, since we only need to store the values between 1 and 2.

Another way to think about this is that the logarithm curve from [2,4) is the same as the one from [1,2), just shifted up by 2 and scaled horizontally by 2. Likewise the curve from [4,8) is scaled the same way. The algorithm is just taking advantage of this symmetry.

Oh– additionally, this algorithm only works on input values greater or equal to one, so the input and output values are always positive.

# Implementation

Now, let’s code up our Verilog module. We’ll start with the module declaration:

module log2
#(
parameter in_int = 16,
parameter in_frac = 8,
parameter out_int = $clog2(in_int),
parameter out_frac = in_int+in_frac-out_int,
parameter lut_precision = 6,
parameter register_output_stage = 0
)

Since the module is dealing with a fixed-point input and output value, we need to specify how many integer and fractional bits there are in the input and output values. The *lut_precision* parameter specifies the log base 2 of the number of entries in our lookup table, and the default setting will be 64 entries in the table. There is also a parameter which allows for an optional final output register stage.

Next come the module ports:

(
input clk,
input [in_int+in_frac-1:0] din,
input din_valid, // data in is valid
output din_ready, // ready to receive data
output reg [out_int+out_frac-1:0] dout,
output reg dout_valid, // data out is ready (valid)
output reg dout_error // data out is incorrect - input data was less than 1.0
);

We have the clock *clk* and the input value *din*. A handshake input indicating that the *din* value is valid. There is an output signal *din_ready* indicating that the module is ready to accept an input value. There is, of course, the output value *dout*, and a valid signal dout_valid. Since it is possible to provide an illegal input between 0 and 1, there is also a *dout_error* signal indicating the output value is not valid due to an inadmissible input value.

Now, let’s look at the body of the module:

assign din_ready = 1;

Ready? Please. We’re always ready.

## Pipeline Stage 1

Next, we instantiate the recursive priority encoder from my previous blog post. This takes the integer portion of the input value and produces the integer portion of the output value. There is also a *prienc_error* output which indicates that none of the input bits were set. This means we were given an input value that was less than one.

wire [$clog2(in_int)-1:0] prienc_out;
wire prienc_error;
priencr #(in_int)
priencr
(
.decode(din[in_frac+:in_int]),
.encode(prienc_out),
.valid(prienc_error)
);

Next, we have the stage one pipeline logic:

reg [$clog2(in_int)-1:0] stage1_prienc_out;
reg stage1_error;
reg stage1_valid;
reg [in_int+in_frac-1:0] stage1_din;
always @(posedge clk)
begin
stage1_din <= din;
stage1_prienc_out <= prienc_out;
stage1_error <= prienc_error;
stage1_valid <= din_valid;
end

## Pipeline Stage 2

Stage two of the pipeline is the barrel shift logic. This shifts the input value to the left, based on the priority encoder output. Things are flipped, though, so lower priority encoder output values cause a larger shift. We’ll also pipeline the other signals:

reg [in_int+in_frac-1:0] stage2_barrel_out;
reg [out_int-1:0] stage2_barrel_out_int;
reg stage2_error;
reg stage2_valid;
always @(posedge clk)
begin
stage2_barrel_out <= stage1_din << (in_int-stage1_prienc_out-1);
stage2_barrel_out_int <= stage1_prienc_out;
stage2_error <= stage1_error;
stage2_valid <= stage1_valid;
end

## Pipeline Stage 3

The third pipeline stage is the lookup table, which computes the fractional part of the output. We will use an initial block to fill the table. First, we define a function to take the floating point log base-2 of an input value, since Verilog does not have one built in. Remember this rule from high school algebra?

function real rlog2;
input real x;
begin
rlog2 = $ln(x)/$ln(2);
end
endfunction

Next we declare the table and fill it in an initial block:

reg [out_frac-1:0] lut[(1<<lut_precision)-1:0];
integer i;
initial
begin : init_lut
lut[0] = 0;
for (i=1; i<1<<lut_precision; i=i+1)
lut[i] = $rtoi(rlog2(1.0+$itor(i)/$itor(1<<lut_precision))*$itor(1<<out_frac)+0.5);
end

Now, we need to use the barrel shift output as an address to the lookup table. We also carry along some of the stage 2 results:

reg [out_frac-1:0] stage3_lut_out;
reg [out_int-1:0] stage3_out_int;
reg stage3_error;
reg stage3_valid;
always @(posedge clk)
begin
stage3_out_int <= stage2_barrel_out_int;
stage3_lut_out <= lut[stage2_barrel_out[in_int+in_frac-2-:lut_precision]];
stage3_error <= stage2_error;
stage3_valid <= stage2_valid;
end

Finally we have the code for the optional stage 4 pipeline. This is controlled by the *register_output_stage* parameter:

generate
if (register_output_stage)
begin
always @(posedge clk) dout = {stage3_out_int,stage3_lut_out};
always @(posedge clk) dout_error = stage3_error;
always @(posedge clk) dout_valid = stage3_valid;
end
else
begin
always @(*) dout = {stage3_out_int,stage3_lut_out};
always @(*) dout_error = stage3_error;
always @(*) dout_valid = stage3_valid;
end
endgenerate

# Caveats

So there you have it: a module that computes log base 2 for fixed point inputs. However, there ‘s still one tiny problem: the design is not synthesizable by XST or Vivado. I haven’t tried other tools, but they may have issues with it as well.

The issue is that the tools don’t seem to implement the built-in Verilog floating point functions during elaboration. Essentially, the synthesis tool needs to run the initial block in order to know how to populate the lookup table. In general, the tools can do this. However, if they catch wind of a float value, they tuck their tail between their legs and run.

My workaround for this is to use the Perl preprocessor I describe in a previous blog post. But that’s a topic for another time.