Sunday, March 19, 2017

RAM HDL Coding Technique (Block RAM And Distributed RAM For FPGA)

Memories are the storing element which can store data. RAM is one type of memory which can store data as long as there is power. In FPGA there is limited LUTs and CLBs and use them as a large memory is not good idea. FPGA makers like Xilinx, Altera has put some memory blocks in FPGA, so that it can be use for large memory instantiation. 

In good practice, for comparatively large memory like code ram or data ram (Eg. 512KB or 1MB) are made of FPGA Memory Blocks (called Block RAM) and for small memory like 2KB or less are made from FPGA Logic Blocks (also called distributed RAM). 

A simple HDL code like use 2-D array of registers can be use as a memory.
For eg.
reg [31:0]  mem [1023:0]

But 
Will synthesis tool infer it block ram or distributed ram ? 
Is it possible to configure it block ram not distributed ram only or via-versa ? 

The answer is yes, there is specific coding technique for inferring block ram and distributed ram.
There is two way to WRITE into MEMORY - Synchronously and Asynchronously like wise two way to READ from MEMORY - Synchronously and Asynchronously.

For Block RAM coding - Write and Read both should be Synchronous.
For Distributed RAM coding -  Write should synchronous and Read should Asynchronous.

Check the following example for Block RAM and Distributed RAM.

Block RAM Coding in Verilog

module block_ram (clk, addr,write_data, write_en, read_data);
input             clk ;
input [15:0]  addr ;
input [31:0]  write_data ;
input             write_en ;
output [31:0] read_data ;

reg [31:0] mem [65535:0] ;
reg [31:0] r_data ;

//Write into Memory
always @ (posedge clk)
begin
  if (write_en)
    mem[addr] <= write_data ;
end 

//Read from Memory
always @ (posedge clk)
begin
    r_data <= mem[addr] ;
end 
assign read_data = r_data ;

endmodule


Distributed RAM Coding in Verilog

module distributed_ram (clk, addr,write_data, write_en, read_data);
input             clk ;
input [15:0]  addr ;
input [31:0]  write_data ;
input             write_en ;
output [31:0] read_data ;

reg [31:0] mem [65535:0] ;

//Write into Memory
always @ (posedge clk)
begin
  if (write_en)
    mem[addr] <= write_data ;
end 

//Read from Memory
assign  read_data = mem[addr] ;

endmodule


Above memory has only "write_en" signal, which write 'complete width' into memory. For example from above code, when there is 'write_en' signal, it write complete 32 bit 'write_data' into memory.
Now there may be requirement for bit wise write or byte wise write into the memory, but above example write complete width at once.

There is following difference in Distributed and Block Ram while writing bit/byte in memory.

Distributed RAM
1. Data width - No Limitation.
2. Bit and Byte write possible.

Block RAM
1. Data Width - multiple of 8 bit only. (some fpga support multiple of 9 bit also.)
2. Only Byte write possible.

Here the example code for both RAM.


Block RAM Coding in Verilog (With Byte Enable)

module block_ram (clk, addr,write_data, write_en, byte_en, read_data);
input             clk ;
input [15:0]  addr ;
input [31:0]  write_data ;
input [3:0]    byte_en ;
input             write_en ;
output [31:0] read_data ;

reg [31:0] mem [65535:0] ;
reg [31:0] r_data ;

//Write into Memory
always @ (posedge clk)
begin
  if (write_en)
    mem[addr][8*byte_en +: 8] <= write_data[8*byte_en +: 8] ;
end 

//Read from Memory
always @ (posedge clk)
begin
    r_data <= mem[addr] ;
end 
assign read_data = r_data ;

endmodule


Distributed RAM Coding in Verilog (With Bit Enable)

module distributed_ram (clk, addr,write_data, write_en, bit_en, read_data);
input             clk ;
input [15:0]  addr ;
input [31:0]  write_data ;
input [31:0]    bit_en ;
input             write_en ;
output [31:0] read_data ;

reg [31:0] mem [65535:0] ;


reg [31:0] data_real ;
reg [31:0] data_rmw ;

// read - modify - write
genvar j;

generate for (j = 0; j < DATA_WIDTH; j = j + 1) begin : g_we
   assign data_real[j] = write_en ? (bit_en[j]) : 1'b0;
end endgenerate

// Implement the bit write by read - modify - write
assign data_rmw = (~data_real & mem[addr]) | ( data_real & write_data);

// DATA WRITE
always  @  (posedge  clk)  begin
if (write_en) begin
mem [addr]  <=  data_rmw ;
end
end

// DATA READ
assign read_data = mem[addr] ;

endmodule