A PAC Tutorial – Packet Architects AB

In this tutorial we will be looking at some of the language features and write a short application that performs some basic operations on an Ethernet packet.

The PAC main() function is executed once for each packet The main() function has packet data as input parameters and produces information as output parameters that will be used by subsequent blocks in the design. The function can also modified the packet data. From a programming perspective the PAC code can be regarded as if it was executing on a sequential processor. The hardware implementation of the PAC application will be an automatically created pipeline where many packets will be processed simultaneously at wirespeed. There are hardware aspects that you might have to take into account when writing a PAC application but it will not be covered in this tutorial.

The switch architecture separates the packet processing into ingress processing and egress processing. There is therefore correspondingly two PAC applications that each have a main() function and that are executed independently. The inputs and outputs of each processing function is determined by the adjacent blocks in the architecture.

Language fundaments

PAC code has a structure that is similar to C.

if ( match == 1 ) {
// There is a match
dstQueue = destQ;
} else {
// No match. Drop packet!
pktDrop = 1;
}

One difference to the C language is the bit field data type that is similar to SystemVerilog’s. This allows very concise descriptions since bit fields are fundamental to packet processing.

bit:4 small;
bit:200 big;
bit:50 part1;
bit:150 part2;

small = big<15:12>; // the selected bits are extracted and assigned
{ part1, part2 } = big; // splitting the bits of variable

struct ethernet_header {
bit:48 da;
bit:48 sa;
bit:16 tlen;
};

ethernet_header ehead;
dest = ehead.da;

There is usually a need to have a processor that control the switch. To easily implement control and status registers for the processor there is a reg datatype. It is similar to a struct but it also tells the FlexSwitch generator that this variable should be accessible from the processor interface. The generator can also automatically create documentation for all registers.

reg enable_switch {
bit:1 enabled;
} enable_register;

Essential to packet processing is having access to tables typically for holding forwarding information. The ram datatype is used to create tables that can be accessed as easily as C array.

Ingress processing

The ingress processing gets executed for each packet that is received on the ingress interface (usually from the MAC’s in the Ethernet world). The processing should decode the packet headers and determine where the packet shall be sent.

Here is an outline of the main() ingress processing function. The interface to the surrounding blocks in the architecture is described as input/output parameters to the main function.

void main( // Incoming packet details
input t_port srcPort,
input t_cell srcPkt,
input t_resultSize srcBytes,

// Signals used by accepted packet
output t_cell dstPkt,
output t_queue dstQueue,
output t_mask dstPortmask,
output t_resultSize dstBytes,

// Signals used by drop packet
output bool dropPkt,
output byte dropId)
{
// Place code here.
}

A ingress main() function has a few input parameters such as the packet data itself, the source port, the number of bytes in the first cell of the packet and the packet data itself. The output from the ingress processing is the new updated packet, the destination ports (as a port mask), the destination queue for the ports, and finally the data itself and how many bytes in the cell.

There is also a drop signaling if the packet shall be dropped. This enables the ingress packet processing to drop a packet along with a dropId that specifies which counter to update when the dropPkt is high. The drop counters are used for system debugging to determine where packets are dropped and for what reason.

Let’s start by defining a control register that a processor can use to configure the packet processing. In this case we define a register that can enable/disable each of the switch’s ports.

// A register for each port
reg portsEnabled {
bit:1 enabled;
} prtsEnabled[c_SrcPort];

Packet processing normally requires some forwarding tables to determine where to send packets. So lets define a table.

// A ram is going to be used as the lookup table
ram lookupTab {
bit:48 address; // the real address
bit:c_SrcPort destMask; // the destination mask
bit:3 destQueue; // The destination queue.
} lookupTable[4096][4];

This defines a small table of 4096 * 4 entries. The table consists of 4 RAM’s each having 4096 entries/addresses. We will later explain why we need this table structure.

Now to do packet processing we need to be able to access the data in the table. First we declare some variables that can hold table data.

lookupTab result_1,result_2; // Automatically a struct named lookupTab is
// created from definition of the table
int index_x,index_y;

From a programmer’s perspective the table is simply a matrix which can be read and written to as any memory so this is how to read from the table.

result_1 = lookupTable[1240][1]; // Read out entry 1240 from memory #1,

index_x = 234;
index_y = 2;

result_2 = lookupTable[index_x][index_y]; // Readout entry 234 in memory #2.

Next step is to look at the packet decoding. Packet Architects has defined a number of functions to make it as simple as possible to do packet decoding. In this tutorial we’re going to use Ethernet packets. An Ethernet packet has first a 48 bit destination address followed by a 48 bit source address then a 16 bit number which is used either as a length field or as a definition of what protocols comes later in the packet data.

To use the predefined functions for reading packet data we must first declare a pointer data structure.

rdPnt rdPt; // Read pointer structure.

// Here we will put the extracted protocol headers.
bit:48 da;
bit:48 sa;
bit:16 type;

To read packet data we use the pktRead() function. It reads a number of bytes defined by the first parameter and then updates the pointer. Each pktRead will start reading at the byte after the previous pktRead().

// Read the packet data.
da = pktRead(6,rdPt);
sa = pktRead(6,rdPt);
type = pktRead(2,rdPt);

Besides the pktRead there is a library ready to use for reading, jumping over certain sections of the packet and reading out interesting fields. On the outgoing packet there are also packet write instructions such as pktWrite (write packet data unchanged), pktOverWrite (write over existing packet data), pktInsert (insert new data into a packet and make the packet correspondingly longer) and finally pktDel (to delete packet data).

Now back to the example. A very common operation in packet decoding is to access parts of a variable, i.e. bit fields. This is done by selecting the fields using the <n:m> operator:

bit:48 da;
bit:24 manuf;
bit:24 id;

manuf = da<47:24>;
id = da<23:0>;

We’ve seen how to use a table as an ordinary vector. But in a real Ethernet switch the address table can’t be stored directly in a vector because the Ethernet address is 48 bits so the table would be too large to fit in any device. Most switches therefore implement the address table using a hash table. The hash search is done by first calculating a hash key from the 48-bit Ethernet address. The key is then used as address to readout all the entries in parallel from multiple memories. The address table contains the Ethernet address of each entry so the packets Ethernet address is compared to each of the entries read out and if a match is found we have found the corresponding address entry.

Here is the code to calculate the hash key from the Ethernet address.

// Folding the 48 bit address to a hash key of 12 bits
bit:12 hashkey;
hashkey = da<47:36> ^ da<35:24> ^ da<23:12> ^ da<11:0>;

Each address table entry also contains a destination mask which is a bit mask where each bit corresponds to one port. The packet will be sent out on the ports which have a bit set in the mask. The address table entry also contains a destination port which tells the design where the packet is going to be sent out. The priority queue on the destination port can be selected from the incomings packets priority field in the VLAN.

Now lets put it all together into a the complete application to illustrate how simple yet powerful the PAC language is.

// A register for each port
reg portsEnabled {
bit:1 enabled;
} prtsEnabled[c_SrcPort];

// A ram is going to be used as the lookup table
ram lookupTab {
bit:48 address; // the Ethernet address
bit:c_SrcPort destMask; // the destination mask
bit:3 destQueue; // The destination queue.
} lookupTable[4096][4];

void main( // Incoming packet details
input t_port srcPort,
input t_cell srcPkt,
input t_resultSize srcBytes,

// Signals used by acceptable packet
output t_cell dstPkt,
output t_queue dstQueue,
output t_mask dstPortmask,
output t_resultSize dstBytes,

// Signals used by dropped packet
output bool dropPkt,
output byte dropId)
{
// Variable declarations
bit:12 addres; // The address to read out in the table/memories.
rdPnt rdPt; // Read pointer structure.

bit match; // used to determine if there is a match!
bit:c_SrcPort destPM; // Destination ports
bit:3 destQueue; // Destination queue

bit:48 da;
bit:48 sa;
bit:16 type;

// Check if port is enabled.
if ( prsEnable[srcPort] == 0) {
dropPkt = 1;
dropId = 0;
} else {

da = pktRead(6,rdPt);
sa = pktRead(6,rdPt);
type = pktRead(2,rdPt);
hashkey = da<47:36> ^ da<35:24> ^ da<23:12> ^ da<11:0>;
match = 0;
// Check all memories if there is a match!
for(int i=0;i<3 && match==0;i++) {
result = lookupTable[hashkey][0];
if(result.address == da) {
match =1;
destPM = result.address;
destQ = result.destQueue;
}
if(match==1) {
// There is a match
dstQueue = destQ;
dstPortmask = destPM;
} else {
// No match. Drop packet!
pktDrop =1;
dropId = 1;
}
}
}
}

Egress processing

The egress processing of PAC is located after the packet has been scheduled from the buffer memory. It allows the egress packet processing to re-queue a packet into a different queue or to drop the packet.

void main(// Incoming packet details
input t_port fromPort,
input t_resultSize fromBytes,
input t_queue fromQueue;
input t_queueInfo fromQueueInfo

// Signals used by writing out packet
output t_cell dstPkt,
output t_resultSize dstBytes,

// Signals which is used by egress
// to re-queue a packet to another egress port.
output bit dstPortValid,
output t_port dstPort,
output t_queue dstQueue,
output t_egressInfo dstQueueInfo,

// Signals used by drop packet at egress.
output bool dropPkt,
output byte dropId)
{
// Code here.
}

The input data is which port the packet came from, how many bytes the packet is, which queue it came from along with a user definable field called t_queueInfo. This is used if a egress packet needs to tell itself where this packet was before. This is useful for IP multicast along with output mirroring.

To re-queue a packet, the dstPortValid, dstPort, dstQueue and dstQueueInfo is used, this tells the queue engine to re-queue the packet to the queue & port which this information points to. A packet will be both sent out and re-queued if needed. If the packet should only be re-queued the drop signals can be used.

The egress allows the packet to be dropped by the same mechanism as the ingress along with a dropping counter.

This should have gives you as a user a good introduction into what our PAC language offers. We strongly believe that not a single line of verilog/VHDL code shall ever be written for packet processing again!