Implementing 'grep' on an FPGA.

FPGA projects on this site, or abroad

Implementing 'grep' on an FPGA.

Postby hamster » Sun Jan 23, 2011 11:33 pm

Hi,

On my Xeon @2.7GHz I can grep at 13MB/s looking for three patterns, but even without search optimisations I could think I could achieve one byte char per clock cycle on my FPGA

I've hand-coded something that looks for 18 words in a byte stream, and found it to be workable but very tedious.

Has anybody implemented something like 'grep' on an FPGA? - After all, grep just changes the regular expressions into a state machine, and FPGAs are great with state machines!

Maybe something somebody has written a code generator for this, removing all the drudge?

Mike
hamster
 
Posts: 38
Joined: Mon Sep 27, 2010 10:19 pm

Postby Jotta » Wed Jan 26, 2011 10:29 pm

Hi,

in the simplest case Pattern=Keyword, it should not be to difficult
to create a FSM (also for >1 keywords), even with a simple
software script. The only problem for >1 keywords is to handle
more than 1 hit (>1 keywords are identical).

But I guess you mean Pattern=RegularExpression. I've never
found such a tool, so I think you have to go the hard way and
write a generator and transform the resulting FSM into VHDL/Verilog.

I wrote a simple Generator for regexs long ago, a good source
is e.g, Hopcraft/Ullman "Introd. to Automata Theory.." or
other CompilerDesign books.

Good Luck
Jotta
 
Posts: 3
Joined: Thu Sep 30, 2010 12:11 pm

Implementing 'grep' on an FPGA.

Postby hamster » Thu Jan 27, 2011 3:46 am

Hi, thanks for you kind words.

I've done a 'hand coded' version for matching my list of 20 or so keywords, and although it processes each byte in a single cycle it is pretty hefty, at about 300 4LUTs (or about 4LUTS per keyword of each character).

I'm going to recode it to be table driven (each row being ["Char to match", "State # on Match","State on no-match", "output signals"]), which will be slower, but the table could be held in a BRAM, giving just an 8 bit comparator and a MUX to select which state to move to next.

As ever, it's the old speed, size, complexity trade-off....
hamster
 
Posts: 38
Joined: Mon Sep 27, 2010 10:19 pm

Postby Jotta » Thu Jan 27, 2011 7:26 am

Hi,

I read your thread about TinyBasic implementation, so I think
your "grep" is only keyword based.

You have not mentioned how you implemented your FSM, is
there a small FSM for each keyword or a big one for all words?
I would prefer the last approach. This FSM can be simplified
and compacted (and your char set can maybe reduced to
7 or 6 Bits). Putting this in a BRAM should also not so
difficult.

Good Luck
Jotta
 
Posts: 3
Joined: Thu Sep 30, 2010 12:11 pm


Return to General projects