Hi Yassen!
I've decided to make a neural network, that will learn simple logical function provided by user (AND, OR, ...).
I've managed to simulate it with Matlab using fixed point arithmetic. Here are the facts regarding VHDL :
- I've managed to implement LUT for tansigmoid function, and since it's a simple ROM it does not use much of resources.
- I've managed to make some kind of fast ‘multiplier-adder’ to multiply input with weights and sum them.
- My implementation (i'm using xilinx ISE; target is spartan3E) is using embedded 18x18 multipliers in spartan.
Now, the problem is, that my Spartan only has 20 multipliers. I know, that it is more than enough for a small network that is only learning some simple logical function , but i'll try to expand my problem to 8x8 matrix number recognition. In that case 20 multipliers is not enough.
As far as i can see i have two possibilities :
1) Use max number of multipliers (im my case 20) to make some kind of component, that just multiplies and sums all the numbers (let’s call it ‘mulsum’ component
). If problem requires larger neural network, I just feed my ‘mulsum’ component with different parts of neural network.
Good side of this kind of implementation would be simplicity. Bad side would be sequential calculation, that just does not fit with neural network philosophy. Also, there is a clock problem. So far I’ve implemented 16 embedded multipliers and my clock went down to 48mhz! I’m obviously loosing some points here.
2) Use self-made multiplier. Booth multiplier for example. Every input/weight would have one.
Down side of this kind of implementation would be my lack of knowledge of multipliers. I don’t know how complex they are, how much gates they consume, what are the timings of this kind of multipliers,…
Do you have any thoughts on that? Do you have any experiences / literature on making multipliers from scratch?
Best regards,
Matej Gutman