fpga4fun.com

by **Oneironaut** » Wed Nov 05, 2008 9:45 pm

Finally added the second 512K SRAM to my board and things are up and running.

The Xilinx board came with nothing but the XC2C256 on it so I added my own dual 512K SRAMs and the logic that allows the video system to talk to the outside world. Any Microcontroller or Microprocessor can draw graphics to a VGA screen using a flicker free double buffer system. Draw to the back buffer, then just issue a flip command. Flicker free and fast. The logic allows 5v or 3.3v interfacing.

Hope to have some graphics demos and games done soon. I will probably start by writing some GFX demos in an AVR324p.

60 frames per second with screenfulls of priorty based sprites will be easy to achieve, even for a 20MHz AVR or PIC.

Cheers,
Brad

by **Oneironaut** » Mon Nov 10, 2008 3:57 am

Now that all of the 74HC logic has been tested, the AVR324p can talk to the system and draw text, sprites, and animate graphics at speeds close to what a 486 DX could have done back in the good ol' days.

I only have basic IO routines done, but hope to have some real demos to show off the speed of the graphics board in a week or so.

So far the entire code is using 35% of the Xilinx XC2C256, and includes a fast back buffer screen clearing routine to offload the external CPU, as that takes more than 960,000 cycles to complete.

Once the final circuit boards are done, I will have a standalone high resolution super VGA card ready to connect to any uC or CPU. Even a simple basic stamp could pull off a decent arcade quality conversion.

Brad

by **sofian** » Sun Nov 30, 2008 9:22 pm

your code and schematic is public ?

by **Oneironaut** » Mon Dec 01, 2008 12:32 am

As soon as everything is tested, I will be putting up a website.

Thanks,
Brad

by **sofian** » Wed Dec 03, 2008 10:55 pm

Ok

if you need to help with routing pcb for prototype please send schematic

by **Oneironaut** » Thu Dec 04, 2008 9:54 pm

Thanks for the offer.

This project has currently been sent to an engineering firm for final development and I hope to have boards available early in the new year. I want to develop a decent code base for PIC and AVR (C and ASM) before offering the boards though.

I will post more photos of the final unit soon.

Here are a few photos as well as video of the very first prototype being used with an external AVR324 to generate a simple sprites demo...

http://www.lucidscience.com/az641/

Updated....
http://www.lazarus64.com/

This unit was based on mostly logic parts and another AVR for sync generation and back buffer control.

The final unit will be about half the size of a playing card and has only a single SRAM and one CPLD.

Cheers,
Brad

by **Oneironaut** » Fri Jun 12, 2009 11:14 pm

Thanks!

I am running my system at 57.MHz now, and it has not problems with noise at all.

I have made a decent amount of progress since the last time I posted...

http://www.lazarus64.com/

I am now using the XESS Spartan3 board...

Brad

by **nazi** » Sun Aug 30, 2009 9:40 am

Hi,

I have to design a SRAM CONTROLLER for an Spartan3 XC3S1500 FPGA board. It comes with 1Mb of ASynchronous SRAM, model CY7C1041CV33.

My first task is to write a controller for the storage of an array.
I downloaded the specifications of the SRAM, but I don't know how to use it...

I know this should be really simple, but I'm a real newbie... So if someone could help me, describe me the steps I have to go through to write the controller, or provide me some simple controller template, I will be really thankful.

waiting 4 Reply....

regards

by **Oneironaut** » Sun Sep 13, 2009 8:20 pm

I know this is a late reply, but in case you are still working on your project, let me warn you about something that took me months to figure out when making my SRAM controller.

**** Bus TurnAround Time ****

Yes, even though this is probably the most important thing you need to know about, not one single datasheet for the 4 different types of SRAM I use have even bothered to mention it anywhere!

Since real data is scarce, I have found that 2 dead cycles are typically needed when you switch between read and write on an SRAM. Leave them out and all kinds of fun things will happen.

There is an SRAM called ZBT (zero bus turnaround) as well, but I have not tried it yet.

Brad

by **tkbits** » Mon Sep 14, 2009 5:00 am

Asynchronous SRAMs (like the mentioned Cypress RAM) don't have clock cycles.

by **elpuri** » Mon Sep 14, 2009 5:28 am

Which is probably the reason that it's not mentioned in the datasheets :wink:

Brad, you probably had some other timing violation in your design. Or did you really use synchronous SRAM? I think they're quite rare. Except the block ram inside an fpga of course

by **Oneironaut** » Mon Sep 14, 2009 2:24 pm

Asychronous SRAMs do have bus turnaround times. When I say clock cycles, I mean your main system clock. I am running my video system at 57.252MHz, and the SRAM requires 2 dead cycles between finishing a video line read before it can allow access to the host to write. This is the same for every type of ram I tried except for parts called ZBT.

CY7CX1CV33 is the same SRAM I am using right now.... 512K 10ns asynchronous.

In the one datasheet I did find that mentioned turnaround time, they also call the dead cycles clock cycles, referring to the maximum speed of the part. At 10ns (100MHz), the manufacturer claimed 3 dead clock cycles, which is why I am using 2 cycles at 57.252MHz.

Considering this turnaround time solved every single glitch in my system, allowing much higher memory efficiency.
http://www.lazarus64.com/video/capture.wmv

I am also using the Spartan3 (XCS31000), so we have almost identical hardware.

Here is a good bit of info on ZBT...
http://www.altera.com/literature/an/an183.pdf

I still find it strange how little information there is on SRAM bus turnaround time out there considering how important it is when trying to get any real performance from the SRAM.

Before I got into FPGAs, I used slow 55ns-70ns SRAM, and this was of no concern.

Brad

by **elpuri** » Mon Sep 14, 2009 3:15 pm

In a sense yes it has a "turnaround time", because there are setup and hold time restrictions, but of the parts I've looked at none have times that would be multiples of the access time.

The Altera application note you linked specifically talks about ZBT SRAM versus syncburst SRAM.

Couldn't find a part called CY7CX1CV33 from Cypress' website. Typo?

I just have hard time believing that there is a asynchronous SRAM manufacturer conspiracy that makes them leave out this vital information from their datasheets. That would make no sense at all (as opposed to synchronous burst SRAMs having bus turnaround time because they must have some control logic built inside).

If you can dig up a asynchronous SRAM datasheet which mentions bus turnaround for the part, I'll believe you.

by **Oneironaut** » Mon Sep 14, 2009 3:52 pm

The part is CY7C1049CV33 or DV33, I was just referring to the CYC series generally.

http://www.ociw.edu/instrumentation/ccd/parts/CY7C1049CV33.pdf

In the datasheet, the usual RW timings are given, which are basic for all SRAMs. Notice how not once does it mention bus turnaround times.

If it wasn't for the fact that I have a working system here, I would also belive you that there is no such timing consideration. It took days to finally sort out small glitches that I once thought were in my code, but in fact were bus turnaround times.

All I did was add 2 dead cycles after the horizontal pixel read, which buffers 256 pixels into a blockram before allowing RW access to the ram. It was strange... reading worked right away once the "memready" flag was set, but the first two writes would be random or glitchy.

After much digging, I found several other people who have discovered this problem when using SRAM near it's maximum ratings.

When I read this it made me think...
http://www.eeherald.com/section/design-guide/esmod15.html

>>
ZBT (zero bus turnaround): the turnaround is the number of clock cycles it takes to change access to the SRAM from write to read and vice versa. The turnaround for ZBT SRAMs or the latency between read and writes cycle is zero. In short the ZBT is designed to eliminate dead cycles when turning the bus around between read and writes and reads.
>>

So I simply added 2 dead cycles between the last buffer read and the "memready" flag and it solved all problems instantly.

To further test this dead cycle fix, I upped the master clock from ~60MHz to 100MHz and found that I indeed needed 3 cycles, just as someone on another site pointed out when using SRAM at 10ns.

I can't be the only person here that has discovered that bus SRAM turnaround time is real?? I guess if it wasn't a real issue, IDT would not have created ZBT SRAM.

Here is a tiny bit of code from my GPU section that does a block copy from one page of memory to another...

Code: Select all: ///////////////////////////////////////////////////////////////////////////////////////////////// ////////// COMMAND 015 - BLOCK COPY : (ROTATION VALUE : ALPHA COLOR) ////////// X1LOC,Y1LOC : SOURCE X1,SOURCE Y1 ////////// X2LOC,Y2LOC : SOURCE X2,SOURCE Y2 ////////// X3LOC,Y3LOC : DESTINATION X,DESTINATION Y ////////// X4LOC,Y4LOC : SOURCE PAGE,DESTINATION PAGE (100 = CURRENT DRAWING PAGE) ///////////////////////////////////////////////////////////////////////////////////////////////// if (rxcomm == 15 & comstate == 1) begin deltax <= x1loc; // SELECT POSITION 0 if (rxdata1 == 0) begin deltay <= x3loc; end // SELECT POSITION 1 if (rxdata1 == 1) begin deltay <= y3loc; x3loc <= x3loc + (x2loc-x1loc); end // SELECT POSITION 2 if (rxdata1 == 2) begin deltay <= x3loc + (x2loc-x1loc); x3loc <= x3loc + (x2loc-x1loc); y3loc <= y3loc + (y2loc-y1loc); end // SELECT POSITION 3 if (rxdata1 == 3) begin deltay <= y3loc + (x2loc-x1loc); y3loc <= y3loc + (x2loc-x1loc); end // SELECT POSITION 4 if (rxdata1 == 4) begin deltay <= x3loc+ (x2loc-x1loc); x3loc <= x3loc + (x2loc-x1loc); end // SELECT POSITION 5$$ if (rxdata1 == 5) begin deltay <= y3loc + (y2loc-y1loc); x3loc <= x3loc + (x2loc-x1loc); y3loc <= y3loc + (x2loc-x1loc); end // SELECT POSITION 6 if (rxdata1 == 6) begin deltay <= x3loc; y3loc <= y3loc + (y2loc-y1loc); end // SELECT POSITION 7 if (rxdata1 > 6) begin rxdata1 <= 7; deltay <= y3loc; end // CALCULATE SOURCE AND DESINTATION PAGES xtemp <= x4loc*51200; if (y4loc < 100) ytemp <= y4loc*51200; if (y4loc == 100) ytemp <= drawpage; comstate <= 2; end // EXECUTE COMMAND PIPLINE FOR BLOCK COPY if (rxcomm == 15 & comstate == 2 & memready == 1) begin pipeline <= pipeline + 1; // PIXEL READ CYCLE if (pipeline == 2) begin sramadr <= x1loc+(y1loc*256)+xtemp; sramwe <= 1; sramoe <= 0; sramce <= 0; end if (pipeline == 4) delta <= sramport; // PIXEL WRITE CYCLE if (pipeline == 5) begin sramout <= palette[delta]; sramadr <= x3loc+(y3loc*256)+ytemp; sramwe <= 0; sramoe <= 1; sramce <= 1; end // BLOCK COPY MATH FOR POSITION 0 if (rxdata1 == 0 & pipeline == 6) begin x1loc <= x1loc + 1; x3loc <= x3loc + 1; if (x1loc == x2loc) begin x1loc <= deltax; x3loc <= deltay; y1loc <= y1loc + 1; y3loc <= y3loc + 1; end end // BLOCK COPY MATH POSITION 1 if (rxdata1 == 1 & pipeline == 6) begin x1loc <= x1loc + 1; y3loc <= y3loc + 1; if (x1loc == x2loc) begin x1loc <= deltax; y3loc <= deltay; y1loc <= y1loc + 1; x3loc <= x3loc - 1; end end // BLOCK COPY MATH FOR POSITION 2 if (rxdata1 == 2 & pipeline == 6) begin x1loc <= x1loc + 1; x3loc <= x3loc - 1; if (x1loc == x2loc) begin x1loc <= deltax; x3loc <= deltay; y1loc <= y1loc + 1; y3loc <= y3loc - 1; end end // BLOCK COPY MATH POSITION 3 if (rxdata1 == 3 & pipeline == 6) begin x1loc <= x1loc + 1; y3loc <= y3loc - 1; if (x1loc == x2loc) begin x1loc <= deltax; y3loc <= deltay; y1loc <= y1loc + 1; x3loc <= x3loc + 1; end end // BLOCK COPY MATH FOR POSITION 4 if (rxdata1 == 4 & pipeline == 6) begin x1loc <= x1loc + 1; x3loc <= x3loc - 1; if (x1loc == x2loc) begin x1loc <= deltax; x3loc <= deltay; y1loc <= y1loc + 1; y3loc <= y3loc + 1; end end // BLOCK COPY MATH POSITION 5 if (rxdata1 == 5 & pipeline == 6) begin x1loc <= x1loc + 1; y3loc <= y3loc - 1; if (x1loc == x2loc) begin x1loc <= deltax; y3loc <= deltay; y1loc <= y1loc + 1; x3loc <= x3loc - 1; end end // BLOCK COPY MATH FOR POSITION 6 if (rxdata1 == 6 & pipeline == 6) begin x1loc <= x1loc + 1; x3loc <= x3loc + 1; if (x1loc == x2loc) begin x1loc <= deltax; x3loc <= deltay; y1loc <= y1loc + 1; y3loc <= y3loc - 1; end end // BLOCK COPY MATH POSITION 7 if (rxdata1 == 7 & pipeline == 6) begin x1loc <= x1loc + 1; y3loc <= y3loc + 1; if (x1loc == x2loc) begin x1loc <= deltax; y3loc <= deltay; y1loc <= y1loc + 1; x3loc <= x3loc + 1; end end // FINISH PIPELINE if (pipeline == 6) begin if (x1loc < x2loc | y1loc < y2loc) pipeline <= 0; if (delta != rxdata2) begin sramwe <= 0; sramoe <= 1; sramce <= 0; end else begin sramwe <= 1; sramoe <= 1; sramce <= 1; end end // COMPLETE COMMAND if (pipeline == 7) begin sramwe <= 1; sramoe <= 1; sramce <= 1; rxcomm <= 0; pipeline <= 0; hostak <= 1; end end

Notice that the pipeline must jump from 2 to 4 during the transition between read and writes, allowing 2 dead cycles for bus turnaround time. Without this latency, the first few writes are random or fail.

This took days to figure out.

Oh well, my system is almost ready for manufacture now, and I am sticking with.... bus turnaround time is real!

Brad

by **elpuri** » Mon Sep 14, 2009 4:49 pm

In the datasheet, the usual RW timings are given, which are basic for all SRAMs. Notice how not once does it mention bus turnaround times.

Exactly. I think it speaks more in the favor of asynchronous SRAMs not having BTT than having such constraint.

If it wasn't for the fact that I have a working system here, I would also belive you that there is no such timing consideration. It took days to finally sort out small glitches that I once thought were in my code, but in fact were bus turnaround times.

Maybe you have accidentally registered some of the control signals (or forgot to register something else) causing latency vs. the data? Maybe your wire setup causes the OE line to ring when you pull it high for writing and needs those two cycles to settle? There could be lots of more plausible things that could explain the phenomenon than manufacturers blatantly refusing to tell about this constraint in their datasheets.

I can't be the only person here that has discovered that bus SRAM turnaround time is real?? I guess if it wasn't a real issue, IDT would not have created ZBT SRAM.

SANTA CLARA, Calif.--(BUSINESS WIRE)--March 30, 1998--Integrated Device Technology, Inc. (IDT), the pioneer of Zero Bus Turnaround(TM) (ZBT(TM)) architecture, today introduced the industry's fastest 4-Mbit flow-through synchronous ZBT SRAM.

With all the respect, judging from the video you've done some really cool stuff, but I still think the problem is in your design and the two cycle wait just masks the real problem.

by **Oneironaut** » Mon Sep 14, 2009 6:37 pm

Thanks for your comments!

I will be doing the PCB in the next month, so if there is some kind of delay or ringing, I will find it then. It would be nice to mop up those 2 dead cycles, as the block copy command is by far the most important in the system.

From my research and reading though, this turnaround time seems to be a real issue. One that isn't often shown in the datasheet.

The SRAM datasheets all seem to show waveforms for single or continuous reads and writes, never RW interleaving. Of course, the zero to 60MPH speed rating on my car assumes you are stopped, not already doing 60MPH in reverse, and that would indeed change things!

I will report back after I have a proper circuit board and have the chance to see if those 2 dead cycles are still necessary.

Cheers,
Brad

fpga4fun.com

800x600 Dual Buffer VGA Display

800x600 Dual Buffer VGA Display

Re: 800x600 Dual Buffer VGA Display

Re: Wow

Sram controller

Re: Sram controller