- Info about Xilinx FPGA history inc. the XC2018 and XC3030 that I have in my shed catalogue.
- Also some possible Xilinx info about old development tools and procedures.
- Initially just a copy of
test05h
. Doesn't "Fit", astest05h
didn't before I changed it to "Density" optimisation in the Fit stage. - I tried first just changing the existing XST (Synthesis) speed-based optimisation from "Normal" to "High". This may have made it worse, requiring 51 equations instead of 50 equations.
- I changed it to "Goal: Area; Effort: Normal". This didn't work either. The text report shows what failed to map:
...I think Speed/High mode failed to map 6 signals? I do wonder whether rearranging the logic could push for a better result, since I don't know that we need speed and timing precision for this design.
************************* Summary of UnMapped Logic ************************ ** 4 Buried Nodes ** Signal Total Total User Name Pts Inps Assignment U1/tone<1> 2 2 U1/tone<2> 2 3 U1/tone<3> 2 4 U1/tone<4> 2 5
- Changing to "Area/High" still doesn't help so I changed back to "Speed/Normal". I then tried the "Exhaustive Fit Mode" (see here for more info).
- It can be seen that this iteratively runs
cpldfit
with different "Collapsing Input Limit" and "Collapsing Product Term Limit" combinations, hoping to find one that succeeds - It is relatively fast for each try, with such a small design, but we could still be looking at something on the order of over 2,000 iterations, with each taking a second or more in my VM.
- It can be seen that this iteratively runs
- As it happens, Exhaustive Fit Mode did manage to find a solution after only a few minutes, which it determined with these parameters:
...and these overall stats:
INFO:Cpld:994 - Exhaustive fitting is trying pterm limit: 22 and input limit: 54
Design Name: test05i Date: 6-20-2020, 2:04AM Device Used: XC9572XL-7-VQ64 Fitting Status: Successful ************************* Mapped Resource Summary ************************** Macrocells Product Terms Function Block Registers Pins Used/Tot Used/Tot Inps Used/Tot Used/Tot Used/Tot 51 /72 ( 71%) 288 /360 ( 80%) 111/216 ( 51%) 47 /72 ( 65%) 6 /52 ( 12%) ** Function Block Resources ** Function Mcells FB Inps Pterms IO Block Used/Tot Used/Tot Used/Tot Used/Tot FB1 8/18 24/54 90/90* 2/13 FB2 12/18 27/54 78/90 2/13 FB3 18/18* 28/54 36/90 1/14 FB4 13/18 32/54 84/90 1/12 ----- ----- ----- ----- 51/72 111/216 288/360 6/52
- Compare this with a non-exhaustive "Density" optimisation:
By default, this is using
Design Name: test05i Date: 6-20-2020, 2:09AM Device Used: XC9572XL-7-VQ64 Fitting Status: Successful ************************* Mapped Resource Summary ************************** Macrocells Product Terms Function Block Registers Pins Used/Tot Used/Tot Inps Used/Tot Used/Tot Used/Tot 62 /72 ( 86%) 186 /360 ( 52%) 104/216 ( 48%) 47 /72 ( 65%) 6 /52 ( 12%) ** Function Block Resources ** Function Mcells FB Inps Pterms IO Block Used/Tot Used/Tot Used/Tot Used/Tot FB1 18/18* 29/54 35/90 2/13 FB2 18/18* 32/54 69/90 2/13 FB3 17/18 32/54 67/90 1/14 FB4 9/18 11/54 15/90 1/12 ----- ----- ----- ----- 62/72 104/216 186/360 6/52
-inputs 54
and-pterms 90
. - For now, I will just leave this on "Density" optimisation, but I know I could use Exhaustive Fitting if I need it later.
So MusicBox4 might be a little wrong...? The ROM code is only included in this link at the bottom of the page, and the rest of the code in that archive is different from what the page recommends we change.
Hence, I adapted what was in the actual archive to fit with the code I have in test05i
(copied from test05h
).
This produced a design that wouldn't initially fit, even with 'Density' optimisation. I let it run with Exhaustive Fitting, and it did take a long time. I'm not sure how long but according to timestamps on various files it took 3 hours and 7 minutes.
The solution was:
INFO:Cpld:994 - Exhaustive fitting is trying pterm limit: 14 and input limit: 15
...with these results:
Design Name: test05i Date: 6-20-2020, 6:32AM
Device Used: XC9572XL-7-VQ64
Fitting Status: Successful
************************* Mapped Resource Summary **************************
Macrocells Product Terms Function Block Registers Pins
Used/Tot Used/Tot Inps Used/Tot Used/Tot Used/Tot
70 /72 ( 97%) 267 /360 ( 74%) 138/216 ( 64%) 55 /72 ( 76%) 6 /52 ( 12%)
** Function Block Resources **
Function Mcells FB Inps Pterms IO
Block Used/Tot Used/Tot Used/Tot Used/Tot
FB1 18/18* 41/54 67/90 2/13
FB2 17/18 36/54 84/90 2/13
FB3 18/18* 29/54 56/90 1/14
FB4 17/18 32/54 60/90 1/12
----- ----- ----- -----
70/72 138/216 267/360 6/52
Anyway, these were the specific changes I made from the base test05h
code to produce test05i
:
- Moved
divide_by_12
module into mainmusic.v
code. - Added the inline melody ROM, but I also changed the tail end of it from the original:
...to now alternate the last few notes rapidly between notes 2 octaves apart:
240: note<= 8'd25; 241: note<= 8'd0; 242: note<= 8'd00; default: note <= 8'd0; endcase
NOTE: This change breaks the fitting step, again. For more info, see below.240: note<= 25; 241: note<= 60; 242: note<= 36; 243: note<= 60; 244: note<= 36; 245: note<= 60; 246: note<= 36; 247: note<= 60; 248: note<= 36; 249: note<= 60; 250: note<= 36; default: note <= 0; endcase
- Added the instance of
music_ROM
to read notes intofullnote
. - Extended main
tone
counter from 28 bits to 30 bits, because now instead of counting the 64 notes (upper 6 bits) to play in a scale, we are counting 256 note addresses (upper 8 bits) to retrieve from the inline ROM. - Simplified
counter_octave
to use a bit shift instead of hardcoded constants. - Added
fullnote!=0
check check; if the note is0
, the speaker is silent. - Where
tone[21:0]
counts the duration of each note (222 cycles at 50MHz, or ~168ms), we check for when the top 4 bits are0000
, and silence the speaker during that time, which basically means the first ~10ms of each note is silent, to sound like a tiny gap.
Anyway, the melody it plays is "Rudolph the Red-Nosed Reindeer". It's the whole song, except for the last few notes, which were left out of the ROM for some reason.
Note that when I made the change to the end of the ROM, as described in one of the steps above, it broke the fitting step again, so I'm running it overnight with the most strict settings I can... though it's possible my XST change for "Area/High" will trip me up; I'm not sure. It's also possible the design simply won't fit as-is, in which case I might consider taking the 50MHz-to-25MHz clock halver and breaking it down to maybe 12.5MHz or even 6.25MHz. This will add 2 more bits to reg clk25
, but hence probably save us 4 bits in total, between tone
and counter_note
(and don't forget to match div
).
There is one other line I think I could change, though I'm not sure if it would make much difference. It could be this, perhaps?
always @(posedge clk) if (counter_note==0 && octave!=0) counter_octave <= counter_octave==0 ? 5'b11111 >> octave : counter_octave-1;
...i.e. don't constantly assign 0
back into counter_octave
.
I ended up changing the code to use a 6.25MHz clock (by dividing 50MHz by 8), so I was able to save on some repeated counter bits in music
, and this made the fitting work on the first try. I also put the last line of the song in, but had to trim the last note to allow a gap of silence before it loops back to the start.
- I should learn more about Parameterized Modules which allow us to do things like define a module with a variable number of bits (i.e. variable vector size). See also: this and this.
- AR# 40377: Info on possible work-arounds if XST crashes.