Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

himbaechel: dqce_bel != BelId() / Issues with rPLL #1408

Open
jpf91 opened this issue Dec 22, 2024 · 4 comments · May be fixed by #1410
Open

himbaechel: dqce_bel != BelId() / Issues with rPLL #1408

jpf91 opened this issue Dec 22, 2024 · 4 comments · May be fixed by #1410

Comments

@jpf91
Copy link

jpf91 commented Dec 22, 2024

Hi there,

Issue Summary

I'm trying to use the rPLL in a Tang Nano 20K Device. I connect the 27MHz clock to the PLL and configure it to multiply by 4. As a simple test program, I have a counter which is driven by the resulting 108 MHz clock. I connect one of the board LEDs to the PLL lock signal and the others are toggled whenever the counter reaches a certain value.

This works fine, as long as I only toggle one LED. The PLL lock LED is on and the other LED toggles as expected. It also works for 2 toggling LEDs. With 3 LEDs, the LEDs no longer toggle, or sometimes they do, but not deterministically. PLL lock still seems to be on, but of course I wouldn't see short unlocked phases when looking at the LED.

I then remembered that on some FPGAs, PLLs must drive clock buffers and those should drive the nets. So I tried to add a DQCE in between. With the DQCE, the 1 or 2 LED variant still works. The 3 LED variant then even fails pnr:

terminate called after throwing an instance of 'nextpnr_himbaechel::assertion_failure'
  what():  Assertion failure: dqce_bel != BelId() (/work/_builds/linux-x64/nextpnr-himbaechel/nextpnr/himbaechel/uarch/gowin/globals.cc:364)

A random guess would be that the logic for the LED registers is in different "clock regions". And that when explicitly instantiating the DQCE, nextpnr spots the issue and crashes. But when not using a DQCE, it just somehow produces an invalid placement. But that's just a wild guess, unfortunately I don't know anything about the gowin architectures.

Reproduction Testcase

The code can be found in this gist: https://gist.github.com/jpf91/9796d712ad611e2836d8b61bf1a83aaf
Use the following commands to reproduce the issue:

# Synthesize
yosys -p "read_verilog CLKGen.v main.v; synth_gowin -top main -json top.syn.json" \
  -q -l top.syn.log --detailed-timing

#PnR
nextpnr-himbaechel --device GW2AR-LV18QN88C8/I7 --json top.syn.json --write top.pnr.json \
  --report top.pnr.json --vopt family=GW2A-18C \
  --vopt cst=top.cst --pre-pack top.py -q -l top.pnr.log \
  --threads 4 --detailed-timing-report

# Bitstream
gowin_pack -d GW2A-18C -o top.fs top.pnr.json

# Upload
openFPGALoader -b tangnano20k -f top.fs

When using the code as is, it produces the dqce_bel != BelId() crash. When removing one of the led_buf[X] <= !led_buf[X]; lines, the code works. When removing the DQCE in CLKGen.v, the crash is gone but the generated bitstream does not work.

Am I doing something wrong here, or is there really a bug in the tools?

Tool version: oss-cad-suite-linux-x64-20241222

@jpf91
Copy link
Author

jpf91 commented Dec 22, 2024

According to Gowin UG286, the GW2AR-18 has 4 clock regions ("quadrants"). And as far as I understand, they need to be driven by different DQCEs. To test, I connected a second DQCE to the PLL, duplicated the counter logic and drive one counter from each DQCE. I then ensure one LED is updated by one counter and the two other LEDs by the other.

This setup passes PnR and it also works in hardware. However, in practice it would be very difficult to manually handle the clock regions like this. So I assume I shouldn't instantiate the DQCE manually, but nextpnr should automatically instantiate it for me when using a PLL? Is there any way to check whether it properly connects the rPLL to DQCE when I did not manually instantiate it?

I might go ahead and try some more complex designs without physical outputs. As long as all logic is placed into one quadrant, I think this could work...

@yrabbit
Copy link
Contributor

yrabbit commented Dec 22, 2024

Thanks for testing!

There is no need to care about quadrants and you use DQCE if you are going to save energy by switching off the clock in quadrants, which is only relevant when powered by batteries I think.

So there are essentially two bug reports here - a crash issue with the placement of the DQCE and the thee LEDs not flashing. I'll look into them in the near future.

@yrabbit
Copy link
Contributor

yrabbit commented Dec 23, 2024

#PnR
nextpnr-himbaechel --device GW2AR-LV18QN88C8/I7 --json top.syn.json --write top.pnr.json
--report top.pnr.json --vopt family=GW2A-18C
--vopt cst=top.cst --pre-pack top.py -q -l top.pnr.log
--threads 4 --detailed-timing-report

Just an off-topic remark - top.pnr.json is specified twice, this will lead to interesting results 😄

@yrabbit
Copy link
Contributor

yrabbit commented Dec 23, 2024

So, at first glance, both difficulties are solved. I will need some time to test the other examples (and there are many - compilation alone takes two hours or so on my machine, not to mention the hardware checks).

VID_20241223_114710.mp4
VID_20241223_115102.mp4

@yrabbit yrabbit linked a pull request Dec 23, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants