The total effort spent on verification is continuously on
the rise (click here). This boost can be attributed to multiple reasons such as –
- Rising complexity of the design further guided by Moore’s
law
- Constrained random test benches coupled with complex
cross cover bins
- Incorporating multiple techniques to confirm verification
closure
- Debugging RTL and the Verification environment
A study conducted by Wilson Research
Group, commissioned by Mentor Graphics revealed that, mean time
a design engineer spends in verification has increased from an average of 46%
in 2007, to 50% in 2010. It also confirmed that debugging claims most part of verification engineer's bandwidth. While the effort spent on RTL debugging may rise
gradually with the design size and complexity, TB debugging is showing up
frequent spikes. Absence of a planned approach and limited
support of the tools to enable this further adds up to the woes. Issues in the verification environment arise
mainly due to –
- Incorrect understanding of the protocol
- Limited understanding of the language and methodology features
- First timers making silly mistakes
- ‘JUGAAD’ (Hindi word for workaround)
Unlike design, the verification code was never subjected to area
and performance optimization and the verification engineers were liberal in
developing code. If something doesn’t work, find a quick workaround (jugaad)
and get it working without contemplating the impact on testbench efficiency.
Market dynamics now demand the faster turnaround of product and if verification
is sluggish that impacts the product development schedule considerably. Below is one such case study picked from past experience wherein a complex core with parallel datapaths culminating
into the memory arbiter (MA) block was to be verified.
GIVEN
CRV with Vera+RVM used to verify MA and block (XYZ) feeding MA.
100% functional coverage was achieved at block level for both modules. XYZ used
complete frame structure to verify so average simulation time of test was 30
mins while MA used just a series of bytes & words long enough to fill FIFOs
and simulation time was <5 mins. To stress MA further with
complete frames of data and confirm it works fine with XYZ, CRV was chosen for XYZ+MA
as a DUT. The rest of the datapath feeding XYZ was left to directed verification at top level as the total size of the core was quite large.
EXECUTION
The team quickly integrated the two environments and started
simulating the tests. But this new env was taking ~16X more time as compared to
XYZ standalone environment thereby impacting the regression time severely. This kicked
off the debugging process of analyzing the bottleneck. First approach was to
comment out the instances of MA monitor & scoreboard in the integrated env and
rerun. If simulation time reduces then uncomment the instances and its tasks
one by one to root cause the problem. On rerunning with this change there was
no drop in simulation time. Damn! How was that possible?
Reviewing the changes, the team figured out that instead of commenting
out the instances, the engineer had commented out the start of transactions. He
claimed that just having an instance in the env shouldn’t affect as long as no
transactions are getting processed by MA components. Made sense! But then why this Kolaveri (killer
instinct)?
To nail down the problem multiple approaches like code review,
increasing verbosity of logs and profiling were kicked off in parallel.
ROOT CAUSE
The MA TB had 2 mistakes. A thread was spawned from the new () task of scoreboard for
maintaining the data structure and this code had a delay(1) to it. This was added by the MA engineer while debugging standalone env at some point in time as a JUGAAD.
task ma_xyz :: abc()
{
variable
declarations…
while(1)
{
…
delay(1);
}
}
task new()
{
…
fork
abc();
join none
}
Since this thread was spawned from new(), even though the
start_xactor task was dormant this thread was still active causing the delay.
Replacing this delay by posedge(clock) solved the issue and to respect
guidelines this task was moved to a suitable place in the TB.
Lesson learnt – 'Jugaad' in the verification env of yesteryears
doesn’t work so very well with modern day verification environment. Think twice
while fixing you verification code or else the debugging effort on your project
would again overshoot beyond average!
I invite you to share your experiences with such goofups! Drop an email to siddhakarana@gmail.com
No comments:
Post a Comment