Wednesday, May 1, 2013

Constrained Random Verification flow strategy

The explosive growth of cellular market has affected the semiconductor industry like never before. Product life cycle have moved to an accelerated track to meet time to market. In parallel, engineering teams are in a constant quest to add more functionality on a given die size with higher performance and less power consumption. To manage this, the industry adopted reusability in design & verification. IPs & VIPs have carved out a growing niche market. While reuse happens either by borrowing from internal groups or buying from external vendors, the basic question that arises is, whether the given IP/VIP would meet the specifications of SoC/ASIC? To ensure that the IP serves requirement of multiple applications, thorough verification is required. Directed verification falls short in meeting this target and that is where Constrained Random Verification (CRV) plays an important role.
 
A recent independent research conducted by Wilson Research Group, commissioned by MentorGraphics revealed some interesting results on deployment of CRV.
 
In past 5 years –
- System Verilog as a verification language has grown by 271%
- Adoption of CRV increased by 51%
- Functional coverage by 65%
- Assertions by 70%
- Code coverage by 46%
 
- UVM grew by 486% from 2010 to 2012
- UVM is expected to grow by 46% in next 12 months
- Half of the designs over 5M gates use UVM
 
A well defined strategy with Coverage Driven Verification (CDV) riding on CRV can really be a game changer in this competitive industry scenario. Unfortunately, most of the groups have no answer to this strategy and pick adhoc approaches only to lose focus during execution. At a very basic level, focus of CRV is to generate random legal scenarios to weed out corner cases or hidden bugs not anticipated easily otherwise. This is enabled by developing a verification environment that can generate test scenarios under direction of constraints, automate the checking and provide guidance on progress. CDV on the other hand uses CRV as the base while defining Simple, Achievable, Measurable, Realistic and Time bound coverage goals. These goals are represented in form of Functional coverage, Code coverage or Assertions.
 
The key to successful deployment of CDV+CRV demands avoiding redundant simulation cycles while ensuring overall goals, defined (coverage) and perceived (verified design) are met. Multiple approaches to enable this further are in use –
- Run random regressions while observing coverage trend analysis till incremental runs aren’t hitting additional coverage. Analyze coverage results and feedback to the constraints to hit remaining scenarios.
- Run random regressions and use coverage grading to come up with a defined regression suite. Use this for faster turnarounds with a set of directed tests hitting the rest.
- Look for advanced graph based solutions that help you attain 100% coverage with most optimal set of inputs.
 
To define a strategy the team needs to understand the following –
- Size of design, coverage goals and schedule?
- Availability of HW resources (server farm & licenses)?
- Transition from simulator to accelerator at any point during execution?
- Turnaround time for regressions with above inputs?
- Room to run random regressions further after achieving coverage goals?
- Does the design services partner bring in complementing skills to meet the objective?
- Does the identified EDA tool vendor support all requirements to enable the process i.e. Simulator, Accelerator, Verification planner, VIPs, Verification manager to run regressions, coverage analysis, coverage grading, trend analysis and other graph based technologies.
 
A sample flow using CRV is given below -
 


Relevant Blog posts -
 





 

Saturday, March 30, 2013

Verification Futures India 2013 - Quick recap!

Verification Futures started off in 2011 from UK and in 2013 touched the grounds at India too. It is a one day conference organized by T&VS providing a platform for users to share the challenges in verification and for the EDA vendors to respond with potential and upcoming solutions. The conference was held on 19th MAR in Bangalore and turned out to be a huge success. It’s a unique event extending an opportunity to meet the fraternity and collaborate to discuss on challenges. I thank Mike Bartley for bringing it to India and highly recommend attending it.
 
The discussions covered a variety of topics in verification and somewhere all the challenges pointed back to the basic issue of ‘verification closure’. Market demands design with more functionality, small foot print, high performance and low power to be delivered in continuously shrinking time window. Every design experiences constant changes in the spec till the last minute expecting all functions of the ASIC design cycle to respond promptly. With limited resources (in terms of quantity and capabilities), the turnaround time for verification falls into critical path. Multiple approaches surfaced during the discussions at the event giving enough food for thought to solution seekers and hope to the community. Some of them are summarized below.
 
Hitting the coverage parameters is the end goal for closure. However, definition of these goals is biased on one side by an individual’s capability to describe the design into specification and on the other side to converge it into coverage model. Further, disconnect with the software team aggravates this issue. The software may not exercise all capabilities of hardware and actually hit cases not even imagined. HW-SW co-verification could be a potential solution to narrow-down the ever-increasing verification space and to increase the useful coverage.
 
Verifying the designs and delivering one that is an exact representation of spec has been the responsibility of the verification team. Given that the problem is compounding by the day there may be a need to enable “Design for Verification”. Designs that are correct by construction, easier to debug and follow bug avoidance strategies during development. EDA would need to enhance tool capabilities and the design community would need to undergo a paradigm shift to enable this.
 
Constrained Random Verification has been adopted widely to hit corner cases and ensure high confidence on verification. However, this approach also leads to redundant stimulus generation covering same ground over & over again. This means, even with grading in place, achieving 100% coverage is easier said than done. Deploying directed approaches (like graph based) or formal has its own set of challenges. A combination of these approaches may be needed. Which flow suits to what part of the design? Is 100% proof/coverage a ‘must’? Can we come up with objective ways of defining closure with a mixed bag? The answer lies in collaboration between the ecosystem partners including EDA vendors, IP vendors, design service providers and the product developers. The key would be to ‘learn from each other’s experiences’.
 
If we cannot contain the problem, are there alternates to manage the explosion? Is there a replacement to the CPU based simulation approach? Can we avoid the constraint of limited CPU cycles during peak execution period? Availability of cloud based solution extending elasticity or increasing velocity with hardware acceleration or enhanced performance using GPU based platforms are some potential solutions.
 
The presentation from Mentor included a quote from Peter Drucker –
- What gets measured, gets done
- What gets measured, gets improved
- What gets measured, gets managed
 
While the context of the citation was coverage, it applies to all aspects of verification. To enable continual improvement we need to think beyond the constraints, monitor beyond the signals and measure beyond coverage!
 

Sunday, March 3, 2013

Over-verification : an intricate puzzle

For verification, it was an eventful week. DVCON 2013 kept everyone busy with record attendance at the sessions and by following the tweets & blogs that resulted from them. The major highlight of this year’s conference was release of the latest update to System Verilog standard, IEEE 1800-2012 and free PDF copies made available, courtesy - Accellera.
 
With verification constantly marching to increase its claim on ASIC design schedule while retaining its position as a major factor for silicon re-spins, verification planning was a hot topic of discussion at DVCON. Some of the interesting points that came out of a panel discussion on verification planning were –
 
- Verification plan is not just a wish list. You have to define how you’re going to get there.
- Problem is not over-planning, but over-verifying designs because there has not been enough planning.
- Biggest objection we hear is we don’t have time to capture a verification plan. But you'll lose more time if you don't.
- What’s useful in verification planning is “ruthless prioritization.” You can never get it all done.
- My biggest challenge is getting marketing input into my verification plan.
- Failure to plan means planning to fail.
 
Last week, I guest blogged on a similar topic based on a recent survey conducted by Catherine & Neil. Clearly, the issue of poor planning gets highlighted in all areas of product development.
 
Traditionally, the verification plans were just a list of features to be verified, addressing ‘what to verify’. With the emergence of CRV, the plans started including the second aspect i.e. ‘how to verify’. Further, to bring focus to this never ending verification problem, CDV was adopted. The verification plans now started including target numbers in terms of coverage (code, functional and assertions) to define ‘when are we done’. With a given set of resources, when the ASIC design schedule is imposed on the verification plan, meeting the goals is a challenge. There arises a need to prioritize verification in terms of the features. Remember, Any code that is not verified will not work!
 
To enable this “ruthless prioritization”, collaboration is required among marketing, software and hardware groups to align to the design objectives. Everyone needs to understand the potential end applications and preferential ways in the design to achieve them. In case of IPs, this could mean that the initial releases target a limited set of applications based on the customers on board. Once that is achieved, ‘over-verifying’ can take over to further close on the grey areas. In case of SoC, it is a tough call. While design cost continues to increase with diminishing dimensions, the break even may not happen with limited applications (as a result of limited verification) of the SoC. The problem gets aggravated further with the specifications changing on the fly. A platform based approach could be a potential solution where variants of SoC are churned out frequently, but again defining the platform and prioritizing the features boils down to the same problem. A tough nut to crack!
 
The whole point of over-verifying comes to the fore front because verification is the long pole in the schedule. What if the designs could be ‘over verified’ within the timelines? What does it take to achieve that? Are tools like intelligent test benches, formal verification, hardware acceleration or cloud computing a solution? If yes, what is the associated cost and how does it affect?
 
There is no easy answer to any of these questions. In words of Albert Einstein, “The world we have created is a product of our thinking. It cannot be changed without changing our thinking.
 
Probably when an answer comes out, we will be on the road of commoditizing the hardware. Till then 'over-verification' is what we have to live with!

Sunday, January 27, 2013

Evolution of the test bench - Part 2

In the last post, we looked into the directed verification approach where, the test benches were typically dumb while the tests comprised of stimuli and monitors. The progress on verification was in linear relationship with the no. of tests developed and passing. There was no concept of functional coverage and even the usage of code coverage was limited. Apart from HDLs, programming languages like C & C++ continued to support the verification infrastructure. Managing the growing complexity and constant pressure to reduce the design schedule demanded an alternate approach for verification. This gave birth to a new breed of languages – HVLs (Hardware Verification Languages).
 
HVLs
 
The first one in this category was introduced by Verisity popularly known as ‘e’ language. The base of this language was AOP (Aspect Oriented Programming) and required a separate tool (Specman) in addition to the simulator. This language spear headed the entry of HVLs into Verification and was followed by ‘Vera’ that was based on OOP (Object Oriented Programming) promoted by Synopsys. Along with these two languages, SystemC tried to penetrate this domain with support from multiple EDA vendors but couldn’t really gain wide acceptance. The main idea promoted by all these languages was CRV (Constrained Random Verification). The philosophy was to empower the test bench with all features of drivers, monitors, checkers and a library of sequences/scenarios. The generation of tests was automated with the state space exploration guided by constraints and progress measured using functional coverage.
 
Methodologies
 
As adoption of these languages spread, the early adopters started building proprietary methodologies around them. To modularize development, BCLs (Base Class Libraries) were developed by each organization. Maintaining local libraries and continuously improving them while ensuring simulator compatibility was not a sustainable solution. The EDA vendors came forward with methodologies for each of these languages to resolve the above issue and standardize the usage of language. Verisity led the show with eRM (e Reuse Methodology) followed by RVM (Reference Verification Methodology) from Synopsys. These methodologies helped in putting together a process to move from block to chip level and across projects in an organized manner thereby laying the foundation for reuse. Though verification was progressing at a fast pace with these entrants, there were some inherent issues with these solutions that left the industry wanting for something more. The drawbacks include –
 
- Requirement for an additional tool license beyond simulator
- Efficiency of simulator took a toll because of passing the control back & forth to this additional tool
- These solutions had limited portability across simulators
- As reusability picked up, finding VIPs based on the HVL was difficult
- Hardware accelerators started picking up and these HVL couldn’t compliment it completely
- Ramp up time for engineers moving across organizations was high
 
System Verilog
 
To move to the next level of standardization, Accellera decided to improve on Verilog instead of driving e or Vera as industry standard. This led to the birth of System Verilog which proved to be a game changer in multiple respects. The primary motivation behind driving SV was to have a common language for design & verification to address the issues with other HVLs. Initial thrust to System Verilog came in from Synopsys by declaring Vera as open source and extending its contribution to definition of System Verilog for verification. Further Synopsys in association with ARM moved RVM to VMM (Verification Methodology Manual) based on System Verilog providing a framework for early adopters. With IEEE recognizing SV as a standard (1800) in 2005 the acceptance rate increased further. By this time Cadence acquired Verisity after its quest of promoting SystemC as a verification language. eRM was transformed to URM (Universal Reuse Methodology) that supported e, SystemC and System Verilog. This was followed by Mentor proposing AVM (Advanced Verification Methodology) supporting System Verilog & SystemC.  Though System Verilog settled the dust by claiming maximum footprint across organizations, availability of multiple methodologies introduced inertia to industry wide reusability. The major issues faced include –
 
- Learning a new methodology almost every 18 months
- The methodologies had limited portability across simulators
- Verification env developed using VIP from 1 vendor not easily portable to another
- Teams confused in terms of road maps for these methodologies based on industry adoption
 
Road to UVM
 
To tone down this problem, Mentor and Cadence merged their methodologies and came up with OVM (Open Verification Methodology) while Synopsys continued to stick to VMM. Though the problem was reduced, still there was a need for a common methodology and Accellera took the initiative to develop one. UVM (Universal Verification Methodology) largely based on OVM and deriving featured from VMM was finally introduced. While IEEE recognized ‘e’ as an standard (1647) in 2011, it was already too late. Functional coverage, assertion coverage and code coverage all joined together to provide the quantitative metrics to answer ‘are we done’ giving rise to CDV (Coverage Driven Verification).
 
Suggested Reading - Standards War

Thursday, December 27, 2012

Evolution of the test bench - Part 1

Nothing is permanent except change and need constantly guides innovation. Taking a holistic view with reference to a theme throws light on the evolution of the subject. In a pursuit to double the transistors periodically, the design representation has experienced a shift from transistors  Ã  gates à RTL and now to synthesizable models. As a by-product of this evolution, ensuring functional correctness became an ever growing unbounded problem. Test bench is the core of verification and has witnessed radical changes to circumvent this issue.
 
While the initial traces of the first test bench is hard to find, the formal test benches entered the scene with the advent of HDLs. These test benches were directed in nature, developed using Verilog/VHDL and mostly dumb. The simulations were completely steered by the tests i.e. the stimuli and checker mostly resided inside the tests. Soon variants of this scheme came wherein the HDLs were coupled with other programming languages like C/C++ or scripting languages like perl & tcl.  Typical characteristics of that period were, stretched process node life, multiple re-spins a norm, elongated product life cycle and relatively less TTM pressure.  For most part of the 90’s these test benches facilitated verification and were developed and maintained by the designers themselves. As design complexity increased, there was a need for independent pair of eyes to verify the code. This need launched the verification team into the ASIC group! Directed verification test suite soon became a formal deliverable with no. of tests defining the progress and coverage. After sustaining for almost a decade, this approach struggled to keep pace with the ramifications in the design space. The prime challenges that teams started facing include –
 
- The progress was mostly linear and directly proportional to the complexity i.e. no. of tests to be developed
- The tests were strictly written wrt clock period and slight change in design would lead to lot of rework
- Maintenance of the test suite for changes in the architecture/protocol was manual & quite time consuming
- Poor portability & reusability across projects or different versions of the same architecture
- High dependency on the engineers developing the tests in absence of a standard flow/methodology
- Corner case scenarios limited by the experience and imagination of the test plan developer
- Absence of a feedback to confirm that a test written for a specific feature really exercised the feature correctly
 
Though burdened with a list of issues, this approach still continues to find its remnants with legacy code.
 
In the last decade, SoC designs picked up and directed verification again came to the fore front. With processor(s) on chip, integration verification of the system is achieved mainly with directed tests developed in C/C++/ASM targeting the processor.  Such an approach is required at SoC level because –
 
- The scenario should run on the SoC using the processor(s) on chip
- Debugging a directed test case is easier given the prolonged simulation time at SoC level
- The focus in on integration verification wherein test plan is relatively straight forward
Constraints required for CRV needs to be strict at SoC vs block level and fine tuning them would involve a lot of debugging iterations which is costly at SoC level
 
While the above points justify the need for a directed approach to verify SoC, the challenges start unfolding once the team surpass the basic connectivity and integration tests. Developing scenarios that mimic the use-cases, concurrence behaviour, performance and power monitoring is an intricate task. Answers to this problem are evolving and once widely accepted they will complement the SoC directed verification approach to further evolve into a defined methodology.
 
Directed verification approach isn't dead! It has lived since the start of verification and the Keep It Simple Silly principle would continue to drive its existence for years to come. In the next part, we check out the evolution of CRV, its related HVLs and methodologies.
 
If you enjoyed reading this post, you would also appreciate the following –
 
Wish you a Happy and Prosperous 2013!!!
 
KEYWORD - HISTORY

Sunday, October 21, 2012

HANUMAN of verification team!

While Moore’s law continues to guide innovation and complexity in semiconductors, there are other factors that are further accelerating it. From iPod to iPhone5 via iPad, Apple has redefined the dynamics of this industry. New product launches include one from Apple every year, equally competitive and multiple products from other players and from Chinese cell phone makers every 3 months. There is an ever increasing demand of adding multiple functions to the SoCs while getting it right the 'first time' in 'limited time'. IP and sub system reuse has controlled the design lag of these SoCs significantly. While IP is assumed to be pre-verified at block level, the total time required to verify the SoC still is significant. A lot of this attributed to –
 
- Need for a staged approach to verify integration of the IPs on SoC
- Need to run concurrency tests, performance tests and use case scenarios
- Prolonged simulation time due to large design size
- Deploying multiple techniques to converge on verification closure
- Extended debug due to test bench complexity, test case complexity and all of the above
 
These challenges coupled with Murphy’s law conspire to pose a question on the verification schedule that claims a significant portion of the SoC design cycle. In the blog post Failure and Rescue, the author points to an interesting fact,things can and will go wrong. Yet some have a better capacity to prepare for the possibility, to limit the damage, and to sometimes even retrieve success from failure”. This directly applies to SoC verification too. Verification leads and managers are expected to build teams and ensure that the tools and processes deployed will diminish perceived risks while reduce unforeseen ones. The process involves bringing in engineers, tools and processes that match the project requirements. For effective management and resiliency, one needs HANUMAN to bring in balance to execution amidst uncertainty.
 
Who is Hanuman? HANUMAN is a Hindu deity, an ardent devotee of Lord Rama. With the commencement of festive season in India, number of mythological stories gain prominence. Hanuman is a central character in the Indian epic Ramayana, and also finds mentions in several other texts. He is also referred to as ‘Sankat Mochan’ i.e. SAVIOR who helped Lord Rama in precarious circumstances during the fight against evil. Last season, we correlated these epics with Verification here.
 
So where does HANUMAN find relevance in Verification?
 
SoCs today cannot be verified just with a team of verification engineers and a simulator. The process demands much more viz
 
- Meticulous planning on what to verify, how to verify, who will verify what and when we are done.
- Architecting the verification infrastructure to address verification plan development & tracking, test scenario generation, VIP sourcing and integration, assertion identification, coverage collection, power aware simulations, acceleration or emulation, regression management and automated triaging.
- Engineers, who can use the infrastructure efficiently, are experts in protocols and methodology, strong with problem solving and debugging.
 
Handling complexity amidst dubiety demands a RESCUER i.e. HANUMAN. The stakes are high and so are the challenges. Multiple intricate situations arise during the course of verification to decelerate the schedule. The RECOVERER from such situations can be an engineer, a tool or a methodology and that entity at that instance is a manifestation of HANUMAN.
 
Sit and recall your past projects...
...if you delivered in such situations, feel proud to be one
...if you identify someone who did it, acknowledge one
...if you haven't till now then be one!
 
May these avatars of HANUMAN continue to drive your silicon ‘right the first time and every time’.
Happy Dussehra!

Sunday, October 7, 2012

Verifying with JUGAAD

The total effort spent on verification is continuously on the rise (click here). This boost can be attributed to multiple reasons such as –
- Rising complexity of the design further guided by Moore’s law
- Constrained random test benches coupled with complex cross cover bins
- Incorporating multiple techniques to confirm verification closure
- Debugging RTL and the Verification environment
 
A study conducted by Wilson Research Group, commissioned by Mentor Graphics revealed that, mean time a design engineer spends in verification has increased from an average of 46% in 2007, to 50% in 2010. It also confirmed that debugging claims most part of verification engineer's bandwidth. While the effort spent on RTL debugging may rise gradually with the design size and complexity, TB debugging is showing up frequent spikes. Absence of a planned approach and limited support of the tools to enable this further adds up to the woes.  Issues in the verification environment arise mainly due to –
- Incorrect understanding of the protocol
- Limited understanding of the language and methodology features
- First timers making silly mistakes
- ‘JUGAAD’ (Hindi word for workaround)
 
Unlike design, the verification code was never subjected to area and performance optimization and the verification engineers were liberal in developing code. If something doesn’t work, find a quick workaround (jugaad) and get it working without contemplating the impact on testbench efficiency. Market dynamics now demand the faster turnaround of product and if verification is sluggish that impacts the product development schedule considerably. Below is one such case study picked from past experience wherein a complex core with parallel datapaths culminating into the memory arbiter (MA) block was to be verified.
 
GIVEN
 
CRV with Vera+RVM used to verify MA and block (XYZ) feeding MA. 100% functional coverage was achieved at block level for both modules. XYZ used complete frame structure to verify so average simulation time of test was 30 mins while MA used just a series of bytes & words long enough to fill FIFOs and simulation time was <5 mins. To stress MA further with complete frames of data and confirm it works fine with XYZ, CRV was chosen for XYZ+MA as a DUT. The rest of the datapath feeding XYZ was left to directed verification at top level as the total size of the core was quite large.
 
EXECUTION
 
The team quickly integrated the two environments and started simulating the tests. But this new env was taking ~16X more time as compared to XYZ standalone environment thereby impacting the regression time severely. This kicked off the debugging process of analyzing the bottleneck. First approach was to comment out the instances of MA monitor & scoreboard in the integrated env and rerun. If simulation time reduces then uncomment the instances and its tasks one by one to root cause the problem. On rerunning with this change there was no drop in simulation time. Damn! How was that possible?
 
Reviewing the changes, the team figured out that instead of commenting out the instances, the engineer had commented out the start of transactions. He claimed that just having an instance in the env shouldn’t affect as long as no transactions are getting processed by MA components. Made sense! But then why this Kolaveri (killer instinct)?
 
To nail down the problem multiple approaches like code review, increasing verbosity of logs and profiling were kicked off in parallel.
 
ROOT CAUSE
 
The MA TB had 2 mistakes. A thread was spawned from the new () task of scoreboard for maintaining the data structure and this code had a delay(1) to it. This was added by the MA engineer while debugging standalone env at some point in time as a JUGAAD.
 
task ma_xyz :: abc()
{
     variable declarations…
     while(1)
     {
        
        delay(1);         
     }
}
task new()
{
   
    fork
      abc();
    join none
}
 
Since this thread was spawned from new(), even though the start_xactor task was dormant this thread was still active causing the delay. Replacing this delay by posedge(clock) solved the issue and to respect guidelines this task was moved to a suitable place in the TB.
 
Lesson learnt – 'Jugaad' in the verification env of yesteryears doesn’t work so very well with modern day verification environment. Think twice while fixing you verification code or else the debugging effort on your project would again overshoot beyond average!
 
I invite you to share your experiences with such goofups! Drop an email to siddhakarana@gmail.com

Saturday, September 15, 2012

Communicating BUGS or BUGgy Communication

A few decades back when the designs had limited gate count, designers used to verify their code themselves. With design complexity increasing, the verification engineers were introduced to the ASIC teams. As Moore’s law continues to drive the complexity further, IP reuse picked up and this ensured that the engineers are spread all around the globe and communication across geographies is inevitable.
 
The reason for introducing the verification engineer was BUGS. A lot has been written and discussed on verification but reference to bugs is limited. With IPs getting sourced from different parts of the world and companies having extended teams everywhere, communicating the BUGs effectively becomes all the more important. “Wrong assumptions are costly” and in semiconductor industry this can throw a team out of business completely. Recently, in a mid-program verification audit for a startup with teams sitting between US & India, I realized that putting a well defined structure on communicating bugs could improve the turnaround time tremendously. Due to different time zones and working style, there was a lot of back & forth communication between the team members. Having a well defined format for communicating bugs helped a lot.
 
BUG COMMUNICATION CYCLE
 
BUG reported à BUG reproduced by designer à BUG fixed à FIX validated à BUG closed à revisit later for review, reproducing when required or data mining.
 
APPROACH
 
Before the advent of CRV, directed verification approach was common. Communicating bugs was simple and required limited information.  Introduction of CRV helped in the finding the bugs faster but sharing and preserving information around bugs became complicated. With a well defined approach, this problem can be moderated. Here is a sample format on what should be taken care of at each stage of the above cycle –
 
BUG reporting
 
Defining mnemonics for blocks or data path or scenarios etc. that can be appended along with the bug title helps. This enables categorizing the bugs at any time during project execution.
 
While the tool enforces adding the project related information, severity, prioritization and contact details of all concerned to broadcast information, the detailed description section should include –
 
- Brief description of the issue
- Information on diagnosis based on the debug
- Test case(s) used to discover this bug
- Seed value(s) for which the bug is surfaced
- Command line option to simulate the scenario and reproduce the issue
- Testbench Changelist/TAG on which the issue was diagnosed
- RTL Changelist/ TAG on which this issue was diagnosed
- Assertion/Cover point that covers this bug
- Link to available logs & dump that has the failure
 
BUG fixing
 
After the designer has root caused and fixed the bug, the bug report needs to be updated with –
 
- Root cause analysis and the fix
- Files affected during this fix
- RTL changelist/TAG that has the fix
 
FIX Validation
 
After the BUG moves to fixed stage, the verification engineer needs to update the TB if required and rerun the test. The test should pass with the seed value(s) required and then with multiple random seeds. The assertion/cover point should be covered in these runs multiple times. With this, the report can be updated further with –
 
- RTL changelist/TAG used to validate the bug
- Testbench changelist/TAG on which the issue was validated
- Pointer to the logs & waveforms of the validated test (if required to be preserved)
 
BUG Closed
 
After validating the fix, the bug can be moved to closed state. The verification engineer needs to check if there is a need to move this test to the smoke test list/mini regression and update the list accordingly.
 
Following the above approach would really help to communicate, reproduce and preserve the information around BUGs. The rising intricacies of the design demand a disciplined approach to attain efficiency in the outcome. Communicating bugs is the first step towards it!
 
HAPPY BUG HUNTING!!!

Wednesday, August 15, 2012

Laws and Verification

It’s Independence Day in India and I decided to use the freedom of applying some of the common laws to “Sid’dha-karana” aka Verification" :)

LAW means “a generalization that describes recurring facts or events in nature” and engineering is full of laws. Though laws are formulated for a particular aspect of engineering, most of these are widely applicable to parallel areas too. The idea here is to salute the heroes who articulated these postulates and derive conclusions applicable to our field of interest.

The Laws

In semiconductor industry the most widely referenced one is Moore’s Law formulated by Intel co-founder Gordon E. Moore, who described the trend in his paper (1965) later generalized as – The number of transistors on a chip will double approximately every two years.

Another common adage quite popular is Murphy’s Law named after Edward A Murphy based on his conclusion after experiments in 1949 stating "If there's more than one way to do a job, and one of those ways will result in disaster, then somebody will do it that way." The law is more generalized since then as – Anything that can go wrong, will go wrong.

Next set of laws form the basis for classical mechanics popularly known as Newton’s law of motion compiled by Sir Isaac Newton in 1687. The laws are –
First Law - Every object in a state of uniform motion tends to remain in that state of motion unless an external force is applied to it.
Second Law – The relationship between an object's mass m, its acceleration a, and the applied force F is F = ma. Acceleration and force are vectors and the direction of the force vector is the same as the direction of the acceleration vector.
Third Law – For every action there is an equal and opposite reaction.

Another law commonly discussed with reference to multiprocessing is Amdahl's law named after computer architect Gene Amdahl. It states that the performance improvement to be gained from using some faster mode of execution is limited by the fraction of the time the faster mode can be used.

Finally the law of Natural Selection formulated by Charles Darwin in support of the theory of evolution stating that, some individuals survive and reproduce more often than others, and as a consequence their heritable characteristics become more common over time.

Applying to Verification

Moore’s law – Amount of code to be verified doubles every 2 years.

Murphy’s law – Any code that isn’t verified will not work.

Based on the above two laws, we can further derive a conclusion that total Verification effort doubles every 2 years where effort = Engg time + Tools + Flows + CPU cycles.

Newton laws
First Law - Once the bug rate stabilizes, every code continues to be stable unless there is a change in external force. This force can be exerted by modifying constraints, introducing another methodology like assertions, formal etc.
Second Law - The improvement in rate of finding bugs is directly proportional to the net force applied and is inversely proportional to the total lines of code. This means, the effort, approach and outcome all depend on the F=ma relationship wherein, force is same as described in First law, m = lines of code and a is improvement in bug rate. In comparison to IPs, for large designs like SoCs, the effort is more while modifying the approach and the rate of finding bugs slows down for a given time.
Third Law - Every attempt to break the design results into more stable and mature design.

Amdahl’s law – When multiple verification techniques are applied, the overall gain (i.e. stability of code) is limited by the fraction of the time these techniques can be used e.g. CRV over directed, Formal over CRV, Manual tests vs auto generation of tests etc.

Law of Natural Selection – Change is inevitable and during the process of verification evolution, multiple approaches show up but only few find their footprint across design tape outs.

Drop in a comment with other laws you find applying to verification!
Happy Independence Day!!!

Sunday, April 29, 2012

Verification claims 70% of the chip design schedule!

Human psychology points to the fact that constant repetition of any statement registers the same into sub-conscious mind and we start believing into it. The statement, “Verification claims 70% of the schedule” has been floating around in articles, keynotes and discussions for almost 2 decades so much that even in absence of any data validating it, we believed it as a fact for a long time now. However, the progress in verification domain indicate that this number might actually by a "FACT".

20 years back, the designs were few K gates and the design team verified the RTL themselves. The test benches and tests were all developed in HDLs and sophisticated verification environment was not even part of the discussions. It was assumed that the verification accounted for roughly 50% of the effort.

Since then, the design complexity has grown exponentially and state of the art test benches with lot of metrics have pre empted legacy verification. Instead of designers, a team of verification engineers is deployed on each project to overcome the cumbersome task. Verification still continues to be an endless task demanding aggressive adoption of new techniques quite frequently.

A quick glance at the task list of verification team shows following items –
- Development of metric driven verification plan based on the specifications.
- Development of HVL+Methodology based constrained random test benches.
- Development of directed test benches for verifying processor integration in SoC.
- Power aware simulations.
- Analog mixed signal simulations.
- Debugging failures and regressing the design.
- Add tests to meet coverage goals (code, functional & assertions).
- Formal verification.
- Emulation/Hardware acceleration to speed up the turnaround time.
- Performance testing and usecases.
- Gate level simulations with different corners.
- Test vector development for post silicon validation.

The above list doesn’t include modeling for virtual platforms as it is still in early adopter stage. Along with the verification team, significant quanta of cycles are added by the design team towards debugging. If we try to quantify the CPU cycles required for verification on any project, the figures would easily over shadow any other task of the ASIC design cycle.

Excerpts from the Wilson Research study (commissioned by Mentor) indicate interesting data (approximated) –
- The industry adoption of code coverage has increased to 72 percent by 2010.
- The industry adoption of assertions had increased to 72 percent by 2010.
- Functional coverage adoption grew from 40% to 72% from 2007 to 2010.
- Constrained-random simulation techniques grew from 41% in 2007 to 69% in 2010.
- The industry adoption of formal property checking has increased by 53% from 2007 to 2010.
- Adoption of HW assisted acceleration/emulation increased by 75% from 2007 to 2010.
- Mean time a designer spends in verification has increased from an average of 46% in 2007 to 50% in 2010.
- Average verification team size grew by a whopping 58% during this period.
- 52% of chip failures were still due to functional problems.
- 66% of projects conitnue to be behind schedule. 45% of chips require two silicon passes and 25% require more than two passes.

While the biggies of the EDA industry are evolving the tools incessantly, a brigade of startups has surfaced with each trying to check this exorbitant problem of verification. The solutions are attacking the problem from multiple perspectives. Some of them are trying to shorten the regressions cycle, some moving the task from engineers to tools, some providing data mining while others providing guidance to reduce the overall efforts.

The semiconductor industry is continuously defining ways to control the volume of verification not only by adding new tools or techniques but redefining the ecosystem and collaborating at various levels. The steep rise in the usage of IPs (e.g. ARM’s market valuation reaching $12 billion, and Semico reporting the third-party IP market grew by close to 22 percent) and VIPs (read posts 1, 2, 3) is a clear indicative of this fact.

So much has been added to the arsenal of verification teams and their ownership in the ASIC design cycle that one can safely assume the verification efforts having moved from 50% in early 90s to 70% now. And since the process is still ON, it would be interesting to see if this magic figure of 70% still persist or moves up further!!!