Post on 24-Jan-2018
May 9, 20161
Harnessing Big Data to Simplify Debugging
May 9, 2016
Asi Lifshitz, CTO
www.thevtool.com
May 9, 20162
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 20163
RTL Debugging
• Verification is one of the major bottlenecks towards tape-out
• Debugging failing tests is complex and time-consuming
Source: Wilson Research & Mentor Graphics, 2014
May 9, 20164
• Iterating between the waveforms and the simulation log file
• Simulation log files can reach several GB
Debugging Today
May 9, 20165
• Big Data tools will quickly and efficiently extract data from huge log files
• Extracting and manipulating data gets simpler
• Data can be presented in a graphical way
• Shortening the debug time will shorten the project schedule and increase the engineer’s productivity
Debugging Tomorrow
May 9, 20166
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 20167
• Big data is a term for data sets that are so large or complex that traditional data processing applications are inadequate
• The term often refers simply to the usage of advanced methods for extracting value from data, and seldom to a particular size of data set
Big Data
May 9, 20168
• For some organizations, facing few gigabytes of data for the first time may trigger a need to reconsider data management options
• For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration
Big Data – Cont.
May 9, 20169
• A database is an organized collection of data
• The data is typically organized in a way that supports processes that require information
• A database management system (DBMS) is a computer software application that interacts with the user, other applications, and the database itself to capture and analyze data
Database
May 9, 201610
• Database can be used to query a specific record, i.e., a specific message
• However, if some computation is required a database search engine is to be used
– A concrete example which goes beyond the capabilities of a database, is when the DV engineer would like to see all messages from time point tp1 to time point tp2
Database for Log Files
May 9, 201611
• A search engine allows the user to search for information using simple keywords
Database Search Engine
May 9, 201612
• A free and open-source database search engine, originally written in Java
• Has been ported to Delphi, Perl, C#, C++, Python, Ruby, and PHP
• Suitable for any application that requires full text indexing and searching capability
• The core of its logical architecture is the idea of a document containing fields of text
May 9, 201613
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 201614
• A simulation log file is a structured textual file, and as such it can be indexed
• Once indexed, Lucene API can be used to search for all the ”interesting” events that are needed for debugging a failing test
Lucene for Verification
May 9, 201615
• The Universal Verification Methodology (UVM) is a standardized methodology for verifying integrated circuit designs
• More than 70% of the industry have adopted UVM, and the numbers will only grow with time
UVM
Source: Wilson Research & Mentor Graphics, 2014
May 9, 201616
• UVM-based simulation contains UVM messages that usually have the following format:
Verbosity
Filename(line)
Timepoint
Emitter
Message
UVM Messages
May 9, 201617
• UVM_ERROR /project/sflash/verification/SFLASH_controller_ENV/src/sflash_controller_env_sb.sv(1863) @ 4498000: uvm_test_top.env.sb [WRITE_MODE_SPI_DATA_ERR] Sent data packet contains 0x532e4000, but expected 0x532e4cb3
• UVM ERROR is the verbosity (or severity)
• /project/sflash/verification/SFLASH_controller_ENV/src/sflash
_controller_env_sb.sv(1863) is the filename(line)
• @ 4498000 is the time point
• uvm_test_top.env.sb is the emitter of the message
• [WRITE_MODE_SPI_DATA_ERR] Sent data packet contains
0x532e4000, but expected 0x532e4cb3 is the message
UVM Message Example
May 9, 201618
• Parse the log file, so that every message will be broken to the aforementioned 5 elements and stored as records in Lucene database
• The user can now use the efficient API of Lucene to extract information
Using Lucene for UVM Messages
May 9, 201619
• Being designed to handle huge records, Lucene returns these records in a negligible time
– Receive all messages of a specific verbosity, or specific verbosity within some time range
– Messages containing a specific string
– All messages emitted from the APB UVC writing 0X1 to register sflash_reg.enable
Extracting UVM Records
May 9, 201620
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 201621
• It is extremely hard to navigate through the log file, while seeking for the necessary information, without being overwhelmed or miss important information
• Graphical representation of data is more natural and is much easier for analysis
Why Graphical Representation?
May 9, 201622
Graphical Representation of a Log File
May 9, 201623
Graphical Debugging
• The transition from debugging a textual file to a graphical representation is intuitive
• Problems are traced much faster.The engineer can quickly see what is wrong, when the pattern changes, or when some unexpected event has occurred
May 9, 201624
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 201625
Summary
• The complexity and size of designs these days require new techniques, as the traditional ones impose very long debugging time
• Harnessing tools that are used for processing Big Data can simplify and shorten the debug time of failing tests
• We hope that this work will encourage more researches on importing these strong capabilities to the existing and new EDA tools
May 9, 201626
Thank You
26