Software Static Analysis
-
Upload
chris-raulf -
Category
Documents
-
view
239 -
download
0
description
Transcript of Software Static Analysis
Successful I18n Project
Planning using Static Analysis
Lingoport, Inc.
3985 Wonderland Hill Ave.
Boulder, Colorado
USA 80304
+1 303 444 8020
www.lingoport.com
Adam Asnes
Grand Poisson
Copyright: March 2011Please do not reproduce without authorized permission
Olivier Libouban
G11n Lead
Lingoport
• Internationalization Services
– Assessment
– Project planning
– I18n development
– I18n testing
– Localization integration
• Globalyzer
– Internationalization software
• Find and fix i18n issues in code
Agenda
• Business Case
• I18n issues
• Static Analysis Background
• Requirements Gathering
• Static Analysis Detail
• Project Plan Example
• Agile planning
• Continuous Integration for i18n
Engineering for Locale Support
• Globalization (g11n) has two components :
– Internationalization (i18n) : software engineering to
enable localization
– Localization (L10n): culture specific resources
(translation, etc.)
Business Case:
Nothing gets internationalized or localized just cause it would be cool
I18n Needs – Biz vs. Tech
Engineering thinks about…
1. Multi-tiered web application?2. Complex Interface?3. Database components?4. Embedded Strings?5. Locale aware application?6. Can it manage multiple data formats?7. I18n testing plan?8. Tactics to get it done
Our Software must be in Japanese, French, German, Chinese, and Spanish by November
I18n is Business Driven
• Global initiatives
– Expanding opportunities, New customers
• Competitive pressure
• Lost time to market
• Iterative code fixing, problems keep slipping
through
• Development costs in the hundreds of
thousands to millions of dollars
You Need a Plan – Scope 1st
, design later
• Project becomes real with $$$
• CFO thinking in terms of ROI
– Deal Based• Revenue – Costs = Profit
– Strategic
• Revenue over X years – Costs +
effect on equity – risk
• Leverage global investment of
organization
– Cost of Time to Market
• If you‟re late or lousy, that has
significant opportunity cost
Engineering:
Localization is a Downstream Concern
• “Somebody else‟s problem” in the world of many
developers
• Creates an opportunity to educate and shepherd
teams through globalization
Is It Internationalized?
• Typically underestimate i18n requirements
• Most don‟t know the answer
• Agile or other feature and release requirements
often overrun less formally measured i18n
requirements
• There is a Management Value in being able to
confirm global readiness
Example: Hard-Coded English Text
1 million lines of source codeFound:
20,000 Embedded Strings which cannot be efficiently translated
String orderStatus = “Your order has been
processed. A confirmation e-mail will be
sent to you shortly.”;
Character Sets/Encodings
• Character set (e.g. Unicode)
– A set of characters used to support a given language or series of
languages
• Character encoding (e.g. UTF-16, UTF-8)
– A set of code points that defines numeric values for each
character within a character set (coded character set)
Character Sets and Encoding
• This is broken:
Sample Code (Java) – i18n Examples
I18n Engineering Considerations
• Locale Handling
• Character encoding
• Strings– External, Grammar, Segments, Plurals, Wrapping
– String Handling (char *, etc.)
– Tabs, spaces, delimiters, etc.
• Resource management –centralized, normalized, re-usable
• Dates - Calendar
• Times
• Sorting & searching
• Currency
• Transaction process
• Character set conversions
• On line help
• Sounds
• Honorific titles
• Telephone formats
• Postal formats
• Region-specific functions
• Shipping conditions
• Numerical formats
• Page layout, LTR, RTL
• Fonts and attributes
• Icons, colors
• Reporting, workflow
• Database support
• Multi-byte enabling
• Business logic
• Measurements, units
• Input Methods
• Data exchange
Internationalization Challenge
• Software Data Path - it‟s not just the display
Display Input Transform
Store
RetrieveTransform
New Internationalization Project!
• What to do?
– Large amount of code
– Change in requirements
– Change in architecture
– Change in development practices
– Change in testing requirements
Practical Challenges
• Sift through hundreds of thousands or millions of
lines of code
• Managing fixing complex problems
• Creating a product that looks, feels and behaves
natively to its worldwide users
• Source code must be adapted to seamlessly
adapt to any language, streamlining support and
updates
Code Review
• What to Identify– Embedded strings
– Locale-Sensitive methods/functions/classes
– Image references
– Unsafe programming constructs (ex: regular expressions needing US Alphabetical Order, Pointer arithmetic and more)
Code Analysis
• How to Identify Issues– “Brute force”
• Engineers search for and resolve known issues
• Count display pages
• Pseudo-localization
• Scripts and page by page analysis
– Globalyzer-assisted review, static analysis
• An I18n code analysis tool is employed to examine source code for a large range of potential and known issues
• Issues can be identified and resolved in a more systematic fashion
Traditional Approach - repeat, and repeat, and repeat, and repeat
Localize and see what you‟re missing
GREP, overwhelm developers
View pages. Pour through code for strings,
methods, etc.Externalize and refactor
one by one
Test, Pseudo-Localize
Globalyzer Server and Clients
Static Analysis on the Source Code
Server
Client Command Line
Globalyzer is methodology agnostic. Project Managers may use it in a „traditional‟ approach or Agile approach.
Globalyzer Principles - Customization
• Globalyzer Server manages Rule Sets Configuration– Globalyzer Rule Sets are used to
identify i18n issues in the code base
– Rules embody the i18n issue detection logic
– One rule set targets one programming language (& variant)
– Default rule sets are based on research and years of experience
– Rules must be tailored to a specific project
– Rules can be shared amongst team members
Globalyzer Principles – Desktop Analysis
• Globalyzer desktop client:
– Scan source code using Globalyzer Rule Sets
– Detect and report i18n potential issues
– Manage i18n issues
– Assist Fixing the code to become i18n compliant
Globalyzer Principles - Automation
• Globalyzer Command Line
– For integration in the overall software process to run at given
frequencies
– Generate reports once a setup has been established
– Different strategies
• Segment the code base into small scan projects that
reflect the i18n effort
• Focus on i18n scope
I18n Processes
• Planning
• Market Requirements Analysis
• Architectural Requirements Analysis
• Code Review
• I18n Design
• I18n Implementation
• Testing
• And beyond…
• Localization
• Support
Merging Requirements and
• Architectural Changes
What‟s not in the code
– Locale support
– Changes to how data
is passed around
– Discuss and Analyze
technical requirements
• Code Analysis
What‟s in the code
– Strings
– Refactoring Locale-
limiting
methods/functions
– Find and count issues
I18n Architectural Challenge – what’s not in the code
DatabaseCharacterencoding support
Application Codee.g. Java, C++, VB
3rd Party Products
U/Ie.g. JSP,
ASP, ASPX
Business Logic
Platforms, Browser Support Requirements
Marketing RequirementsLocale behavior
COMPLICATIONS
Operational Challenges
• Ongoing development
– Agile?
– Code Branching?
– Multiple teams?
Release Path
• Internationalization,
1st Time
– Most of U/I
– Breaks the DB
– Data I/O
– Test entire product
• Feature Release
– 3 week sprint?
– Focus on code subset
– Concentrated testing
• Static analysis with
Globalyzer
Code branch, merge, testing strategy
Factors to Plan On
• Programming languages
• How many tiers, what do they do
• Database support
• Locale Requirements
• 3rd Party Products – support for Unicode?
• Size of Application – Lines of Code
• Amount of Embedded Strings to be Externalized
• Estimate of concatenation
• DB refactoring
• Methods/Functions/Classes replacement
Tiers and Technologies
1
• Java
• C#
2
• JavaScript
• VB
3
• C++
• Older languages: e.g. RPG
Time and effort increase
Other Issues
• Stability of the build
• Quality of the code
– History
• Focus of the developers
• Source code management approach
• New concurrent development introducing new
i18n problems
Questions & Answers
Adam Asnes
Olivier Libouban
Resources
http://www.lingoport.com
Globalyzer
http://www.globalyzer.com
Blog
http://i18nblog.com
Lingoport:
Requirements and Planning
Adam Asnes
President & CEO
Lingoport
Olivier Libouban
Globalization Lead
Lingoport
Why go through requirements?
• I18n work is software engineering
• To determine the scope of the i18n work, the
i18n cannot simply look at the code and come
up with an i18n project
• Scope also leads to planning, cost, resources
• How to describe i18n requirements?
Focus on one requirement: Locale
• One product instance per locale?
• Multi-locale support
• Locale detection?
• User account support?
Ex: WebSphere Portal Locale
Determination
– User logged in: display user‟s preferred language
– No preferred user language: look for user‟s browser
language
• If supports of that language, displays in that language.
• If browser has more than one language defined, uss the first
language in the list to display the content.
– If no browser language can be found, for example if the
browser used does not send a language, the portal
resorts to its own default language.
– If the user has a portlet that does not support the
language that was determined by the previous steps,
that portlet is shown in its own default language.
One-Time Locale Selection
System based Locale Detection
More of the typical i18n requirements
• Target date(s)
• System requirements
• Existing & potential use cases for UI text entry,
• Text display
• Text processing
• Collation
• Handling of locale-sensitive data (dates,
numbers, currencies, etc.).
• Client Installer considerations
Architectural Discussion
• Thorough Product Demo
• Walk through major architecture components
Conceptual illustrative architecture
Specific development and integration
Web Services Rules Engine JMS
RDBMS LDAP CMS
CODE
UI
Business
Persistance
Workflow
Engine3rd Parties
April 19, 2011 – p 45
Specific i18n software engineering focus
• UI : html, server side, JavaScript,
input forms, css, content
presentation, etc.
• Business logic, searches,
comparisons, data exchange with
external systems
• Persistence : exchanges with
RDMBS, Content Management,
LDAP, file based persistence
(xml, etc.)
April 19, 2011 – p 46
Specific development and integration
Web Services Rules Engine JMS
RDBMS LDAP CMS
CODE
UI
Business
Persistance
Workflow
Engine3rd Parties
Specific development i18n issues
• String externalization (outside of
code) and i18n resource bundles
• Locale sensitive methods :
searching, retrieving, sorting, date
and time, string operations,
character operations, etc.
• Code resources (images, etc.)
• Overall programming language
specifics
April 19, 2011 – p 47
Specific development and integration
Web Services Rules Engine JMS
RDBMS LDAP CMS
CODE
UI
Business
Persistance
Workflow
Engine3rd Parties
Specific development and integration
Web Services Rules Engine JMS
RDBMS LDAP CMS
CODE
UI
Business
Persistance
Workflow
Engine3rd Parties
Data stores i18n issues
• PL/SQL
• Encoding
• Locale files (xml, xls, csv, etc)
• Database specific issues, date/time, conversion, sorting, soundex, etc.
• Storing and retrieving local data in local language (vs. a “generic” schema)
• User entered data
• Columns requiring translation
• Attributes, user names, postal addresses, etc
• Database design
April 19, 2011 – p 48
Specific development and integration
Web Services Rules Engine JMS
RDBMS LDAP CMS
CODE
UI
Business
Persistance
Workflow
Engine3rd Parties
Content Management i18n issues
• Accessing the proper locale
• Encoding of content
April 19, 2011 – p 49
Specific development and integration
Web Services Rules Engine JMS
RDBMS LDAP CMS
CODE
UI
Business
Persistance
Workflow
Engine
3rd Parties
External system i18n issues
• Modality of data exchange /
data loss
• Accessing the proper locale
• Encoding/persistence of
content on external system
April 19, 2011 – p 50
I18n Engineering Considerations
• Locale Handling
• Character encoding
• Strings– External, Grammar, Segments, Plurals, Wrapping
– String Handling (char *, etc.)
– Tabs, spaces, delimiters, etc.
• Resource management –centralized, normalized, re-usable
• Dates - Calendar
• Times
• Sorting & searching
• Currency
• Transaction process
• Character set conversions
• On line help
• Sounds
• Honorific titles
• Telephone formats
• Postal formats
• Region-specific functions
• Shipping conditions
• Numerical formats
• Page layout, LTR, RTL
• Fonts and attributes
• Icons, colors
• Reporting, workflow
• Database support
• Multi-byte enabling
• Business logic
• Measurements, units
• Input Methods
• Data exchange
April 19, 2011 – p 51
Process requirements:
how to fit into an existing environment
• Lifecycle
• Documentation
• Integration
• QA
• Type of meetings
• Build
• Source control
• Branching
• Reporting structure
• Review boards
• JUnit
• Globalyzer
• Bug Reporting
Questions & Answers
Adam Asnes
Olivier Libouban
Resources
http://www.lingoport.com
Globalyzer
http://www.globalyzer.com
Blog
http://i18nblog.com
Static Analysis Detail
Globalyzer example – Running and Reporting
Adam Asnes
President & CEO
Lingoport
Olivier Libouban
Globalization Lead
Lingoport
Example Project Plan
Looking at a plan from a service project
Example Project Plan
Combine:
•1 Part Architecture
•1 Part Code Metrics
•1 Part Experience
Lingoport:
Agile & Internationalization
Adam Asnes
President & CEO
Lingoport
Olivier Libouban
Globalization Lead
Lingoport
Agile in one slide (smallest nutshell)
• Roles (Product Owner, Scrum Master, Team)
• Product Backlog
• Sprints (user stories are designed, implemented, tested in a „short‟ timeframe, e.g. 3 weeks)
• Sprint Backlog
• Daily Scrums
• Demonstrable
• „Shippable‟
i18n and Agile Challenges
• Traditionally, Legacy i18n has followed a waterfall model:– i18n cuts across the code, for instance:
• Encoding problems …in all the code
• Formatting issues … in all the code
• Externalize strings …
– i18n needs a systemic approach
– I18n tend to have long project life cycles
– (L10n: must get an entire locale done)
• From a methodology perspective Agile:– is feature driven
– runs in “short” Sprint
• Sometimes a Hybrid approach works best
Agile & i18n Process Challenges
Lingoport Project Assessment - Legacy
• Uncover i18n potential issues from 2 perspectives:– Code perspective: Globalyzer reporting/metrics
– Architectural: Locale/technical i18n requirements
• Allows to create the initial „i18n product backlog‟
• Can, but does not need to be part of a Sprint
• Allows to have an overall scope and effort estimate
• Can feed into a number of processes– TDD, ADD, Waterfall, … Agile
• Involve the Product Owner: communication resource
Lingoport Project Organization
Backlog identification and Scoping
• The i18n product backlog is a prioritized list of
requirements, stories, features, etc.
• What the customer wants, described using the
(Product Owner‟s) customer‟s terminology
ID Name Imp Est How to demo Notes
1 Locale Setting and Tracking 30 5 Log in,
If no login before,
default locale
Splash screen for
Locale
If first time, otherwise
remembers
… …
… …
2 Locale for languages 10 8
Log in for an 'en US'
user Locale is default
Go to page 'www.'
Change Locale
Check pseudo
localization
… ..
Lingoport Project Organization
Sprint Management
• i18n code branching
• Agile typically uses development build, CI
environments
• Must pass „regular‟ dev criteria
• Must be able to push i18n code branching easily
and vice versa
• I18n tests must be available to other teams in CI
• Some items are more sensitive than others
– Database schema changes and implications on all source
Continuous Integration - Basics
Team 1
Team 2
Team 3
Team 4
Team 5
CI & Scan Results Summary
CI & Scan Details Results
Questions & Answers
Adam Asnes
Olivier Libouban
Resources
http://www.lingoport.com
Globalyzer
http://www.globalyzer.com
Blog
http://i18nblog.com