Analysis and System Test of Powertrain Embedded Control Systems ...

Analysis and System Test

of Powertrain Embedded Control Systems in Heavy Vehicles during

Start-Up and Shutdown

M A R K B A R T I S H

Master of Science Thesis Stockholm, Sweden 2011

Analysis and System Test of Powertrain Embedded Control

Systems in Heavy Vehicles during Start-Up and Shutdown

M A R K B A R T I S H

Master’s Thesis in Computer Science (30 ECTS credits) at the School of Engineering Physics Royal Institute of Technology year 2011 Supervisor at CSC was Alexander Baltatzis Examiner was Stefan Arnborg TRITA-CSC-E 2011:065 ISRN-KTH/CSC/E--11/065--SE ISSN-1653-5715 Royal Institute of Technology School of Computer Science and Communication KTH CSC SE-100 44 Stockholm, Sweden URL: www.kth.se/csc

Abstract

This diploma project was performed at Scania CV AB in Södertälje.The goal was to investigate embedded powertrain control systems withrespect to their startup and shutdown processes which are extra sensi-tive phases in these systems. That is due to the fact that these controlsystems, which are called Electronic Control Units (ECUs), interactthrough a communication bus called a Controller Area Network (CAN).A unit that sends faulty data may affect other ECUs on the same com-munication bus. All ECUs on the same bus do not start simultaneouslyand the variation in startup times must be taken into account. Dur-ing shutdown, the sensitive process of saving of Non-Volatile Memory(NVM) data is initiated. Should something go wrong during this pro-cess the result may be corruption of operational data and End-Of-Lineconfiguration (EOL). Also misleading error codes may be built.

Scania therefore wanted to have one or several test cases for systemtest of the powertrain ECU software focused specifically on these areas.The author of this report performed a technical analysis of the “problemareas” of the ECUs as well as failure report analysis in order to deter-mine what the areas of greatest risk are. Based on this analysis, thesystem functional requirements on the ECUs were identified and testcases were developed. The work resulted in a total of two test caseseach of which is related to an identified problem area. The test casesare divided into test flows which are a set of direct instructions how thetests should be performed. Each test case verifies one or more systemfunctional requirements and are meant to be implemented as scripts forthe test automation rigs. The actual implementation in test automationscripts has not been done as part of this diploma work, only a manualconduction of the test flows in a laboratory environment. Also a theo-retical study of different techniques for software testing was performed.The result of this study is presented in the theory chapter of the report.

Referat

Analys och systemtest av inbyggda drivlinestyrsystem i

tunga fordon under uppstart och nedstängning

Detta examensarbete utfördes vid Scania CV AB i Södertälje. Syftet varatt undersöka inbyggda drivlinestyrsystem i lastbilar och bussar med av-seende på problem i samband med uppstart och nedstängning som ärextra känsliga moment hos dessa styrsystem. Detta på grund av att sty-renheterna kommunicerar genom CAN (Controller Area Network) ochen styrenhet som eventuellt skickar felaktig data påverkar alla andra påsamma kommunikationsbuss. Alla system på nätverket startar inte ex-akt samtidigt därför måste hänsyn tas till variationer av uppstartstider.Vid nedstängning kan sparande av NVM-data1 i EEPROM vara ett pro-blem, en oväntad avstänging av ett styrsystem kan resultera i korruptdata. Ovanstående problem kan leda till att missvisande felkoder bildas.

Scania ville därför utveckla testfall för systemtest av mjukvara i des-sa styrsystem specifikt fokuserat på dessa problemområden. Det börjademed en teknisk analys av problemområden och fortsatte med genomgångav felrapporter både interna och från auktoriserade Scania-verkstäder.Därefter identifierades krav på mjukvaran och testfall utvecklades ut-ifrån företagets styrdokument som definierar testfallsutvecklingproces-sen. Resultatet blev två testfall som var och en berör ett identifieratproblemområde. Testfallen är uppdelade i testflöden som är en uppsätt-ning direkta instruktioner för hur testning skall gå till. Varje testflödeverifierar ett eller flera systemkrav. Testflödena är tänkta att vara ettunderlag för implementation av testskript för testautomatiseringsriggar-na. Någon implementation i skript har dock inte gjorts inom ramen förexjobbet, endast en manuell genomkörning i laborationsmiljö. En teore-tisk studie utfördes också kring olika tekniker för mjukvarutest. Resultatav denna presenteras i rapportens teoridel.

1NVM = Non Volatile Memory

Contents

1 Introduction 1

2 Background and Problem Statement 3

2.1 Electronic Systems in Heavy Vehicles . . . . . . . . . . . . . . . . . . 3

2.2 EMS - Engine Management System . . . . . . . . . . . . . . . . . . . 4

2.2.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.2 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 GMS - Gearbox Management System . . . . . . . . . . . . . . . . . . 8

2.3.1 Scania Opticruise . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.3.2 Scania Comfort Shift . . . . . . . . . . . . . . . . . . . . . . . 8

2.3.3 Scania Retarder . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.4 EEC - Exhaust Emission Control . . . . . . . . . . . . . . . . . . . . 9

2.5 OBD - On Board Diagnostics . . . . . . . . . . . . . . . . . . . . . . 10

2.5.1 KWP - Keyword Protocol . . . . . . . . . . . . . . . . . . . . 11

2.6 The problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.7 Test platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.7.1 Automated Testing in Simulator Rigs . . . . . . . . . . . . . 12

2.7.2 Manual Testing in a Vehicle . . . . . . . . . . . . . . . . . . . 12

3 Theory 15

3.1 CAN - Controller Area Network . . . . . . . . . . . . . . . . . . . . . 15

3.2 Theory behind CAN . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.2.1 Real-time System . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.2.2 Differential bus . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.2.3 Data transmission . . . . . . . . . . . . . . . . . . . . . . . . 18

3.2.4 Bit stuffing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.2.5 Bit arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.3 Software Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.3.1 Module Testing (also called Unit Testing) . . . . . . . . . . . 21

3.3.2 Function Testing . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.3.3 Integration testing . . . . . . . . . . . . . . . . . . . . . . . . 21

3.3.4 System testing . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.3.5 Acceptance Testing . . . . . . . . . . . . . . . . . . . . . . . . 21

3.4 Other Types of Testing . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.4.1 Regression testing . . . . . . . . . . . . . . . . . . . . . . . . 22

3.5 Software Development and Testing Procedures . . . . . . . . . . . . 22

3.5.1 The V-model . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.6 Test Techniques: Black Box Testing . . . . . . . . . . . . . . . . . . 23

3.6.1 Decision Tables and Decision Trees . . . . . . . . . . . . . . . 23

3.6.2 State Transition Testing . . . . . . . . . . . . . . . . . . . . . 24

3.6.3 Equivalence Class Partitioning and Boundary Value Analysis 26

3.7 White Box Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.8 Gray Box Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.9 Smoke testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.10 Non Functional Testing . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.11 How far should we test? . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.12 Formal Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4 Methods 31

4.1 Technical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2 Failure Report Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.3 Requirement Identification . . . . . . . . . . . . . . . . . . . . . . . . 32

4.4 Test Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.5 Developing the Test Cases . . . . . . . . . . . . . . . . . . . . . . . . 32

5 Results 33

5.1 Technical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5.1.1 CAN messages and signals . . . . . . . . . . . . . . . . . . . . 33

5.1.2 Signal and Component Statuses . . . . . . . . . . . . . . . . . 33

5.1.3 Start Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5.1.4 Cranking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.1.5 Shutdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.3 Test case 1: EMS – EEC Communication, CAN timeouts detection . 37

5.3.1 Requirement Identification . . . . . . . . . . . . . . . . . . . 37

5.3.2 Test Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.3.3 Test Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.3.4 Testing the Test Case . . . . . . . . . . . . . . . . . . . . . . 45

5.4 Test case 2: EMS Shutdown . . . . . . . . . . . . . . . . . . . . . . . 46

5.4.1 Requirement Identification . . . . . . . . . . . . . . . . . . . 47

5.4.2 Test Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.4.3 Test Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.4.4 Detect an Abnormal Shutdown and Set a DTC and InternalEvent (INTE) . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.4.5 Possibility to Cancel a Shutdown in Progress . . . . . . . . . 52

5.4.6 Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

6 Conclusions and Suggestions for Future Work 576.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576.2 Future Work Suggestions . . . . . . . . . . . . . . . . . . . . . . . . 58

Bibliography 61

Nomenclature

CCP CAN Calibration Protocol

ComP Common Platform

COO Coordinator control unit

CRC Cyclic Redundancy Check

DEC Diagnostic Event Code

DIN Deutsches Institut für Normung

DTC Diagnostic Trouble Code

E2 See EEPROM

ECU Electronic Control Unit

EEC Exhaust Emission Control

EEPROM Electrically Erasable and Programmable Read-Only Memory

EMS Engine Management System

EOL End of Line

GMS Gearbox Management System

ICL Instrument Cluster

ISO International Organization for Standardization

J1939 The SAE standard for CAN communication defining some of the signalsthat are sent between control units

KWP Keyword Protocol

NEVS System Test group within NE

NE Powertrain Control System department at Scania

OBD On-Board Diagnostics

OPC Opticruise

S8 A version of EMS

SAE Society of Automotive Engineers

SCR Selective Catalytic Reduction

SDP3 Scania Diagnos and Programmer 3

SFR System Functional Requirements

U15 An input signal used for wake up of the ECUs. It is 0 V if ignition key isOFF and the same level as U30 if ignition key is ON (DIN 72552)

U30 An input line for the main power source to the ECUs (DIN 72552)

VCI Vehicle Communication Interface

Chapter 1

Introduction

Today, electronic control systems are used in almost all motorized vehicles. Elec-tronic controllers consist of computer hardware and software that constantly readsinput signal values from the sensors connected to it and based on these values cal-culates the output signals to the actuators, in other words controls the system.They are called Electronic Control Units (ECUs) – a denotation that will be usedthroughout the report.

The ECUs control different parts of a vehicle’s function, from the most vital, likefuel injection in the engine, to less important like radio and cab heating system.As the complexity of every vehicle’s electrical and electronic system grows, so doesthe need to systematize the ECUs. It is not possible neither would be desirable tohave one single control unit for the whole vehicle. For the ability to modularizethere needs to be several ECUs, one for each function. This of course leads to aneed for ECUs to be able to communicate with each other. A standard, namedCAN1 was developed for this purpose.

Scania CV AB is a manufacturer of heavy vehicles (trucks, buses) and industryand marine engines (subsequently denoted I&M-engines). Scania’s research and de-velopment department is located in Södertälje, Sweden.

This master thesis project was performed at the System Test group within thedepartment for development of powertrain control systems at Scania Research andDevelopment. The objective was to analyze operation of powertrain ECUs duringtheir start-up and shutdown phase, memory initialization, non-volatile data man-agement (EEPROM data), file handling and communication establishment as wellas sensor/actuator signals and CAN-signals with signal statuses and related DTCs(Diagnostic Trouble Codes). After this analysis a list of software SFRs (SystemFunction Requirements) was made which consists of the requirements already de-fined by other documents at the department as well as the test developer’s own

1CAN = Controller Area Network

1

CHAPTER 1. INTRODUCTION

defined requirements. The latter may be requirements that are parsed out of SFDs(System Function Descriptions) but not explicitly stated there. The ultimate aimwas to develop one or several test cases that cover SFRs, a process which includedchoosing of test techniques and test platform.

2

Chapter 2

Background and Problem Statement

As mentioned in the introduction digital electronic controllers really made their wayinto the automotive domain during the last decades. A considerable part of ScaniaResearch and Development (R&D) is involved in development of these control sys-tems, commonly known as embedded systems.Embedded systems have similarities to the desktop computers in the sense thatthey also have a microprocessor and surrounding hardware, such as motherboardand memory chips. However unlike desktop computers they are not general purposemachines. They are designed, both in hardware and software, to perform a veryspecific task, like controlling the operation of an engine. Hardware resources likeCPU and memory are often much more limited in embedded systems comparedto desktop computers. Also the operating systems in embedded applications doesusually not contain such services as virtual memory management which puts muchstronger requirements on memory management in application software in such sys-tems. Another dissimilarity with the desktop computers is the human interaction.Some embedded systems lack direct human interaction of any kind while othershave a different kind of communication with the user than a desktop computer.There may be light emitting diodes (LEDs), small displays, relays and switches andsimilar devices used for user interaction.

2.1 Electronic Systems in Heavy Vehicles

The figures 2.1 and 2.2 illustrate a network of ECUs in a Scania truck. Each ECU isresponsible for different functionality in the vehicle. A typical ECU is connected todifferent sensors and actuators, constantly reading input signals from sensors andit outputs control signals to actuators. ECUs are connected to each other in suchway that they can act as sensors and actuators for each other (more about this insection 3.1). There are many ECUs in a typical modern vehicle. A modern car cancontain up to 40 ECUs [HYB10].

3

CHAPTER 2. BACKGROUND AND PROBLEM STATEMENT

Figure 2.1. ECUs in a Scania truck. For the meaning different colors see fig. 2.2

Not all of the ECUs used in Scania vehicles are developed by Scania. Some arepurchased from external suppliers but many of the ECUs shown in fig. 2.1 are de-veloped by Scania and this applies to all of the powertrain ECUs which are namedGMS1, EMS and EEC (not shown in fig. 2.1).

2.2 EMS - Engine Management System

Engine Management System - This ECU is found on all engines produced by Scania,both road vehicle and I&M-engines. It is one of the most important and complexECUs found in a Scania product. There have been several different software and

1with the exception of fully automatic gearboxes which are purchased from external suppliersas a complete unit.

4

2.2. EMS - ENGINE MANAGEMENT SYSTEM

- Powertrain ECUs

Figure 2.2. Topological view of a CAN network in a typical Scania vehicle. Thethree different communication buses (red, yellow and green) make a logical division ofECUs in three groups. ECUs on the same bus communicate with each other directlywhile any communication between ECUs on different buses are controlled by theCoordinator ECU. Each one of the buses may operate at different bit rates. Thered bus interconnects the most critical ECUs for a vehicle’s function and operationsafety. The yellow contains less critical but still important ECUs and the green isfor systems which does not have any impact on the operational safety of a vehicle(however some ECUs on the green bus are responsible for providing information to thedriver that may be considered safety critical). Not all ECUs are directly connectedto one of the three buses, an example of the one that is not is the EEC which directlycommunicates only with the EMS.

hardware versions of EMS over past years. The current one is named S8. The differ-ences from the previous S7 is mostly in software. The largest software modificationwith respect to S7 is that S8 is now based on what is called Common Platform

5


Figure 2.3. EMS version S7 mounted on an 6-cylinder engine

(ComP). S8 is totally compatible (with respect to connector-pin interface) with S7and replaces it as a spare part.

2.2.1 Hardware

EMS S8 has 140 connector sockets pins to sensors and actuators including fourCAN-pairs, one for the red CAN bus, one for the EEC sub-bus, one for internaldevelopment use only and one that is currently unused. [Scania Internal Documen-tation]

2.2.2 Software

From the software point of view the EMS consists of different components (or layers).

Common Platform (ComP) which may be seen as the lowest level software(i.e. closest to the hardware). ComP contains software for direct signal I/O, signalprocessing a part of non-volatile memory (NVM) management and interacts closelywith the LLAP. ComP also contains code that translates physical signals (measuredin Volts) into engineering units. ComP can be seen as the “foundation” and is usedin multiple ECUs (all powertrain ECUs).

6

2.2. EMS - ENGINE MANAGEMENT SYSTEM

Figure 2.4. EMS S8 software architecture layers and managers.

Low Level Application software (LLAP) responsible for transmission and re-ception of CAN messages, encoding and decoding CAN messages from signals ac-cording to Scania CAN specifications and timeout diagnosis. It also is responsiblefor some hardware checks, AD conversion and diagnosis of sensors and actuators.Since this layer contains parts that are directly involved in the start-up and shut-down process, this one is analyzed quite thoroughly.High Level Application software (APPL or HLAP) contains managers (whichin turn contain modules) that control and monitor fuel injection, combustion, gasexchange, after-treatment and so on. These are of limited relevance to this thesis.Virtual Sensors (VSEN) Irrelevant to the project. Will not be described.File manager (FILE) is responsible for non-volatile data reading and writingto/from EEPROM as well as the RAM mirror which is a mechanism to make datanormally residing in EEPROM to be read into RAM during execution for fasteraccess. During shutdown the data is written back to EEPROM.Run-Time Database (RTDB) manages signals that are cross-layer accessiblewithin the unit.Commonly used utilities (UTIL) is not relevant to the report and will thereforenot be described

7


2.3 GMS - Gearbox Management System

GMS is a control unit that controls a special type of gearbox found in Scania trucks,called Opticruise, as well as an auxiliary braking system named Retarder. There areseveral different types of gearboxes present in Scania road vehicles on the markettoday (2011). These are

• Manual gearbox

• Automatic gearbox, developed by external suppliers

• Opticruise, an older version with a clutch pedal

• Opticruise, the new version without a clutch pedal

• Comfort shift

The manual gearbox is purely mechanical which means that the driver manuallycontrols shifting of the gears by moving the shift stick into right position. Theautomatic gearbox obviously needs some sort of controller but since it is developedoutside Scania, it is outside the scope of this description. The Opticruise andComfort Shift on the other hand are developed by Scania and will be described ingreater detail below.Scania Opticruise and Scania retarder are controlled by a control unit named OPC(a logical controller within the physical GMS unit) which currently (year 2011) isat version OPC5.

2.3.1 Scania Opticruise

Scania Opticruise is a so called Automated Manual Transmission. The gearboxitself is an ordinary manual gearbox, where gear shifting is controlled by electric orelectro-hydraulic actuators instead of the driver. The clutch is used by the driveronly to get the vehicle rolling from standstill. Gear changing is done by sendinga request to the EMS for engine speed matching the wheels’ speed and putting inthe gear without ever opening the clutch. In the second generation Opticruise thereis no clutch pedal at all meaning that even take off from standstill is controlledautomatically. This is done by electro-hydraulic clutch actuators.

2.3.2 Scania Comfort Shift

Scania Comfort Shift is a gear changing system used on buses. Instead of havinga mechanical link between the gear lever and the gearbox, an electrical actuator isused to switch the gear. A gear shifting scenario looks as follows: the driver requestsa new gear by moving the gearshift lever (which is not mechanically linked to thegearbox) lever into position without opening the clutch, the controller registers therequest and as soon as the clutch pedal is pressed down it controls the gear shiftingactuator to switch gears.

8

2.4. EEC - EXHAUST EMISSION CONTROL

2.3.3 Scania Retarder

A retarder is an auxiliary braking system that helps the vehicles’ ordinary brakeswhen moving downhill in order to decrease wear and tear of the ordinary brakes.A retarder control unit (RET) is a so called logical node that is physically presentin the same box as Opticruise (OPC) i.e. it uses the same embedded controllerhardware as OPC, namely GMS.

2.4 EEC - Exhaust Emission Control

Figure 2.5. An illustration of the more and more restrictive emission tolerance fromheavy road vehicles in the EU.

Legislative requirements on exhaust emissions from diesel engines in heavy vehiclesbecome more and more restrictive in Europe. Naturally this increases the demandon exhaust gas cleaning mechanisms (this process is known as after-treatment). Asthis report is written a legislation known as Euro5 is in prevail which specifies limitson four different kind of toxic gases and particles allowed in exhaust from vehicles.These are Carbon Monoxide (CO), Nitrogen Oxides (NOx), Hydrocarbon (HC) andParticulate Matter (PM) also known as soot. The limits for heavy vehicles are de-fined by exhaust mass per unit of energy output (g/kWh), unlike for passenger carswhere they are defined in terms of exhaust mass per distance driven (g/km). InJanuary 2013 Euro6 will come into force which heavily restricts the allowed amountof NOx in the exhaust. [WPEMSTD]

A cleaning mechanism named Selective Catalytic Reduction, SCR is used in Scaniavehicles to reduce NOx emissions which makes use of a liquid called AdBlue which

9


Figure 2.6. An illustration of the SCR process. The SCR unit is controlled byEEC3.

contains urea and is injected into the exhaust system of a vehicle where it undergoesa catalytic reaction and is degraded into water and nitrogen.A separate ECU called Exhaust Emission Control system, EEC is responsible forcontrolling this process. The EEC is located on a sub-bus to the EMS on CAN,meaning it can only communicate to EMS directly. Communication to other ECUson the red CAN-bus must go through EMS (see fig. 2.2).

2.5 OBD - On Board Diagnostics

Figure 2.7. A VCI unit that is used to connect a computer to the diagnostic porton a Scania vehicle

OBD is present on every modern vehicle, this is a legislation requirement for dieselvehicles sold in the European Union since 2004 [WPCAN]. OBD is a generic termreferring to a vehicle’s self-diagnostic and reporting capability [OBD]. OBD-II is aconnection interface standard for a 16 pin connector that enables an external device

10

2.6. THE PROBLEM

(typically a laptop) to be connected to the vehicle’s controller systems network.This enables workshops to perform diagnosis on the vehicle’s electrical/electronicsystem. It also allows programming of the vehicle’s ECUs, i.e. storing configurationparameters in the ECU memory.

Each ECU in a vehicle performs a number of diagnostic tests on the sensors/actuatorsconnected to it as well as CAN communication channels with other ECUs. Whenthese diagnoses results in an indication that for example a component is faulty,does not send sensor data at all or send an implausible value, a Diagnostic TroubleCode (DTC) is built. The DTC is stored in the ECU’s non volatile memory andcan be read using a diagnostic tool connected to the OBD port. The DTCs areprimarily intended to enable effective diagnosis in a workshop as well as functionaldegradation of the vehicle.

2.5.1 KWP - Keyword Protocol

Keyword Protocol (KWP) or KWP2000 is a protocol for diagnostic device to vehiclecommunication standardized as ISO14230. In modern Scania vehicles the commu-nication between the diagnostic device (often a PC) is done over the CAN network.KWP2000 offers a possibility to send a diagnostic command to the vehicle’s ECUsystem and receive information from it. A command may be a request for DTCsfor instance, or to request an ECU to reset itself. The diagnostic device sends aparameter identifier PID which is an integer of one or more bytes to the diagnosticserver at the ECU and receives a number of bytes in response. Two PC softwaretools are used to communicate with vehicles by KWP2000. These are Scania Di-agnos and Programmer 3 (SDP3) which is intended to be used by workshops thatserves Scania vehicles and XCOM that is intended for development purposes onlyand is used internally at NE.

2.6 The problem

Whenever a driver of a Scania vehicle turns on the ignition key, a signal called U15 issent to all ECUs in the system. It is a wake-up signal for the ECUs, meaning that forthe vehicle to be fully operational, an ECU must, in a limited amount of time, poweritself on, start the boot-loader, read the software that is stored in the EEPROM2

(also called E2) and be ready to send and receive messages to/from other ECUs onCAN. Otherwise, other ECUs may set DTCs indicating communication errors withthe ECU that did not wake up in time. This may lead to problems both inside theECU and in communication between several ECUs. For instance, the engine cannotstart without properly operating EMS. Therefore it is important that all ECUs com-ply to the requirements with respect to maximum start-up time and establish properCAN communication. Many things can go wrong during the start up process. An

2EEPROM = Electrically Erasable and Programmable Read Only Memory

11


ECU that does not wake up properly within the required time limit will make otherECUs degrade signal statuses for the signals that are contained in CAN messagesexpected from the failing ECU. During cranking (starter engine operation) a lot ofelectrical current is consumed by the starter engine and this causes a temporarydecrease in supply voltage which may make some ECUs shut down, or even worse,entering an undefined state. When ignition key is switched to OFF position thepowertrain ECUs are expected to perform a well defined and controlled shutdownwhich includes saving some operational data, adaptations, diagnostic informationand similar data to non volatile memory (NVM). The problem is to investigatewhich areas are the most problematic during start-up and shutdown and how cantest cases be developed to perform a system test of the powertrain ECUs developedat Scania with the focus on correct start-up and shutdown.

2.7 Test platforms

2.7.1 Automated Testing in Simulator Rigs

At Scania, automated testing of the ECUs at different levels have been performedfor some time. This involves both system tests on single ECUs, integration tests ona few ECUs and integration tests on all the ECUs that would normally be presentin a vehicle. These tests are performed in so called HIL3-rigs. The function of aHIL rig is to simulate the environment of an ECU in such way that the ECU thinksit is inside a real vehicle. The real vehicle is simulated by mathematical modelsand it communicates with the ECU through several types of signal converters andprocessors which converts it into data to the model. The model is controlled by acomputer which allows to choose different environments to simulate. The rigs allowmanipulation of CAN messages as well as different electrical failures or interrupts.There is a framework (based on Python) which allows creating Python scripts toperform different test actions automatically. There are currently three rigs usedin the powertrain control lab. One that contains only EMS, another that containssolely GMS and a third which has EMS, GMS, EEC, COO and ICL.

2.7.2 Manual Testing in a Vehicle

Since the models that simulate the environment of the ECUs are and always will beapproximations of the real world. Many tests, where a true real world operationalenvironment is significant, are being performed in a real vehicle in its real workingconditions. The interface of the tested ECU(s) of the test vehicles is connected toCAN and electrical components through a break-out box. The break-out box isa box that allows to interrupt each contact at the ECU interface to measure eachsignal that comes in/out of the ECU. One can for instance measure the current ofthe voltage supply line (U30 signal) in order to determine if the ECU is asleep or

3HIL = Hardware In the Loop

12

2.7. TEST PLATFORMS

awake, or for any other purpose. Also manipulation of signals is possible throughthe break-out box. The ECU COO provides a diagnostic port for connecting a com-puter to the vehicle and perform operations like DTC reading and programming ofsome parameters. It is also possible to connect a PC to the CCP4 port allowingreal-time logging of internal variables in the ECU. Logging of signals sent on CANis also possible by connecting a listening device to CAN_L and CAN_H contactson the break-out-box and using appropriate PC software. The one used at NE isXCom, an internally developed application used for diagnosis and programming ofparameters by KWP2000 protocol (see 2.5.1). Logging and manipulating of ECUinternal variables and CAN signals is done with the help of ATI Vision (AccurateTechnologies Inc). Since reading of internal variables is done by reading specificmemory addresses in the ECU, a database which contains a mapping between vari-able names and memory addresses is loaded into Vision.

4CCP = CAN Calibration Protocol

13

Chapter 3

Theory

With separate control units controlling different functionality, a distributed con-troller network where some nodes control vital and safety-critical functions of aheavy vehicle, there is a hard requirement on the communication network thatthese ECUs use to communicate with each other. CAN is used for ECU to ECUcommunication in vehicles produced by Scania today.

3.1 CAN - Controller Area Network

In the beginning of the 1990’s the demands for comfort in cars were increased. Elec-tric window elevators, electric seat adjustments, rear-view mirrors, climate control,audio systems and navigation, etc., appeared on the market. Also, the interna-tional requirements regarding safety and environment were increased, the vehiclesbecame more fuel-efficient and environmentally-friendly. The safety functions likeABS-brakes and Immobilizer came as well as more efficient automatic gear changingmechanisms.

The increased demands in the automotive industry drove the development of a com-munication bus system adapted to embedded microcomputer systems that wouldfulfill high transmission rate demands, have good real-time properties and be robustand cheap. By the beginning of 1990’s many vehicle manufacturers developed theirown bus concepts and it became difficult for the suppliers to support the differentsystems. Each large manufacturer attempted to make their own solution becomean international standard. A couple of these bus system solutions continued to beused in the mid-90’s. One of them was CAN. CAN - Controller Area Network is astandard developed by Bosch in the 1980’s in order to fulfill the increasing demandsof European automotive industry. It was later also accepted by the American au-tomotive industry due to successful use in the Europe. CAN was officially releasedin 1986 by SAE - Society of Automotive Engineering as a Recommended Practice.CAN data link layer and some aspects of physical link layer is ISO-standardized,ISO-11898.

15

CHAPTER 3. THEORY

In figure 2.2 we can see a typical ECU network in a Scania vehicle. There arethree buses: red for vital and safety critical ECUs, yellow for not so critical butstill important ECUs and finally green for comfort function controllers. These threebuses are interconnected through a coordinator ECU (COO). Coordinator acts asa gateway for messages that need to travel between ECUs on different buses sincethe buses typically operate at different bit-rates. Currently (year 2011), the red busis operated at 250 kbit/s as do the yellow and the green buses.

3.2 Theory behind CAN

CAN is a a multi-master broadcast network for connecting ECUs into a distributedcontroller network [CANEMB]. By multi-master we mean that there is no masternode, communication is carried out between the ECUs directly. Broadcast meansthat each node is sending messages on whole network and each node is able tolisten to each message on the bus, in other words there is no way to send a messagebetween node A and node B without all other nodes on the bus knowing it.

3.2.1 Real-time System

The real-time system concept is often misunderstood as a system that needs to befast. In fact real-time has nothing to do with speed. Although it is often desir-able for a system to perform fast it is not a requirement in order for a system tobe real-time. In fact, how fast a system should react to changes is defined by thedynamics of the controlled environment. [RTOS, p.1] defines a real-time system asa system in which performance depends not only on the correctness of the singlecontroller actions but also on the time at which actions are produced. The maincharacteristics of a real-time system is that (in case it is a controller) it should,given an input signal, finish the calculation of the output signal within a deadline,i.e. maximum time allowed to finish a computational process execution. Real-timetasks can be divided into hard and soft ones. A missed deadline in a hard real-timetask does not only result in system malfunction but can be directly dangerous. In asoft real-time task a rate of missed deadlines can be tolerated without more severeeffects than degradation of performance. A system that is able to operate hard realtime tasks is called a hard real-time operating system.

One can find many examples of both hard and soft real time applications in auto-motive industry. A so called drive-by-wire systems which have some use in trucksand buses do use communication networks (like CAN) to control the throttle. Alate response of such system may result in an uncontrollably accelerating vehicle(at least until the driver reacts and activates the brakes) which can be directly dan-gerous.

16

3.2. THEORY BEHIND CAN

CAN is a system that clearly is hard real-time. A late response of a CAN controller(unit within an ECU that is responsible for physical layer CAN communication)can be damaging to a vehicle.

3.2.2 Differential bus

Physically, a CAN transmission channel consists of two electrical wires which wedenote as CAN High (CAN_H) and CAN Low (CAN_L). CAN is a differentialbus meaning the difference in voltage between the two lines gives either dominant(logical 0) or recessive (logical 1) bit. In an arbitration situation (described below)the dominant bit wins the arbitration and the other nodes transmitting simultane-ously stop transmitting allowing the node which sent the dominant bit to continuetransmission.

Figure 3.1. CAN differential bus. In a differential bus, there are two states: domi-nant (logical 0) and recessive (logical 1). The states are based on voltage differencebetween the CAN_L and CAN_H lines. As can be seen in table 3.1 recessive stateis given by CAN_L and CAN_H at the same voltage level and dominant otherwise.The sequence given in the figure is 0101

Recessive Dominant

Min Nominal Max Min Nominal Max

CAN_H 2.0 2.5 3.0 2.75 3.5 4.5 Volt

CAN_L 2.0 2.5 3.0 0.5 1.5 2.25 Volt

Table 3.1. Voltage levels of the differential bus that gives recessive resp. dominantlogical state on bus in a transmission.

A dominant bit “wins” over a recessive bit if a conflict of two simultaneous trans-missions is arisen. See 3.2.5

17

CHAPTER 3. THEORY

3.2.3 Data transmission

B. CAN Extended Frame Format

S

O

F

1

Identifier

11

S

R

R

1

I

D

E

1

DLC

4

Data Field

0 - 64

ACK

Field

2Bits

CRC

Delimiter

1

Arbitration Field

32 Bits

r

0

1

Control

Field

6 Bits

Data

Field

CRC

15 7

E

O

F

Bit Stuffing

CAN Extended Data Frame

Maximum frame length with bit stuffing = 150 bits

r

1

1

Identifier

Ext.

R

T

R

118

No Bit

Stuffing

Figure 3.2. CAN data frame

There are four types of frames on CAN:

Data frame

A frame that contains ordinary data of up to 8 bytes in length. This type of frame isreferred to later in this report as CAN-Message. The format for this type of frameis given below:

Field Length in bits Meaning

SOF 1 Start of Frame

Identifier 11 Identifier of the message. Contains priority in-formation.

SRR 1 Substitute Remote Request (must be recessive)

IDE 1 Recessive (1) in an extended frame. Dominant(0) in a standard frame. Makes sure a standardframe gets higher priority than an extended

Identifier Ext 18 Extended part of the identifier

RTR 1 Should be dominant (0)

Reserved bits 2 Always dominant (0)

DLC 4 Data length in bytes, 0-8

Data Field 0 - 64 Data to transmit

CRC 15 Cyclic Redundancy Check. Used for messageintegrity check

CRC delimiter 1 Always recessive (1)

ACK bit 1 Sent as recessive (1), can be set dominant (0) byreceivers

ACK delimiter 1 Always recessive (1)

End of Frame 7 Always recessive (1)

18

3.2. THEORY BEHIND CAN

Remote Frame

CAN messages may be either periodical or sent only on request. In the latter casea request must be made by the node that desires to receive information. In thiscase the requesting node sends a remote frame, which looks as a data frame butwith two differences from it: RTR bit is recessive in a remote frame and there is nodata field in a remote frame. The fact that RTR bits is recessive in a remote framemakes, in case a data frame and a remote is sent simultaneously, the remote frameto lose arbitration so that data frame is transmitted first on bus and the remoteframe must be resent.

Error Frame

A node that detects a fault may send a message that violates bit stuffing rules. Bysending 6 bits of the same polarity, either dominant or recessive a node notifies theother nodes on network that it discovered the fault. Other nodes will transmit errorframes also. An error frame consists of a 6 consecutive either dominant or recessivebits and 8 recessive bits that are an error delimiter. Dominant 6 bits indicate anactive error flag and is sent by a node that detects an error while 6 recessive bits issent by a node that detects an active error flag on the bus.

Overload Frame

Overload frame is a way for a receiving node to indicate to the sending node that it isbusy at the moment. The overall layout is very much like to that of an error frame.

Figure 3.3. A data field of a CAN Message containing 64 bits of data. In thisexample it consists of 5 signals each using 12 bits. We see bytes on the vertical axisand bits on the horizontal. Bits are given from right to left. A part of each signalname is hidden since they have information class ’Internal’. We see five differenttemperature signals in the depicted CAN message.

19

CHAPTER 3. THEORY

A CAN message contains up to 8 bytes (64 bits) of data which contains one ormore signals. A signal can range from one bit to 64 bits in size and different signalswithin the same message can have different length in bits. In fig. 3.3 we see how alayout of signals within a message can look. In this particular case all signals havethe same lengths. The layout of each CAN message must be known by both thesending and the receiving node on network.

3.2.4 Bit stuffing

Bit stuffing makes sure that normally no more than five bits of the same logicalvalue is sent over the bus. Six consecutive of the same polarity are in fact used toindicate error. A bit of opposite polarity is inserted after each occurrence of fiveconsecutive bits of the same polarity. At the receiver side the opposite is done. Ifa sequence of five bits of the same value have occurred, the next bit is removed.Here is an example:

Original frame 00000 10101 11110 10000 11110 01111 1...

After bit stuffing 00000 11010 11111 00100 00111 10011 1110...(italic bits are the inserted stuffing bits)

The receiver removes stuffed bits 00000 611010 11111 600100 00111 10011 11160...

Received bits 00000 10101 11110 10000 11110 01111 1...

The last ten bits of a frame in fig. 3.2 i.e. CRC delimiter, ACK-bits and End-of-Frame-bits are not subjected to bit stuffing. Neither are the three inter-framebits. Bit stuffing implies that there may be more bits to send over bus than theframe contains. The worst case number of bits to send over bus is given by

ns(n) =

⌊

n − 1

4

⌋

where n is number of bits in the stream and ⌊x⌋ means to floor x (giving the largestinteger smaller than or equal to x). [CANTIMING, eq. 13.1]

3.2.5 Bit arbitration

Bit arbitration is a mechanism that prevents two nodes to transmit on the bussimultaneously. When a node is about to send a message on CAN a check is per-formed to determine if the bus is idle. If it is not then the node waits until it is.If two nodes start to transmit a data frame simultaneously then the one with thehighest priority “wins” the arbitration because its dominant (0) priority bit will winover the recessive bit of a message with lower priority. This means that the nodesending lower prioritized message will stop transmitting and wait for bus free toretransmit.

20

3.3. SOFTWARE TESTING

3.3 Software Testing

The software testing process may be coarsely divided into three levels which are

• Module/Unit testing

• Integration Testing

• System Testing

Of course, depending on application, other levels and sub-levels to the above men-tioned are possible which is shown by fig. 3.4.

3.3.1 Module Testing (also called Unit Testing)

Module test is performed in order to verify that a single unit within a software sys-tem is functioning as intended. A unit (or module) is typically a class within objectoriented programming or a source code file (C module for instance) in procedureoriented programming. The unit is isolated from the rest of the system and testedindependently, often in a separate testing framework. [AST].

3.3.2 Function Testing

Function test is testing of what may be a combination of modules controlling aspecific function, like engine torque limitation in a truck engine.

3.3.3 Integration testing

A successful thorough unit testing makes it plausible that a unit within a softwaresystem is functioning as intended on its own, but there is no guarantee that the unitinteracts with other units/modules as intended. Integration test is about verifyingthat units tested separately interact with each other as expected. The method istypically to take a few units run some tests and add more units, run some tests andincrease the number of units. Any errors revealed at this stage is most likely relatedto the units’ interface since the units have been tested successfully on their own.

3.3.4 System testing

System test is about testing the whole system against requirement specifications.One wants to verify that the system and all of its components are functioning asintended in its regular working environment.

3.3.5 Acceptance Testing

Acceptance test is usually performed by actual users of a software product (so calledbeta testing) before a final release in order to test the product in its real operatingenvironment and possibly discover bugs that was not discovered at prior testingphases.

21

CHAPTER 3. THEORY

3.4 Other Types of Testing

3.4.1 Regression testing

Regressions test is a type of test that is performed after changes in the code with thefocus on revealing bugs that have previously been fixed but may have reappearedafter the change. It looks to reveal software regressions, meaning that previouslycorrect working functionality stops working, therefore the name. Regressions usu-ally occur in a way that newly added code that adds functionality or a bug fixintroduces a new bug or makes a fixed one reappear [MSDNRT]. The method isto re-run previously successful tests and checking that none of the bugs appear orreappear. This type of test can be performed on almost all test levels from moduleto system integration.

As to relation of regression testing to the system test of powertrain ECUs canbe mentioned that it is a very large area that involves very advanced hardware andsoftware. As regression tests are performed over and over again, there is no possi-bility to do this all manually. Manual tests are performed on a live vehicle. Thecost in terms of both man-hours and equipment (vehicle) utilization would increasemassively if this type of testing is to be performed manually. The need for testautomation is therefore significant here and so called Hardware-In-the-Loop (HIL)rigs are used for this purpose. The working principles of these are to simulate theenvironment of the ECU to make it believe that it is actually inside a real vehicle.For example: the EMS located inside the HIL rig is made to act exactly as it wouldif it were controlling a real engine. The dynamics of the engine is simulated bycomputer models based on differential equations.

3.5 Software Development and Testing Procedures

3.5.1 The V-model

One can summarize the way of software development from an initial idea to a com-plete product using a so called V-model. When the project starts, one is at theupper left tip of the V drawn in fig. 3.4 and is successfully moving down towardsthe lower tip where code implementation is taking place. The development part(left side of the V) goes from the highest level (overview architecture and design)to the lowest (implementation of separate modules). The testing part (right side ofthe V) goes from the lowest level (unit tests) to the highest level system integrationtest [TDP, p.43]. This model is a coarse representation of the method used by ECUsystems software development groups at Scania.

The model in 3.4 is simplified. The points are generalized and the exact ones are de-fined depending on the application. Also, software development (as well as testing)is an iterative process. Testing is performed continuously during the development

22

3.6. TEST TECHNIQUES: BLACK BOX TESTING

Acceptance Test

Module Design Spec

System Test

Function Requirements

Unit Test

Function Test

System Requirements

Business Requirements

Code

Figure 3.4. The V Model used in ECU software development at Scania

at module level and system level. Acceptance level testing is done immediately priorto a release often together with the customer.

The concepts described below will be referred to later in the report therefore abrief presentation of the most important software testing types and techniques aregiven in the subsequent section.

3.6 Test Techniques: Black Box Testing

Testing of a part of a software where no internal knowledge of the object beingtested is necessary. The primary goal is to test functionality from a user perspective.Test cases are built based on functional requirement specification and documentsdescribing what the software should do. The test developer decides what inputshould be given and what the expected output is. The description of different testtechniques of black-box testing follows.

3.6.1 Decision Tables and Decision Trees

Decision tables are used to analyze and test complex sets of rules where a numberof variables are used. The purpose is to examine the logical correctness of the setof rules and to identify appropriate test cases. [TDP, p. 181]

The technique basically means that all variables in a set of rules are listed andcombined. An example of a decision table is given in tab. 3.2 The first quadrantlists all the conditions and condition entries in the second quadrant. In the thirdor fourth quadrants we have actions and action entries respectively. A condition issomething that is easy to qualify as either fulfilled or not, i.e. engine output torquewithin [200 1000] Nm, gearbox in neutral or not, functional degradation due to aDTC enabled or not and similar. An action is something that should or should

23

CHAPTER 3. THEORY

1 2 3 4

Condition 1 T T F F

Condition 2 T F T F

Action 1 Y Y N Y

Action 2 N N Y Y

Action 3 N Y N Y

Table 3.2. Decision table example. T = True, F = False

not be done when a set of conditions is fulfilled. In tab. 3.2 we have a completelist of all possible combinations of two conditions which are 22 = 4, in general 2n

condition combinations are possible given n conditions. The strategy is then torule out inconsistent combination of conditions which are often present. It mayalso be the case that one condition rules all other out. After simplifying the tableto only contain valid combinations of conditions we can use heuristics to choosewhich cases to test. Generally each column in the table is a test case. Howeverthere is an exponential explosion on the number of columns with increasing numberof conditions. NEVS made a summary of pros and cons of decision tables which are:

Pros Cons

Good survey Exponential growth with the num-ber of conditions

Test cases directly from the table Difficult to identify all the condi-tions

Easy to limit number of test cases Important test cases may disappearduring simplification

3.6.2 State Transition Testing

State transition testing is a model-based test technique. It is commonly used whentesting event triggered systems, real-time systems and digital electronics hardware.Since we often deal with that kind of systems in automotive industry, this is animportant test technique for us. What characterizes this technique is that we use afinite automaton graph to represent the states as nodes and the transition betweenstates as edges. A very simple ATM machine can serve as an example. In statetransition testing as in any other test technique we need a good heuristic to coverthe state graph in the best way. In real world applications, the states are oftenmuch more complex than the one presented above and to test all the transitionswould be unfeasible. Some heuristics for graph coverage are

1. The most probable paths

24

3.6. TEST TECHNIQUES: BLACK BOX TESTING

S1

Waiting

for a Card

S2

Waiting

for PIN

S3

Waiting

for

transacti

on

E2

Correct card,

Ask for PIN

E3

Wrong PIN 3 times,

Block Card

E6

Correct PIN,

Ask for transaction

E5

Wrong PIN entered,

Ask for PIN again

E1

Wrong Card inserted,

Eject card

E7

Wrong transaction

Request,

Ask again

E8

Transaction request correct

Perform transaction, eject card

E4

Cancel

Transaction,

Eject card

E9

Cancel

Transaction,

Eject card

Figure 3.5. Here we see a simplified picture of an ATM transaction process. Herewe see 3 states and 9 transitions in this finite automaton model.

2. Traveling Salesman path: visit all states once

3. Eulerian path: visit all edges once

4. Risk based: path where a combination of transitions can be problematic

5. All paths of a certain length: takes a long time to process

6. All ways out of a state

7. All events that should not trigger a transition: this verifies system robustness

[TDP]

A transition table for the ATM example is given in tab. 3.3

Trans. No. Start state Event Response End State

E1 S1 Wrong card Eject card S1

E2 S1 Correct card inserted Ask for PIN S2

E3 S2 Wrong PIN third time Block card S1

E4 S2 Cancel transaction Eject card S1

E5 S2 Wrong PIN Ask for PIN S2

E6 S2 Correct PIN entered Ask for transaction S3

E7 S3 Wrong transaction input Ask for transaction S3

E8 S3 Transaction request correct Perform transaction, eject card S1

E9 S3 Cancel transaction Eject card S1

Table 3.3. State transition table of an ATM machine

25

CHAPTER 3. THEORY

3.6.3 Equivalence Class Partitioning and Boundary Value Analysis

The purpose of equivalence class partitioning is to reduce the tests or test data setby a heuristic that only one representative of each class of test cases can be used withhighest probability to find an error if it exists. The choice of such representativetest case should be based on the assumption that (although not absolutely sure)

• if one test case within the class finds a deviation then the other ones withinthe class also do

• if one test case does not find any deviations then neither does the other testcases

The test cases in a set are considered equivalent if the above mentioned points aremet. By experience one knows that errors usually occur at the boundaries (both ininput and output data). This is especially true when working with numeric values.Test case design by equivalence classes consists of two steps.

1. Identify the equivalence classes

2. Design the test case

[MYERS, p.42]

0 3000

Engine speed (RPM)

- ∞∞

class 1 (invalid) class 2 (valid) class 3 (invalid)

Figure 3.6. Illustration of equivalence class partition and boundary value analysis.Here we have three intervals. ω < 0 rpm, 0 ≤ ω ≤ 3000 rpm and ω > 3000 rpm.

As an example we see in fig. 3.6 that a continuous numerical variable such as theengine speed is divided into three equivalence classes. In order to test an enginecontroller software, how it reacts to different sensor values we may manipulate thesensor to give any value we want in order to test the software reaction. We canchoose these values wisely so we only have to test a few values in each equivalenceclass and have a very high probability to discover a deviation (error) instead oftesting all possible values (this number is finite since the measurement resolution islimited). Experience shows that tests of values on the boundaries usually have thehighest probability to discover an error (boundary value analysis).

3.7 White Box Testing

This type of testing requires knowledge of the internal working of a piece of software.The purpose here is to test how the object behaves on the inside. Test cases typicallycontain input data that tests different execution paths of the software.

26

3.8. GRAY BOX TESTING

3.8 Gray Box Testing

In case of gray box testing we use the knowledge that we have on data structuresand algorithms for test case design but we do test at the user level i.e. only at theinterface (black-box level).

However, modifying a data repository does qualify as gray box, as the user wouldnot normally be able to change the data outside of the system under test. Grey boxtesting may also include reverse engineering to determine, for instance, boundaryvalues or error messages.

3.9 Smoke testing

Smoke testing in software development means to perform a quick and less detailedtest as a preliminary to further testing. A set of test cases that test the mostimportant functionality is used as a smoke test to reveal severe bugs that need tobe fixed before ordinary testing can begin.

3.10 Non Functional Testing

Apart from the functional requirements on software there are often other sorts ofrequirements on how the software should behave. The standard ISO9126 Software

Product Evaluation defines six headlines for assessing software quality. These are

1. Functionality: Presence of desired functions, correctness, inter-operation,compliance with standards, security

2. Reliability: System robustness and correct function in different situations,failure tolerance, recovery, accessibility

3. Usability: Ease to understand and use the system

4. Efficiency: The optimal resource utilization, timing aspects (performance),resource requirements, scalability

5. Maintainability: The possibility to upgrade the system when necessary, theability to analyze, modify and test the system

6. Portability: Possibility of the system to work in different environments,different operating systems or with different databases

The above points are important to bear in mind during the development process. Itis also important to let the experts in respective area to perform these assessments.As for an example, usability is in focus at an early stage of the system design andshould be done by the experts in usability. It is not a good idea to let test developerswithout specialized knowledge to handle the questions concerning it. [TDP, p.96]

27

CHAPTER 3. THEORY

3.11 How far should we test?

There is no simple answer to this question. To perform an exhaustive test on thesoftware is practically impossible other than for the simplest software. One way isto use five basic criteria which together are used to determine when we are finished,or more likely, achieved good enough quality. These are:

1. We have achieved the goal we set in the test coverage heuristics.

2. The number of discovered deviations (errors) are less than a limit value wedefine.

3. The cost of finding more errors is larger than the estimated value loss due toundiscovered errors.

4. The project team decides together that a product is ready to be released.

5. Management team decide to release the product.

[TDP, p. 269]

Each of the above mentioned criteria on its own has its weaknesses. The fact thatwe do not find any more errors may mean that we are not doing the tests right andnot that there are no errors. The decision that the cost is higher than the gain fromperforming more tests is a subjective assessment that is difficult for a test developerto perform. The 5th criteria is a deadline that is decided outside the testers domainand is not related to quality of the tests but a business assessment on when theproduct shall be released. Several of the mentioned criteria together shall be usedin order to decide when the product is ready for a release.

Sufficient Quality

As mentioned, it is often not the testers that decide when a product is released.However a tester should be prepared to answer the question what opinion he/shehas about the quality. The answer should be well founded. According to JamesBach’s view, what he calls a good enough quality, it can be summarized in fourpoints (it refers to the product).

• It has sufficient number of advantages.

• It does not have critical problems.

• The advantages are sufficiently higher than the disadvantages.

• Further testing would, the current situation and all parts considered, do moreharm than help.

28

3.12. FORMAL METHODS

[BACH]

The bottom line here is that several criteria must be taken into consideration to-gether before deciding that testing is finished and the product is release-ready.

3.12 Formal Methods

The methods for software testing presented above are well accepted and widelyused in practical applications due to their simplicity in use and possibility to reusethe test cases over software refactoring cycles. This applies primarily to black boxtesting methods. Black box testing can also be performed without detailed knowl-edge of internal workings of the tested system which is often desired since it isalmost always different people that develop and test software in large industrial ap-plications. However these methods totally lack any kind of theoretical background.Formal methods is a somewhat ambiguous denomination of a collection of mathe-matical/logical tools for software testing and verification. These are, unlike softwaretesting methods above, very well theoretically founded, but their use in practicalapplications are very limited due to the complexity.

29

Chapter 4

Methods

The overall aim of this project is to contribute to the work aimed at making thepowertrain ECU software less prone to problems during start up and shutdown.A natural first step in this work is to investigate what makes the ECUs prone tostart up and shutdown problems. By using technical analysis and completing theinvestigation with an analysis of failure reports we identify a number of areas thatare considered to be the most high risk areas for the ECUs during start up andshutdown. The work process itself was largely based on the steps described below.

4.1 Technical Analysis

During this part we go down into details of the process of three different phases ofECU operation which are start up, cranking and shutdown. We study a number oftechnical regulations, system descriptions, software architecture documentations aswell as looking at the ECU software code base itself in order to determine scenariosthat could lead to problems like misleading DTCs, corrupt files etc.

4.2 Failure Report Analysis

While the technical analysis reveals areas where problems can occur, we need tocomplete it with information about where they actually do occur. This was done bysearching internal failure report databases as well as interviewing system architectsand system owners of EMS, GMS and EEC as well as platform software, LLAP andComP architects. The database search part was one of the most difficult becauseit is not always straightforward to investigate how a failure did occur based on theinformation given in the database to which authorized workshops and field testsreport. One often needs more information about the state of the ECUs at thetime when the problem occurred in order to perform an investigation which wasnot always available. Another database that was searched contained failure reportsthat was found by the staff at NE, powertrain control system department. This

31

CHAPTER 4. METHODS

database often contained more detailed information that is usable in order to findthe actual cause of the problem.

4.3 Requirement Identification

Now when we have knowledge of where actual problems may occur, we need todefine or identify system functional requirements for the ECU software. This isdone by studying documentation on system function description, system functionspecification and technical descriptions and parsing out how the system softwareshould function which leads to a requirements definition. It is very common that around of discussion with software developers and/or function developers is needed.In the case of this thesis we identified the requirements by going through the docu-ments [TBENV], [TBJ1939COM], CANM1 description and requirements document,FILE2 description and requirements. A number of requirement that was found tobe related to start up or shutdown of the ECUs was picked and combined into atest case requirement set upon which test cases were built.

4.4 Test Techniques

A deep study of different test techniques was done and it is reported in chapter 3.During this phase we need to decide which test technique to apply to our test casesand how. We make a test design which is a specification of the test techniques wechoose and how we use them. The test design is specified in each test specificationdocument.

4.5 Developing the Test Cases

After making a test design which is a rather theoretical part of the test developmentprocess it is time to develop a test flow, which is a set of exact instructions of whatto do and when. We need to identify the testability of different requirements, thatis, how the tests should technically be performed. During this part we choose testplatform which may be a single ECU, HIL rigs or a complete vehicle. We chose toperform all of our test cases in HIL-rigs due to the fact that we need to perform manyhigh-precision timing and measurements that are not feasible to do in a vehicle. Asingle ECU is not an option either since we often need to have access to scenarioswhere several ECUs communicate with each other in a realistic manner. During thisstep we also identify the variables that we need to observe, whether it is internalstate variables in an ECU or a CAN signal, sampling frequency and similar technicaldetails. The result of this process is a complete test case specification that may beimplemented as a script for the test automation HIL rigs.

1CAN Communication Manager in the LLAP layer2Non Volatile Memory Manager

32

Chapter 5

Results

5.1 Technical Analysis

5.1.1 CAN messages and signals

Each CAN-message that is sent periodically on bus by an ECU have a defined pe-riod time Tmax (according to SAE-J1939 specifications). A database that containsall CAN messages and their specifications, which includes period time, comes alongwith each software release. Not all CAN-messages are periodic though. Some aresent only on request. In order to ensure good real-time performance in the embed-ded ECU systems each ECU must be able to detect missing expected periodic CANmessages from other ECUs.

In EMS, LLAP1 is responsible for monitoring CAN messages and indicating a time-out when a signal fails to be received by EMS when it is expected. This indicating isdone by setting a signal status, that is traveling along with the signal from LLAP toAPPL through RTDB. The signal value itself should be set to some pre-programmeddefault value.

5.1.2 Signal and Component Statuses

Each of the ECUs that we are working with in this project has a mechanism fordiagnostics of the electrical components connected to the ECU. This mechanism israther complex and only a subset containing necessary basics will be covered here.A part of this mechanism lies inside the ComP layer of the controller software,another part is in the application layer. We call the former DIMA-BSW and thelater DIMA-AP. A diagnosed component is an electrical component that is either asensor/actuator connected to the ECU or a part of the ECU electronics like mem-ory, AD/DA converters and similar. The status of a component can indicate thata component:

1See 2.2

33

CHAPTER 5. RESULTS

• is flawless

• is possibly affected by an error that is present in the system

• is affected by an electrical or non-electrical error

• does not exist in the current configuration

A signal is a software variable that is transferred between different parts ofsoftware which most of the time are physical quantities like Volt, Bar, Kelvin etc.A signal is usually transfered together with a signal status. Signal status is encodedusing 8 bits and is contained in the same C-struct as the signal value. Signal statusare indicating whether the signal

• is flawless

• is possibly affected by an error that is present in the system

• value is a good replacement value

• value is possibly a bad replacement value

• value has a plausibility error

• is not available or based on a nonexistent component

At the ECU start up, the signal status is initialized to INIT meaning that the signalsare not classified as good or bad at all until a specified amount of time has passedsince power up of the ECU.

5.1.3 Start Up

A measurement of start up times of two ECU was performed. A measurement device(Ipetronik M-SENS) was connected to the S8 U15 signal through a break-out-boxalong with the CAN1 channel which made it possible to record the U15 signal inthe same graph as all the CAN signals sent by EMS on the red CAN bus, usingATI Vision. The difference between the time point where U15 becomes high (24 V)and when the first CAN signal is sampled is considered the start-up time. This wasperformed on a live vehicle. Five iterations of this measurement was performed andthe result is that an average start up time was 0.2733 s with a standard deviationof 0.0017. See fig 5.1.

Further, a number of failure reports from different kinds of test vehicles indicate theresetting of duty cycle data when starting with a low battery voltage as a problemarea. These problems should be handled at module level and/or ComP level. Otherreports point out possibly false or misleading DTCs with a variety of causes at startup. Some of the reported problems are nearly impossible to test at the system leveland should be investigated by the developers at module level. Others seem to be aresult of a previous incorrect shutdown or file saving.

34

5.1. TECHNICAL ANALYSIS

U15 is turned on

First CAN signal from S8First CAN signal from RET

Figure 5.1. Measuring of the EMS S8 start up time

5.1.4 Cranking

It is a known phenomenon that engine start causes a short drop in the supply voltagefrom the battery since the starter engine in a heavy vehicle requires a massiveamount of electrical current. Technically a driver must first turn the ignition keyto ON position which gives U15 signal to all the system ECUs. Only after thatcan the engine be turned on by turning the key to the START position. Also, theengine management system EMS must be fully up and running in order for it tobe possible to start the engine. This diploma project is not focused on the enginestart up but more on the ECU start up, however this phenomenon is still interestingsince it may affect the ECU behavior due to the voltage drop.

5.1.5 Shutdown

When the driver turns off the ignition key, the U15 signal is broken and the softwareof the ECUs detects it which makes the software initiate the shutdown process ofthe ECU. However the ignition key turn-off is one of the several conditions for ECUshutdown. Just to mention a few examples: the engine controller, EMS must ensurethat the engine is standing still, the gearbox controller, GMS must check that theneutral gear is in and the SCR system controller EEC must sometimes perform aregeneration process which may take some time during which the controller mustbe running. Also, most of the ECUs have a set of data that is saved to non-volatilememory during shutdown so it can be read at the next start up. This includes but

35

CHAPTER 5. RESULTS

Figure 5.2. A starter motor does require a lot of electrical current from the battery.Batteries in heavy trucks normally give 24 V DC. But as we see, the heavy currentconsumption during engine start may make the voltage level to drop significantlybelow 24 V. This may affect the ECU software. Here we see the U15 signal level(which is the same as U30 provided ignition is ON and 0 V otherwise. The stepat 4.3 s is due to ignition key turn to on. At 5.2 s the ignition key is turned toENGINE START position and the engine starts. Later when the engine is started -the generator produces about 28 V.

is not limited to duty cycle data, DTC and freeze-frame2 data, adaptation data etc.This data is saved at shutdown and an ECU must not power itself down until thesaving process is complete. Also there are kinds of data that is not allowed to bewritten to NVM areas at shutdown. An example of the latter is End-of-Line (EOL)configuration data. This data is critical for the system function and therefore iswritten only during a reprogramming session initiated by a KWP request. Duringthe system runtime a copy of this data resides in RAM and there is, at least intheory, a possibility that it may accidentally be modified in RAM and then writtenback to NVM at shutdown and in such way be corrupted. A test case (denoted asTest Case 2) in this report is focused on these issues with the EMS.

5.2 Definitions

Before we continue to describe the results that were achieved during this project wemust make a clear definition of the concepts test case and test flow. A test case isa system function test specification document that is a complete and detailed de-scription of the tested function, system requirements, prerequisites, post-requisites,limitations, used test design techniques, used heuristics and finally test flows.

2Freeze Frame is a set of state variable whose values, at the moment when a DTC is set, issaved

36

5.3. TEST CASE 1: EMS – EEC COMMUNICATION, CAN TIMEOUTS DETECTION

A test flow is a set of direct instructions to the implementer of the test. Theinstructions describe what to do, when to do it and how to do it. A test flow typi-cally covers a requirement or a number of them. In some cases, several requirementsare best combined in a single test flow. A test flow may be conducted directly in avehicle or a test rig and it can also be implemented as a script for a test rig.

A total of two test case were developed in this project. Although the complete testcase documents are not presented here since they contain some information thatis not allowed to be published, parts of these documents that are considered mostinteresting and does not reveal internal information are given below. Test flows arepresented as completely as possible with some modifications in order not to revealinternal information. The main principles should however be clearly readable.

5.3 Test case 1: EMS – EEC Communication, CAN

timeouts detection

5.3.1 Requirement Identification

After having a correspondence with test developers at NEVS it was shown that aDTC was discovered by a test developer that indicated a CAN message timeout. Amessage, that EEC sends to EMS with nominal period time (which we call Tmax)of 200 ms was in EMS wrongly programmed to be expected every 50 ms. In casesof some messages, a DTC is set when 5·Tmax period time of an expected messagehas passed (no related DTC is allowed to be set earlier than 5 · Tmax due to a re-quirement). In this particular case, a DTC “Communication with the SCR controlunit error” was set, apparently due to the fact that EMS expected the message fromEEC each 50 ms and after 5*50 = 250 ms it was allowed to set CAN timeout DTCwhich it did. The message should in normal cases be received every 200 ms whichis less than 250 ms allowed to pass before a DTC is set. However due to high busload the message was delayed for an additional amount of time and was missing for250 ms or longer which lead to the DTC.

We decided that a test case should be developed to detect this kind of errors.This test case is to be based on the requirements:

1. No DTC is allowed to be set until 5 lost frames of a message (i.e. 5 · Tmax

time has passed since last received message)

2. S8 LLAP must detect a missing CAN message after 5 missing instances (i.e.if tmessage ≥ 5 · Tmax

3. Results of timeout test for message must not be reported until at least 2seconds + 5 · Tmax has passed since ECU on

37

CHAPTER 5. RESULTS

4. Timeout must not be reported outside ’Operating voltage mode’, in this case22V – 32V

A closer inspection of these requirements follows.

Requirement 1

The rationale behind this requirement is that fault codes (DTCs) indicating com-munication error with an ECU should not be set too early. This is because sucha fault code may be misleading and make a user or a repair workshop mechanicthink that there is a faulty ECU that should be replaced when in fact a CANmessage was just delayed a little because high load on the CAN bus. Thereforeeach ECU that receives CAN messages from other ECUs must be tolerant to a de-lay on the messages that are expected to be received within a certain period timefrom other ECUs. System architects at Scania have decided that a delay up to 5times a nominal message period time should be acceptable and not result in a DTC.

As to testability of this requirement, it is not easy to check DTCs in an ECUat exactly the right moment according to the test specifications. This is becausea KWP request must be made to the ECU asking it for a list of DTCs, a time isrequired to process it and return an answer, which contains a list of DTCs. Thisprocess altogether can take up to several seconds to complete so it is not feasible touse this straight approach since we must know when a DTC is set with millisecondprecision. So we use another way. By inspecting the code we find that CAN com-munication manager, call it CANM, reports a timeout diagnostic result by what iscalled a test communication struct (shortly testcom struct). A testcom struct is aC-struct or more precisely a bit-field containing 8 bits. There is a testcom structvariable declared for almost all CAN messages that S8 is receiving from EEC3.These variables can be monitored in ATI Vision3 by using CCP4. By monitoringthis 8-bit integer we can see whether a timeout failure is reported by looking at onebit in this structure.

Requirement 2

The signals that are contained within CAN messages often contain valuable realtime data. There are cases where closed control loops over CAN are based on thesignals. Therefore it is critical that application layer of the ECU software knowswhether a signal value is reliable or not. LLAP indicates this by altering the signalstatus. The signal status is a 8-bit variable that travels from LLAP to APPL viaRTDB in the same C-struct and is used by APPL to decide whether to rely on thesignal value, make some degradations to some functions of the vehicle/engine or to

3ATI Vision (or just Vision) is a computer software used to perform measurements of internalstate variables in an ECU, CAN monitoring etc.

4CCP = CAN Calibration Protocol which is a way to monitor internal ECU variables with a100Hz sampling rate.

38


set a DTC. In case where a CAN message that is expected to be received with acertain period time discontinues to arrive, the LLAP must degrade its signal statusafter 5 times the period time and set the signal value itself to a pre-defined defaultvalue. There are many levels of signal statuses however for this test case we aredealing with only three of them: FLAWLESS, NOTAVAILABLE and BADREPLACEMENT.In fig. 5.3 we see actual measurements (through CCP) that shows how signal statusis degraded for two messages each of which are sent with cycle time of 6 times theirrespective nominal cycle time. As the CAN message with nominal cycle time of 50ms is received (each 6*50 = 300 ms), the status for its signals is set to FLAWLESS

and continues to be FLAWLESS for a time of 5*50 = 250 ms. After that the signalstatus is degraded to NOTAVAILABLE and holds that status for a 50 ms after whicha new instance of the message is received and the signal status is again restoredto FLAWLESS. The same applies to a message with cycle time of 200 ms. Thismeasurement was performed in a single EMS S8 unit connected to a DC powersource. A VCI2 unit was connected (see fig. 2.7) to the S8 units CAN2 and CAN3port. The software used to program the S8 unit and read DTCs/DECs was XCOM(internal Scania Software) that can send and receive diagnostic information fromthe ECU through its CAN3 channel. The software that was used to monitor theinternal ECU status variables, including signal statuses, was ATI Vision. Visionalso allows us to send simulated CAN messages so they would appear to come fromthe EEC3 unit that is not present. The CAN messages were sent this way.

Requirement 3

Variation in wake-up time of different ECUs makes it inappropriate to set a diagnos-tic trouble code too early. For the same reason that an ECU is required to tolerate5 missing CAN message frames from another ECU that it is scheduled to receivebefore setting a timeout DTC an ECU must have a tolerance for ECUs that is notstarting to transmit CAN messages immediately after U15 on. This is to ensurethat a DTC that indicates a faulty ECU is set if and only if there is a faulty ECU.A time of two seconds is therefore allowed to pass since ECU wake up in additionto the usual rule of no DTCs until 5 · Tmax time has passed without an expectedCAN message to give a chance to all ECUs on the CAN bus to power up properly.

As to testability of this requirement, we need to be able to measure how muchtime has passed since ECU has been turned on. As measurements indicate, it takessome time for an ECU to wake up and start responding to CCP after U15 on. Itis therefore not accurate to base the track of time on the time point when U15 isturned on. Instead, we use an internal variable of the ECU that counts how many10 ms loops have passed since initialization of the ECU. By recording this variabletogether with other variables like testcom structs we can see at which point in ECUrun-time a timeout indication appeared.

39

CHAPTER 5. RESULTS

Figure 5.3. We see how signal status (no engineering unit) jumps between theFLAWLESS (upper line) to NOTAVAILABLE (lower line) for two CAN messagestraveling from EEC3 to S8. The messages are sent with a cycle time 6 times thenominal cycle time. In the upper graph the signal belongs to a signal that is containedin a message with cycle time of 200 ms, and in the lower graph corresponding cycletime is 50 ms.

Requirement 4

This requirement is another measure to counter the phenomenon of misleadingDTCs. As we discovered during technical analysis the supply voltage from a vehi-cle’s battery may vary significantly. When the control unit system is powered by avehicles battery, i.e. when the ignition is on but the engine is off and the battery isweak and unable to supply the nominal 24V DC not all sensors/actuators and con-trol systems can be expected to work properly and send a reliable data. Therefore,according to [TBENV] full ECU system functionality can only be expected whensystem supply voltage is within the range of 22 V – 32 V. Outside this range, aCAN timeout DTC may be misleading and is therefore not allowed.

Taking these requirements as a set of rules we can now formulate a decision ta-ble which is used to decide which set of conditions are to be covered by the test,see tab. 5.1.

40


Decisions U30 within 22 V – 32 V T F T F T F T F

tECUon > 2 s + 5 · Tmax T T F F T T F F

tmsg > 5 · Tmax T T T T F F F F

Actions DTC allowed Y N N N N N N N

Timeout detected by LLAP Y N N N N N N N

Table 5.1. Decision table for the described test case

5.3.2 Test Techniques

In the cases where we verify that the system properly reacts to a timeout of a CANmessage, we do it by sending a simulated message to the ECU on CAN bus with amodified cycle time.

Requirement 1

Tested using equivalence class partitioning and boundary value analysis in the timedomain for all diagnosed messages using their respective testcom-struct. Equiva-lence classes are

1. tmsg,act < 5 · tmsg

2. tmsg,act ≥ 5 · tmsg

where tmsg is the message maximal period time for each of the message tested tmsg,act

is the period at which a manipulated message is sent during the test process.

Requirement 2

Tested using equivalence class partitioning and boundary value analysis in the timedomain in a similar way as for the previous requirement. In this case we use signalstatuses specified in the system function specification.

Requirement 3

Tested using boundary value analysis in the time domain (testcomstruct). Equiva-lence classes are

1. tECUon < 2 s + 5tmsg

2. tECUon ≥ 2 s + 5tmsg

Requirement 4

According to [TBENV] the Operating Voltage Mode is 22V to 32V. This should betested by equivalence class partitioning and boundary value analysis.We have three equivalence classes

41

CHAPTER 5. RESULTS

1. Uop < 22 V

2. 22 V ≤ Uop ≤ 32 V

3. 32 V < Uop

5.3.3 Test Flows

By logically grouping the requirements we define three test flows which covers dif-ferent test requirements. It is specified which requirement is covered by a test flow.The test flows are presented below. Please note that the variable/constant namesare not allowed for publishing. Therefore the variable names given in the test flowsbelow are not real. The names given here however is chosen to be descriptive ofwhat the variable represents.

A note about the use of boundary value analysis is that we are using somewhatrelaxed version of it by not analyzing signal data exactly on the boundaries sincethe system requirements does not define them clearly and it is not deemed necessary.

42


Test Flow 1

This test flow covers requirement 1 and 3.

43

CHAPTER 5. RESULTS

Test Flow 2

This test flow covers requirement 2. Note that the “appendix” mentioned here refersto the appendix in the internal test specification document. Fig. 5.5 in this reportis its close correspondence.

44


Test Flow 3

This test flow covers requirement 1, 3 and 4 and involves manipulating systemsupply voltage.

5.3.4 Testing the Test Case

This test case was run on two different software versions, one of which had a knownbug. The software version with the known bug tested was 61.38.00 while the bugwas fixed in version 61.38.01 As we see how signal status of one message whose Tmax

according to CANdb is 200 ms behaves when the message is sent with actual cycletime of 1200 ms. In 61.38.00 we see cycles of 950 ms with status NOTAVAILABLE and250 ms with status FLAWLESS. This is incorrect behavior and violates requirement2 by degrading the signal status too early. In fig 5.5 we see the correct behavior ofthe signal. The test case was able to detect this bug.

45

CHAPTER 5. RESULTS

Figure 5.4. Signal status variation, the higher line indicates FLAWLESS signalstatus, the lower is NOTAVAILABLE signal status. This one is faulty

Figure 5.5. Signal status variation, the higher line indicates FLAWLESS signalstatus, the lower is NOTAVAILABLE signal status. This one is OK

5.4 Test case 2: EMS Shutdown

The process of shutting down an ECU may be complex and depend on conditionsboth within the software of the ECU being shut down and other ECUs. As aninstance, EMS must wait for EEC3 to shut down in some cases. In general, anECU must be shut down in such way that it should be possible to start it up againwithout it entering an undefined state. Another important issue is that data in theRAM mirror must be saved correctly to EEPROM during shutdown. Each timedata is saved a CRC5 checksum is calculated and stored in EEPROM. During up-start, the checksum on EEPROM data is calculated again and compared to the one

5CRC = Cyclic Redundancy Check

46

5.4. TEST CASE 2: EMS SHUTDOWN

stored from the time data was saved. A mismatch indicates corrupt data area andit should be replaced by pre-defined default values.

The technical analysis shows that S8 in a road vehicle (truck or bus) is powered bythe vehicles battery or generator when the engine is running. Besides U30 powersupply, S8 (like almost any other ECU) has a U15 input signal, which is defined as

U15 =

{

U30 if ignition is on0 otherwise

When ignition is turned from on to off, the ECU detects the loss of U15 signal andinitializes its shutdown sequence. During this sequence the ECU presumes to havefull U30 voltage supply. If U30 is lost while the shutdown phase is in progress theECU may enter an undefined state and/or corrupt the EEPROM data. This sce-nario is fully realistic in a truck because trucks are equipped with a master batteryswitch, usually located outside the cab, near the battery or in some special purposevehicles inside the cab. This switch turns off all of the vehicle’s electrical systems(with a very few exceptions). If this switch is turned off while ignition is on thenS8 will lose power at the same moment as the U15 signal is switched to 0. In worstcase, the ECU may initialize the EEPROM saving process which it does not havea chance to complete due to sudden voltage loss.

The above illustrates a rationale behind developing a test case that tests correctnessof a shutdown process. As with previous test case we begin the development processwith a requirement identification.

5.4.1 Requirement Identification

Four requirements were identified and chosen to be covered by this test case. Theabbreviation EXEM in this section refers to a software module within the SYSMmanager that controls the initial phase of the execution of an ECU.

Requirement 1: The EMS should turn itself off if and only if thefollowing conditions are fulfilled

1. U15 is OFF for 100 ms

2. No engine movement for 100 ms

3. APPL layer allows shutdown or a timeout has occurred

4. EEC, if present, allows shutdown or a wait timeout has occurred

47

CHAPTER 5. RESULTS

Requirement 2: EXEM shall validate a DTC if the ECU does notperform the correct shutdown actions for a configurable number oftimes in a row

See description of requirement 3

Requirement 3: EXEM shall validate an INTE as fast as it detects anabnormal shutdown conditions

It is important that an incorrect shutdown process is detected by the ECU duringthe following start-up. A cause may be an interrupt in power (U30) supply or asoftware/hardware failure. According to a system requirement an incorrect shut-down must result in an internal event (INTE) and after a configurable number oftimes a DTC. This test case verifies these requirements by provoking an incorrectshutdown and checking that an INTE and DTC is set. Incorrect shutdown is doneby cutting U30 power supply to EMS when it is running giving it no chance to savefiles and perform a normal shutdown sequence.

Requirement 4: It must be possible to abort a shutdown process untilall shutdown actions are acknowledged

There is a system requirement stating that a powertrain control unit should beup and running within 500 ms from ignition key on. Measurements indicate thata complete file saving process takes about 900 ms meaning that S8 shutdown atleast takes 900 ms. It is therefore impossible to fulfill the 500 ms start-up timerequirement if the ignition key is turned on very shortly after it has been turned offin case the system performs a complete shutdown and start-up sequence. Thereforethere is a system requirement that S8 must interrupt the ongoing shutdown sequenceand restore itself to be a fully operational ECU if U15 is suddenly turned on duringthe time that the ECU saves files and prepares for shutdown as a result of an U15turnoff.

48


5.4.2 Test Techniques

For requirement 1 we use a decision table. Based on the conditions in requirement1 we form a decision table as shown in tab. 5.2. As we see with the help of the

Table 5.2. Decision table for the S8 shutdown test case. Majority of combinationsof the four conditions was excluded from the test case heuristically based on theassumption that it is unlikely that two conditions in combination would give a failurewhile they would not if tested separately. This way we test whether the EMS doesshut down when all conditions are fulfilled and does not when one condition at a timeis not fulfilled.

decision table we get a good overview of all possible scenarios. Their number isreasonably small in this case. One should be careful with decision tables as one ormany important condition combinations may be excluded while they should not be.To verify other requirements no special test technique was used since the test dataset and the number of execution flows is limited and verified directly.

5.4.3 Test Flows

Test Flow 1: All Conditions Fulfilled

Note that variable names are modified in order not to disclose internalinformation. The real variable names are different. Also note that u15

in this case is a boolean variable that can either be 0 or 1 indicatingwhether ignition is off or on respectively. During this test flow we assumethat the EEC3 unit does not request stay-alive. Also we assume that the signalstatus of the EEC3 stay-alive request is FLAWLESS at all times. Note: the variableu15_allow_shutdown is used to determine whether the U15 condition for shutdownis fulfilled. u15 is 1 when U15 is on and 0 when U15 is off. In order for U15 conditionto be fulfilled u15 must be 0 for 100 ms.

49

CHAPTER 5. RESULTS

Test Flow 2: Engine does not Stop

In this test flow we test that the EMS does not turn off when the engine is running.S8 is not allowed to turn itself off until there is no engine movement for some time.In this test flow we verify that the EMS stays turned on for as long as the engine ison. There is no timeout so EMS may theoretically stay up for an unlimited amountof time as long as the engine is running but since we cannot test for an unlimitedtime we must define a time limit. We set 60 seconds as it appears to be a reasonabletime limit. It is reasonable to assume that EMS either shut itself down soon afterall conditions for shutdown except engine movement has been fulfilled (which wouldmean that this test case fails) or it will stay up as long as the engine is moving.Even if it would happen the consequences are not deemed as severe.

50


Test Flow 3: APPL Layer Holds Shutdown due to Adaptations

ADAPTATION_TIMEOUT

ADAPTATION_TIMEOUT

adaptation_shutdown_halt

51

CHAPTER 5. RESULTS

Test Flow 4: EEC3 halts the Shutdown

adaptation_shutdown_halt

-

5.4.4 Detect an Abnormal Shutdown and Set a DTC and InternalEvent (INTE)

Test Flow 5

Pt Action Expected result

1 Turn U15 on and connect XCOM to the power train ECUs.

2 Clear DTCs DECs and INTEs

3 Turn battery off (U15+U30)

4 Turn battery back on, turn U15 on and reconnect XCOM to the ECU (if necessary)

Check INTEs, an INTE in module (reset

cause check) should be set.

5 Clear all DTCs/DECs/INTEs

6 Perform battery turn off and turn back on the number of times given by MAX_NO_OF_ABN_SHUTDOWNS

Check that a DTC 0xF001 (Incorrect EMS shutdown is set)

5.4.5 Possibility to Cancel a Shutdown in Progress

Since it is not known exactly how much time file saving process takes we cannot puta time requirement how long it should be possible to interrupt shutdown after U15off. This case is therefore based on first performing a measurement of how much

52


time a file saving process takes and then perform the scenario again but with halfthe measured time. Since variations in file saving time of more than ±50% seemsvery unlikely we consider that the system behaves not according to requirement ifit is not possible to interrupt a shutdown after half the measured time.

Test Flow 6

5.4.6 Test Results

This test case was run in the test automation environment which is a Hardware In-the Loop (HIL) rig containing the ECUs EMS (S8), GMS (OPC5), EEC3, COO7

53

CHAPTER 5. RESULTS

and ICL2. A selection of most interesting results is presented here.

Test Flow 1

Figure 5.6. Vision recording of the scenario given in test flow 1 in this test case.The recording was stopped after the ECU (EMS) stopped to respond to CCP andwe assume that it powers itself down directly after that. This means that the lastsamples in this recording is the last samples of the ECU before shutdown. At ap-prox 23.5 s time the ignition (U15) is turned off and approx 100 ms after that theu15_allow_shutdown condition becomes true. At approx 25.5 s the engine stops mov-ing and the eec_allow_shutdown conditions become true and after approx 1.2 s theEMS stops responding to CCP. This 1.2 s can be explained by the time it takes tosave the NVM data.

Test Flow 2 – 4

The behavior in these test flows are very similar to the one in the test flow 1 result.The difference is only in details and variable names.

Test Flow 5

By inspection with XCOM it was found that the INTE code was set when a U30switch was shut off during running ECU in the HIL lab. Also after one shutdownby cutting off U30 to the control unit a DTC 0xF001 Incorrect EMS shutdown wasset.

54


Test Flow 6

Please observe that in the following plots the units on the y-axis are different foreach line. There are boolean variables, u15 and e2saving_in_progress and thetime counters which have the unit second, mixed in the same plot.

Figure 5.7. The U15 is turned off at approx 6.1 s and then turned back at approx7.3 s. As we see the counter that counts the time since ECU was turned on did notreset. This means that the ECU went back to fully operational state without beingturned off first.

55

CHAPTER 5. RESULTS

Figure 5.8. This is the scenario where the EMS actually does turn itself off as we seeby the run time counter. The time period where the counters (ecuon_time_counter

and e2saving_time) are at a constant is the time when the EMS is down and Visionis not receiving any samples from CCP.

56

Chapter 6

Conclusions and Suggestions for Future

Work

6.1 Conclusions

In general, many things can go wrong during an ECU or vehicle start up and shut-down. To be able to classify the areas of highest risk one needs to have an experiencein developing or testing the ECU software (preferably both) and a great knowledgeof the internal structure of the ECU software. The author’s communication withplatform software developers was of great help to be able to get somewhere and torealize some high-risk problem areas.

Through communication (informal interviews) with other developers and testersat the department, analysis of technical documents, some fault reports and faultcodes database, some problem areas was identified as probably the most high-riskand two test cases were developed. These problems can be coarsely partitioned into

• Start up problems such as CAN communication timeouts, fault codes thatmay be false or misleading. A great number of these problems can most likelybe explained by a previous improper shutdown.

• Cranking problems which is a result of a sudden voltage drop of U30 supplyline to all the ECUs. At very cold temperatures (down to −40◦ C) this problemis even worse than what is shown in fig. 5.2. In such cases the voltage mayreach 0 V and stay at this level for a significant amount of time [Interviewwith the platform architect at NE, Feb. 2011]

• Shutdown problems that comprises all cases where an ECU is turned off unex-pectedly due to either a deliberate or non-deliberate U30 cut off or unexpectedECU reset due to software (or hardware) problems. A common denominatorfor these scenarios is that the ECU has little chance of saving its data toNVM. It can be compared to typing a document in a word processor on a

57

CHAPTER 6. CONCLUSIONS AND SUGGESTIONS FOR FUTURE WORK

desktop computer and then make the computer suddenly lose power withoutpreviously saving the document.

The assignors requirement was to develop several test cases related to identifiedproblem areas within the start-up and shutdown of the ECUs and if time and/orresources allows also implement the test cases in either a test automation rig or alive vehicle. At the time point when the first test case was developed (test speci-fication was produced) in the end of march 2011 it was clear that there is no timeto go on with the implementation phase. The natural step for a future work (arecommendation to Scania) is therefore to implement the test case EMS-EEC CAN

Timeout as well as part of the EMS Shutdown test case as a HIL-script.

One of the main conclusions of this work is that there is no simple solution tothe start up and shutdown problems. Also the system test as an approach to attackthese problems is not sufficient. Many things happen “under the surface” during thestart up and shutdown which requires a more white box approach. Also many prob-lems discovered are not because of software bugs. This mainly applies to problemsthat arise due to battery voltage or system electrical supply interruption. There-fore other ways, such as module testing and possibly hardware redesign, must beconsidered.

6.2 Future Work Suggestions

One of the natural continuations of this work would be to investigate further whichfiles that EMS S8 system is most critical to save correctly and add the test flowsfocused on these into the “S8 Shutdown” test case. Also the data that is not allowedto be written to non volatile memory, such as EOL data, must be verified that it isnot being written at shutdown. This can be done by modifying the configurationparameters in the RAM mirror through CCP and shutting down the unit beingtested. At the next start-up the RAM mirror must contain the original value of theparameter, i.e. not be affected by the modification made before shutdown. Eitherall configuration parameters can be tested this way or a heuristic selection can bemade of the most important ones.

Another suggestion is to make implementation of the given test flows in a testautomation environment, specifically the HIL rigs to make them more useful as re-gression tests. The test flow may have to be modified in order to be “scriptable” asthe current one is written in such way that it is presumed to be run manually in theHIL-rig. Instead of inspecting the signal and/or testcom struct graphs manually inATI Vision a script can be written to go through the data arrays and compare thevalues of samples at critical time points with predefined reference arrays.

Failure report analysis show that a lot of problems in the development or fieldtest vehicles is due to battery/electrical problems and supply voltage variations.

58

6.2. FUTURE WORK SUGGESTIONS

One of the problems is that it is possible to shut down an ECU in an improper wayby switching off the master battery switch while ignition is on. An intuitive thoughtis that this problem can be solved by introducing a backup battery in each ECUindependent of the main battery. When U30 is turned off the ECU then can usethe backup battery power to perform a proper shutdown which would reduce thenumber of misleading fault codes significantly. This would however require a majorhardware redesign of the ECUs. The author of this report is unable to make thedecision of whether it is worth the effort. There is a recommendation to investigatethe matter.

59

Bibliography

[AST] Pär Eriksson, Jim PetterssonImplementation of a System for Automatic Software Verification, diploma thesis.Umeå University,2006.

[BACH] Bach, James,Good Enough Quality: Beyond the Buzzword.IEEE Computer Society,1997.

[CANEMB] Gianluca Cena, Adriano Valenzano,Controller Area Networks for Embedded Systems.Taylor & Francis Group, LLC2009.

[CANTIMING] Thomas Nolte,Timing Analysis of CAN-Based Automotive Communication Systems.Mälardalen UniversityTaylor & Francis Group, LLC2008.

[ECUSK10] ECU systemkurs

Scania CV AB (Internal Document),2010.

[HYB10]Lecture Notes in Hybrid and Embedded Control Systems, Lecture 1.KTH, School of Electrical Engineering, Stockholm,2010.

[KVASERCAN] Kvaser CAN home page

Kvaserhttp://www.kvaser.com/can

Accessed 2011-03-30.

[MSDNRT] Regression Testing

MSDN Library

61

BIBLIOGRAPHY

http://msdn.microsoft.com/en-us/library/aa292167


[MYERS] Myers, GlenfordThe Art of Software Testing.John Wiley and Sons,2004

[OBD] OBD

KMB Systemshttp://www.kbmsystems.net/obd_tech.htm


[RTOS] Giorgio ButtazoReal-Time Operating Systems: Problems and Novel Solutions.University of Pavia,Vol. 2469 of Lecture Notes in Computer Science, Springer-Verlag, pp. 37-512002.

[SDS8] System description EMS S8.Scania CV AB (Internal Document),2008.

[TBENV] Technical Regulation - Requirements and verification methods for electri-

cal factors in a 24V system

Scania CV AB (Internal Document)2010.

[TBJ1939COM] Technical Regulation - Data communication requirements, control

units connected to a SAE J1939 network segment

Scania CV AB (Internal Document),2010.

[TDP] Torbjörn Ryber,Testdesign för programvara.No Digit Media,ISBN 91-976062-1-92006.

[WPCAN] Controller Area Network

Wikipedia articlehttp://en.wikipedia.org/wiki/Controller_area_network


[WPEMSTD] European Emission Standards

Wikipedia articlehttp://en.wikipedia.org/wiki/European_emission_standards


62

TRITA-CSC-E 2011:065 ISRN-KTH/CSC/E--11/065-SE

ISSN-1653-5715

www.kth.se

Analysis and System Test of Powertrain Embedded Control Systems ...

Documents

Transcript of Analysis and System Test of Powertrain Embedded Control Systems ...