An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly...

22
An Efficient Virtual System Clock for the Wireless Raspberry Pi Computer Platform Diego L. C. Dutra, Edilson C. Corrˆ ea, Claudio L. Amorim February 2016 Abstract The use of Dynamic Voltage and Frequency Scaling (DVFS) by Energy- Efficient (EE) computer systems considerably increases the requirements regarding the design of efficient system clocks. On the one hand, the operation of a system clock must support the independent operating fre- quencies of the processor core units, the dynamic migration of the running processes between the processors core units, and the use of synchroniza- tion and time interpolation techniques to maintain the accuracy of the system clock. On the other hand, an efficient system clock has to mini- mize the overhead of its own operation, aiming at energy efficiency of EE computer systems. In this paper, we present the design and evaluation of the RVEC virtual system clock for the EE Wireless Raspberry Pi (RasPi) platform. In the RasPi platform, the use of DVFS for reducing the energy consumption hinders the direct use of the cycle count (CCNT) of the ARM11 processor core for building an efficient system clock. Therefore, a distinct feature of RVEC is to obviate this obstacle, such that it can make use of the CCNT circuit for precise and accurate time measurements, concurrently with the use of DVFS by the operating system of the ARM11 processor core. Specifically, this paper presents the design and experimental evaluation of an implementation of the RVEC virtual system clock in the Linux kernel of the RasPi platform with DVFS. Our experimental results validate the RVEC virtual system clock as an efficient system clock for the EE RasPi platform that runs the Linux operating system. 1 Introduction In computer systems, system clocks provide the time measurements that are fundamental to the development and assessment of computer programs. Similarly, distributed applications, including financial transactions, distributed databases, and multiplayer games depend on accurate and reliable system clocks. Conventionally, system clocks guarantee correct and precise time measurements by using a standard technique based on a time-invariant cycle count, such as the 1

Transcript of An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly...

Page 1: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

An Efficient Virtual System Clock for the

Wireless Raspberry Pi Computer Platform

Diego L. C. Dutra, Edilson C. Correa, Claudio L. Amorim

February 2016

Abstract

The use of Dynamic Voltage and Frequency Scaling (DVFS) by Energy-Efficient (EE) computer systems considerably increases the requirementsregarding the design of efficient system clocks. On the one hand, theoperation of a system clock must support the independent operating fre-quencies of the processor core units, the dynamic migration of the runningprocesses between the processors core units, and the use of synchroniza-tion and time interpolation techniques to maintain the accuracy of thesystem clock. On the other hand, an efficient system clock has to mini-mize the overhead of its own operation, aiming at energy efficiency of EEcomputer systems.

In this paper, we present the design and evaluation of the RVEC virtualsystem clock for the EE Wireless Raspberry Pi (RasPi) platform. In theRasPi platform, the use of DVFS for reducing the energy consumptionhinders the direct use of the cycle count (CCNT) of the ARM11 processorcore for building an efficient system clock. Therefore, a distinct feature ofRVEC is to obviate this obstacle, such that it can make use of the CCNTcircuit for precise and accurate time measurements, concurrently withthe use of DVFS by the operating system of the ARM11 processor core.Specifically, this paper presents the design and experimental evaluation ofan implementation of the RVEC virtual system clock in the Linux kernelof the RasPi platform with DVFS. Our experimental results validate theRVEC virtual system clock as an efficient system clock for the EE RasPiplatform that runs the Linux operating system.

1 Introduction

In computer systems, system clocks provide the time measurements thatare fundamental to the development and assessment of computer programs.Similarly, distributed applications, including financial transactions, distributeddatabases, and multiplayer games depend on accurate and reliable system clocks.Conventionally, system clocks guarantee correct and precise time measurementsby using a standard technique based on a time-invariant cycle count, such as the

1

Page 2: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

Time Stamp Counter (TSC). Recently, Energy-Efficient (EE) computer systemsthat use multicore processors with Dynamic Voltage and Frequency Scaling(DVFS) [1] required some advances in the design of system clocks. Specifically,a system clock now must support the independent operating frequencies of theprocessor core units and the dynamic migration of the running processes betweenthe processor core units. However, such a design of a system clock also facesthe negative effects of DVFS on its efficiency, which can potentially reduce boththe precision and accuracy of its time measurements.

To counteract the negative effects of DVFS, a system clock frequently main-tains its accuracy by using a synchronization mechanism, such as NTP [2], theFlooding Time Synchronization Protocol (FTSP) [3] or the Timing-sync Pro-tocol for Sensor Networks (TPSN) [4]. In addition, to increase the accuracy ofa system clock, it is also necessary to resort to some complementary techniquethat implements a time interpolation method in the operating system. How-ever, both types of techniques have not yet provided an efficient system clockwith the property of strictly increasing and precise (SIP) time counting 1 [5].Moreover, the design of an efficient system aiming at energy efficiency of EEcomputer systems clock must also minimize the cost of operation of the systemclock.

In this paper, we use the Raspberry Pi [6] (RasPi) computer as a repre-sentative platform of EE computer systems[]. The RasPi platform is based onthe 32-bit ARM1176JZF-S [7] processor core and is used especially in embed-ded computing systems [8] and Wireless Sensor Networks (WSNs) [9, 10, 11]for remote sensing and monitoring applications [12, 13] because of the energyefficiency of the ARM processors. The sources of the timing events of the RasPiplatform consist of periodic interrupt scheduling and the system time counter(STC). STC is a free-running counter that operates at 1 MHz, which the Linuxoperating system uses to implement a time interpolation technique. However,the use of DVFS prevents the direct use of the CPU cycle count (CCNT) forbuilding an efficient system clock to perform the time measurements (Broom-head et al. [14] report equivalent issue for the x86 architecture cycle counter(TSC)).

This paper extends the material presented in our previous publication tothe RasPi platform [15] by extending the experimental results of the RVEC SIPproperty [5] of both RVEC and the CCNT counter of the ARM11 processorand presents an evaluation of the impacts of the NTP synchronization on thesystem clocks of the RaspPi platform. The contributions of this paper are thefollowing:

• Development of an RVEC implementation in the Linux kernel of the RasPiplatform;

• Assessment of the SIP property of both RVEC and the ARM CCNTcounter;

1A system clock with the SIP property assures that two consecutive clock readings, T1 andT2, will return time measurements T2 > T1 for any time interval between the two readings

2

Page 3: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

• Validation of the RVEC virtual system clock as an efficient system clockfor a computer system that runs Linux;

• Evaluation of the impacts of use of NTP on the system clocks of the RasPiplatform.

The remainder of this paper is organized as follows. Section 2 presentsrelated work. In Section 3, we describe the organization and integration ofRVEC to protect it from the time drifts that appear in the Linux kernel of theRasPi computer. In Section 4, we present results of our experimental evaluationof RVEC and the analysis of NTP synchronization on the RaspPi platform.Finally, in Section 5, we present our conclusion.

2 Related work

Dutra et al. [5] introduced the original design and implementation of RVECin the Intel Xeon processor family with DVFS. The authors also showed thatthe nodes of a cluster of computers can remain synchronized globally after eachof the cluster nodes initializes its local RVEC by using a remote synchronizationclient-server algorithm.

Veitch et al. [16] developed the RADClock system clock, which is builton existing system clocks or cycles of time counters such as the TSC. RAD-Clock provides information on the global time and the absolute global time forsynchronization of computer network nodes. However, RADClock depends onNTP for periodic resynchronization and also has limited applicability to multi-core processors because of its implementation method.

Tian et al. [17] presented a global clock that uses the TSC as the baseclock circuit together with a remote clock synchronization algorithm, which issimilar to NTP. Because of the direct use of TSC, such a global clock cannotwork properly with processors that have multiple core units or use DVFS.

Souza et al. [18] proposed an auxiliary synchronization network that usesa remote pulse generator to ensure that all nodes of a cluster of computerssimultaneously receive the remote clock pulse, which each node uses to updateits local clock without involving the operating system. Although the proposedsolution guarantees a SIP time count, it depends on dedicated hardware and isnot applicable to wireless networks.

With regard to the maintenance of the accuracy of the system clock, thepredominant solution is to run the NTP daemon that periodically resynchronizesthe system clock by using the remote NTP server [19]. However, under a heavyworkload, the execution of the NTP daemon is often delayed, which causes timedrifts in the system clock of up to tens of seconds [20]. Moreover, if consecutivereadings of the system clock are issued within time intervals shorter than tensof milliseconds the system clock cannot guarantee the SIP property of the timecounting. Hong et al. [21] evaluated the accuracy of the system clock based onNTP synchronization by using local GPS hardware that provides an absolute

3

Page 4: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

time reference. The authors reported results that indicate a median error of2 ms to 5 ms in the system clock by using the standard NTP clients, whereasthe work [20] evaluated the NTP clients that cause over-utilization of the CPU.By considering the results of both works, we can infer that the use of NTPsynchronization is still less efficient for WSNs in which the stability of wirelesslinks is not guaranteed and the WSN nodes have lower processing capacity; thus,the NTP synchronization cannot ensure the SIP property of the time countingfor system clocks used in WSNs.

3 RVEC: A Strictly Increasing and Precise Vir-tual Clock

Currently, the system clocks of a multicore processor that runs Linux willreceive periodic interrupts from the auxiliary circuits of multiple time-invariantcycle count that operate at a lower frequency than that of the multicore pro-cessor. As a result, the time measurements of the system clocks will have lowerprecision and its accuracy will be determined by the degree of stability of thetime intervals between the hardware interrupts. In practice, the system clocksare exposed to other interrupts that occur within variable time intervals, so,the efficiency of system clocks will depend on the use of a clock synchronizationmechanism, such as NTP or TPSN.

Other complementary solutions for improving the accuracy of system clocksimplement a time interpolation technique in the operating system by using thecycle count of the faster processor, without guaranteeing the SIP property for thesystem clocks. Furthermore, the periodic execution of a synchronization mecha-nism counteracts the energy efficiency of an EE computing system, whereas thehandling of interrupts increases the overheads of power consumption in battery-powered embedded systems.

In past work [5], we introduced the RVEC virtual system clock, which webriefly review next. Assume a running application in a multicore processorwith DVFS. Over the execution time of the application, the passage of timewill evolve accordingly to its execution time intervals, each of which dependson the operating frequency of the core unit used in the specific time interval.Therefore, the implementation method of RVEC focuses on tracking each of theexecution time intervals and the accumulated execution time intervals for therunning applications. To this end, RVEC implements a virtual system clock byusing a simple data structure stored in main memory composed of a core unitcycle count and the elapsed time, as follows. Note that our choice of using thecycle count of the core units has some significant advantages: the cycle countoperates at the same high frequency of the core unit and generates no interrupts;moreover, the core unit cycle count is based on an oscillator that is highly stableand accurate.

The RVEC implementation uses a data structure composed of two fields:base count and the age time. The base count field stores the last value read

4

Page 5: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

from the core unit register, which indicates the current number of cycles. Theage time ns field stores the elapsed time between the time of the RVEC creationup to the time indicated by the base count field, which is the last value read fromthe core unit register. This way, the data structure must be updated wheneverthe core unit frequency changes. Figure 1 shows the update procedure used bythe implementation of RVEC.

Figure 1: RVEC update procedure when DVFS is used

Figure 2 illustrates the steps of the RVEC update procedure used for an EEcomputer system that support DVFS. In the figure, we note that just before thecore unit operating frequency changes, the value of the core unit cycle countis stored in the base count field, after which the age time ns field is updated.Specifically, the figure shows the change of the operating frequency of a processorcore from 800 MHz to 400 MHz when the cycle count reaches the value 4. Byusing the procedure shown in Figure 1, the RVEC base count and age time nsfields will be updated to 4 ns and 5 ns, respectively.

Figure 2: The time calculation performed when the operating frequency of aprocessor core changes

Figure 3 presents the RVEC read procedure, which is applied to the previousexample of Figure 2. Specifically, at the instant when the core unit cycle countreaches the value 8, the reading of RVEC will return the value 15 ns. The com-bination of RVEC data structure with the maintenance and reading procedurescreates a virtual timekeeping mechanism that supports a time count with theSIP property.

3.1 Implementation of the RVEC in the RasPi platform

The RasPi platform has been used in various sensing projects [12, 13], asthe computing platform for building Wireless Sensor Networks (WSN) [9]. In

5

Page 6: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

Figure 3: The RVEC reading procedure

fact, the energy efficiency of a computing platform is an important issue forWSN applications, increasing the attractiveness of using of an embedded devicethat can use DVFS to reduce its energy consumption during periods of lowutilization, although the use of DVFS can affect some of the circuits used bythe system clock. In this regard, our Linux implementation of RVEC in theRasPi platform introduces an efficient solution that enables the use of both theARM processor cycle count- namely, the Cycle Count Register (CCNT) [22]-and DVFS for energy-efficient WSN applications.

To work properly, we integrated RVEC into the Linux kernel boot process.For instance, in the x86 platform, the core unit running the boot process isthe same one that starts the operating system kernel and performs the initialconfiguration of RVEC [5]. In contrast, in the RasPi platform, the GPU startsand executes most of the boot process, rather than the core unit.

The initialization process of the RasPi platform consists of four stages, asfollows. First, the VideoCore GPU starts the execution of the first-stage boot-loader that is stored in the ROM of the BCM2835 SoC. After, the first-stagebootloader reads the SD card and transfers the second-stage bootloader (boot-code.bin) from the SD card to its L2 cache. Next, the execution of the second-stage bootloader enables the SoC DRAM and loads start.elf- i.e., the third-stagebootloader- from the SD card into the DRAM, where start.elf is also the GPUfirmware. start.elf then splits the DRAM space between the GPU and the ARMcore through the the use of the VideoCore GPU MMU (Memory ManagementUnit). After that, start.elf loads kernel.img, which is the binary file that con-tains the Linux kernel, into the Raspi platform and sends a reset signal to theARM processor core. Finally, the ARM1176JZF-S processor core executes thekernel.img thereby loading the operating system and starting the RVEC initial-ization procedure.

The ARM1176JZF-S processor core includes the CP15 system control copro-cessor that supplies the Cycle Count Register (CCNT), which is a CPU cyclecount equivalent to the TSC in the x86 architecture. However, unlike the TSC,access to the CCNT register by RVEC is enabled during the initialization pro-cedure of the kernel. Figure 4 shows how the processor core reaches the CCNT.The access to the CCNT count is performed by reading the c15 register of theCP15 coprocessor. The CCNT is always available within the operating systemkernel whereas the MCR instruction is used to read the CCNT, in which theinstruction fields must have the following data.

• Opcode 1 defined as 0;

6

Page 7: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

Figure 4: ARM1176JZF-S processor and its Coprocessor CP15

• CRn defined as c15;

• CRm defined as c12;

• Opcode 2 defined as 1.

Afterwards, we addressed the problem of the count limitation of the CCNT’s32-bit count. Clearly, given a core unit that operates at a 1 GHz frequency,CCNT overflow occurs every 4.295 s. Thus, the use of CCNT for measuringtime intervals greater than 4.3 s will violate the SIP property of the time count-ing. Therefore, we expanded the 32-bit CCNT cycle count to a 63-bit CCNTcycle count by using the macro cnt32 to 63 available in the Linux kernel 2; thisexpansion is performed during the Linux task scheduling.

The data structure struct tb shown in Figure 5 corresponds to an imple-mentation of the conceptual data structure of RVEC by using the base countand age time ns fields described above. For the correct operation of RVEC, wemust include one instance of the struct tb data structure in both the datastructures that support the processing queue of the core unit (struct rq) anda task in the operating system (struct task struct).

The data structure struct tb shown in Figure 5 is the implementation ofthe conceptual data structure of RVEC described earlier in this section, withthe fields base counter and age time ns. For its correct operation of RVEC, wemust include one instance of the struct tb data structure in the data struc-ture representing the core processing queue (struct rq) and, also, in the datastructure that represents a task in the operating system (struct task struct).

The final implementation of the RVEC for Linux running in the ARM plat-form is shown in Figure 6. To simplify the display of RVEC implementation,the figure shows a hypothetical multiprocessor with two core units. In the fig-ure, we can see both subsystems of Linux that have been adapted to guarantee

2include/linux/cnt32 to 63.h

7

Page 8: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

Figure 5: RVEC data structure

the correct operation of RVEC. The double adaptation of the Linux kernel isnecessary to ensure that the changes to the processor frequency will not causethe effect in cascade of RVEC updates on all running applications tasks in theprocessor core. Of note, this implementation of the RVEC technique allows dif-ferent running applications tasks in the RasPi platform, each of which can checkits associated RVECs through the system call clock gettime() [23] by requiringonly that each of the applications tasks use the clock identifier CLOCK RVECas the input parameter.

Figure 6: Overview of the RVECs core unit and RVECs application task

To enable the proper operation of RVEC, the necessary adaptations tothe Linux DVFS drive are shown in Figure 7. The changes were performedin the bcm2835-cpufreq device driver 3 developed for the ARM11 family ofprocessors, which includes the ARM1176JZF-S processor of the RasPi plat-form. The original function bcm2835 cpufreq set clock()used to change theoperating frequency of the processor core was extended to include two newfunctions- namely, rvec cpu freq change pre() and rvec cpu freq change pos().The rvec cpu freq change pre() function for RVEC works on the previous change(change pre) configuration of the operating frequency, which occurs in the man-agement subsystem of the core unit running the update RVEC procedure. Afterchanging the operating frequency of the processor core, the function rve cpu freq change pos()adds to the current sum of elapsed times the time spent using the current fre-quency, since the beginning of the execution of the rvec cpu freq change pre()function, after which the current value of core unit CCNT register is stored inbase count.

3./drivers/cpufreq/bcm2835-cpufreq.c

8

Page 9: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

Figure 7: Modifications of the bcm2835-cpufreq driver used by RVEC

4 Experimental Evaluation

In this section, we present an experimental evaluation of the RVEC virtualsystem clock for the RasPi platform. First, we assess the impact of using NTPon the time synchronization of a WSN node based on the Wi-Fi RasPi platform,followed by an evaluation of the efficiency of the RVEC-based timekeeping tech-nique. Table 1 summarizes the characteristics of our experimental platform. Allresults have a confidence interval of 99.9% for the sample mean.

Table 1: The RasPi Platform - Components

Component DescriptionRaspberry Pi Model B+

Processor ARM1176JZF-SDefault ClockSource STC (1 MHz)

Linux 3.2.27+

4.1 On using the NTP synchronization of the WiFi RasPiplatform

In computer systems, the system clock usually maintains its accuracy throughperiodic synchronization with an absolute time reference, such as the NTP. How-ever, in heavily loaded computer systems, NTP-based synchronization cannotrun frequently enough to compensate for the time drifts of the system clocks,which in turn reduces the accuracy of the time counting. In practice, evenfor a lightly loaded computer system, time adjustments based on the NTP can

9

Page 10: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

advance the local system clock far enough that the deadline of a running appli-cation becomes compromised, as we will show next.

We collected the experimental results from the corrections performed bythe NTP daemon on the system clock for three distinct computing platforms: awired desktop computer, a wired RasPi platform, and a wireless RasPi platformusing the WPi dongle. For each of the three computing platforms, we evaluatedthree distinct time intervals between the NTP queries. In the first experiment,we executed the NTP queries continuously with no time interval between thequeries. In the next two experiments, we inserted intervals of 10 s and 600 sbetween the NTP queries. We performed each of the experiments 200 times,each of which sent 20 queries to the NTP server. The computing platformwas lightly loaded by using minimal system services, one NTP client and amonitoring application. We also discarded the execution of the experiment inwhich the time interval between two successive NTP queries did not obey thelimit set to the NTP query rate because of some network failure. Note thata loaded computer system can further degrade the accuracy of a local systemclock because the hardware of a cycle counter issues interruptions that can bemissed.

Table 2: ∆ Statistics of the NTP updates to a Local System ClockPlatform Time Interval (s) µ (ms) σ (ms)

Intel i7-870 Ethernet 0 s 0.503 0.112Intel i7-870 Ethernet 10 s 0.660 0.235Intel i7-870 Ethernet 600 s 10.169 0.651

Raspberry Pi Ethernet 0 s 0.640 0.415Raspberry Pi Ethernet 10 s 0.708 0.625Raspberry Pi Ethernet 600 s 20.348 4.232

Raspberry Pi WiFi 0 s 1.486 1.019Raspberry Pi WiFi 10 s 2.042 3.995Raspberry Pi WiFi 600 s 29.174 25.137

Table 2 shows the results of the nine experiments. The first column identifiesthe computing platform, the second column shows the time interval betweenthe NTP queries, and the third column presents the average time adjustmentthat the NTP required of the local clock. The last column shows the standarddeviation of the time measurements.

The first experiment creates an ideal case for the NTP service, which runscontinuously in a lightly loaded computer system. In this ideal case, the tablereveals that a running application will be unable to respond to the alerts thatoccur within time intervals shorter than 1 ms because the NTP service requires1 ms to perform only two consecutive updates to the system clock of a wireddesktop computer.

The results of Table 2 contrast with the execution of the NTP service whetherperformed by a desktop computer or the RasPi platform. Although both com-puting platforms can reach the NTP server over the Ethernet connection, theRasPi platform compared with the desktop computer, performed updates ofgreater time length to its local clock within the same time interval between

10

Page 11: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

NTP queries.Furthermore, the use of a Wi-Fi connection between the RasPi platform and

the NTP server requires a higher synchronization rate of NTP updates to thelocal system clock. The Wi-Fi communication of the RasPi platform increasedthe average update time from 0.708 ms to 2.042 ms for the time interval of10 s between queries with a standard deviation 6.4 times greater. Moreover, inthe case of a time interval of 600 s between queries, the average update time of29.174 ms is 43.38% bigger, and the standard deviation is 5.94 times larger.

Considering the above results, which confirm the observations we found inprevious work [24], we expect that a conventional approach to the design ofa WSN composed of nodes based on the RasPi platform that uses NTP forthe synchronization of the system clocks will significantly limit the accuracy ofthe time measurements used by the distributed applications. In fact, Corbett etal. [25] presents Spanner, a globally-distributed database based on an implemen-tation of TrueTime API that keeps clock uncertainty small at less than 10 msby using GPS and atomic clocks. The authors showed that Spanner supportstransaction semantics with reduced overheads by enforcing tighter bounds onthe clock uncertainty.

4.2 Evaluation of the SIP property of RVEC

An evaluation of the RVEC SIP property must first experimentally confirmthe adherence of CCNT to the SIP property. To this end, we built a microbench-mark composed of a block of 12 operations within a loop of 400 iterations, whereeach of the operations performed an arithmetic sum. The experiment consistedof running the microbenchmark up to 100 times and using the MCR instructionto read the value of the cycle count stored in the CCNT register that was usedto infer the microbenchmark runtime.

4.2.1 Assessing the SIP property of the CCNT cycle count

As we previously discussed, CCNT is a 32-bit cycle count register that op-erates at the same frequency as the processor core and provides time measure-ments with greater accuracy than those of conventional system clocks. However,CCNT can be used as a time base for the RVEC solution only if CCNT is ad-herent to the SIP property, as described in Section 1.

Figure 8 shows one of the samples of the CCNT’s experimental evaluationdescribed in Section 4.2 and is used to validate the CCNT SIP property. Asseen in the figure, the execution of the microbenchmark did not violate the SIPproperty. The experimental evaluation of the CCNT revealed an average runtime of 361.91 ms with a standard deviation of 18.33 ms and the processorfrequency set at 800 MHz. The coefficient of variation for the experiments was0.051, which provides a preliminary indication of the stability of the CCNT.

11

Page 12: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

Figure 8: Assessing the SIP property of CCNT

Therefore, based on these results, we concluded that the CCNT is adherent tothe SIP property over the time length of all experiments that were performed.

Figure 9: Assessing the overflow of CCNT count

However, we note that the CCNT is SIP adherent under the limited conditionprovided by the short duration of each of the experiments. Specifically, theCCNT is a 32-bit register, so an overflow will occur when it is used for measuring

12

Page 13: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

intervals greater than 5 s, as shown in Figure 9. In this figure, the averageduration of each of the iterations was 255.372 ms, thus causing an overflow ofthe CCNT count after the execution of every 20 iterations. In other words, theseresults confirm the limitation of directly using the 32-bit CCNT for the timemeasurements of running applications in the RasPi platform. As we reported inSection 3.1, the Linux OS provides a C macro to expand the CCNT count from32-bit to 63-bit, which we used in building the RVEC virtual system clock.

4.2.2 Assessing the SIP property of the RVEC

We collected the following results during the experimental evaluation of theRVEC virtual system clock. Figure 10 shows an execution sample of the exper-iments we performed to evaluate the RVEC SIP property. For this evaluation,we repeated the same experiments used for the evaluation of the CCNT. Specifi-cally, we ran the same microbenchmark 100 times. The evaluation of the RVECSIP property produced an average run time of 372.88 µs with a standard devi-ation 20.02 µs for the processor frequency fixed at 800 MHz. The coefficient ofvariation for the experiments was 0.054, which gives us a preliminary indicationof the stability of the RVEC implementation; therefore, RVEC is adherent tothe SIP property. Moreover, it is important to note that the increase in thenumber of instructions required to gain access to RVEC is responsible for theincreased coefficient of variation compared with CCNT.

Figure 10: Assessing the SIP property of RVEC

To evaluate the use of RVEC for measuring long periods of time, we submit-

13

Page 14: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

ted RVEC to a loop of 100 iterations, each of which has an average runtime of1.108 s. The results are shown in Figure 11. Clearly, the use of RVEC solvedthe existing overflow problem that occurs in the 32-bit CCNT hardware, as thisfigure shows.

Figure 11: Assessing the RVEC on using CCNT for long periods of time

Figure 12 presents the evaluation of RVEC SIP property when using DVFSfor changing the processor frequency. We used a microbenchmark composed of atesting task and a frequency change task. The microbenchmark starts with theexecution of the frequency change task that releases the execution of the testingtask by calling sem post(&OkExec) and then waits for a call to sem wait(&frqS),which will change the core operating frequency. Next, the microbenchmark (postfork) testing tasks continues to wait in a call to sem wait(&OkExec) to ensurethat the creation of the frequency change task occurs before the execution of thetesting task. The microbenchmark performed an evaluation of a loop with 5, 000arithmetic instructions, which was divided into two blocks of 2, 500 instructions;at the end of the first block, the testing task calls sem post(&frqS) and releasesthe frequency change task. As a result, the operating frequency decreases from800 MHz to 400 MHz when executing the second loop concurrently. In thefigure, the blue colour represents the RVEC time measurements, whereas the redcolour represents those of the CCNT. The figure shows, as we could expect, thatthe RVEC curve indicates an increase in the average execution time, whereasthe associated curve of CCNT remains unchanged.

Moreover, in the figure, the red curve (CCNT) has three different stages ofprogression: the first stage is the iteration interval (0..2500), the second stageis in the interval range [2500..2650), and the third stage is in the interval range[2650..5000). The rates of the first and third stages of progression are equal,whereas the rate of progression of the second stage is influenced by the interfer-

14

Page 15: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

Figure 12: Assessing the SIP property of both RVEC and CCNT by changingboth the operating frequency and the concurrent task for execution

ence (concurrent execution) of the frequency change task with the testing taskrunning the second part of the microbenchmark. This hypothesis is based onthe method by which we implemented the support to RVEC in the DVFS devicedriver, which is shown in Figure 7 of Section 3.1. Furthermore, in Figure 7, it isstill possible to observe that RVEC (blue curve) shows only two distinct stagesof progression with two distinctive rates, despite the reduced time measurementsduring the [2500..2650) interval.

To confirm the hypothesis we used to explain the behaviour of the previousexperiment, we developed a new experiment in which the change in the operatingfrequency occurs within the same task that will subsequently perform the secondloop, instead of using another task. Figure 13 shows the RVEC behaviour whenusing the same 5, 000 instructions from the previous experiment; it is possibleto note the presence of only two progression rates of execution by using CCNT(red colour) and RVEC (blue colour). Therefore, based on the results shownin Figure 13 the presence of three different stages of progression for CCNT inFigure 12 effectively demonstrates the result of the concurrent execution of boththe testing task and frequency change task of the microbenchmark.

Figure 12 and Figure 13 show the RVEC and CCNT curves, respectively.We note that the curves exhibit a small divergence that increases progressively

15

Page 16: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

Figure 13: The SIP property of RVEC and CCNT when only the operatingfrequency changes

along the experiments, which can be explained by the different access costs ofRVEC and CCNT.

4.3 The Access Cost of the RVEC virtual system clock

We evaluated the access cost of RVEC by using the CCNT cycle count, whichprovides the best available accuracy for a cycle count within the RasPi platform.In this way, we could also assess CCNT’s own access cost and the access costsof the MONOTONIC and REALTIME system clocks of Linux running in theRasPi platform. We used a microbenchmark that consists of a block of 2, 400arithmetic instructions within a loop, in which access to the system clock beingevaluated will occur after the execution of the first 1, 200 arithmetic instructionsin the loop. This evaluation loop was executed 1, 000 times, in which one of thefour system clocks- namely, CCNT, RVEC, MONOTONIC, and REALTIME –in turn, was evaluated for each of the loop iterations by using the system callclock gettime() for the last three system clocks.

Figure 14 presents the five experimental configurations used by the mi-crobenchmark described above. In Scenario 1, no query is performed to any

16

Page 17: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

Figure 14: Evaluation of the Configuration Overhead

of the system clocks that served as the reference of the least access cost forcomparison purposes. Configurations 2, 3, 4, and 5 evaluate the cost of accessto CCNT, RVEC, MONOTONIC, and REALTIME, respectively, where each ofthe configurations was evaluated 100 times.

Table 3: Execution Time x System Clock (800 MHz)

System Mean Std. Dev. Overhead OverheadClock (µs) (µs) (µs) (%)CCNT 171.687 6.729 1.333 0.78RVEC 174.502 7.419 4.148 2.43

MONOTONIC 173.107 6.918 2.753 1.62REALTIME 173.013 6.891 2.659 1.56

Table 3 shows the results of our experimental evaluation of the RasPi plat-form running at 800 MHz. For the reference configuration, the average accesstime was 170.354 µs, with standard deviation of 6.507 µs. CCNT, RVEC,MONOTONIC and REALTIME had an average overhead of 1.333 µs (0.78%),4.148 µs (2.43%), 2.753 µs (1.62%), and 2.659 µs (1.56%), respectively. Com-pared with MONOTONIC, the small difference that is favourable to REAL-TIME is explained by the additional validations performed by the MONO-TONIC system clock.

We repeated the same experiments by using the RasPi platform but changingits operating frequency to 400 MHz. In this case, as Table 4 shows, the averageaccess time was 341.679 µs for the reference configuration, whereas CCNT,RVEC, MONOTONIC, and REALTIME obtained average execution overheadof 2.702 µs (0.79%), 8.002 µs (2.34%), 5.188 µs (1.52%) and 5.125 µs (1.50%),respectively.

Overall, the foregoing results demonstrate that the overheads of the RVECsaccess costs are compatible with those of current system clocks (MONOTONICand REALTIME) for the RasPi platform. In addition, these current system

17

Page 18: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

Table 4: Execution Time x System Clock (400 MHz)

System Mean Std. Dev. Overhead OverheadClock (µs) (µs) (µs) (%)CCNT 344.381 19.831 2.702 0.79RVEC 349.680 20.448 8.002 2.34

MONOTONIC 346.867 19.187 5.188 1.52REALTIME 346.804 19.720 5.125 1.50

clocks of the RasPi platform neither guarantee the SIP property nor supportthe use of DVFS whereas RVEC does.

4.4 Evaluation of the Time Count Precision of RVEC

In this section, we perform an evaluation of the precision of the Linux systemclocks for the RasPi platform by using the microbenchmark with configuration 1,as described in Section 4.3. Now, however, the block of 2, 400 arithmetic instruc-tions is measured by using the CCNT, the Linux system clocks (MONOTONICand REALTIME), and the RVEC implementation for the RasPi platform. Ourexperimental evaluation consisted of repeating the experiment of the previoussection 100 times.

Table 5: Time Precision x System Clock (800 MHz)

System Mean Std. Dev.Clock (µs) (µs)CCNT 170.439 6.554RVEC 172.820 7.248

MONOTONIC 172.047 7.086REALTIME 171.459 7.140

Table 5 shows the results of the time precision of the different system clocks.The results were collected for the RasPi platform operating at a frequencyof 800 MHz. For the CCNT, the average execution time of the block was170.439 µs with a standard deviation of 6.554 µs, which represents a coeffi-cient of variation of 0.038. In the case of RVEC, the average execution timewas 172.820 µs with standard deviation of 7.248 µs with coefficient of variationof 0.042, whereas the MONOTONIC and REALTIME system clocks measured172.047 µs and 171.459 µs with coefficients of variation of 0.041 and 0.042,respectively.

18

Page 19: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

4.5 Discussion of results

The results of the experimental evaluation presented above confirmed the sta-bility of the CCNT cycle counter of the ARM1176JZF-S processor. The resultsalso provide preliminary evidence, although significant, of the stability of theRVEC implementation in the RasPi platform. The results presented in Sec-tion 4.1 demonstrate that the NTP synchronization, which is predominantlyused by computer systems in wired networks, reduces the efficiency of systemclocks by reducing their accuracy and precision; ultimately, the current systemclocks of the RasPi platform cannot guarantee the property of strictly increasingand precise (SIP) time counting.

In the case of wireless networks, NTP synchronization further reduces theefficiency of system clocks, especially for wireless sensor networks that use theRasPi platform as the WSN node. In fact, the experiments presented in Sec-tion 4.4 show that both CCNT (0.038) and RVEC (0.042) have coefficients ofvariation less than 0.05. Furthermore, in Section 4.2, the results also show thatthe RVEC virtual system clock is the only system clock that adheres to the SIPproperty independently of the length of the time interval that is measured.

5 Conclusion

In this work, we used the ARM1176JZF-S processor cycle counter (CCNT) toimplement and evaluate the efficiency of the RVEC virtual system clock for theRasPi platform. In contrast to the current system clocks (MONOTONIC andREALTIME) of the RasPi platform, the RVEC virtual system clock guaranteesthe property of strictly increasing and precise (SIP) time counting independentlyof the length of time interval being measured.

Our experimental results show that the implementation of RVEC presentsan access cost slightly higher than that of the current system clocks of the RasPiplatform. However, the results also demonstrate that RVEC preserves the SIPproperty of time counting even if it is exposed to changes in the operatingfrequency of the processor core. Therefore, the RVEC can afford for the RasPiplatform to increase its energy efficiency by making use of DVFS, in contrast tothe current system clocks of the RasPi platform that restrict its use.

Furthermore, our results also indicate that the current system clocks sufferhigh overhead by using the NTP synchronization in a WSN node that usesthe RasPi platform, in contrast with the RVEC virtual system clock whichmaintains its adherence to the SIP property even if RVEC uses the 32-bit CCNTregister. Future research work will address the limit of stability of RVEC overlarger time intervals and its effects on the energy efficiency of EE computerplatforms considering that RVEC is based on a hardware counter that generatesno interrupts.

Overall, our results confirm that RVEC provides an alternative system clockwith nanosecond resolution for the RasPi platform while maintaining its ac-cess cost equivalent to that of the other system clocks of the RasPi platform.

19

Page 20: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

Most importantly, the idea behind RVEC of using a virtual system clock as anefficient system clock for energy-efficient computer systems fosters the develop-ment of new techniques for wireless sensor networks, such as an efficient globalsynchronization time protocol that uses RVEC as the basic system clock.

References

[1] D. Suleiman, M. Ibrahim, and I. Hamarash, “Dynamic voltage frequencyscaling (dvfs) for microprocessors power and energy reduction,” in 4th In-ternational Conference on Electrical and Electronics Engineering, 2005.

[2] D. Mills, “Network time protocol (version 3) specification, implementa-tion,” United States, 1992.

[3] M. Maroti, B. Kusy, G. Simon, and A. Ledeczi, “The flooding time syn-chronization protocol,” in Proceedings of the 2Nd International Conferenceon Embedded Networked Sensor Systems, ser. SenSys ’04.

[4] R. Fan, I. Chakraborty, and N. Lynch, “Clock synchronization for wire-less networks,” in In Proc. 8th International Conference on Principles ofDistributed Systems (OPODIS), 2004, pp. 400–414.

[5] D. Dutra, L. Whately, and C. L. Amorim, “Attaining strictly increasingand precise time count in energy-efficient computer systems,” in ComputerArchitecture and High Performance Computing (SBAC-PAD), 2013 25thInternational Symposium on, Oct 2013, pp. 65–72.

[6] B. Corporation, BCM2835 ARM Peripherals, 2012. [Online].Available: https://www.raspberrypi.org/wp-content/uploads/2012/02/BCM2835-ARM-Peripherals.pdf

[7] J. Khan, S. Bilavarn, and C. Belleudy, “Energy analysis of a dvfs basedpower strategy on arm platforms,” in Faible Tension Faible Consommation(FTFC), 2012 IEEE, June 2012, pp. 1–4.

[8] T. Simunic, L. Benini, and G. De Micheli, “Energy-efficient design ofbattery-powered embedded systems,” Very Large Scale Integration (VLSI)Systems, IEEE Transactions on, vol. 9, no. 1, pp. 15–28, Feb 2001.

[9] V. Vujovic and M. Maksimovic, “Raspberry pi as a wireless sensor node:Performances and constraints,” in Information and Communication Tech-nology, Electronics and Microelectronics (MIPRO), 2014 37th InternationalConvention on, May 2014, pp. 1013–1018.

[10] M. Haghighi and D. Cliff, “Sensomax: An agent-based middleware for de-centralized dynamic data-gathering in wireless sensor networks,” in Collab-oration Technologies and Systems (CTS), 2013 International Conferenceon, May 2013, pp. 107–114.

20

Page 21: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

[11] M. Kochlan, M. Hodon, L. Cechovic, J. Kapitulik, and M. Jurecka, “Wsnfor traffic monitoring using raspberry pi board,” in Computer Science andInformation Systems (FedCSIS), 2014 Federated Conference on, Sept 2014,pp. 1023–1026.

[12] T. Nagy and Z. Gingl, “Low-cost photoplethysmograph solutions usingthe raspberry pi,” in Computational Intelligence and Informatics (CINTI),2013 IEEE 14th International Symposium on, Nov 2013, pp. 163–167.

[13] R. Neves and A. Matos, “Raspberry pi based stereo vision for small sizeasvs,” in Oceans - San Diego, 2013, Sept 2013, pp. 1–6.

[14] T. Broomhead, J. Ridoux, and D. Veitch, “Counter availability and char-acteristics for feed-forward based synchronization,” in Precision Clock Syn-chronization for Measurement, Control and Communication, 2009. ISPCS2009. International Symposium on, Oct 2009, pp. 1–6.

[15] E. Correa, D. Dutra, and C. L. Amorim, “Relogio virtual estritamentecrescente para o computador raspberry pi,” in Simposio de Sistemas Com-putaciaonais (WSCAD-SCC), Oct. 2015, in Portuguese.

[16] D. Veitch, J. Ridoux, and S. Korada, “Robust synchronization of absoluteand difference clocks over networks,” Networking, IEEE/ACM Transac-tions on, vol. 17, no. 2, pp. 417–430, April 2009.

[17] G.-S. Tian, Y.-C. Tian, and C. Fidge, “High-precision relative clock syn-chronization using time stamp counters,” in Engineering of Complex Com-puter Systems, 2008. ICECCS 2008. 13th IEEE International Conferenceon, March 2008, pp. 69–78.

[18] A. F. de Souza, de Souza S. F., C. L. de Amorim, P. Lima, andP. Rounce, “Hardware supported synchronization primitives for clusters,”in PDPTA’08, 2008, pp. 520–526.

[19] D. L. Mills, “Network Time Protocol (Version 3) Specification, Implemen-tation and Analysis,” 1992, Network Working group report RFC 1305.

[20] S. Muir, “The seven deadly sins of distributed systems,” in First Workshopon Real, Large Distributed Systems, WORLDS’04, Dec. 2004.

[21] C.-Y. Hong, C.-C. Lin, and M. Caesar, “Clockscalpel: Understandingroot causes of internet clock synchronization inaccuracy,” in Proceedingsof the 12th International Conference on Passive and Active Measurement,ser. PAM’11. Berlin, Heidelberg: Springer-Verlag, 2011, pp. 204–213.[Online]. Available: http://dl.acm.org/citation.cfm?id=1987510.1987531

[22] A. Limited, ARM1176JZ-S Technical Reference Manual, 2009. [Online].Available: http://infocenter.arm.com/help/topic/com.arm.doc.ddi0333h/DDI0333H arm1176jzs r0p7 trm.pdf

21

Page 22: An E cient Virtual System Clock for the Wireless Raspberry ... · with the property of strictly increasing and precise (SIP) time counting 1 [5]. Moreover, the design of an e cient

[23] N. Clifford and A. Brouwer, Linux Programmer’s Manual: clock gettimesystem call. [Online]. Available: http://man7.org/linux/man-pages/man2/clock gettime.2.html

[24] G. Neville-Neil, “Time is an illusion.” Queue, vol. 13, no. 9, pp.30:57–30:72, Nov. 2015. [Online]. Available: http://doi.acm.org/10.1145/2857274.2878574

[25] J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman,S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak,E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan,R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, andD. Woodford, “Spanner: Google’s globally distributed database,”ACM Trans. Comput. Syst., vol. 31, no. 3, pp. 8:1–8:22, Aug. 2013.[Online]. Available: http://doi.acm.org/10.1145/2491245

22