Edge AI a ssists Partial Content Caching with Smart...

Edge AI assists Partial Content Caching with Smart

Content Prefetching Scheme

Kyi Thar, and Choong Seon Hong

Department of Computer Science and Engineering, Kyung Hee University

{kyithar, cshong}@khu.ac.kr

Abstract

Watching videos (Contents) from mobile devices has been causing most of the network traffic and is projected

to remain to increase exponentially. Thus, numerous types of content and chunk based caching schemes have

been proposed to handle the increasing traffic. Those caching schemes cache the whole videos at the edge

nodes, but most of the users view only the beginning of the videos. Hence, caching the complete video on the

edge node is an ineffective solution to reduce the network traffic as well as to improve the cache utilization.

Thus, in this paper, we proposed a chunk-level caching scheme to store popular videos partially and a smart

prefetching scheme to provide the missing chunks of the video. The simulation results confirm that our proposed

scheme outperforms the existing schemes.

1. Introduction

According to the CISCO, watching videos from mobile

devices has been causing most of the network traffic and is

projected to remain to increase exponentially [1]. Thus, many

researchers are proposing numerous types of caching

schemes based on reactive approaches and proactive

approaches to handle the growing video traffic. In the reactive

caching, the edge node decides to store videos when the

requests or videos arrived [2]. In the proactive approach,

popular videos are cached based on the prediction results

before requested by any users [3]. So, the performance of

the proactive approach is changing based on the efficiency

of the prediction model. Currently, the deep learning models

get huge attention to utilize in content's popularity prediction

scheme because of the advances in big data and high

computing power. The aforementioned caching schemes

consider storing the complete popular videos at the edge

nodes (i.e., Base station). The main issue is that most of the

users view only the beginning of the videos because they stop

watching videos when they do not like the beginning. Hence,

caching the whole video is an ineffective solution to reduce

network traffic as well as to improve the users' Quality of

Experience (QoE).

Therefore, in this paper, we proposed edge Artificial

Intelligence (AI) assists partial video caching to improve the

cache performance. We additionally proposed the smart

prefetching scheme to reduce the latency to access the

missing chunks. The goal of this paper is to minimize the

latency to access the videos from the users' devices.

This work was supported by Institute of Information &

communications Technology Planning & Evaluation (IITP) grant

funded by the Korea goverment(MSIT) (No.2019-0-01287,

Evolvable Deep Learning Model Generation Platform for Edge

Computing). *Dr. CS Hong is the corresponding author

Our contributions are summarized as follows:

We formulate the video accessing latency

minimization problem.

We proposed the popularity prediction model based

on the deep neural network models.

We proposed deep reinforcement learning based

caching and prefetching scheme.

Figure 1 System Model

2. System Model

The system model of the proposed scheme is shown

in Fig.1, where Content Server 𝑐 is located at outside

of the current Autonomous System and it provides the

set of videos 𝐹 = {1,2, . . , 𝑓} . Assume that all of the

videos have the same size or the same number of

chunks. A set of edge nodes (base stations) is denoted

as 𝐵 = {1,2, . . , 𝑏} and located nearest to the user. A set

of users is denoted as 𝑈 = {1,2, . . , 𝑢}. The edge node 𝑏

provides videos from its cache storage when the

requested videos are located at its cache storage.

2019년 한국소프트웨어종합학술대회 논문집

936

Otherwise, it retrieves requested videos from the

Content Server 𝑐 and provides content to the

requested user 𝑢. The cache storage size of the edge

node 𝑏 is denoted as 𝑠𝑏 and the total cache storage

size of the whole network is denote as ∑ 𝑠𝑏𝐵𝑏 = 𝑆.

3. Problem Formulation

In this section, we formulate the partial content

caching and prefetching problem as content access

latency minimizing problem.

min𝑥

∑ ∑ 𝑟𝑓𝑖(𝑡+1)

(𝑥𝑓𝑖(𝑡+1)

𝛿𝑢𝑏 + (1 − 𝑥𝑓𝑖

(𝑡+1))𝛿𝑏

𝑐𝑝)

𝐼

𝑖

𝐹

𝑓, (1)

𝑆𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜: ∑ ∑ 𝑥𝑓𝑖(𝑡+1)

≤ 𝑠𝐼

𝑖

𝐹

𝑓,

𝑥𝑓𝑖(𝑡+1)

∈ {0,1},

where 𝛿𝑢𝑏 is the latency of retrieving videos from the

edge node 𝑏 and 𝛿𝑏𝑐𝑝

is the latency of retrieving videos

from the edge node 𝑏 from content server 𝑐𝑝 . The

future request for the chunk 𝑖 of content 𝑓 is denoted

as 𝑟𝑓𝑖(𝑡+1)

. The cache decision variable to store the

chunk 𝑖 of content f based on future requests is

denoted as 𝑥𝑓𝑖(𝑡+1)

∈ {0,1}. The chunk 𝑖 of content f will

store on edge node when 𝑥𝑓𝑖(𝑡+1)

= 1. Then, we define

the constraint where the cache storage of edge 𝑏

cannot store the contents more than its capacity. The

optimization problem (1) is a combinational problem

and it needs an exhaustive search approach to solve

the problem. Also, it is not easy to solve because the

future request count information of each video is

unknown. Therefore, we apply a deep learning

approach to solve this problem.

4. Edge AI assists partial caching and smart prefetching

In this section, we solve the optimization problem (1)

by using edge AI assist cache decision and perfection

decision scheme. Fig.2 shows the overview system

components needed to implement the proposed

scheme. Fig.2 (a) and (b) represent the important inputs

parameters to make the cache decision. Fig.2 (a) shows

that the example of content popularity profile which is

the output of the Popularity Prediction module. Fig.2 (b)

shows the sample collected videos accessed data,

where red blocks show the most accessed chunks of

each video. As shown in Fig.2 (c) the cache storage

space is divided into two partitions: i) Content Storage

and ii) Prefetching Buffer. The Content Storage partition

stores the partial popular videos and the prefetching

buffer stores the current prefetching chunks of videos.

The Popularity Prediction module predicts video

popularity with the help of a deep learning model and

the details deep learning model used in this prediction

module is discussed in Section 5. The Cache Decision

module decides to store the chunks of the video based

on the popularity profile and historical data. The

Prefetching Decision module performs the missing

chunks retrieving process. Note that both Cache

Decision and Prefetching modules utilize deep

reinforcement learning.

The process of partial content caching and the

prefetching process is shown in ALG. 1. The inputs for

this algorithm are raw collected data at the edge node

b. The outputs of this algorithm are making cache

decisions and prefetching decisions. The initial deep

learning models are constructed at the cloud data

center because this process needs high computing

power (line 3). Then the deep learning models are

transferred to the edge node b. The edge node b makes

a content popularity prediction based on the deep

learning model and constructs the popularity profile of

the videos. Then, the edge node makes a cache

decision based on the collected videos accessed data

and predicted popularity profile. When the retrieving

contents from the user reach to the certain chunks level,

the prefetching decision module pre-download the

chunks before requested by users.

Figure 2 Overview System Components

Algorithm 1: Content caching and prefetching process

1: Input: Raw collected Data

2: Output: caching the popular chunks and

prefetching the missing chunks

▷Cloud Datacenter

3: The initial deep learning models are constructed at

the cloud data center and transfer to edge node 𝑏

▷Edge Node

4: The content popularity of each content is

predicted by using deep learning model and

construct the Popularity Profile

5: Caching the chunks 𝑖 of content 𝑓 based on

popularity profile and videos accessed data, where

the popular chunks are stored until the cache

space is full

6: Prefetching the missing chunks based on the


937

threshold ℎ and current requests pattern

5. Performance Evaluations

In this section, we examine the performance of the

proposed scheme and the current scheme with python

based simulator. From the MovieLens dataset, we

extract the request pattern of the videos. The dataset

covers 26, 000, 000 ratings given to 45, 000 movies by

270, 000 users. Then, we assume each video has 100

chunks. We used the bi-directional LSTM model with

the configuration LSTM <28, 28, 28>, Dense <76, 87,

64 >, (i.e., 28 cells in layer 1,2 and 3 of LSTM and 76

cells, 87 cells, and 64 cells, respectively, in layers 1,2

and 3 of dense layers). For the deep reinforcement

learning we choose Deep Neural Network with

<128,128,128> configuration. We choose the

probability of hit and content accessing latency which

is the objective function, as Key Performance Indicator

metrics.

Fig. 3 shows the measurement of a cache hit

probability where a greater probability of cache hit

indicates better the caching utilization performance. As

the increase in cache storage size, the cache hit

probability also rises due to the larger capacity to cache

more videos.

Fig. 4 shows the content retrieval latency comparison.

The lower the amount of retrieval latency, the better the

performance. The results in Fig. 4 show that the content

retrieval latency of our proposed scheme reduces the

network traffic by 10% on average compared to other

schemes.

6. Conclusion

In this paper, we proposed a scheme to cache the

popular chunks with the joint partial content caching

and smart prefetching. The proposal in this paper aims

to enable the user to get the requested videos instantly.

As a result, the proposed scheme improves not only the

probability of cache hit but also the average latency to

access the videos. We intensively simulated the

proposed scheme. The preliminary results confirm that

for our proposed scheme, the cache hit is higher and

the average latency to get the content is lower than

other proposals.

Figure 3 Cache hit comparison between optimal,

proposed and randomized solutions

Figure 4 latency comparison between optimal,

proposed and randomized solutions

References

[1] CISCO VNI. Accessed: Feb. 7, 2019. [Online].

[2] Saeed Ullah, Kyi Thar and Choong Seon Hong,

“Management of Scalable Video Streaming in Information

Centric Networking,”Multimedia Tools and Applications,

Multimed Tools Appl (2016), 28pages, October 2016.

[3] Anselme Ndikumana and Choong Seon Hong, "Self-

Driving Car Meets Multi-access Edge Computing for Deep

Learning-Based Caching," The International Conference on

Information Networking (ICOIN 2019), Jan. 9-11, 2019,

Kuala Lumpur, Malaysia

[4] K. Thar, T. Z. Oo, Y. K. Tun, D. H. Kim, K. T. Kim and C.

S. Hong, "A Deep Learning Model Generation Framework for

Virtualized Multi-Access Edge Cache Management," in IEEE

Access, vol. 7, pp. 62734-62749, 2019. doi:

10.1109/ACCESS.2019.2916080


938

Edge AI a ssists Partial Content Caching with Smart...

Documents

Transcript of Edge AI a ssists Partial Content Caching with Smart...