NCAR storage accounting and analysis possibilities
description
Transcript of NCAR storage accounting and analysis possibilities
NCAR storage accounting and
analysis possibilitiesDavid L. Hart, Pam Gillman, Erich Thanhardt
NCAR CISLJuly 22, 2013
3
Why storage accounting?
• Big Data– Increasing cost of storage with
respect to compute • NSF data management plan
mandate– Tools for users
• Some info is better than no info– Some process is better than ad
hoc fire drills• Supports allocation processes
4
Accounting for archive storage
• NCAR has “charged” users for archive use for many years.– Archive accounting has institutional inertia
• NCAR HPSS details, June-July 2013Date Files
(M)PB
(unique)PB
(2nd copy) Users TB+
6/2/13 137.6 19.5 22.3 991 181
6/9/13 138.2 19.8 22.6 991 307
6/16/13 138.8 20.1 22.9 992 370
6/23/13 141.1 20.5 23.3 998 347
6/30/13 142.4 20.7 23.5 1002 266
7/7/13 142.5 20.9 23.6 1005 135
5
Archive storage record• Activity date – date record was collected• Activity type – Read, Write, Storage• Unix uid• Project code – project to charge• Number of files • Bytes – read, written, or stored• Class of service – e.g., single-copy, dual-copy• DNS – of client host• Frequency – interval, in days, between accounting runs
6
Collecting data from HPSS
• Read/write activity– Analyze logs from HSI and HTAR (since May 2013). Logs archived
daily, processed weekly.• Storage activity
– Weekly DB2 table scan and separate post-processing steps.• Accounting system impact
– Approx. 6,000 records per week• Major accounting requirements
– Use of HPSS accounting hooks to associate NCAR project code with HPSS file “account”
– Accounting system and HPSS enforce requirement for every user to have a “default project” to which files will be charged if no other project provided
7
Accounting for disk storage
• Focus on long-term project spaces, which are allocated– But mechanism captures scratch snapshots, too!
• GLADE total storage, June-July 2013
Date Files (M) PB Users TB+6/8/13 183.05 2.87 2,506 55.3
6/15/13 192.96 2.97 2,525 99.36/22/13 210.32 3.02 2,490 53.16/29/13 212.80 3.11 2,500 89.5
7/6/13 224.76 3.11 2,509 8.8
8
Disk storage record• Event time – date record was collected• Project directory• Group — Unix group• Username• Number of files• kB used• Period — reporting interval, in days• QOS — a quality of service field (for future use)
9
Collecting data from GPFS
• File systems don’t have concept of “project”, but GPFS has notion of “file sets”– Leverage file sets to map to project spaces– For scratch, work, home: report per-user data
• Process runs weekly, provides a storage snapshot– With GPFS tools, process requires only a few minutes to complete—full
file system scan not required• Accounting system impact
– Approx. 4,000 records per week• Major accounting requirements
– Agreements and processes between GLADE administrators and User Services about how spaces are created
– Deviation would break the system
10
ANALYSIS AND REPORTING
11
Storage growth over time (1)
HPSS growth in 2013 GLADE growth in 2013
1/6/13
1/31/13
2/25/13
3/22/13
4/16/13
5/11/136/5/13
6/30/130
5,000,000
10,000,000
15,000,000
20,000,000
25,000,000
30,000,000
PBPB (w/2nd copy)
0
500
1,000
1,500
2,000
2,500
3,000
3,500/glade/p/work /glade/project /glade/scratch
TB
12
Storage growth over time (3)
User reports show project by week and per-user breakdown
13
Top consumers
0-1 TB 1-10 TB 10-100 TB
100-1000 TB
>1000 TB
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
% Projects % Files % TB
Project holdings in HPSS
0-0.1 TB
0.1-1 TB
1-10 TB
10-100 TB
>100 TB
0%10%20%30%40%50%60%70%80%90%
100%
% Users % Files % TB
User holdings in GLADE
14
Aggregate behavior (1)
-50
0
50
100
150
200
250
300
350
400
450Weekly HPSS growth
TB
Net growth, 3/3-4/7 — ~261 TB
15
14-Oct-
08
24-Oct-
08
3-Nov-08
13-Nov-0
8
23-Nov-0
8
3-Dec-0
8
13-Dec-0
8
23-Dec-0
8
2-Jan-09
12-Jan-09
22-Jan-09
1-Feb-09
11-Feb-09
21-Feb-09
3-Mar-
09
13-Mar-
09
23-Mar-
09
2-Apr-0
9
12-Apr-0
9
22-Apr-0
9
2-May-0
9
12-May
-09
22-May
-09
1-Jun-09
11-Jun-09
21-Jun-09
1-Jul-0
90
10,000
20,000
30,000
40,000
50,000
60,000
70,000
TB written daily
Aggregate behavior (2)
Data written, 3/3-4/7 — 594 TB
16
Compute v. storage (1)
2012-47
2012-49
2012-51
2012-53
2013-02
2013-04
2013-06
2013-08
2013-10
2013-12
2013-14
2013-16
2013-18
2013-20
2013-22
2013-24
2013-260
5,000,000
10,000,000
15,000,000
20,000,000
25,000,000
HPC use Disk GB Tape GB
Year-Week
Core
-hou
rs o
r GB
(mill
ions
)
17
Compute v. storage use (2)
2012-43
2012-44
2012-45
2012-46
2012-47
2012-48
2012-49
2012-50
2012-51
2012-52
2012-53
2013-01
2013-02
2013-03
2013-04
2013-05
2013-06
2013-07
2013-08
2013-090.0
500,000.0
1,000,000.0
1,500,000.0
2,000,000.0
2,500,000.0
3,000,000.0
3,500,000.0
4,000,000.0HPC use disk GB tape GB
Year – week
Core
-hou
rs u
sed
or g
igab
ytes
stor
ed (m
illio
ns)
18
Big compute != Big data
1
10
100
1,000
10,000
100,000
1,000,000
10,000,000
100,000,000
HPC charges GB growth
Users, sorted by HPC charges
19
What is “Big Data”?
<0.1 GB<1 GB
<10 GB
<100 GB
<1000 GB0
100
200
300
400
500
Users
Average file size
Num
ber o
f use
rs
<1 GB
<10 GB
<100 GB
<1000 GB
<10000 GB
<100000 GB
<1000000 GB
>1000000 GB
050
100150200250300350400
Users
Data stored per user
Num
ber o
f use
rs
<0.1 GB
<1 GB
<10 GB
<100 GB
<1000 GB
02,000,0004,000,0006,000,0008,000,000
10,000,00012,000,000
GB
Average file size
GB st
ored
(mill
ions
)
<1 GB
<10 GB
<100 GB
<1000 GB
<10000 GB
<100000 GB
<1000000 GB
>1000000 GB
0
2,000,000
4,000,000
6,000,000
8,000,000
GB
Data stored per user
GB st
ored
(mill
ions
)
Average file size vs. Total data holdings
20
Managing “orphaned” files
• Verifying accounting records lets site operators identify files owned by inactive users or inactive projects
• On July 7, HPSS accounting showed 177 users with 885 TB of “orphaned” files
• Early outreach to users and project leads does translate to deletions and fewer files for whom an owner cannot be found– Users required to be “actively engaged” in the disposition of
their archive holdings.www2.cisl.ucar.edu/docs/hpss/policies
21
QUESTIONS?