fcdfdata016

72
fcdfdata016 Disk/Cache Station central-analysis smaster Stager stagerng FSS central-analysis fss Stager stagerng Disk/Cache Disk/Cache Disk/Cache Stager stagerng Stager stagerng Stager stagerng

description

FSS central-analysis fss. fcdfdata016. Disk/Cache. Stager stagerng. Disk/Cache. Station central-analysis smaster. Disk/Cache. Stager stagerng. Stager stagerng. Disk/Cache. Stager stagerng. Stager stagerng. Cache. Cache. Cache. Cache. Cache. Node1. Node2. Node3. Node4. - PowerPoint PPT Presentation

Transcript of fcdfdata016

Page 1: fcdfdata016

fcdfdata016 Disk/Cache

Stationcentral-analysis

smaster

Stager

stagerng

FSScentral-analysis

fss

Stager

stagerng

Disk/Cache

Disk/Cache

Disk/CacheStager

stagerng

Stager

stagerng

Stager

stagerng

Page 2: fcdfdata016

Node1

Cache

Node2

Cache

Node3

Cache

Node4

Cache

Node5

Cache

Station

smaster

Stager

stagerng

Stager

stagerng

Stager

stagerng

Stager

stagerng

Stager

stagerng

Page 3: fcdfdata016

fcdfdata016<fcdfdata016>

Disks/Cache

Page 4: fcdfdata016

fcdfdata016<fcdfdata016>

Stationcentral-analysis

smaster

Disks/Cache

Page 5: fcdfdata016

fcdfdata016<fcdfdata016>

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Page 6: fcdfdata016

<nglas08> setenv SAM_STATION chris

<nglas08> sam dump station --disks

*** BEGIN DUMP STATION chris version v3_2_2 running at nglas08 53 minutes 25 seconds, admins: jozwiak terekhov

Known batch systems: lsf

Default batch system: lsf

No replica selection criteria

There are 0 authorized transfer groups

Minimum delivery is 1KB; external deliveries are unconstrained

STATION DISKS:

disk 7844 nglas08.fnal.gov:/sam/test9/jozwiak/dev/chris, 29947KB/20GB free

disk 8064 nglas08.fnal.gov:/sam/test10/jozwiak/dev/chris, 93110KB/20GB free

*** END OF STATION DUMP ***

sam dump station --disks

Page 7: fcdfdata016

<nglas08> sam dump station --groups

*** BEGIN DUMP STATION chris version v3_2_2 running at nglas08 57 minutes 3 seconds, admins: jozwiak terekhov

Known batch systems: lsf

Default batch system: lsf

No replica selection criteria

There are 0 authorized transfer groups

Minimum delivery is 1KB; external deliveries are unconstrained

AUTHORIZED GROUPS:

group test: admins: jozwiak , swap policy: LRU, fair share: 0, quotas (cur/max): projects = 0/150, disk: 13054803KB/40GB, locks:0B/0KB

group test1: admins: jozwiak , swap policy: LRU, fair share: 0, quotas (cur/max): projects = 0/40, disk: 1714466KB/30GB, locks:0B/0KB

group test2: admins: jozwiak , swap policy: LRU, fair share: 0, quotas (cur/max): projects = 0/50, disk: 7170234KB/40GB, locks:0B/0KB

*** END OF STATION DUMP ***

sam dump station --groups

Page 8: fcdfdata016

<nglas08> sam dump station --projects

*** BEGIN DUMP STATION chris version v3_2_2 running at nglas08 1 hours 4 minutes 49 seconds, admins: jozwiak terekhov

Known batch systems: lsf

Default batch system: lsf

No replica selection criteria

There are 0 authorized transfer groups

Minimum delivery is 1KB; external deliveries are unconstrained

PROJECT MANAGER: fileReleaseTO = 1 days, max files given to project: Unlimited

NO PROJECTS

*** END OF STATION DUMP ***

sam dump station --projects

Page 9: fcdfdata016

fcdfdata016<fcdfdata016>sam submit--script=userscript--group=groupname--cpu-per-event=--defname=

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Page 10: fcdfdata016

fcdfdata016<fcdfdata016>>>>>>> Starting project with the Station MasterStation Master contacted, result: Started project 49008(49008_sam_) for group testWaiting for the project to initialize...

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Page 11: fcdfdata016

fcdfdata016<fcdfdata016>Callback from server: 'OK|Project is ready'

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Page 12: fcdfdata016

fcdfdata016<fcdfdata016>>>>>>> Submitting the job to the batch system.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Page 13: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> PSUSP

Page 14: fcdfdata016

<nglas08> sam dump station --projects

*** BEGIN DUMP STATION chris version v3_2_2 running at nglas08 1 hours 12 minutes 44 seconds, admins: jozwiak terekhov

Known batch systems: lsf

Default batch system: lsf

No replica selection criteria

There are 0 authorized transfer groups

Minimum delivery is 1KB; external deliveries are unconstrained

PROJECT MANAGER: fileReleaseTO = 1 days, max files given to project: Unlimited

STATION PROJECTS:

project 49205_sam_(49205) user jozwiak.test started 01 Nov 14:08:45 UNIX pid 158400787 still wants/currently uses 5/0 files

*** END OF STATION DUMP ***

Sam dump station --projects

Page 15: fcdfdata016

<nglas08> sam dump project --project=49205_sam_

*** BEGIN GPM DUMP ***

Input files: 1003853..1011900

1003853: sim.ztautau.1000evts.017-1442-c5.01, size=0K, unbuffered yet

1003854: sim.ztautau.1000evts.017-1442-c5.02, size=0K, unbuffered yet

1011651: sim.pmc02_01.pythia.ztautau_mb1.1av_200evts.267_1553, size=0K, unbuffered

1011900: sim.pmc02_01.pythia.ztautau_mb1.1av_200evts.276_1152, size=0K, unbuffered

Cached (not buffered) files: (none)

Buffered files: (none)

External files with delivery problems: (none)

Umer contexts (name, state, join time, nSeen): (no umers)

Proc contexts (ID: name, state, join time [, current|last]): (no procs)

Processes waiting for call back:(none)

*** END GPM DUMP ***

sam dump project –project=<project name>

Page 16: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> PSUSP

Optimizer

Page 17: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> PSUSP

eworker

eworkereworker

eworker

Page 18: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> PSUSP

eworker

eworkereworker

eworker

encp

encp encp encp

Page 19: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> PSUSP

eworker

eworkereworker

eworker

encp

encp encp encp

Enstore

Page 20: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> PSUSP

eworker

eworkereworker

eworker

encp

encp encp encp

Enstore

Page 21: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> PSUSP

eworker

eworkereworker

eworker

encp

encp encp encp

Enstore

Page 22: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> PSUSP

eworker

eworker

encp encp

Enstore

Page 23: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

eworker

eworker

encp encp

Enstore

Page 24: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

eworker

eworker

encp encp

Enstore

samscript.sh

userscript

Page 25: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

eworker

eworker

encp encp

Enstore

samscript.sh

userscript

consumer

Page 26: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

eworker

eworker

encp encp

Enstore

samscript.sh

userscript

consumer

Page 27: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

eworker

eworker

encp encp

Enstore

samscript.sh

userscript

consumer

Page 28: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

eworker

eworker

encp encp

Enstore

samscript.sh

userscript

consumer

Page 29: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

eworker

eworker

encp encp

Enstore

samscript.sh

userscript

consumer

Page 30: fcdfdata016

SAMManager:sam Getting next input file...

SAMManager:sam Project master will call back.

Page 31: fcdfdata016

<nglas08> sam dump project --project=49225_sam_

*** BEGIN GPM DUMP ***

Input files: 1099393..1099756

1099417: d0g.test_file_1G_a_dev.0001_001, size=0K, unbuffered yet

1099418: d0g.test_file_1G_a_dev.0002_001, size=0K, unbuffered yet

Cached (not buffered) files: (none)

Buffered files: (none)

External files with delivery problems: (none)

Umer contexts (name, state, join time, nSeen):

36422: jozwiak(test-harness:1), active, 05 Nov 13:59:09, 31

Proc contexts (ID: name, state, join time [, current|last]):

144663: jozwiak(test-harness:1)@nglas08, wait, 05 Nov 13:59:10, 1099415

Processes waiting for call back:

CID=36422: [email protected]:11872 (05 Nov 20:53:43)

*** END GPM DUMP ***

Sam dump project –project=49225_sam_

Page 32: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

eworker

eworker

encp encp

Enstore

samscript.sh

userscript

consumer

Page 33: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

eworker

eworker

encp encp

Enstore

samscript.sh

userscript

consumer

Page 34: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

Page 35: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

Page 36: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

Page 37: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

rm

rm

Page 38: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

Page 39: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

Optimizer

Page 40: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

eworkereworker

Page 41: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

eworkereworker

rcp rcp

Page 42: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

eworkereworker

rcp rcp

Other Cache

Page 43: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

eworkereworker

rcp rcp

Other Cache

Page 44: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

eworkereworker

rcp rcp

Page 45: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

Page 46: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

Page 47: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

Page 48: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

Page 49: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

Page 50: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

Page 51: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

Page 52: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

Page 53: fcdfdata016

fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

52554 <user> RUN

samscript.sh

userscript

consumer

Page 54: fcdfdata016

fcdfdata016<fcdfdata016>

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerng

Project

pmaster

Batch (LSF)

Page 55: fcdfdata016

fcdfdata016<fcdfdata016>

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerngBatch (LSF)

Page 56: fcdfdata016

fcdfdata016<fcdfdata016>sam submit….

<fcdfdata016>sam submit….

<fcdfdata016>sam run project…

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerngBatch (LSF)

Page 57: fcdfdata016

fcdfdata016<fcdfdata016>

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerngBatch (LSF)

52668 <user1> RUN52675 <user2> RUN

52756 <user3> PSUSP

Project

pmaster

Project

pmaster

Project

pmaster

samscript.shsamscript.sh

userscriptuserscript

consumer consumer

eworker

eworker

rcp

encp

Other CacheEnstore

Page 58: fcdfdata016

fcdfdata016<fcdfdata016>

Stationcentral-analysis

smaster

Disks/Cache

Stager

stagerngBatch (LSF)

52668 <user1> RUN52675 <user2> RUN

52756 <user3> PSUSP

Project

pmaster

Project

pmaster

Project

pmaster

samscript.shsamscript.sh

userscriptuserscript

consumer consumer

eworker

eworker

rcp

encp

Other CacheEnstore

Page 59: fcdfdata016

fcdfdata016<fcdfdata016>

Disks

FSSCentral-analysis

fss

Stager

stagerng

Page 60: fcdfdata016

<fcdfdata016> sam dump fss

FSS version v3_2_2 at station central-analysis running on fcdfdata016.fnal.gov 6 hours 57 minutes 34 seconds

No routing (all transfers are direct)

Configuration for operation retrial (count, interval/timeout)

DBS contact: 3, 1 hours

Opter contact: 1, 1 hours

Authorization receipt:1, 1 hours

Stager contact: 1, 1 hours

Transfer (retrials upon timeout and upon failure): 3, 6 hours

Relay (multi-stage routing only): 3, 1 hours

File Storage Server Dump:

Stagers are known at nodes: fcdfdata016.fnal.gov

No requests ever submitted

Sam dump fss

Page 61: fcdfdata016

fcdfdata016<fcdfdata016>sam store descrip.py --source=<file loc>[--dest=/pnfs…..]

Disks

FSSCentral-analysis

fss

Stager

stagerng

Page 62: fcdfdata016

fcdfdata016<fcdfdata016>sam store descrip.py --source=<file loc>[--dest=/pnfs…..]

Disks

FSSCentral-analysis

fss

Stager

stagerngDescrip.pyMetadataInfo about file

Sam checks info,checks location,

Page 63: fcdfdata016

fcdfdata016<fcdfdata016>sam store descrip.py --source=<file loc>[--dest=/pnfs…..]

Disks

FSSCentral-analysis

fss

Stager

stagerng

eworker

encp, rcp,bbftp

Page 64: fcdfdata016

Node from ReallyFar Away

Disk

FssFrom Really

Far Away

Stager

fcdfdata016

Fsscentral-analysis

Stager

Tmp Disk

Enstore

Page 65: fcdfdata016

Node from ReallyFar Away

Disk

FssRouting:

fcdfdata016

Stager

fcdfdata016

Fsscentral-analysis

Stager

Tmp Disk

Enstore

sam store enstore

Page 66: fcdfdata016

Node from ReallyFar Away

Disk

FssRouting:

fcdfdata016

Stager

fcdfdata016

Fsscentral-analysis

Stager

Tmp Disk

Enstore

eworker

bbftp fcdfdata016

Page 67: fcdfdata016

Node from ReallyFar Away

Disk

FssFrom really

Far away

Stager

fcdfdata016

Fsscentral-analysis

Stager

Tmp Disk

Enstore

eworker

bbftp fcdfdata016

Page 68: fcdfdata016

Node from ReallyFar Away

Disk

FssFrom really

Far away

Stager

fcdfdata016

Fsscentral-analysis

Stager

Tmp Disk

Enstore

Page 69: fcdfdata016

Node from ReallyFar Away

Disk

FssFrom really

Far away

Stager

fcdfdata016

Fsscentral-analysis

Stager

Tmp Disk

Enstore

eworker

encp

Page 70: fcdfdata016

Node from ReallyFar Away

Disk

FssFrom really

Far away

Stager

fcdfdata016

Fsscentral-analysis

Stager

Tmp Disk

Enstore

eworker

encp

Page 71: fcdfdata016

Node from ReallyFar Away

Disk

FssFrom really

Far away

Stager

fcdfdata016

Fsscentral-analysis

Stager

Tmp Disk

Enstore

rm

Page 72: fcdfdata016

Node from ReallyFar Away

Disk

FssFrom really

Far away

Stager

fcdfdata016

Fsscentral-analysis

Stager

Tmp Disk

Enstore