Journaled Component Files John Scholes and Richard Smith 13 October, 2008 Or – How to never see...
-
Upload
moses-weaver -
Category
Documents
-
view
215 -
download
0
Transcript of Journaled Component Files John Scholes and Richard Smith 13 October, 2008 Or – How to never see...
Journaled Component Files
John Scholes and Richard Smith13 October, 2008
Or –How to never see FILE DAMAGED
again!
2
Component files
3
3
Purely linear file layout1
• Free space• Component data (APL arrays)• Global file information (root)
2
2 3
4
Updating a linear file1 2
• Replacing a component with a smaller one wastes space
• Replacing a component with a larger one is not possible ...
• ... unless you move potentially large amounts of data first
32 3
5
Actual file layout
• Free space• Global file information (root)• Component index blocks• Component data (APL arrays)• Free space nodes
3 1 2
6
Updating a component1
• Write the new data to free space(Note that the free space node is overwritten)
• Update the component index blocks
• Update the free space nodes• Update the root
1’
7
Adding a component1’2
• Write the new data in free space(Note that a free space node is overwritten)
• Update the component index blocks
• Update the free space nodes• Update the root
2
8
Adding – and causing damage
1’!
• Write the new data in free space(Note that a free space node is overwritten)
• ** APL process is killed **• The free space node is still
referenced but has been corrupted
9
The solution - journaling
• The free space in a file can be safely updated
• The majority of an update occurs in this free space
• Updates to existing data are first written to a journal
• The update is then completed
• The free space can be updated• The journal is put in free
space• Most of the component is
written(The free space node was left intact)
• All remaining updates are journaled
• The journal is activated
2
10
Adding - journaled1’
• Only free space updated so far• Entire update recorded in file
2
11
Adding - journaled1’
2
12
Adding - journaled1’2
• The journal is executed• The journal is removed• The update is complete
2
• Normal case - there is no journal
• Nothing needs to be done
13
Accessing the file – example 1
1’2
• Process killed before journal complete
• The updates were all in free space
• The file has been safely rolled back
2
14
Accessing the file – example 2
1’
• Process killed after journal complete but before update finished
• The journal is (re-)executed• The journal is removed• The update has been
completed and damage repaired
2
15
Accessing the file – example 3
1’2
16
Journaled files
• Are supported now in 12.0.3• Have very little impact on
performance and file size• May be enabled on a per-file
basis• ⎕FPROPS converts a file to/from
journaled
17
Journaled files
• Can only be accessed by 12.0.3 or later (but journaling can be switched off)
• Are not enabled by default• Protect from file damage if APL is
killed• Do not currently always protect
from file damage if the OS is killed
• Disk writes are held in memory and flushed efficiently (out of sequence)
• Data still flushed if APL killed• But if the O/S is killed, out of
sequence data may be lost 18
Disk caching
123 123123 123123 13APL
ProcessO/S
Kernel
Disk
1. Write to free space (inc journal)
2. Mark journal as present• O/S dies; update 1 incomplete• Executing this broken journal
would corrupt the file• There are 4 such points in an
update
2
19
Why this matters - example
1’
These must be done atomically:1. Write to free space (inc journal)
2. Mark journal as present
3. Execute the journal
4. Remove the journal
2
20
Critical update sequence1’2
• fsync causes APL to wait for the data to be committed to disk
• Could issue 4 fsyncs per update
21
fsync solution
123 12313APL
ProcessO/S
Kernel
Disk
22
fsync solution
• Slows the application considerably
• So we should reduce the number of fsyncs if possible
• Good news is that we can
1. Write to free space (inc journal)
2. Mark journal as present• O/S dies; update 1 incomplete• Executing this broken journal
would corrupt the file• Solution: add checksums to
detect
2
23
First fsync elimination1’
2. Mark journal as present3. Start executing the journal• O/S dies; journal no longer
present• No journal for recovery• Solution: use the
checksumming and redundancy to rebuild indices
2
24
Second fsync elimination1’2
25
Second fsync elimination
• Note: omitting this fsync does not prevent damage
• But we are able to fix it
3. Execute the journal4. Remove the journal• O/S dies; earlier updates lost• No journal for recovery• Rebuild indices
2
26
Third fsync elimination1’2
4. Remove the journal• O/S dies; update lost• If the journal is still present
we may re-execute it on recovery
• Otherwise it will fail its checksum validation
2
27
Fourth fsync elimination1’2 3
28
Additional journaling options• Two fsyncs eliminated by
checksumming• One further fsync eliminated if
recovery tool used• Last fsync eliminated if
recovery tool used ...• ... potential loss of more data
29
Additional journaling options• Are planned for a future release• Will have a greater impact on
performance and file size• Will offer a variety of options so
that security and performance may be balanced
• Will be configured on a per-file basis
Journaled Component Files
John Scholes and Richard Smith13 October, 2008