Large Files without the Trials
-
Upload
sally-kleinfeldt -
Category
Technology
-
view
798 -
download
1
description
Transcript of Large Files without the Trials
Large FilesWithout the Trials
Aaron VanDerlip and Sally KleinfeldtPlone Symposium East 2010
Thursday, June 3, 2010
Acknowledgments
• Bioneers provides environmental education and social connectivity through conferences, radio and TV, books, and online materials
• Engaged Jazkarta to build a file asset server based on Plone to help them organize, capture, and store multimedia and textual content with files as large as 5 GB.
Thursday, June 3, 2010
Acknowledgments
• Aaron VanDerlip - Project Manager
• Kapil Thangavelu - Developer
Thursday, June 3, 2010
What is a Big File?
• Anything that makes you wait...
Thursday, June 3, 2010
Plone Problems with Big Files
1.Uploading/Downloading
2.Versioning
Thursday, June 3, 2010
Uploading Big Files
• Both the user and a Zope thread are waiting for the file transfer
Thursday, June 3, 2010
Thursday, June 3, 2010
Uploading Big Files
• Browser encodes file in multipart mime format
• Zope must undo this encoding
• CPU and memory intensive, and SLOW
• Zope thread is blocked during this process
Thursday, June 3, 2010
Downloading Big Files
• ...the same thing happens in reverse
Thursday, June 3, 2010
Learning from Rails
• Get file encoding/unencoding and read/write operations out of Plone
• Web servers are really good at this -Apache, Nginx, and Lighttpd
• Our implementation uses Apache
• Apache file streaming is fast and threads are cheap
Thursday, June 3, 2010
Learning from Rails
• Uploads: Apache plus mod_porter http://therailsway.com/tags/porter
• Downloads: Apache plus mod_xsendfile http://john.guen.in/past/2007/4/17/send_files_faster_with_xsendfile/
• ...and of course ZODB Blob storage
Thursday, June 3, 2010
Mod Porter
• Parses the multipart mime data
• Writes the file to disk
• Changes the Request to contain a pointer to the temp file on disk
• All done efficiently in C code inside your Apache process
Thursday, June 3, 2010
Mod Porter
Thursday, June 3, 2010
Apache Config for Mod Porter
LoadModule apreq_module /usr/lib/Apache2/modules/mod_apreq2.so
LoadModule porter_module /usr/lib/Apache2/modules/mod_porter.so
# Apache has a default read limit of 64MB, set it higher
APREQ2_ReadLimit 2G
...
Porter On
# Files below this size will not be handled by mod-porter
PorterMinSize 14M
# Where the uploaded files are stored
PorterDir /mnt/uploads-Apache
Thursday, June 3, 2010
X-Sendfile
• HTTP header
• Set an X-Sendfile header and the path of a file on your response
• Apache does the rest
Thursday, June 3, 2010
Apache Config for X-Sendfile
LoadModule xsendfile_module /usr/lib/Apache2/modules/mod_xsendfile.so
...
EnableSendfile On
XSendFile on
# Config to send file resources directly from blob storage
XSendFilePath /mnt/bioneers/var/blobstorage
Thursday, June 3, 2010
Using X-Sendfile from Python
def download(self, response, file_path):
response.setHeader("X-Sendfile",
file_path)
Thursday, June 3, 2010
Blob Storage
• Uploads
• Blob.consumeFile moves file from Apache’s temp area to blob storage (ZODB/blob.py)
• Uses os.rename, file never enters Plone
• Downloads
• Served directly from blob storage
Thursday, June 3, 2010
Upload Process
Thursday, June 3, 2010
What About Really Really Big Files?
• Use FTP
• Supports continuation and batching
• Handles files too large for browser limits
• Content editors use FTP to transfer files to an upload directory
Thursday, June 3, 2010
UI
Thursday, June 3, 2010
Uploading with FTP
Thursday, June 3, 2010
ore.bigfile
• Minimally intrusive, works with the grain of Plone
• Provides Big File content type
• IFrontendFileServer interface defines two methods that provide web server support for upload and download
• Apache and Nginx implementations provided
Thursday, June 3, 2010
ore.bigfileLimitations
• Upload directory is hardcoded
• Possibility of error on very large images which Mod Porter intercepts
Thursday, June 3, 2010
Versioning Big Files
Thursday, June 3, 2010
Solution
• Bypass CMFEditions - no file size limitation
• Create a new version only when file changes (not metadata)
• Allow old versions to be purged
• Version information stored on Big File object using annotations
Thursday, June 3, 2010
UI
Thursday, June 3, 2010
Conclusion
• ore.bigfile solves the Big File problem for a particular use case, not feature complete
• It does so by taking advantage of mature web server technology
• The code is minimally intrusive
• It provides a strategy for implementation we can learn from as we improve Plone’s Big File story
Thursday, June 3, 2010
http://svn.objectrealms.net/view/public/browser/ore.bigfile
Questions
Thursday, June 3, 2010