Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data...

35
Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store

Transcript of Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data...

Page 1: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Managing Data with iPlant

Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store

Page 2: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Background on the iPlant Data Store

• iPlant Learning Center provides quick tutorials, slides, and documentation for everything you will see here• Backbone of the iPlant CI• Connects to all iPlant services• Appropriate for any (research) file types of any size• Cloud Based, Backed Up, Initial 100 GB expandable to 1 TB• Built on iRODS

• Folder = Collection• Other than that, you don’t have to think about this if you don’t want to

• Demo 1 – Navigating the Data Store in the Discovery Environment

Page 3: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Upload and Download In the Discovery Environment

‘Simple’, for small files (~ 5 files, <1.9 GB)

‘Bulk’, for larger files and folders (<10GB)

Import from URL (no size limit)

Advantage + Disadvantage -

• Covers most upload/download sharing needs• Point and Click

• Some size/speed limitations

Demo 2 – Transferring Data from the Discovery Environment

Page 4: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

TipsSpaces /Special Characters• Many software packages are sensitive to spaces in file

names and/or the special characters below. Users may wish to rename uploaded files before using them in an analysis. Good advice for any transfer method.

~ ` ! @ # $ % ^ & * ( ) + =

{ } [ ] | \ : ; " ' < > , ? /

Page 5: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Bulk Transfers• Requires Java 6 or later be enabled in your browser (http://www.java.com)• Not currently compatible with Google Chrome• Window and web browser must remain open and active until the

transfer is complete

Import from URL• Monitor Notifications to check that the URL import has been submitted –

you will receive a notification when import is complete

Tips

Page 6: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

iDrop Desktop Drag and Drop files and folders File sizes up to your total allocation Fast transfers Synchronize folders with Data Store Download and Installation Instructions

Can demo installation

Advantage + Disadvantage -

• Upload/download large file sizes and numbers of files

• Sharing and permission features more complex

Demo 3 – Transferring Data with iDrop Desktop

Page 7: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

iDrop• Requires Java 6 or later (http://www.java.com)• At the bottom of the iDrop window you can monitor, pause, and restart

transfers. You can also view additional details by clicking Manage. • When iDrop Desktop is open/running there will be an icon in your system

tray or menu along with other background programs (e.g. Wi-Fi, Bluetooth, etc.). You need to close iDrop from this icon to completely close the program.

Tips

Page 8: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

iCommands

Advantage + Disadvantage -

• Customizability • Requires at least some command line expertise

Ability to script and automate Access from terminal/server Can resume transfers Download and installation instructions

Can demo installation

Demo 4 – Accessing the Data Store with iCommands

Page 9: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Tips

-f force - overwrite local files -P output the progress of the download.-r recursive - retrieve subcollections (directories)-T renew socket after 10 min.(use with large files)-V Very verbose

Useful iget / iput options:

Page 10: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Sharing Files in the Data Store

Discovery Environment Sharing Sharing via Public Link• Share files/folders instantly• Control access permissions• Manage sharing between collaborators

• No iPlant account required• Limited to individual files• URLs are public (less secure, can revoke)

2 Easy ways to share data within the Discovery Environment

Demo 5 – Data Sharing in the DE

Page 11: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Tips• When sharing via the Discovery Environment, use the following chart to

decide the permissions you wish to grant:

Permission Read Download Metadata Info Types Rename Move Delete

Read Write Own

Page 12: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Need help?

• iPlant Learning Center provides quick tutorials, slides, and documentation for everything you will see here

• http://ask.iplantcollaborative.org/questions/

Page 13: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Analyses in the Discovery Environment

Using Bioinformatics Apps

Page 14: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Background on the iPlant Discovery Environment• iPlant Learning Center provides quick tutorials, slides, and

documentation for everything you will see here• So far we have mainly looked at the Data Tab• DE is also a powerful interface for Apps and Analyses• This is where scalability and extensibility really come into play• Can customize and create new apps• Seamlessly integrated with high-throughput computing• Apps are linked to iPlant wiki for documentation• Demo 1 – Apps and analyses in the DE

Page 15: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Tips

• Mark an App as a favorite and it will appear in your workspace.

• Use the Apps menu to:• Rate Apps to provide feedback• Click the info icon to see the user manual for an App

• Re-launch a job by clicking on the App name in the Analysesmenu. The App will re-launch populated with the last parameters used, givingyou the option to alter the settings you want.

Page 16: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Tips

• View a file containing metadata (settings, etc.) connected to your analysis. Select the a job and click View Parameters.

Page 17: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Tips

• Can customize tools in the DE (simplifying here)

• App installation / modification involves happens at two levels:• Apps (and dependencies) are installed on DE Cluster (done by iPlant support)

• DE Interface (created by App integrator/user) and published to DE

• Detailed instructions with videos, manuals, documentation in Learning Center

Page 18: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Viewing and Editing MetadataIn the DE

• User metadata stored AVUs• Attribute – Value – Unit

• Template-based metadata

• Can view and edit from with iCommands (tomorrow)

Demo 2 – Metadata in the DE

Page 19: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Tips

• Currently can only use one template at a time

• Can create custom metadata templates

Metadata in the DE

Page 20: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Need help?

• iPlant Learning Center provides quick tutorials, slides, and documentation for everything you will see here

• http://ask.iplantcollaborative.org/questions/

Page 21: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Syncing Folders With iDrop

iPlant Learning Center provides quick tutorials, slides, and documentation for everything you will see here

Page 22: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

iDrop Desktop Synchronization

1. Click on Settings2. Click on the Synchronization tab3. Click New to start a new synchronization4. Enter a name (e.g. Project 1 sequence data)5. Click Choose Local Folder to select a folder6. Select a synchronization mode7. Select a frequency for synchronization8. Click Choose iRODS Folder to select the location to synch to9. Click Update to save your synchronization

Demo – DropBox Sync

Page 23: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Tips

Choose one of three methods to synchronize files:

• Local > iRODS – files will copied at synch to Data Store

• Local < iRODS – files in Data Store copied to local computer at synch

• Local <> iRODS – files in Data Store and local computer both synched

• If syncing to a DropBox folder, Dropbox must be paused during set up

Page 24: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Need help?

• iPlant Learning Center provides quick tutorials, slides, and documentation for everything you will see here

• http://ask.iplantcollaborative.org/questions/

Page 25: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Advanced Topics

Searching in the DE, iCommands and Metadata, Data Commons Plans and

Progress

Page 26: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Searching in the Discovery Environment

Demo 1 – Basic and advanced searching

• Basic search bar lets you search all files and folders where you have permission

• Advanced search features allows searching based on metadata, permissions, and share status

Page 27: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

iCommands to view and edit metadata

Demo 2 – imeta commands

• Ability to interact with metadata at the command line

• Already installed as part of iCommands

• Documentation is a little wimpy• Try Here and Here

Page 28: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Tips

• At the moment, metadata added via template is not available from the command line

Page 29: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Data, Project, Research Management In the Data Commons

• Tools and for sharing, managing, and publishing data

• A home for high-value public dataset to be used with iPlant analysis tools

• A way to manage projects

• Metadata templates and workflows for common analysis types

• Working hard to lay the groundwork, with development starting early 2015

Page 30: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Data, Project, Research Management In the Data Commons

• Additional layer on top of the Data Store for Users to publish packages of data and metadata to the Data Commons and supported external repositories with appropriate long-term identifiers and licenses

• Data will be static, searchable, discoverable, and linked to external repositories

• Based on Data Strategy, current CI, Developer input, recognized need for additional components to support Data Commons effort

Page 31: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Data Commons

Data Commons Development Plans

Staging AreaDE Project Interface

Planned Features• Define ‘Project’ data / metadata• Faceted view of Data Store based

on metadata• Organize data, enter standardized

and free-text metadata, and match file types with suitable analysis options

Metadata Progress• Data management and Genomic

use cases mature and available for development

• Existing tags, metadata collection, and metadata search are critical components

Next Steps• Development based on use cases

and manual walkthrough• Define CI-wide Project concept

Planned Features• Interface between Data Commons

and rest of the CI• Select from Project Interface and

distill data/metadata into package for publication to the Data Commons and beyond

• Select appropriate licenses and Identifiers for data and external repositories

• Metadata “carry through” from other platforms

Planned Features• Static, searchable, discoverable,

licensed data packages with persistent identifiers and links to external repositories

• Data will be available and useful to the community, not buried in Data Store

Next Steps• Define Developer needs based on

use cases and data models

• Define entry beyond DE

• Integrate ontologies and controlled vocabularies with metadata

Next Steps• Manually shepherd existing use

case through all components to better define Development needs

• Develop requirements for additional use cases and data types

• Provide documentation

• Define potential EOT deliverables

Page 32: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Goal: Data models and workflows for the entire data lifecycle

specimencollection

analysis

project creation publication

Page 33: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Interface Mockups are in Review

Demo 3 – Data Commons Interface Mockups

Page 34: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Metadata Templates for G2F

• Open discussion now that we are all updated on G2F and iPlant infrastructure

• Project Interface needs for G2F?• Metadata needs for G2F?• Other needs?• Perhaps start here?

Page 35: Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.

Thanks!

• iPlant Learning Center provides quick tutorials, slides, and documentation

• http://ask.iplantcollaborative.org/questions/