Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of...

34
Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign [email protected]

Transcript of Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of...

Page 1: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Installation - Plus

Loretta Auvil

National Center for Supercomputing Applications

University of Illinois at Urbana-Champaign

[email protected]

Page 2: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Outline

• Installation

• Meandre servers and clusters

• Development Tools: Eclipse Plugin

• Hands-On

Page 3: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Considerations

• Do you want to use SEASR-powered services?

– May not need to install anything (besides a browser)

• Do you want to run analytics on your laptop?

– Quick 3 step process

• Do you want to provide SEASR-powered services?

– Start simple

– Scale as needed

• Deploying all the extra goodies

Page 4: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Using SEASR-Powered Services

• SEASR provides some demo services

• Requires a browser

• You can access them from – Community Hub to execute a flow

• http://seasr.org

– Meandre Server Client to execute a flow; or tune properties and execute a flow

• Hosted at http://demo.seasr.org:1714

– Meandre Workbench to execute a flow; or tune properties and execute a flow; or create a flow

• Hosted at http://demo.seasr.org:1712

– Zotero to analyze your collections with existing flows

Page 5: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

I Need To Run SEASR on my laptop• I want to run on my laptop (server)

– I have copyrighted information

– I have collection for analysis that is too big to be moved

– I just want to test it and have fun with it

• Getting a Meandre server up and running in 3 steps1. Install Java http://www.java.com/en/download/

2. Download the Meandre server jar into a new directory http://seasr.org/meandre/download/

3. Use the “Start-Infrastructure” or type “java –jar meandre-server-1.4.5.jar”

• Access your new installation at– http://localhost:1714/public/services/ping.html

Page 6: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Specialized Downloadable Bundles• On the SEASR/Meandre download site

– http://seasr.org/meandre/download

• Installation bundles available for:

– Mac OS

– Linux

– Windows

• Bundles contain:

– Zip file that includes executable files

– Set of demo components and flows

• Requires Java (1.5 or greater) to be installed

Page 7: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Bundles Include

• The bundle comes with

– Meandre Server

– ZigZag console/compiler/runtime

– Meandre Workbench (also provided as a war file)

• Provides simple scripts to

– Start/stop the Meandre server

– Start/stop the Meandre Workbench

Page 8: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

What About Setting Up My Own Server?

• You can also deploy the bundles on a server using the same approach.

Page 9: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Customization of My Server

• This will support– Moderated traffic ??we don’t know what this means??

– Persistent web services can be provided using this server

• Application Server– Workbench can be deployed alongside the Meadre

Server in the embedded Jetty Application Server

– Workbench can be deployed using your favorite application server using the .war file

• Database Options– Meandre uses an embedded Derby as the database

– Meandre can be also be setup to use Mysql for the database

Page 10: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Backend Using Derby Database

• default meandre-config-store.xml <entry key="DB_USER"></entry>

<entry key="DB_DRIVER_CLASS"> org.apache.derby.jdbc.EmbeddedDriver</entry> <entry key="DB">Derby</entry> <entry key="DB_PASSWD"></entry> <entry key="DB_URL”>jdbc:derby:./MeandreStore;create=true;logDevice=./DerbyLog</entry>

Page 11: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Backend Using MySQL

• meandre-config-store.xml to <entry key="DB_USER">USERNAME</entry>

<entry key="DB_DRIVER_CLASS”>com.mysql.jdbc.Driver</entry> <entry key="DB">MySQL</entry> <entry key="DB_PASSWD">PASSWORD</entry> <entry key="DB_URL”><![CDATA[jdbc:mysql://your-server.com/YOURDB?useUnicode=yes&characterEncoding=utf8&autoReconnect=true]]></entry>

• Changing from Derby to MySQL– Stop the server

– Change the meandre-config-store.xml file

– Restart the server

– Now your server is backend on MySQL

Page 12: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Scaling Up

• Two possible routes– Deploy a farm of self-contained services (via

zigzag)

– Use the Meandre Cluster solution

• Both require your sysadmin/netadmin to provide a highly available load balancer (some virtual appliances available)

• To create a cluster– Use the previous MySQL set up

– Point all the servers to the same database

– The server interface pages will allow you to monitor of the servers

Page 13: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Installing The Workbench

• Use the installation bundles

• Use the war file

– Install your favorite application server

– Deploy the war file against the application server

Page 14: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Installing the Community Hub

• The community hub is a Wordpress plugin• Allows to point to a Meandre server• Makes all the flows available for execution • Pages and posts can add the tag

– [meandre-desc SERVER_REPOSITORY_URL FLOW_URI]– E.g. [meandre-desc

http://demo.seasr.org:1714/public/services/repository.rdf http://test.org/flow/text_processing_demo_1]

• Renders the description of the flow information and provides a simple execute button to allow visitors to run the flow

• Deploy the zip file into Wordpress plugins directory

Page 15: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Eclipse Plugin for Developers

• On the SEASR/Meandre download site

– http://seasr.org/meandre/download/

• Steps for installation

• Exit Eclipse

• Download zip file into Eclipse/dropins directory

• Unzip file

• Restart Eclipse

Page 16: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Developers: Eclipse Plugin

• Uploads components to the Meandre Server

• Lists components installed

• Allows for removal of components

• Shows additional data of interest to a programmer

Page 17: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Meandre: The Architecture

• The design of the Meandre architecture follows three directives:

– provide a robust and transparent scalable solution from a laptop to large-scale clusters

– create an unified solution for batch and interactive tasks

– encourage reusing and sharing components

• To ensure such goals, the designed architecture relies on four stacked layers and builds on top of service-oriented architectures (SOA)

Page 18: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Meandre: Basic Single Server

Page 19: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Meandre MDX: Cloud Computing• Servers can be

– instantiated on demand

– disposed when done or on demand

• A cluster is formed by at least one server

• The Meandre Distributed Exchange (MDX)

– Orchestrates operational integrity by managing cluster configuration and membership using a shared database resource.

Page 20: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Meandre MDX: The PictureM

DX B

ackb

one

Page 21: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Meandre MDX: The Architecture• Virtualization infrastructure

– Provide a uniform access to the underlying execution environment. It relies on virtualization of machines and the usage of Java for hardware abstraction.

• IO standardization– A unified layer provides access to shared data stores, distributed file-system, specialized metadata stores, and access to other service-oriented architecture gateways.

Page 22: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Meandre MDX: The Architecture• Data-intensive flow infrastructure

– Provide the basic Meandre execution engine for data-intensive flows, component repositories and discovery mechanisms, extensible plugins and web user interfaces (webUIs).

• Interaction layer– Can provide self-contained applications via webUIs, create plugins for third-party services, interact with the embedding application that relies on the Meandre engine, or provide services to the cloud.

Page 23: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Meandre: ZigZag Script Language• ZigZag is a simple language for describing data-intensive flows

– Modeled on Python for simplicity.

– ZigZag is declarative language for expressing the directed graphs that describe flows.

• Command-line tools allow ZigZag files to compile and execute.

– A compiler is provided to transform a ZigZag program (.zz) into Meandre archive unit (.mau).

– Mau(s) can then be executed by a Meandre engine.

Page 24: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Meandre: ZigZag Script Language• As an example the Flow Diagram

– The flow below pushes two strings that get concatenated and printed to the console

Page 25: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

• ZigZag code that represents example flow:

Meandre: ZigZag Script Language

# # Imports the three required components and creates the component aliases #import <http://localhost:1714/public/services/demo_repository.rdf> alias <http://test.org/component/push_string> as PUSH alias <http://test.org/component/concatenate-strings> as CONCATalias <http://test.org/component/print-object> as PRINT ## Creates four instances for the flow # push_hello, push_world, concat, print = PUSH(), PUSH(), CONCAT(), PRINT() # # Sets up the properties of the instances # push_hello.message, push_world.message = "Hello ", "world!" ## Describes the data-intensive flow # @phres, @pwres = push_hello(), push_world() @cres = concat( string_one: phres.string; string_two: pwres.string ) print( object: cres.concatenated_string ) #

Page 26: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Meandre: ZigZag Script Language

# # Describes the data-intensive flow # @pu = push() @pt = pass( string:pu.string )print( object:pt.string )

• Automatic Parallelization – Multiple instances of a component could be run in parallel to boost throughput.

– Specialized operator available in ZigZag Scripting to cause multiple instances of a given component to used

• Consider a simple flow example show in the diagram

• The dataflow declaration would look like

Page 27: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

• Automatic Parallelization

– Adding the operator [+AUTO] to middle component

– [+AUTO] tells the ZigZag compiler to parallelize the “pass component instance” by the number of cores available on system.

– [+AUTO] may also be written [+N] where N is an numeric value to use for example [+10].

Meandre: ZigZag Script Language

# Describes the data-intensive flow # @pu = push() @pt = pass( string:pu.string ) [+AUTO]print( object:pt.string )

Page 28: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

• Automatic Parallelization

– Adding the operator [+4] would result in a directed grap

Meandre: ZigZag Script Language

# Describes the data-intensive flow # @pu = push() @pt = pass( string:pu.string ) [+4]print( object:pt.string )

# Describes the data-intensive flow # @pu = push() @pt = pass( string:pu.string ) [+4!]print( object:pt.string )

Page 29: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Scaling Genetic Algorithms with Meandre

Intel 2.8Ghz QuadCore, 4Gb RAM. Average of 20 runs.

Page 30: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

And Beyond with Hadoop

60 Dual Quad Core Xeons with 8GB RAM. GB Ethernet

Resources exhaustion

Page 31: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Meandre: Flows to MAU

• Flows can be executed using their RDF descriptors

• Flows can be compiled into MAU

• MAU is:

– Self-contained representation

– Ready for execution

– Portable

– The base of flow execution in grid environments

Page 32: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Demonstration

• Installation of Meandre

• Meandre Eclipse Plugin

• JIRA, Confluence, Bamboo - what they are and what we use them for

• Usage of ZigZag

– Compiling and executing flows using ZigZag

– Usage of ZigZag for Zotero-enabled flows

– Usage of ZigZag for Fedora flows

Page 33: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Learning Exercises

• Open an existing ZigZag flow

• Convert your flow from yesterday to ZigZag

• Compile the script

• Execute the script

• Have participants download and install SEASR on their personal computers

• Have participants sign up for accounts to access the SEASR suite of Atlassian tools

• Use JIRA to log a support request

Page 34: Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu.

Discussion Questions

• What challenges (if any) would scholars have installing the SEASR software?

• Do you see your institution's IT department running the SEASR environment or would it be your research group?

• Which environment would you most likely use, the Meandre Workbench or the ZigZag scripting language?