Essential UNIX skills for biologists
-
Upload
yannick-pouliot -
Category
Education
-
view
232 -
download
1
description
Transcript of Essential UNIX skills for biologists
Lane Medical Library & Knowledge Management Centerhttp://lane.stanford.edu
Essential UNIX Skills for Biologists
Yannick Pouliot, PhDBioresearch Informationist
Lane Medical Library & Knowledge Management Center
1/14/2009
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
2
The Bioresearch Informationist: At Your Service Yannick Pouliot, PhD, Lane Medical Library & Knowledge
Management Center Bioresearch Informationist ≈ computational biologist in
residence Lane Library service Closely coordinated with CMGM
Role: Support laboratory researchers regarding biocomputational resources and their use
…especially postdocs
Contact: [email protected]
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
3
Goals Deliver basic understanding of core
UNIX commands Tips on running UNIX on Mac and Windows
… and on a procedural note, we’ll be using anonymous polling to determine whether you’re happy with the material and speed of delivery …
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
4
But First: LaneConnex -- Your Key to Finding Resources Quickly
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
5
So, Why UNIX? UNIX is good for:
1. performing complex operations with very few key strokes
2. operating on large number of objects for e.g., searching file contents very specifically renaming files moving/copying files
UNIX is fast… Fast running and fast to invoke
LINUX (≈ UNIX) is free and runs on everything
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
6
UNIX Trip-Ups UNIX is capitalization-sensitive
ls ≠ Ls What you type is what you get
no mistyping! mind those commands
e.g., rm –fr = delete everything in current directory and
subdirectories! → DON’T DO THIS AT HOME!
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
7
So How Does One Access UNIX?
Mac: UNIX underlies Mac’s graphical interface
Applications → Utilities → Terminal Windows: Must install code (more later)
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
8
Exploring UNIX
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
9
Key UNIX Concepts UNIX is command-line based (no cute icons). There are flavors of UNIX
“Mac” UNIX ≈ Linux ≈ UNIX “Shell” = command line interface
different shells exist, all with identical basic functionality Anything you can imagine, UNIX can do
… but you may have to think about it… In UNIX, anything can be done in at least three different ways… UNIX has:
commands (built-in) → most of today’s workshop utilities
≈ “super-commands”, e.g., grep, for parsing text not built-in but usually there
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
10
Concept: Redirection *** Redirection operator
“>” or “<“ : add to file (overwrite) “>>” or “<<“: add to file (don’t overwrite)
Applies to both input and output file.txt > prog.exe prog.exe > file.txt File.txt > prog.exe > file1.txt prog.exe >> file.txt
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
11
Concept: Metacharacters *** “*”= 0 or more characters of any kind ‘.’ or ‘?’ = exactly one character of any kind
Exact character depends on the tool… Metacharacters can be used with nearly any other
command, e.g., ls file?.txt ls file*.txt ls *.* more *.txt grep *omics *.txt
NB: There are lots of other kinds of metacharacters…
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
12
Concept: Stringing Commands Together Using Pipes
“I” = pipe, e.g.: ls -1 | more
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
13
Polling Time: How’s the speed?
1: Too fast
2. Too slow
3. More or less OK
4. I feel nauseous
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
14
Overview of Selected UNIX Commands
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
15
ls [options] [names] ****
Lists contents of directories, including directories themselves Basically, lists files…
When names are provides, lists files contained in a directory name or that match a file name.
names can include filename metacharacters. The options display information in different formats. The most useful
options include -F, -R, -l, and -s.
Examples1. list all details of all files in current directory
ls –l2. list just the filenames
ls -1 3. create a file that contains a list of the filenames
ls -1 > mylist.txt4. List files of type with word “example” followed by single character, e.g.,
example1.txt, etcls -1 example?.txt
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
16
cat/more/head/tail→ commands to look at content of files cat: returns everything more: same but one page at a time **** head: returns top x lines tail: returns bottom x lines all can operate on multiple files
Examples1. show contents of all txt files
cat *.txt2. show first 100 lines of file
head +100 file.txt3. show first 1000 lines of file and paginate:
head +1000 file.txt | more
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
17
grep: Searching File Contents Using “Regular
Expressions” ****grep [options] pattern [files]
Very powerful: Searches file contents for presence of a string grep protein *.pdf about a million options…
Also searches using regular expressions Definition: a mathematical expression that expresses the characteristics of
one or more strings, e.g.: te?xt *omics
Examples1. Find all text files whose contents contain words ending in “omics”
(“genomics”, “proteomics”, “transcriptomics”): grep *omics *.txt
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
18
Polling Time: How’s the speed?
1: Too fast
2. Too slow
3. More or less OK
4. Need coffee
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
19
uniq options filename1 **
Very handy for listing unique (or duplicate) lines in a file Has options to…
ignore first or last n fields delimited by tabs or spaces compare only the first n characters
Operates ONLY on sorted files
Examples1. List unique lines using unsorted file
sort test1.txt | uniq
2. Count number of unique instances using sorted file
uniq –c test2.txt
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
20
find [pathnames] [conditions] ***
Very powerful: can specify anything, including exclusions and negations
Descends the directory tree beginning at each pathname and locates files that meet the specified conditions. The default pathname is the current directory.
Most useful conditions are -name and -type (for general use) Can search very large numbers of file names, if slowly…
Examples1. List all files named chapter1 in the /work directory:
find /work -name chapter1 -print
2. Look for filenames in current directory that don't begin with a capital letterfind . ! -name '[A-Z]+' -print
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
21
UNIX on Windows Easy: UnxUtls
= UNIX “light” Excellent for most tasks Not a complete emulation of UNIX Download here; make sure to follow installation instructions
More later… Hard: Cygwin
difficult to make it behave perfectly can run in parallel with Windows
Easier: create a dual boot Provides ability to boot either Windows or Linux Requires reboot to go switch…
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
22
Resources
• UNIX commands: http://en.wikibooks.org/wiki/Guide_to_Unix/Commands
Another list of UNIX utilities: http://en.wikipedia.org/wiki/List_of_Unix_utilities
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
23
Everything You Need to Know About UNIX in Short Form: eBooks from Lane
• The ultimate quick reference for LINUX
• More than you typically need, but you can zoom into what you need
Lane Medical Library &Knowledge Management Centerhttp://lane.stanford.edu
24
UnxUtils Installation: The MiniMe of UNIX
Download Installation instructions
→ Let’s do it together if you have a PC and want it
Lane Medical Library & Knowledge Management Centerhttp://lane.stanford.edu