A brief introduction to version control systems

Post on 07-Jul-2015

229 views 2 download

Tags:

description

This is a lunchtime talk I gave to the Southampton astronomy department. The aim was to make them aware of version control systems and when they might need to use them.

Transcript of A brief introduction to version control systems

A brief introduction toversion control systems

Tim Staley

Astronomy Group Monday SeminarSouthampton, November 2013

WWW: timstaley.co.uk

The problem No backup Manual copies Centralised VCS Distributed VCS

Aims

É Help identify problem that can besolved.

É Introduce basic concepts of versioncontrol.

É Explain why various technologiesexist, and which you should choose.

The problem No backup Manual copies Centralised VCS Distributed VCS

The problem No backup Manual copies Centralised VCS Distributed VCS

When you need versioncontrol

É Complex documents, built up overtime.

É Multiple collaborators (or even justmultiple machines).

É Multiple versions which ‘co-evolve.’

É Reproducibility (‘snapshots’).

The problem No backup Manual copies Centralised VCS Distributed VCS

Four Evolutionary Stages

The problem No backup Manual copies Centralised VCS Distributed VCS

Stage 0: Not backing up

DON’T DO THIS

The problem No backup Manual copies Centralised VCS Distributed VCS

Stage 0: Not backing up

DON’T DO THIS

The problem No backup Manual copies Centralised VCS Distributed VCS

Stage 1: Manual copies

The problem No backup Manual copies Centralised VCS Distributed VCS

Stage 1: Manual copies

Flaws:É Manual = fallible.

É Backup: Copies of copies.

É Labelling.

We need metadata - datestamps,annotations, attribution.And tools - make this stuff quick andeasy!

The problem No backup Manual copies Centralised VCS Distributed VCS

Stage 1: Manual copies

Flaws:É Manual = fallible.

É Backup: Copies of copies.

É Labelling.We need metadata - datestamps,annotations, attribution.And tools - make this stuff quick andeasy!

The problem No backup Manual copies Centralised VCS Distributed VCS

Aside: ‘Cloudy’ technologies

Trade off — convenience vs control.Good for:É Small docs, frequently updated across

multiple locations (e.g. to-do list).É Basic backups of items unlikely to

evolve (photos, etc).

The problem No backup Manual copies Centralised VCS Distributed VCS

Aside: ‘Cloudy’ technologies

Problems:É Versioning is all automated - can’t

choose sensible ‘checkpoints’ to markout.

É Collaboration is still broken, unlessyou’re working on very simple docs.

NEED MORE METADATA

The problem No backup Manual copies Centralised VCS Distributed VCS

Aside: ‘Cloudy’ technologies

Problems:É Versioning is all automated - can’t

choose sensible ‘checkpoints’ to markout.

É Collaboration is still broken, unlessyou’re working on very simple docs.

NEED MORE METADATA

The problem No backup Manual copies Centralised VCS Distributed VCS

Stage Two

The problem No backup Manual copies Centralised VCS Distributed VCS

Centralised version control

e.g.É ‘Concurrent Versions System’ (CVS,

now defunct).É ‘Subversion’ (SVN).

The problem No backup Manual copies Centralised VCS Distributed VCS

Basic concepts, 1

Record an annotated history of changesets.

É Trunk, branchÉ Parents, ancestors

The problem No backup Manual copies Centralised VCS Distributed VCS

Basic concepts, 2Centralized⇔Master copy

É RepositoryÉ CheckoutÉ Commit / Revision

The problem No backup Manual copies Centralised VCS Distributed VCS

Basic concepts, 3

MergingIn simple cases, merges are automatic!Tree-records allows us to build the newcombined version.

The problem No backup Manual copies Centralised VCS Distributed VCS

Basic concepts, 3Manual merging: When conflicts exist,we have the info and tools to manuallyresolve them.

The problem No backup Manual copies Centralised VCS Distributed VCS

Distributed VCS

1986 – early 2000’s: Why would youmake this any more complex? This works.

INTERWEBS(See e.g. visualised history of Python,https://www.youtube.com/watch?v=cNBtDstOTmA)

The problem No backup Manual copies Centralised VCS Distributed VCS

Distributed VCS

1986 – early 2000’s: Why would youmake this any more complex? This works.

INTERWEBS

(See e.g. visualised history of Python,https://www.youtube.com/watch?v=cNBtDstOTmA)

The problem No backup Manual copies Centralised VCS Distributed VCS

Distributed VCS

1986 – early 2000’s: Why would youmake this any more complex? This works.

INTERWEBS(See e.g. visualised history of Python,https://www.youtube.com/watch?v=cNBtDstOTmA)

The problem No backup Manual copies Centralised VCS Distributed VCS

Centralised doesn’t scale

É Many collaborators.É Cannot check-in half-finished work to

master.É Cannot keep track of a branch for

every collaborator.

É Resort back to hybrid of central copyunder version control, with manylocal, manual backups forintermediate work.

The problem No backup Manual copies Centralised VCS Distributed VCS

Centralised doesn’t scale

É Many collaborators.É Cannot check-in half-finished work to

master.É Cannot keep track of a branch for

every collaborator.É Resort back to hybrid of central copy

under version control, with manylocal, manual backups forintermediate work.

The problem No backup Manual copies Centralised VCS Distributed VCS

The distributed modelStage 3: Distribute!

É Everyone has their own mirror, orclone of the repository.

É Changes are distributed via pushesand pulls.

The problem No backup Manual copies Centralised VCS Distributed VCS

Distribute!

Benefits for you:É More flexible. Allows different

workflows and collaborative behaviouretc.

É Can commit offline, sync later.

É Talk to me later if you want the details.

The problem No backup Manual copies Centralised VCS Distributed VCS

So which should I use?

At this stage, git and mercurial arefunctionally equivalent — but git has wonthe majority mindshare, therefore: bettersupport, better chance of collaboratorsusing same system, etc.

The problem No backup Manual copies Centralised VCS Distributed VCS

So which should I use?

At this stage, git and mercurial arefunctionally equivalent — but git has wonthe majority mindshare, therefore: bettersupport, better chance of collaboratorsusing same system, etc.

The problem No backup Manual copies Centralised VCS Distributed VCS

Summary

É Version control helps with:É BackupsÉ ReproducibilityÉ Comparing arbitrary historical versions.É Maintaining multiple live versions.

É Lots of free services and materialonline to help you out.

É Bit of a learning curve at first - butpayoff is large in long-run. (And nowyou have a headstart!)

The problem No backup Manual copies Centralised VCS Distributed VCS

Advanced Reading

To start, google ‘git intro’, etc. Then. . .É Git for Computer Scientistshttp://eagain.net/articles/git-for-computer-scientists/

É Understanding Git Conceptuallyhttp://www.sbf5.com/~cduan/technical/git/

É Understanding the Git Workflowhttps://sandofsky.com/blog/git-workflow.html