Written for the Jan 11 LIAS Seminar on source control systems.
ScottLawrence will write one for
darcs and
PatrickStein will write one for
monotone.
Shouldn't take up more than 15 minutes, in order to be fair
to Scott and Pat. Other presentations on this topic for
the meeting can be found at the
SeminarSeries
topic.
Slide 1: Subversion
Slide 2: Background
|
Subversion (SVN) is a source control
system intended as a replacement for the venerable
Concurrent Version System¹ (CVS) tool.
CVS has a number of problems of varying severity.
See the "well known" issues in the table on the right.
Their intent is to "take over" (their words) the existing
CVS user base.
Therefore, there are a lot of similarities to the end user
between the two systems.
The back end of the system is very different (and most would
agree, much improved).
Oddly, there is no direct import of a CVS project into Subversion;
there exist several "mostly complete" scripts that help a lot with
this.
|
CVS Problems
- Only manages files. Doesn't handle changes at the directory level (e.g., renaming a file) or at the hierarchy level (e.g., directories).
- Handles non-text files... barely.
- Non-atomic updates. Larger projects on the 'net see problems where some files are committed and others aren't. Users can really step on each other in large projects.
- Needs the repository to do anything. No network and want to diff your changes? Tough luck.
|
¹ http://cvshome.org/ used to be the home site for CVS.
It looks like it's been moved to http://ximbiot.com/cvs/cvshome/
sometime in 2004. I don't know if this is a good thing, or
who the ximbiot people are. Heads up.
Slide 3: CVS vs SVN
|
CVS Problems
- Only manages files. Doesn't handle changes at the directory level (e.g., renaming a file) or at the hierarchy level (e.g., moving directories).
- Handles non-text files... barely.
- Non-atomic updates. Larger projects on the 'net see problems where some files are committed and others aren't. Users can really step on each other in large projects.
- Needs the repository to do anything. No network and want to diff your changes? Tough luck.
|
SVN Approaches
- Manages the entire hierarchy, handles renames, new files don't inherit old histories, directories are versioned.
- Handles binary data and text data the same way, only Δ is stored (even for binary data). Can also do special processing based on MIME types.
- Atomic updates. Either your changes are all in, or not. No "partial" commits.
- Assumes disk space is cheap and manages a copy of your tree locally. Allows simple operations on local tree w/o repository.
|
Slide 4: SVN Similarities
Both commands share the idea of a "main" switching command.
- All CVS functionality takes the form of
cvs cmd ...
- All SVN functionality takes the form of
svn cmd ...
; svn help ci
commit (ci): Send changes from your working copy to the repository.
usage: commit [PATH...]
A log message must be provided, but it can be empty. If it is not
given by a --message or --file option, an editor will be started.
Valid options:
-m [--message] arg : specify commit message ARG
-F [--file] arg : read data from file ARG
-q [--quiet] : print as little as possible
-N [--non-recursive] : operate on single directory only
--targets arg : pass contents of file ARG as additional args
--force-log : force validity of log message source
--username arg : specify a username ARG
--password arg : specify a password ARG
--no-auth-cache : do not cache authentication tokens
--non-interactive : do no interactive prompting
--editor-cmd arg : use ARG as external editor
--encoding arg : treat value as being in charset encoding ARG
--config-dir arg : read user configuration files from directory ARG
Slide 5: SVN Similarities
Many commands work the way you've come to expect (more or less)
with CVS.
- commit
- Updating the repository with your latest changes.
- checkout
- Pulling content from a repository to your sandbox.
- add
- Adding content to your tree.
- remove
- Removing content from your tree.
- ...
- ...
|
Likewise for repository access
- local
- SVN uses
file:/path/ to indicate a repository living on a locally accessable filesystem (CVS would just name the path).
- svn
- SVN uses
svn://server/path/ to specify a path to a remote repository. CVS uses pserver, and the respository would be specified ext:user@server:/path/ instead.
- ssh
- SVN provides secure access via
ssh to any system you have the keys to, by using a URL like svn+ssh://server/path.²
|
² Minor difference: under CVS, using rsh or ssh
would run the cvs tool on the remote system.
Under SVN, this connection just runs svnserve instead,
so svn+ssh: just secures the same connection that svn:
access would give you.
Slide 6: SVN Differences
Branches and tags are very different under SVN.
When you create a repository, the recommended style is to do
something like the following.
Every project has
branches tags and
trunk directories
created at its top level.
|-branches-
|-projA-|-tags-----
|-svn-| |-trunk----
|
| |-branches-
|-projB-|-tags-----
|-trunk----
All main development occurs in the
trunk directory.
In fact, your project could go on forever without ever
once looking at or considering the
branches or
tags
directories.
svn co svn+ssh://saturn/path/to/svn/projA/trunk workdir
Slide 7: SVN Differences - Branches
So, maybe the repository has a couple files in it, now.
|-branches-
|-svn-|-projA-|-tags-----
|-trunk----|-foo.c
|-foo.h
"I have a crazy idea." So I create a new branch by making
an SVN copy of it. These copies are cheap, they don't actually
replicate data everywhere in the repository (SVN records the
source for a file's content).
svn copy .../projA/trunk \
.../projA/branches/crazy
|
Now, the repository looks like this:
|-branches-|-crazy-|-foo.c
| |-foo.h
|-svn-|-projA-|-tags-----
|
|-trunk----|-foo.c
|-foo.h
I start working on my project with:
svn co .../projA/branches/crazy workdir
And when I want to merge data in from the trunk.
Lets say I want to apply the differences in the trunk between
versions 123 and 124 to my branch.
svn merge -r123:124 .../projA/trunk
|
Slide 8: SVN Differences - Branches and Tags
If it looks like
branches is just another
directory, and that SVN doesn't really know what a branch
is, and that all the real work is happening via SVN
copy
and
merge operations...
you're right!
Tags are handled the same way as branches.
For example, let's say I want to create a 1.0 release.
I could do it from the main
trunk
with the following, which simply creates
a new hierarchy in the repository named
tags/1.0.
svn copy .../projA/trunk .../projA/tags/1.0
This would yield
|-branches-|-crazy
|
|-svn-|-projA-|-tags-----|-1.0--
|
|-trunk----
Slide 9: SVN Differences - Tags
SVN developers recommend that both branches and releases are used.
For example, imagine this repository.
|-branches-|-1.0---
| |-crazy-
|
|-svn-|-projA-| |-1.0.0-
|-tags-----|-1.0.1-
| |-1.0.2-
|-trunk----
Each one of the tag directories was created with an
svn copy
from the
branches/1.0 tree at some point in time.
Development continued in
branches/1.0 of course, and so
over time each
svn copy to
tags became a sort of
"snapshot". That's really all there is to it.
Slide 10: SVN Differences - Status and Update
The
status and
update commands are different in SVN
than they are in CVS. And, by "different" we mean that
they work.³
svn status can report on version information, tell you
which components have changed since the last update,
and so on. It's what you'd expect, and we don't risk
updating our trees when we didn't mean to by forgetting
the
-n option to
update.
svn update is meant to actually bring your sandbox up
to date with respect to the repository.
It can be used in a dry-run mode to see what SVN would
do to your hierarchy, but frequent use of
svn status
makes that unnecessary, most of the time.
³ Oh, come on. Every single one of us has
cvs -n up committed to memory as a way to see what our
sandbox looks like, and not a single one of us actually
does anything useful with the output from cvs status.
Slide 11: SVN Differences - Ignoring Files and Properties
There is no direct equivalent of the
.cvsignore file
in CVS.
Instead, SVN uses metadata called "properties".
Anything in SVN can have metadata associated with it.
Files, directories, links,
etc. can all have metadata.
Most of the metadata is ignored, it's mostly there for
the end user. However, there are some properties which
are used by SVN.
SVN reserves all properties starting with the characters
svn: in their name. The property used to implement
patterns to ignore is called
svn:ignore and it is
applied to (you guessed it) directories.
-
svn propget svn:ignore .
- List all the patterns currently being ignored in the current directory.
-
svn propset svn:ignore '*.fasl' .
- Sets
*.fasl as the list of patterns ignored in this directory.
-
svn propset svn:ignore -F .ignore .
- Set the
svn:ignore property from the contents of the .ignore file.
Slide 12: SVN Differences - Properties cont'd
-
svn:ignore - lists the file extensions to exclude from SVN.
-
svn:keywords - lists keywords to be expanded by SVN client.
-
svn:executable - do OS-specific actions to ensure file can be "run"
-
svn:eol-style - manually tweak end-of-line for text files.
-
svn:mime-type - controls how to diff/merge/patch the file, and also plays nicely with Apache (see below).
-
svn:externals - lets you merge content from other repositories into the current directory.
Slide 13: SVN Differences - Repositories
You have to decide between speed and reliability when using
SVN.
CVS had one repository type, a directory tree full of RCS files, while
SVN has two repository types:
- BerkeleyDB
- Fast, real database-like quality, but can only be used on a local filesystem. Never use on NFS, Samba, AFS, or other remote filesystems.
- FSFS
- Slightly slower, file-based database. Safe to use over reasonable networks (NFS).
SVN supplies other tools to work with the repository directly
(e.g., no need to checkout a CVSROOT tree).
- svnlook
- Provides all sorts of inspection of a repository. (Since you can't see the repository tree from
ls (even when using FSFS).
- svnadmin
- Provides
dump and recover and other database-y like functionality. The preferred way to move a repository between systems, by the way.
Slide 14: SVN Surprises - Conflicts
Earlier, I mentioned that SVN guarantees checkins atomically;
either the whole set of changes makes it into the repository,
or they don't.
"But what about conflicts?"
A conflict locks the repository against further edits.
The user then takes care of the conflict in the local file tree.
When the conflict is cleared and the files are set and ready
to go, the user runs
svn resolved
To complete the committal and remove any outstanding locks.
Slide 15: SVN Surprises - Revision Numbers
Everything in a repository shares the same revision number.
Everything in a repository shares the same revision number.
...
That means if you edit foo/goo/menu.c and commit your change,
every directory and file in the repository gets their version
bumped as well as menu.c.
It is not uncommon for repositories to get revision numbers in
the thousands or even ten thousands.
These are fast and easy to mange, though; SVN does a good job
of keeping all the changes between revision n-1 and n in
a single datum (in FSFS, this is a single file named n).
Overall, this is a win, but it does lead one to consider:
The LIAS group, for example, stored a lot of 3rd party
software in its repository; these are used to build
our main software products, but they aren't part of our
build and checkout process.
Is this a good idea, or by using SVN should we use multiple
repositories?