Lecture 40

Version control

MCS 275 Spring 2024
Emily Dumas

View as:   Presentation   ·   PDF-exportable  ·   Printable

Lecture 40: Version control

Reminders and announcements:

  • Please complete your course evaluations.
  • Homework 14 is the last one 🎉
  • Project 4 autograder opens Monday 🤖
  • Projects requiring subdirectories should upload as a ZIP file

Version control

A system to:

  • Track changes
  • Document changes
  • Archive previous versions
  • Allow concurrent work

Version control systems (VCS) are also known as "source code management" (SCM).

Do you have this?


        project4.py
        project4old.py
        project4draft.py
        project4-new.py
        project4-fixed.py
        project4-fixed-debug.py
        project4final.py
        project4final2.py
        project4final3.py
        project4final3 (1).py
        project4final_fixed-new2_revised\ (1).2024-04-17.py

A version control system (VCS) can help.

VCS history

  • Historical milestones
    • 1970s: VAX/VMS filesystem has versioning
    • 1980s: Revision Control System (RCS)
    • 1990s: Concurrent Versions Systems (CVS)
    • 2000s: Subversion (SVN)
  • Common until recently
    • fossil, mercurial, darcs, perforce, ...
  • Widely used today
    • git

git

A VCS created by Linus Torvalds* in 2005.

Key properties:

  • Open source
  • Distributed
  • Nonlinear
  • Offline-friendly

* Finnish software developer and creator of Linux (1993).

Free to use; multiple implementations available.

Everyone has a copy of full history.

Supports parallel branches of development; no concept of a single "latest" version.

Many commands operate only on local files. Sync with others when ready.

Installing git

See the official guide. You have it if the command git in the terminal shows a help message.

In short:

  • Linux - probably already have it
  • MacOS - just try running git in the terminal
  • Windows - need to install it

Online services

There are some popular online services that will keep a copy of your repository on a server and/or let you interact with it in a browser.

These let you voluntarily centralize a purposely decentralized system.

Centralize?

It means you can't exchange work with collaborators when that service is down or unreachable.

But also means you rarely need to worry about how communication between collaborators will work (no file sharing or network setup...)

Today: git is usually used through a centralizing web service; huge productivity losses across industries when one of these has an outage.

Project

Repository

git init

Creates a git repository in the current directory.

Initially has empty history and doesn't track any files.

Data lifecycle

Example

Let's imagine how this would work for a program fetch.py that downloads and saves an HTML document to a file.

git add

Put current version of the file in a staging area.

git commit

Record staged changes in the database.

(These files will be tracked from now on.)

git log

Show recent commits and descriptions.

git status

Show summary of current situation.

Another commit

git push

Contact a remote repository and send it commits that are in our database but not theirs.

Fails if remote has changed since our last push!

git pull

Contact a remote repository and get commits from its database that are not yet in ours.

May trigger a merge if there have been changes to both local and remote since we last pulled.

Looking at history

git show COMMIT:FILE

will display file contents at any commit.

git clone

Make a local copy of an existing repository (from URL, directory, ...).

Not covered today

  • checkout – change which version is seen in the filesystem
  • reset – set files and/or DB back to a previous state
  • branch – name a series of commits

References

Revision history

  • 2023-04-23 Finalization of the 2023 lecture this was based on
  • 2024-04-17 Initial publication