New Views on your History with git replace

34
New Views on your History with git replace Christian Couder, Murex [email protected] OSDC.fr 2013 October 5, 2013

description

Git has become the most popular version control system in the Open Source world, and more and more companies are also using it. The source code history when managed by Git is supposed to be immutable, because Git uses a content addressed database. The Git objects are indexed by their SHA-1 hash. When mistake have been made, or to make some history based features more useful or more reliable, though, it can be interesting to transform the Git source code history. To do that it is a good idea to use git replace.

Transcript of New Views on your History with git replace

Page 1: New Views on your History with git replace

New Views on your History with git replace

Christian Couder, [email protected]

OSDC.fr 2013

October 5, 2013

Page 2: New Views on your History with git replace

About Git

A Distributed Version Control System (DVCS): ●created by Linus Torvalds●maintained by Junio Hamano●since 2005●prefered VCS among open source

developers

Page 3: New Views on your History with git replace

Git Design

Git is made of these things:

●“Objects”●“Refs”●config, indexes, logs, hooks,

grafts, packs, ...

Only “Objects” and “Refs” are transferred from one repository to another.

Page 4: New Views on your History with git replace

Git Objects

●Blob: content of a file●Tree: content of a directory●Commit: state of the whole source code●Tag: stamp on an object

Page 5: New Views on your History with git replace

Git Objects Storage

●Git Objects are stored in acontent addressable database.

●The key to retrieve each Object is theSHA-1 of the Object’s content.

●A SHA-1 is a 160-bit / 40-hex / 20-bytehash value which is considered unique.

Page 6: New Views on your History with git replace

blob size

/* content of this blob, it can be anything like an image, a video, ... but most of the time it is source code like:*/

#include <stdio.h>

int main(void){ printf("Hello world!\n"); return 0;}

SHA1: e8455...

Blob

blob = content of a file

Page 7: New Views on your History with git replace

Example of storing and retrieving a blob

# echo “Whatever…” | git hash-object -w --stdinaa02989467eea6d8e0bc68f3663de51767a9f5b1

# git cat-file -p aa02989467Whatever...

Page 8: New Views on your History with git replace

tree size

SHA1: 0de24...

blob

tree

hello.c

lib

e8455...

10af9...

Tree

tree = content of a directory

It can point to blobs and other trees.

Page 9: New Views on your History with git replace

Example of storing and retrieving a tree

# BLOB=aa02989467eea6d8e0bc68f3663de51767a9f5b1# (printf "100644 whatever.txt\0"; echo $BLOB | xxd -r -p)

| git hash-object -t tree -w --stdin0625da548ef0a7038c44b480f10d5550b2f2f962

# git cat-file -p 0625da548e100644 blob aa02989467... whatever.txt

Page 10: New Views on your History with git replace

commit size

SHA1: 98ca9...

parents

tree 0de24...

()

author Christian <timestamp>

committer

My commit message

Commit

commit = information about some changes

It points to one tree and 0 or more parents.

Christian <timestamp>

Page 11: New Views on your History with git replace

Example of storing and retrieving a commit (1)

# TREE=0625da548ef0a7038c44b480f10d5550b2f2f962# ME=”Christian Couder <[email protected]>”# DATE=$(date "+%s %z")# (echo -e "tree $TREE\nauthor $ME $DATE";

echo -e "committer $ME $DATE\n\nfirst commit")| git hash-object -t commit -w --stdin

37449e955443883a0a888ee100cfd0a7ba7927b3

Page 12: New Views on your History with git replace

Example of storing and retrieving a commit (2)

# git cat-file -p 37449e9554tree 0625da548ef0a7038c44b480f10d5550b2f2f962author Christian Couder <[email protected]> 1380447450 +0200committer Christian Couder <[email protected]> 1380447450 +0200

first commit

Page 13: New Views on your History with git replace

Git Objects Relations

Commit size

SHA1: e84c7...

parents

tree 29c43...

()

author Christian

committer ChristianInitial commit

Tree size

blob

tree

0de24...hello.c

doc 98ca9...

Commit size

SHA1: 98ca9...

parents

tree 5c11f...

(e84c7...)

author Arnaud

committer ArnaudChange hello.c

SHA1: 29c43...

Tree size

blob

blob

677f4...readme

install 23ae9...

SHA1: 98ca9...

Tree size

blob

tree

bc789...hello.c

doc 98ca9...

SHA1: 5c11f...

Blob size

SHA1: 0de24...

int main() { ... }

Blob size

SHA1: bc789...

int main(void) { ... }

Page 14: New Views on your History with git replace

Git Refs

●Head: branch,.git/refs/heads/

●Tag: lightweight tag,.git/refs/tags/

●Remote: distant repository,.git/refs/remotes/

●Note: note attached to an object,.git/refs/notes/

●Replace: replacement of an object,.git/refs/replace/

Page 15: New Views on your History with git replace

Example of storing and retrieving a branch

# git update-ref refs/heads/master 37449e9554

# git rev-parse master37449e955443883a0a888ee100cfd0a7ba7927b3

# git reset --hard masterHEAD is now at 37449e9 first commit

# cat whatever.txtWhatever...

Page 16: New Views on your History with git replace

Result from previous examples

master

commit 37449e9554

tree 0625da548e

blob aa02989467

Page 17: New Views on your History with git replace

Commits in Git form a DAG (Directed Acyclic Graph)

● history direction is from left to right● new commits point to their parents

Page 18: New Views on your History with git replace

B

git bisect

● B introduces a bad behavior called "bug" or "regression"

● red commits are called "bad"● blue commits are called "good"

Page 19: New Views on your History with git replace

Problem when bisecting

Sometimes the commit that introduced a bug will be in an untestable area of the graph.

For example:

X X1 X2 X3W Y Z

Commit X introduced a breakage, later fixed by commit Y.

Page 20: New Views on your History with git replace

Possible solutions

Possible solutions to bisect anyway:●apply a patch before testing and remove it

afterwards (can be done using "git cherry-pick"), or

●create a fixed up branch (can be done with "git rebase -i"), for example:

X X1 X2 X3W Y Z

X + Y X1' X2' X3' Z'

Z1

Page 21: New Views on your History with git replace

A good solution

The idea is that we will replace Z with Z' so that we bisect from the beginning using the fixed up branch.

X X1 X2 X3W Y Z

X + Y X1' X2' X3' Z' Z1

$ git replace Z Z'

Page 22: New Views on your History with git replace

Grafts

Created mostly for projects like linux kernel with old repositories.

●“.git/info/grafts” file●each line describe parents of a

commit●<commit> <parent> [<parent>]*●this overrides the content in the

commit

Page 23: New Views on your History with git replace

Problem with Grafts

They are neither objects nor refs, so they cannot be easily transferred.

We need something that is either:

● an object, or● a ref

Page 24: New Views on your History with git replace

Solution, part 1: replace ref

● It is a ref in .git/refs/replace/● Its name is the SHA-1 of the

object that should be replaced.● It contains, so it points to, the

SHA-1 of the replacement object.

Page 25: New Views on your History with git replace

Solution, part 2: git replace

● git replace [ -f ] <object> <replacement>:to create a replace ref

● git replace -d <object>:to delete a replace ref

● git replace [ -l [ pattern ] ]:to list some replace refs

Page 26: New Views on your History with git replace

Replace ref transfer

●as with heads, tags, notes, remotes●except that there are no shortcuts and

you must be explicit●refspec: refs/replace/*:refs/replace/*●refspec can be configured (in .git/config),

or used on the command line (after git push/fetch <remote>)

Page 27: New Views on your History with git replace

Creating replacement objects

When it is needed the following commands can help:

●git rebase [ -i ]●git cherry-pick●git hash-object●git filter-branch

Page 28: New Views on your History with git replace

What can it be used for?

Create new views of your history.

Right now only 2 views are possible:

● the view with all the replace refs enabled●the view with all the replace refs disabled,

using --no-replace-objects or the GIT_NO_REPLACE_OBJECTS environment variable

Page 29: New Views on your History with git replace

Why new views?

● split old and new history or merge them●fix bugs to bisect on a clean history●fix mistakes in author, committer,

timestamps●remove big files to have something lighter

to use, when you don’t need them●prepare a repo cleanup●mask/unmask some steps●...

Page 30: New Views on your History with git replace

Limitations

●everything is still in the repo●so the repo is still big●there are probably bugs●confusing?●...

Page 31: New Views on your History with git replace

Current and future work

●a script to replace grafts●fix bugs●allow subdirectories in .git/refs/replace/●maybe allow “views” as set of active

subdirectories●...

Page 32: New Views on your History with git replace

Considerations

●best of both world: immutability and configurability of history

●no true view●history is important for freedom

Page 33: New Views on your History with git replace

Many thanks to:

●Junio Hamano (comments, help, discussions, reviews, improvements),

● Ingo Molnar,●Linus Torvalds,●many other great people in the Git and Linux

communities, especially: Andreas Ericsson, Johannes Schindelin, H. Peter Anvin, Daniel Barkalow, Bill Lear, John Hawley, ...

●OSDC/OWF organizers and attendants,●Murex the company I am working for.

Page 34: New Views on your History with git replace

Questions ?