New Views on your History with git replace
-
Upload
christian-couder -
Category
Technology
-
view
517 -
download
3
description
Transcript of New Views on your History with git replace
New Views on your History with git replace
Christian Couder, [email protected]
OSDC.fr 2013
October 5, 2013
About Git
A Distributed Version Control System (DVCS): ●created by Linus Torvalds●maintained by Junio Hamano●since 2005●prefered VCS among open source
developers
Git Design
Git is made of these things:
●“Objects”●“Refs”●config, indexes, logs, hooks,
grafts, packs, ...
Only “Objects” and “Refs” are transferred from one repository to another.
Git Objects
●Blob: content of a file●Tree: content of a directory●Commit: state of the whole source code●Tag: stamp on an object
Git Objects Storage
●Git Objects are stored in acontent addressable database.
●The key to retrieve each Object is theSHA-1 of the Object’s content.
●A SHA-1 is a 160-bit / 40-hex / 20-bytehash value which is considered unique.
blob size
/* content of this blob, it can be anything like an image, a video, ... but most of the time it is source code like:*/
#include <stdio.h>
int main(void){ printf("Hello world!\n"); return 0;}
SHA1: e8455...
Blob
blob = content of a file
Example of storing and retrieving a blob
# echo “Whatever…” | git hash-object -w --stdinaa02989467eea6d8e0bc68f3663de51767a9f5b1
# git cat-file -p aa02989467Whatever...
tree size
SHA1: 0de24...
blob
tree
hello.c
lib
e8455...
10af9...
Tree
tree = content of a directory
It can point to blobs and other trees.
Example of storing and retrieving a tree
# BLOB=aa02989467eea6d8e0bc68f3663de51767a9f5b1# (printf "100644 whatever.txt\0"; echo $BLOB | xxd -r -p)
| git hash-object -t tree -w --stdin0625da548ef0a7038c44b480f10d5550b2f2f962
# git cat-file -p 0625da548e100644 blob aa02989467... whatever.txt
commit size
SHA1: 98ca9...
parents
tree 0de24...
()
author Christian <timestamp>
committer
My commit message
Commit
commit = information about some changes
It points to one tree and 0 or more parents.
Christian <timestamp>
Example of storing and retrieving a commit (1)
# TREE=0625da548ef0a7038c44b480f10d5550b2f2f962# ME=”Christian Couder <[email protected]>”# DATE=$(date "+%s %z")# (echo -e "tree $TREE\nauthor $ME $DATE";
echo -e "committer $ME $DATE\n\nfirst commit")| git hash-object -t commit -w --stdin
37449e955443883a0a888ee100cfd0a7ba7927b3
Example of storing and retrieving a commit (2)
# git cat-file -p 37449e9554tree 0625da548ef0a7038c44b480f10d5550b2f2f962author Christian Couder <[email protected]> 1380447450 +0200committer Christian Couder <[email protected]> 1380447450 +0200
first commit
Git Objects Relations
Commit size
SHA1: e84c7...
parents
tree 29c43...
()
author Christian
committer ChristianInitial commit
Tree size
blob
tree
0de24...hello.c
doc 98ca9...
Commit size
SHA1: 98ca9...
parents
tree 5c11f...
(e84c7...)
author Arnaud
committer ArnaudChange hello.c
SHA1: 29c43...
Tree size
blob
blob
677f4...readme
install 23ae9...
SHA1: 98ca9...
Tree size
blob
tree
bc789...hello.c
doc 98ca9...
SHA1: 5c11f...
Blob size
SHA1: 0de24...
int main() { ... }
Blob size
SHA1: bc789...
int main(void) { ... }
Git Refs
●Head: branch,.git/refs/heads/
●Tag: lightweight tag,.git/refs/tags/
●Remote: distant repository,.git/refs/remotes/
●Note: note attached to an object,.git/refs/notes/
●Replace: replacement of an object,.git/refs/replace/
Example of storing and retrieving a branch
# git update-ref refs/heads/master 37449e9554
# git rev-parse master37449e955443883a0a888ee100cfd0a7ba7927b3
# git reset --hard masterHEAD is now at 37449e9 first commit
# cat whatever.txtWhatever...
Result from previous examples
master
commit 37449e9554
tree 0625da548e
blob aa02989467
Commits in Git form a DAG (Directed Acyclic Graph)
● history direction is from left to right● new commits point to their parents
B
git bisect
● B introduces a bad behavior called "bug" or "regression"
● red commits are called "bad"● blue commits are called "good"
Problem when bisecting
Sometimes the commit that introduced a bug will be in an untestable area of the graph.
For example:
X X1 X2 X3W Y Z
Commit X introduced a breakage, later fixed by commit Y.
Possible solutions
Possible solutions to bisect anyway:●apply a patch before testing and remove it
afterwards (can be done using "git cherry-pick"), or
●create a fixed up branch (can be done with "git rebase -i"), for example:
X X1 X2 X3W Y Z
X + Y X1' X2' X3' Z'
Z1
A good solution
The idea is that we will replace Z with Z' so that we bisect from the beginning using the fixed up branch.
X X1 X2 X3W Y Z
X + Y X1' X2' X3' Z' Z1
$ git replace Z Z'
Grafts
Created mostly for projects like linux kernel with old repositories.
●“.git/info/grafts” file●each line describe parents of a
commit●<commit> <parent> [<parent>]*●this overrides the content in the
commit
Problem with Grafts
They are neither objects nor refs, so they cannot be easily transferred.
We need something that is either:
● an object, or● a ref
Solution, part 1: replace ref
● It is a ref in .git/refs/replace/● Its name is the SHA-1 of the
object that should be replaced.● It contains, so it points to, the
SHA-1 of the replacement object.
Solution, part 2: git replace
● git replace [ -f ] <object> <replacement>:to create a replace ref
● git replace -d <object>:to delete a replace ref
● git replace [ -l [ pattern ] ]:to list some replace refs
Replace ref transfer
●as with heads, tags, notes, remotes●except that there are no shortcuts and
you must be explicit●refspec: refs/replace/*:refs/replace/*●refspec can be configured (in .git/config),
or used on the command line (after git push/fetch <remote>)
Creating replacement objects
When it is needed the following commands can help:
●git rebase [ -i ]●git cherry-pick●git hash-object●git filter-branch
What can it be used for?
Create new views of your history.
Right now only 2 views are possible:
● the view with all the replace refs enabled●the view with all the replace refs disabled,
using --no-replace-objects or the GIT_NO_REPLACE_OBJECTS environment variable
Why new views?
● split old and new history or merge them●fix bugs to bisect on a clean history●fix mistakes in author, committer,
timestamps●remove big files to have something lighter
to use, when you don’t need them●prepare a repo cleanup●mask/unmask some steps●...
Limitations
●everything is still in the repo●so the repo is still big●there are probably bugs●confusing?●...
Current and future work
●a script to replace grafts●fix bugs●allow subdirectories in .git/refs/replace/●maybe allow “views” as set of active
subdirectories●...
Considerations
●best of both world: immutability and configurability of history
●no true view●history is important for freedom
Many thanks to:
●Junio Hamano (comments, help, discussions, reviews, improvements),
● Ingo Molnar,●Linus Torvalds,●many other great people in the Git and Linux
communities, especially: Andreas Ericsson, Johannes Schindelin, H. Peter Anvin, Daniel Barkalow, Bill Lear, John Hawley, ...
●OSDC/OWF organizers and attendants,●Murex the company I am working for.
Questions ?