How to Use GIT Java APIs to Diff Different Versions
Last week I introduced the JGIT Java API with a simple sample illustrating how to read content from HEAD. If you have multiple versions of a source code or text file, you may want to see their differences. An easy tool for this is the standard diff.
The JGIT Java API has built-in support for you to generate diff between any two versions of a file, be it a source code, properties file, XML file, or any other text files. Here is a sample that shows how to do this.
Time to learn how to "Google" and manage your VMware and clouds in a fast and secure
HTML5 App/** Copyright Steve Jin 2013 */ import java.io.ByteArrayOutputStream; import java.io.File; import java.util.List; import org.eclipse.jgit.api.Git; import org.eclipse.jgit.diff.DiffEntry; import org.eclipse.jgit.diff.DiffFormatter; import org.eclipse.jgit.lib.ObjectId; import org.eclipse.jgit.lib.ObjectReader; import org.eclipse.jgit.treewalk.CanonicalTreeParser; public class JGitDiff { public static void main(String[] args) throws Exception { File gitWorkDir = new File("C:/temp/gittest/"); Git git = Git.open(gitWorkDir); String oldHash = "d7db296cc2730ca562f91cfa539d6955a21284b6"; ObjectId headId = git.getRepository().resolve("HEAD^{tree}"); ObjectId oldId = git.getRepository().resolve(oldHash + "^{tree}"); ObjectReader reader = git.getRepository().newObjectReader(); CanonicalTreeParser oldTreeIter = new CanonicalTreeParser(); oldTreeIter.reset(reader, oldId); CanonicalTreeParser newTreeIter = new CanonicalTreeParser(); newTreeIter.reset(reader, headId); List<DiffEntry> diffs= git.diff() .setNewTree(newTreeIter) .setOldTree(oldTreeIter) .call(); ByteArrayOutputStream out = new ByteArrayOutputStream(); DiffFormatter df = new DiffFormatter(out); df.setRepository(git.getRepository()); for(DiffEntry diff : diffs) { df.format(diff); diff.getOldId(); String diffText = out.toString("UTF-8"); System.out.println(diffText); out.reset(); } } }
The output of the program is as follows:
diff --git a/file1.txt b/file1.txt index 7702b88..805e7c6 100644 --- a/file1.txt +++ b/file1.txt @@ -1 +1 @@ -DoubleCloud.org rocks! \ No newline at end of file +DoubleCloud.org really rocks! \ No newline at end of file
As you noticed from the sample, there is a hash string, which is rarely used to identify a version in reality. For one thing, you it’s hidden in the .git/objects folder with other objects identified with hash strings. Most likely you would use a tag, branch head to identify a particular version. That is just a trade-off to simplify the sample.
How to Use GIT Java APIs to Diff Different Versions http://t.co/rawrFPny via @sjin2008
How to Use GIT Java APIs to Diff Different Versions (DoubleCloud) http://t.co/3ukvixzl
Do you have any documentation on the best way to handle printing out the fileNames from a given commit?
Hi Mike,
I don’t remember on top of my head as I haven’t touch it for quite some time.
Steve
May I know what is oldhash in this code ?