How to Use GIT Java APIs to Diff Different Versions
Last week I introduced the JGIT Java API with a simple sample illustrating how to read content from HEAD. If you have multiple versions of a source code or text file, you may want to see their differences. An easy tool for this is the standard diff.
The JGIT Java API has built-in support for you to generate diff between any two versions of a file, be it a source code, properties file, XML file, or any other text files. Here is a sample that shows how to do this.
/** Copyright Steve Jin 2013 */
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.util.List;
import org.eclipse.jgit.api.Git;
import org.eclipse.jgit.diff.DiffEntry;
import org.eclipse.jgit.diff.DiffFormatter;
import org.eclipse.jgit.lib.ObjectId;
import org.eclipse.jgit.lib.ObjectReader;
import org.eclipse.jgit.treewalk.CanonicalTreeParser;
public class JGitDiff
{
public static void main(String[] args) throws Exception
{
File gitWorkDir = new File("C:/temp/gittest/");
Git git = Git.open(gitWorkDir);
String oldHash = "d7db296cc2730ca562f91cfa539d6955a21284b6";
ObjectId headId = git.getRepository().resolve("HEAD^{tree}");
ObjectId oldId = git.getRepository().resolve(oldHash + "^{tree}");
ObjectReader reader = git.getRepository().newObjectReader();
CanonicalTreeParser oldTreeIter = new CanonicalTreeParser();
oldTreeIter.reset(reader, oldId);
CanonicalTreeParser newTreeIter = new CanonicalTreeParser();
newTreeIter.reset(reader, headId);
List<DiffEntry> diffs= git.diff()
.setNewTree(newTreeIter)
.setOldTree(oldTreeIter)
.call();
ByteArrayOutputStream out = new ByteArrayOutputStream();
DiffFormatter df = new DiffFormatter(out);
df.setRepository(git.getRepository());
for(DiffEntry diff : diffs)
{
df.format(diff);
diff.getOldId();
String diffText = out.toString("UTF-8");
System.out.println(diffText);
out.reset();
}
}
}
The output of the program is as follows:
diff --git a/file1.txt b/file1.txt index 7702b88..805e7c6 100644 --- a/file1.txt +++ b/file1.txt @@ -1 +1 @@ -DoubleCloud.org rocks! \ No newline at end of file +DoubleCloud.org really rocks! \ No newline at end of file
As you noticed from the sample, there is a hash string, which is rarely used to identify a version in reality. For one thing, you it’s hidden in the .git/objects folder with other objects identified with hash strings. Most likely you would use a tag, branch head to identify a particular version. That is just a trade-off to simplify the sample.

How to Use GIT Java APIs to Diff Different Versions http://t.co/rawrFPny via @sjin2008
How to Use GIT Java APIs to Diff Different Versions (DoubleCloud) http://t.co/3ukvixzl