and it turned out that the idea craig venter was really pushing nessed all of this combinatorics.will cohere into one. given this fragment and given this fragment, 200 bases overlap, so we'll put them together and see if we can put the next one in. and if those two overlap and then you have a third one that comes in and it matches, you know you have it right. with a human genome, which is 3 billion a's, c's, g's and t's, you need to be a little smarter. so now you set all sorts of heuristics. so you say, "well, let's take every pair of reads that overlaps by at least, say, 100 letters," and they have the identical 100 letters. the problem is there's always sequencing error. for every one of those comparisons, you're computing all these details about it. so that is the problem that the computer has to solve as it's doing assembly of all of these pieces. when i first started in looking at genomes, the only genomes that were being sequenced were one-celled microbes, bacteria and archaea. proteins from one organism would have absolutely no match whatsoever in the next organism over, s