Clustering corpus paragraphs for lexical differentiation