“The quick, brown fox jumps over the lazy dog.” and “There is a dog and a fox. The fox, which is brown, jumps over the dog, which is lazy. The fox is quick.” should give a value of 1.
“The quick, brown fox jumps over the lazy dog.” and “There is a dog and a fox. The fox, which is brown, jumps over the dog, which is lazy. The fox is fast.” should give a value of 0.999.
Yes, I know that it’s complicated; but it’s not impossible. Google clearly does something similar when grouping stories together for news.google.com.
Then I want to have all news stories automatically compared to corporate press releases. I want my webpage to show me the press release on one side and the news article on the other. I want the news article to be shaded with two different colours; one colour for sections that are possibly reworded, but ultimately just taken from the press release and one colour for sections that represent actual work done by the reporter.
Recent Comments