Occurrence Statistics Of Entities on the Web

To estimate the ratio of Entities in a given text corpus using maximum mean discrepancy.
Used existing named entity disambiguators for feature extraction and libsvm for benchmarking and comparison.
Although in the experiments I did, SVM performed better than MMD for ratio estimation of entities, the seminar lead to a useful survey report of the problem of named entity disambiguation techniques and MMD.