Occurrence Statistics Of Entities on the Web

Objective

  • To estimate the ratio of Entities in a given text corpus using maximum mean discrepancy.

  • Used existing named entity disambiguators for feature extraction and libsvm for benchmarking and comparison.

  • Although in the experiments I did, SVM performed better than MMD for ratio estimation of entities, the seminar lead to a useful survey report of the problem of named entity disambiguation techniques and MMD.

Results/Report