Biomedical named entity extraction: some issues of corpus compatibilities

SpringerPlus

Table 4 Evaluation results of the approach on cross-corpus datasets (we report percentages); Here 'FM' denotes 'F-measure'

Approach	Training set	Test set	Recall	Precision	FM
Best Ind. Classifier	JNLPBA (protein only)+AIMed	AIMed	83.14	83.19	83.17
SOO	JNLPBA (protein only)+AIMed	AIMed	85.10	85.01	85.05
Best Ind. Classi	JNLPBA (protein + DNA)+AIMed	AIMed	82.17	84.15	83.15
SOO	JNLPBA (protein + DNA)+AIMed	3-fold cross	84.07	86.01	85.03
		validation on AIMed
Best Ind. Classi	JNLPBA (protein only)+GENETAG	GENETAG	89.44	93.07	91.22
SOO	JNLPBA (protein only)+GENETAG	GENETAG	91.19	94.98	93.05
Best Ind. Classi	JNLPBA (protein + DNA + RNA)+GENTAG	GENTAG	88.70	93.55	91.06
SOO	JNLPBA (protein + DNA + RNA)+GENTAG	GENTAG	90.09	95.16	92.56