Incorporating Prior Domain Knowledge into a Support Vector Machine Classifier with Explanation-Based Learning

$342,101FY2004CSENSF

University Of Illinois At Urbana-Champaign, Urbana IL

Investigators

Abstract

This project addresses a central problem in machine learning: Can background knowledge, even though approximate and imperfect, improve learning accuracy and efficiency? This project combines inductive learning using Support Vector Machines (SVMs) with a new variant of explanation-based learning (EBL). Recognizing handwritten Chinese characters is the test domain. A limited training set is augmented with background knowledge of pen strokes: how they generate characters and how they can result in image pixels. EBL and SVMs are combined in two ways: (1) "phantom examples" are generated to enable learning with fewer input examples, and (2) EBL is used to generate new kernel functions for the SVM. The expected scientific advances include machine recognition of pictogram characters, concept learning in domains where training examples are scarce, and explanation-based learning with imperfect domain theories. Potential broader impacts include more automated preservation and dissemination of historical Chinese texts through the production of clean machine-readable copies and easier application of conventional machine translation systems to pictogram languages without the stumbling block of image input; automatic processing of pictogram and other line-drawing input may in turn enable new computer-based educational applications and opportunities. Integrating prior knowledge and inductive machine learning may lead to more cognitively plausible algorithms utilizing training experiences of a more human scale.

View original record on NSF Award Search →