Template Matching Method for Thermal Infrared Human Target Recognition Algorithm

The template matching method first obtains the template of the whole body or part of the human body and effectively organizes it into a template set. The identification purpose is achieved by matching the candidate target and the template set according to a certain distance measure. The matching and recognition based on the overall template is usually relatively robust. Under the premise of having a complete template set, it is suitable for stationary or moving human targets, but it is difficult to prepare a human body set due to posture, size, and back. Completely unmatched when complex or unmodeled.

According to different properties, the overall templates include contour templates, grayscale templates, and probability templates. Common human body matching image templates are shown in the following figure

The contour template is the contour shape of the human target in various poses. After these contours are transformed by equidistant distances such as Chamfer distance and Hausdorff distance, similar contours are grouped and clustered according to distance differences to form a human body contour template with a grade from coarse to fine. The system is used for identification. In the way of obtaining the contour, because the human body contour in the thermal infrared image is blurred, it is generally obtained by manual outline, which is cumbersome and heavy.

The grayscale template is a manually specified standard grayscale image of a human target. Whether the candidate target is a human body is determined by directly correlating the candidate target with the grayscale template [or the distribution of grayscale histogram, the moment of inertia (second-order distance) and other statistics]. Due to the complex pose of the human body, the scale of both the contour template and the grayscale template set is relatively large, so efficient matching is very important. In practice, the tree-shaped hierarchical matching strategy [100] or the construction of probabilistic templates [101] can better achieve the goal.

The probabilistic template is an improvement over the original image template. The process of applying the probability template for recognition is to first obtain the gray probability distribution of the human target by training a large number of images containing human targets with different postures, then map and calculate the grayscale of each pixel of the candidate target according to the probability distribution, and finally judge based on a specific threshold. Whether it is the target pixel. Since the probability template considers the multi-pose situation of the human target, the probability template is more robust than the contour template and grayscale template with fixed posture. When constructing a probability template, if a large number of positive sample sets and interference sample sets are used for training, and the probability distribution characteristics of human targets and interference targets are reflected at the same time, a better recognition effect will be achieved.

Part-based template matching is based on the detection of human body parts. After the detection results of each part are obtained, the constraint relationship between the parts is used to obtain the recognition result. This method can better handle occlusion and complex human poses, but the difficulty lies in how to define parts and how to integrate the detection information of multiple parts. There are two main ideas for the first question: one is to divide people according to the physiological structure of people, such as dividing people into head and shoulders, torso, lower body, etc.; the other is to divide multiple image blocks according to the characteristics of the image itself. For the second problem, the more common practice is to define a mathematical model, and then realize the integration of information by solving the extreme value of the model (the extreme value point of the model corresponds to the best result).

Based on the above ideas, the adaptive classifier combination algorithm proposed by Mohan et al. [102] divides the human body into four parts: head, lower body, left shoulder and right shoulder. The algorithm extracts the Haar-like features of each part to train a separate part detector, and then uses the output of the four part detectors as the input to train the combined detector to achieve the purpose of integrating the detection information of multiple parts. 92%. BoWu proposes a method based on Bayesian inference from the perspective of probability [103]. The algorithm divides the human body into three parts: head and shoulders, torso, and legs, and trains detectors for these three parts. After obtaining the detection results of all parts, use Bayesian inference to integrate the detection information of each part to obtain the pedestrian position. Leibe's implicit shape model [104] is again different from the above two methods. It does not preset human body parts, but regards the human body as composed of many picture blocks. Any image block may correspond to a head or a shoulder, and the specific size and position of the image block are determined by the algorithm. The algorithm builds a picture block dictionary in the training phase, that is, establishes an index for all picture blocks, and then only needs to query the dictionary when the relevant information of the picture block is needed. In the recognition phase, the algorithm first detects interest points, extracts image blocks around the interest points, and then finds a match in the dictionary. Next, the center position is voted for the center position through the information recorded in the training phase relative to the center of the human body, and the center point with the number of votes exceeding a certain value is regarded as a possible detection result. Finally, the identification result is output after verification. The algorithm can not only generate recognition results, but also achieve human target segmentation at the same time. It is a typical method of first identification and then detection.

LABEL:

superior Down