Greetings,
Figured I would share the small OCR cracker I made in Python. It uses Python Imaging Library (PIL) for processing the image. It converts the image to greyscale and determines the darkest pixel with getexrema() then attempts to outline the pet. It then uses getbbox() to create a virtual rectangle (left, top, right, bottom) around the focal point of the pet. Finally, the x and y coords I use are the center of the rectangle (the most human point).
Code:
import Image
im = Image.open("capt.jpg")
im = im.convert("L")
lo, hi = im.getextrema()
im = im.point(lambda p: p == lo)
rect = im.getbbox()
x = 0.5 * (rect[0] + rect[2])
y = 0.5 * (rect[1], rect[3])
So far it's been 100% accurate and the points that is has chosen have been very human-like. For those interested this code will be integrated into Neolib with the next commit.