Face detection and recognition in shotwell
18 Aug 2018After dabbling a bit with OpenFace, I wanted to add similar face detection and recognition abilities to a typical Linux desktop photo app. So I discovered Shotwell, which is a photo manager for Gnome. Shotwell had a partial implementation of face detection (no recognition) which was under a build define and not enabled in the releases. With that code as the starting point, I started integrating the ideas from OpenFace into Shotwell.
The WIP source code is here. Dependencies are:
- OpenCV >= 2.3.0
- DNN models from OpenFace
Enable face-detect
Shotwell uses meson for its build system. The face detection code is disabled by default in the build configuration. Do the following in the meson build directory to enable it:
meson configure -Dface-detection=true
User interface
The UI for face detection and recognition is still evolving and needs some refinement before it could be enabled in the build by default.
After building shotwell with the face-detection flag enabled, a new button, Faces, shows up in the photo UI.
Faces can either be manually entered by drawing a rectangle on the photo, or auto-detected using the ‘Detect Faces’ button. After running face detection on the photo, rectangles are drawn around the faces detected. To tag a face, the name of the face is entered in place of the ‘Unknown face’ placeholder using the ‘Edit’ button. If not changed, the face is not saved into the shotwell database. Once tagged, a new list called ‘Faces’ appears in the sidebar. This list contains all the faces tagged in current photo database.
In order to use a face in a photo as a reference for automatic recognition in other photos, click on the face name in the ‘Faces’ sidebar. Then select the photo containing the face and click the menu item in Faces -> Train Face … From Photo.
There are more enhancements required in the interface for providing list of suggestions during recognition and batch multiple photos for recognition.
Architecture
The face detection and recognition has to happen in a separate process since OpenCV and shotwell can require different versions of GTK in some installations. The earlier implementation used to spawn the face detection process (a.k.a facedetect) as a child each time the ‘Detect Faces’ button on the GUI was clicked. This is not very efficient and does not have a flexible API. So the first change was to move to DBus for talking to the external process. The DBus daemon is used to start the facedetect process if it is not already running, using a DBus service file.
facedetect API
boolean LoadNet(string dir)
:- Loads the OpenFace DNN models,
res10_300x300_ssd_iter_140000_fp16.caffemodel
for detection andopenface.nn4.small2.v1.t7
for recognition - Returns true if load was successful
- Loads the OpenFace DNN models,
FaceRect[] DetectFaces(string image, string cascade, float scale, boolean infer)
:- Loads
image
and runs it through DNN based face detection if DNN loaded, else runs it through HAAR cascade based detection using thecasacade
file name passed in - If
infer
is true, each face is converted to 128 element embedding - Returns a list of
FaceRect
s that contain the bounding box and embedding vector per face
- Loads
Using the API
The face detection flow in shotwell calls DetectFaces
for a photo with infer
set to true. The facedetect process
returns a list of face bounding boxes and embedding vectors. The face table maintained in the shotwell database has an
additional column to store the 128 element embedding for each face (it had just the bounding box previously).
Face recognition is done by computing the dot-product of the embedding vector of faces. Once a reference face is set using the menu item in the Faces page, shotwell uses the embedding vector from that photo and computes the dot-product with faces detected in other photos (via the ‘Detect Faces’ button).
A threshold of 70% is used to determine if a face matches the reference face. Any such matches are automatically tagged by shotwell before presenting the result dialog.
Next steps
- Improve UI for face detection and recognition
- Sanity check faces for aspect ratio of bounding box and duplicate matches in a photo
- Batch processing of photos to label faces
- Align faces before generating embedding vector
- Implement other clustering techniques for face recognition
- Use the OpenFace API, but it depends on dlib which is not available as a package on many distros
Credits
- OpenCV - this sample code in particular
- OpenFace - for the DNN models