This paper mainly focuses on how to eectively and eciently measure visual similarity for local feature based representation. Among existing methods, metrics based on Bag of Visual Word (BoV) techniques are ecient and conceptually simple, at the expense of eectiveness. By