Web Defacement Detection Trends
In general, detection of web defacement focused on detect changes that are not authorized or not made by the web's developer to the web server by using File Integrity Monitoring (FIM) and/or using Host-based Intrusion Detection System (HIDS). Indeed, using either HIDS or FIM will detect changes that had been done to the web. However, those methods lack the capability to detect the most prevailing defacement techniques used today which are code and/or data injection attacks, and DNS hijacking. This is caused by the attack that comes does not actually modify the code or configuration of the website, but instead, introduces new content or redirects the user to a different website.
On the other hand, the techniques that using more complex algorithms such as web embedding that using n-gram to construct a vectorized profile of the web will efficiently deal with the problems (as can be seen in figure 1). Since the vectorized profile of the web is a representation of the web’s content itself. Comparing changes made for the web in that form will reveal significant discrepancies between changes upon an attack or a defacement happens.
One thing to take from this method is, it uses texts to construct the “profile”. Hence, what if the attacker won’t touch the text content of the web and instead they change the pictures, banner, or even hyperlink of the web?
Figure 1: Example of profile constructed from the website's contents
In this case, the more advanced method is taking place called image object recognition. This method will convert the screenshot of the web into matrices and use it as a representation of that web. So upon the changes that happen (as can be seen in figure 2), comparing those changes in those forms will show significant discrepancies between them, and also covers the change made to the non-text contents. However, even with these advantages given to this technique, it still requires extensive computational resources to be able to construct the profile of the website, since it needs to process every pixel from the screenshot.
Figure 2: Example of profile constructed from the website's screenshots
To tackle this, many security engineers use another representation to do the comparison between changes made to the web with lower computational requirements like compressing the screenshot, using a DOM tree, or even directly using the HTML script. The point is to transform parts from the web into a more comparable form and usually is the content of the web.
Like what we have in SentryPage, a service that provides an AI-based system that will monitor and detect web defacement in your web according to changes made to your site by taking the advantage of web content’s representation. SentryPage will alert you if any funny business happens or has been done to your site. For further information about SentryPage, do not hesitate to Contact Us.
- Borgolte, K. and co. 2015. Meerkat: Detecting Website Defacements through Image-based Object Recognition.