Deep learning for vulnerability discovery in web applications represented in intermediate languages

Date: 05/11/2019

1. Project Title/ Job position title: Deep learning for vulnerability discovery in web applications represented in intermediate languages

2. Area of knowledge: Physical Sciences, Mathematics and Engineering Panel

3. Group of disciplines: Theoretical and Applied Mathematics, Computer Sciences and IT

4. Research project/ Research Group description:
Web applications are the most common vehicle for accessing services and resources in enterprises. However, they often contain vulnerabilities that can be exploited remotely, causing serious damage to organizations and allowing private user information retrieval. Essential services, such as banking and healthcare, demand trustworthy applications, and so it is crucial that they are programmed with securi-ty in mind, preventing successful attacks that can disturb and/or interrupt their operation.
Despite the advances made in web application security, companies have not been able to decrease substantially the number of vulnerabilities reported annually. A key factor that explains this observa-tion is the growth in complexity leveraged by semantic aspects of different languages that can inte-grate an application, which complicate the analysis of tools that inspect the programs while searching for flaws. A way to circumvent such complexity is to perform the analysis in an intermediate language representation of the web application.
In the project, we investigate techniques for analyzing the source code of web applications represent-ed in an intermediate language with the goal of discovering vulnerabilities and then remove automati-cally the errors found by applying patches to the source code, i.e., performing code correction. This way, we plan to use techniques from the code analysis area, such as static and dynamic analysis, and from the artificial intelligence area, focusing on deep learning and natural language processing (NLP). Recently, we have applied a few of these techniques to specific scenarios with promising results, but in the project, we intend to extend them to build tools that are highly accurate and scalable to large code-bases, with the final aim of improving the security of the web. These tools will englobe both identification and correction of vulnerabilities, being the correction a promising and challenging re-search area.

5. Job position description:
The student will be involved in the various tasks required for building a successful tool for the discovery and correction of vulnerabilities, from the design of the solution until its evaluation with real web applications. In more detail:
– Investigate different classes of flaws that might affect web applications
– Build a dataset of applications that contain representative vulnerabilities, either on a programming language and an intermediate language representation
– Research alternative techniques that could be employed to locate the flaws
– Study machine learning methods that could be used to find the vulnerabilities
– Research methods that could be applied to correct the code for removing flaws
– Build a tool based on the investigated techniques
– Test and evaluate the tool with relevant web applications and report discovered vulnerabilities to developers, giving to them a possible correction of their code
The project is developed with members of the Navigators group of the LASIGE research lab. Several members of the group (and lab) are involved in research activities that aim to enhance the correctness of applications in general, with fruitful and outstanding results in the past.
The work is defined in the context of several European consortia and collaborations with other teams are envisioned.

6. Group leader:
* title: Professors
* full name: Nuno Ferreira Neves and Ibéria Medeiros
* email: and
* research project/ research group website:
* website description: the side belongs to one of the groups of the LASIGE Lab (