Github is applying machine learning techniques to its rule-based security code scanning capabilities to extend them less common vulnerability patterns by automatically inferring new rules from the existing ones, noted InfoQ.
To detect vulnerabilities in a repository, the CodeQL engine first builds a database that encodes a special relational representation of the code. On that database it can then execute a series of CodeQL queries, each of which is designed to find a particular type of security problem, according to GitHub’s blog post.
CodeQL uses a machine learning model to extend an existing security query to cover a wider range of frameworks and libraries. The machine learning model is trained to detect problems in code it’s never seen before. Queries that use the model will find results for frameworks and libraries that are not described in the original query.