Add more data sources and improve data quality

Organization: AboutCode

Project: Vulnerablecode

Mentee: Ambuj Kulshreshtha (ambuj-1211)

Mentors:

Overview

There is a large number of pending tickets for datasources. This project focuses on adding more vulnerability data sources and consume them. I have considered following issues to solve Collect advisories for AlmaLinux #1201, Collect vulnerabilities from Amazon Linux #72 , Collect Oracle Linux #75 , Add data in CSAF format #1315, VCIO does not collect some Severity (cvssv3.1) scores for a CVE #1238, Add CWE support in all importers #1093 and Collect rockylinux advisories #753. Consuming these datasources will help to create a large database for vulnerabilities.

Implementation

  • Created Importers to add more advisory data from different data sources:

    • I have added a few new importer modules to the VulnerableCode project to incorporate advisory data from different data sources. Some of the importers I created include the Curl Importer, RockyLinux Importer, AlmaLinux Importer, and Amazon Linux Importer. I also worked on creating an importer to retrieve data in CSAF format from the cisagov repo.

  • Added CWE support in multiple importers:

    • Many importers did not include CWE information, this was mentioned here: Add CWE support in all importers #1093, so I solved this issue to add cwe data in multiple importers. There are still many importers that do not have CWE data available in their root data sources. I will add CWE data for them in the future if their data sources are updated.

  • Found bugs in some Vulnerablities

  • Testing:

    • I have built proper doctests for each importer, describing each function in the module in terms of its parameters and return values.

    • Proper unit tests have been created for each module I built to ensure the proper functioning of these modules.

Linked Pull Requests

Sr. no

Name

Link

Status

1

Added Curl Advisories

aboutcode.org/vulnerablecode#1439

Open

2

Added AlmaLinux Advisories

aboutcode.org/vulnerablecode#1491

Open

3

Added CWE support in multiple importers

aboutcode.org/vulnerablecode#1526

Open

4

Added RockyLinux advisories

aboutcode.org/vulnerablecode#1535

Open

5

Added Amazon Linux advisories

aboutcode.org/vulnerablecode#1569

Open

Pre GSoC Work

I started my contributions to AboutCode by the Add Curl Advisories issue, I added the curl advisories datasources to vulnerablecode database. This issue helped me to:

  • Understand the importers.

  • Understand the database models of VulnerableCode.

  • Understand the structure of AdvisoryData.

  • I also explored many components, such as PackageURL, AffectedPackage, Severities, etc.

Post GSoC

I am committed to working on the pull request to ensure it is merged successfully, addressing any reviews and feedback from the mentors. I will prioritize completing any remaining tasks related to my GSoC work. This includes fixing issues such as bugs for specific CVEs that lack severity CVSSv3 scores and references from NVD (as there are a few of these CVEs). Once these tasks are completed, I plan to explore and contribute to more projects within AboutCode.

Acknowledgements

I would like to thank my mentors:

This summer was full of new challenges and learning. I got to learn a lot from everyone on the team. The weekly status calls were incredibly helpful in solving all my doubts. It was fun building for AboutCode, and I will continue to contribute to the codebase of VulnerableCode and other projects as well. I plan to explore more projects in AboutCode and contribute to them because I would love to be a part of this wonderful project.

Thank you, everyone, for your continuous support and belief in me. Your guidance and encouragement have been invaluable, and I am truly grateful for all the help and trust you’ve shown me throughout this journey.