Symbol and String Collection
The package indexing endpoint now also supports the symbol and string collection
pipeline and stores them in the extra_data
field of the resource.
How it works
When PurlDB receives an index request for a PURL via the /api/collect
endpoint along with the symbol/string addon_pipeline, it fetches the archive
download_url and creates a package for the PURL with relevant metadata.
Thereafter, a scan job is scheduled which downloads the archive of the PURL
and runs the scan_single_package
package pipeline. Scan job also runs the requested addon_pipelines.
Upon completion of the scan job, the package is updated with resource data along
with the source_symbols
and source_strings
in the extra_data
field of
resources.
Currently PurlDB supports these addon_pipeline for symbol/string collection.
collect_symbols_ctags
collect_strings_gettext
collect_symbols_tree_sitter
collect_symbols_pygments
See the detailed tutorial on How To get symbols and strings from a PURL/package in PurlDB.