The Fault in Our Stars: An Analysis of GitHub Stars as an Importance Metric for Web Source Code
2024-3-1 10:24:5 Author: www.ndss-symposium.org(查看原文) 阅读量:1 收藏

Simon Koch, David Klein, and Martin Johns (TU Braunschweig)

Are GitHub stars a good surrogate metric to assess the importance of open-source code? While security research frequently uses them as a proxy for importance, the reliability of this relationship has not been studied yet. Furthermore, its relationship to download numbers provided by code registries – another commonly used metric – has yet to be ascertained. We address this research gap by analyzing the correlation between both GitHub stars and download numbers as well as their correlation with detected deployments across websites. Our data set consists of 925 978 data points across three web programming languages: PHP, Ruby, and JavaScript. We assess deployment across websites using 58 hand-crafted fingerprints for JavaScript libraries. Our results reveal a weak relationship between GitHub Stars and download numbers ranging from a correlation of 0.47 for PHP down to 0.14 for JavaScript, as well as a high amount of low star and high download projects for PHP and Ruby and an opposite pattern for JavaScript with a noticeably higher count of high star and apparently low download libraries. Concerning the relationship for detected deployments, we discovered a correlation of 0.61 and 0.63 with stars and downloads, respectively. Our results indicate that both downloads and stars pose a moderately strong indicator of the importance of client-side deployed JavaScript libraries.


文章来源: https://www.ndss-symposium.org/ndss-paper/auto-draft-490/
如有侵权请联系:admin#unsafe.sh