VINCI utilizes machine learning and data science techniques to create a classifier (an algorithm that classifies bugs as “true” or “false”). We train our classifier with an existing dataset that is labeled, or “tagged” (contains examples of “true” and “false” bugs). In most classification applications, approximately 70-80% of the original dataset is used for “training” and the remaining 30-20% for testing.
VINCI is a novel classifier that 1) requires only 10% of the dataset for training purposes, 2) can classify the remaining 90% with extremely high accuracy, 3) can achieve accuracy of 90% (or more), and 4) can be accomplished in a fraction of the time of comparable solutions!
VINCI offers a friendly user interface where the user can import the output of scanning tools in .csv or .xml file format. VINCI uses ML sampling techniques to separate the dataset into training and a test set. The user has the option to label each finding of the training set using the VINCI interface as true or false. VINCI then can use this training set to create a schema that is used to create the model and train the algorithm to make the prediction on the test set. VINCI has default settings for all functionality but the user has also the flexibility to modify the settings.
ML4Cyber
Copyright © 2024 ML4Cyber - All Rights Reserved.