Reading Time: ~ 3 min.
AI and machine learning offer tremendous promise for
humanity in terms of helping us make sense of Big Data. But, while the
processing power of these tools is integral for understanding trends and
predicting threats, it’s not sufficient on its own.
Thoughtful design of threat intelligence—design that accounts for the ultimate needs of its consumers—is essential too. There are three areas where thoughtful design of AI for cybersecurity increases overall utility for its end users.
Designing where your data comes from
To set the process of machine learning in motion, data
scientists rely on robust data sets they can use to train models that deduce
patterns. If your data is siloed, it relies on a single community of endpoints
or is made up only of data gathered from sensors like honeypots and crawlers. There
are bound to be gaps in the resultant threat intelligence.
A diverse set of real-world endpoints is essential to achieve
actionable threat intelligence. For one thing, machine learning models can be prone
to picking up biases if exposed to either too much of a particular threat or
too narrow of a user base. That may make the model adept at discovering one
type of threat, but not so great at noticing others. Well-rounded, globally-sourced
data provides the most accurate picture of threat trends.
Another significant reason real-world endpoints are
essential is that some malware excels at evading traditional crawling
mechanisms. This is especially common for phishing sites targeting specific
geos or user environments and for malware executables. Phishing sites can hide
their malicious content from crawlers, and malware can appear benign or sit on
a user’s endpoint for extended periods of time without taking an action.
Designing how to illustrate data’s context
Historical trends help to gauge future measurements, so
designing threat intelligence that accounts for context is essential. Take a
major website like www.google.com for example. Historical threat intelligence signals
it’s been benign for years, leading to the conclusion that its owners have put
solid security practices in place and are committed to not letting it become a
vector for bad actors. On the other hand, if we look at a domain that was only
very recently registered or has a long history of presenting a threat, there’s
a greater chance it will behave negatively in the future.
Illustrating this type of information in a useful way can
take the form of a reputation score. Since predictions about a data object’s
future actions—whether it be a URL, file, or mobile app—are based on
probability, reputation scores can help determine the probability that an
object may become a future threat, helping organizations determine the level of
risk they are comfortable with and set their policies accordingly.
Designing how you classify and apply the data
Finally, how a threat intelligence provider classifies data
and the options they offer partners and users in terms of how to apply it can
greatly increase its utility. Protecting networks, homes, and devices from
internet threats is one thing, and certainly desirable for any threat
intelligence feed, but that’s far from all it can do.
Technology vendors designing a parental control product, for
instance, need threat intelligence capable of classifying content based on its
appropriateness for children. And any parent knows malware isn’t the only thing
children should be shielded from. Categories like adult content, gambling
sites, or hubs for pirating legitimate media may also be worthy of avoiding.
This flexibility extends to the workplace, too, where peer-to-peer streaming
and social media sites can affect worker productivity and slow network speeds,
not to mention introduce regulatory compliance concerns. Being able to classify
internet object with such scalpel-like precision makes thoughtfully designed
threat intelligence that is much more useful for the partners leveraging it.
Finally, the speed at which new threat intelligence findings
are applied to all endpoints on a device is critical. It’s well-known that
static threat lists can’t keep up with the pace of today’s malware, but updating
those lists on a daily basis isn’t cutting it anymore either. The time from initial
detection to global protection must be a matter of minutes.
This brings us back to where we started: the need for a robust, geographically diverse data set from which to draw our threat intelligence. For more information on how the Webroot Platform draws its data to protect customers and vendor partners around the globe, visit our threat intelligence page.