|
Streaming analytics™ with Netezza provides a distinct
advantage in a broad range of applications: for spatial analysis,
text mining, risk-profiling, real-time pricing, network monitoring,
fraud detection and many others. But in addition, the ability
to run a complex analysis against a huge live database, without
the delays and costs of moving data to separate hardware,
opens up new types of analyses previously out of reach. Here
are just a few examples of how the NPS appliance is being
used for streaming analytics:
“Fingerprinting” with Hashing Algorithms:
The Message-Digest algorithm 5 (MD5) is a standard cryptographic
hash function with a 128-bit hash value. It is commonly used
to store passwords and ensure that files transferred are intact.
It is also used in chain of custody document fingerprinting.
By performing the hash directly “on stream,” the
NPS system runs hash algorithms on millions or even billions
of records in seconds. This is typically hundreds of times
faster than today’s method, which requires moving data
from the warehouse to a supercomputing grid, performing the
hash on the data and then loading the results back to the
warehouse.
“Fuzzy Text” Search Analytics: Fuzzy text
search analysis uses algorithms that provide a “best
guess” of most likely results. One example is the Levenshtein
edit distance algorithm, which calculates how many text edits
would be required to manipulate, for example, “Madison
Avenue” into “Main Street.” This type of
algorithm is used by national security applications for complex
analysis of names in port of entry data, as well as other
text searching scenarios which require analysis of billions
of text records. These types of capabilities open the Netezza
appliance to analysis that was not only performance-constrained
in the past but simply impossible through a SQL interface.
Predictive Model Scoring: Many companies use predictive
modeling to finely segment their customers and make real-time
decisions about promotions, pricing, fraud and other applications.
This typically involves a time-consuming process: after transaction
data is loaded into the data warehouse, the company performs
large extracts to a separate cluster server system. The data
must be denormalized and then fed back into a predictive modeling
application for the actual scoring. Eventually, a score for
each customer is loaded back into the data warehouse. The
total round-trip can take hours, dominated by the latency
of large data transfers. This entire process can now be done
within the Netezza appliance, in a fraction of the previous
time, enabling real-time offers and promotions to the right
customers.
Geospatial Analytics: Geospatial analysis performs
operations such as combining multiple maps or map layers according
to predefined rules, or identifying regions within a specified
distance of one or more features, such as roads or rivers.
Geospatial analysis is used for solving problems like: “Find
all properties within 10 miles of Hurricane Katrina’s
eye path” or “Find all properties in Massachusetts
that physically straddle county boundaries.” A Netezza
partner is building an entire geospatial library on the NPS
system, with analytics embedded within SQL functions. Users
will be able to run geospatial applications on huge, comprehensive
data sets, taking advantage of Netezza performance to make
rapid and informed decisions.
|