Four key building blocks for insight-driven network automation

The migration of content, applications and services to the cloud is driving network  behaviors characterized by constantly shifting traffic flows, complex end-to-end paths and unpredictable bandwidth demand. Small, agile DevOps teams are engaged in cloud-native application and service deployments that are highly dynamic, distributed and diverse. These trends are creating serious operational challenges for both cloud-native webscalers and network service providers (NSPs), which are exacerbated by the dramatic expansion of the potential attack surface beyond the traditional security perimeter. In addition, a proliferation of weakly secured IoT devices has created a platform that hackers are exploiting to launch massive scale botnet-based DDoS attacks.

My colleague Tim Doiron and I just published an ACG Research white paper on “Powering Intelligent Network Services With Real-Time Visibility, Analytics and Automation” that describes how NSPs and webscalers can use real-time visibility, analytics and automation to overcome these challenges by taking advantage of the latest advances in network infrastructure hardware and software. The paper examines the four key building blocks that enable operators to realize the benefits of real-time, insight-driven network automation:

The data plane should be fully instrumented in both hardware and software, capable of extracting visibility data used for tracking, identifying and characterizing traffic flows. Data plane fabrics based on custom ASICs will continue to play a vital role by providing embedded support for packet flow classification mechanisms, flexible packet header and payload pattern matching for filtering functions, built-in flow replication and the ability to support millions of granular ACL rules.

The control plane should be fully programmable and evolve to incorporate a sophisticated orchestration layer implementing multiple applications for real-time network automation use cases in response to detected failures, anomalies, performance bottlenecks, sub-optimal utilization and security threats.

Multi-domain, multi-layer visibility is critical for dynamic traffic flows traversing complex end-to-end paths while the underlying network topology shifts in response to changing conditions. Another critical visibility requirement is identifying servers and end points not just by IP address, but by application, service or subscriber, which is a non-trivial problem in today’s vast and complex Internet. NSPs and webscalers also need a map of the “supply chain of the Internet”, which tracks the relationships and dependencies between cloud-native applications and services, CDNs and peering and transit networks.

Big Data analytics plays the critical role of extracting actionable insights in real-time by ingesting petabytes of data, including streaming telemetry for visibility into network paths, traffic flows and performance, but also data collected from a wide array of other sources that provides visibility into the identity of applications, services and subscribers.

In combination, these four key building blocks enable operators to deploy intelligent networks that use visibility and analytics to drive closed-loop feedback for insight-driven network automation:

To learn more, read the white paper and tune into the upcoming Light Reading webinar “Real-Time Visibility & Analytics to Enable Actionable Intelligence & Network Automation“, to be held Thursday, November 9, 2017, 12:00 p.m. New York / 6:00 p.m. London.

 

 

Visibility & analytics at the ONUG Spring 2017 conference

I was invited to speak at the Open Networking User Group’s ONUG Spring 2017 conference held in San Francisco back in April about “A Framework for Infrastructure Visibility, Analytics and Operational Intelligence”. My presentation is up on Slideshare and ONUG has posted a video of the session.

My goal was to stimulate thinking about how we bring the power of Big Data to infrastructure monitoring and analytics by creating a common framework for tools to share visibility data from an array of sources and feed this data into a set of shared analytics engines to support various operational use cases.

It’s not economically feasible, nor is it technically desirable, for each tool to bring its own Big Data analytics stack and ingest dedicated streaming telemetry feeds. As an industry, we need to think about how we can create more commonality at the lower layers of the stack to implement lower cost solutions that facilitate data sharing and common analytics across a wide range of use cases.

On this front, ONUG has a Monitoring & Analytics initiative that is working to define user requirements and develop proof-of-concept demos for a new, comprehensive suite of tools to manage software-defined infrastructure.  There was a panel at the conference that provided an update on the status of the initiative, and ONUG has posted a video of this session.

I also moderated an interesting panel discussion on Retooling for the Software-Defined Enterprise that featured Aryo Kresnadi from FedEx, Ian Flint from Yahoo and Dan Ellis from Kentik, who all have extensive experience using and building monitoring & analytics tools in cloud-scale environments. ONUG has also posted a video of this session, along with many others from the conference on ONUG’s Vimeo channel.

If these topics interest you, be sure to save the date for ONUG Fall 2017, which will be held October 17 & 18 in New York City.

Cloud-scale technologies for cloud-scale infrastructure visibility & analytics

I think we can all agree that cloud-scale technologies are wonderful things, enabling hyper-agile delivery of applications and services to billions of users worldwide. Software-defined networking, virtualization, microservices, containers, open source software and Open Compute platforms are enabling cloud service providers to achieve mind-boggling economies of scale while keeping pace with insatiable user demand.

However, as telecom service providers and large-scale enterprises move to embrace cloud-scale technologies, they are proving to be both a blessing and a curse. The benefits are straightforward: rapidly deliver a broader range of applications and services at lower cost while being able to quickly respond to changing customer needs. The downside is that both service providers and enterprises need to employ new toolsets for developing, deploying and managing these applications and services.

Disaggregation and decomposition are consistent themes for cloud-scale technology. Monolithic platforms are separated into a software-driven control plane running on commodity hardware platforms. Network functions and computing resources are virtualized and decoupled from the underlying hardware. Monolithic applications are decomposed into many microservices that each run in their own container. The business value in terms of lower hardware costs coupled with increased flexibility and agility is real, but there are added costs associated with managing all these different piece parts.

The problem becomes obvious when service providers and enterprises try to apply existing management tools and methodologies to cloud-scale infrastructure. For all their internal complexity, configuring, monitoring and controlling monolithic platforms and applications is simpler than managing multiple layers of many different software components running on virtualized infrastructure. While the industry has recently made great strides by adopting new tools for cloud-scale infrastructure configuration and orchestration, we are still playing catch-up in terms of equally effective approaches to cloud-scale visibility and analytics.

Yet here is where cloud-scale technologies come to their own rescue. By disaggregating and decomposing software and hardware functions, with the proper instrumentation implemented at each layer and in every component, we are able to gain full visibility into the entire stack from top to bottom, while utilizing new technologies like streaming telemetry to provide extremely granular, real-time visibility into the application and service delivery infrastructure.

Therefore, it’s only natural that cloud-scale visibility and analytics should be implemented on native cloud-scale platforms, leveraging the same technologies: software-defined networking, virtualization, microservices, containers, open source software and Open Compute platforms. This is especially critical when employing Big Data analytics, where the basic technologies are inherently cloud-scale, and well-suited for ingesting Big Data streaming telemetry feeds and performing real-time streaming analytics on this data.

 

Real-time network visibility & analytics for operational intelligence

When I launched my network analytics practice for ACG just over a year ago, I decided that my initial research needed to focus on the value of real-time network visibility and Big Data analytics for operational intelligence. SDN, virtualization and the widespread adoption of cloud-scale technologies are enabling new techniques, including streaming telemetry, for instrumenting networks and gaining real-time visibility into traffic flows and network state. At the same time, streaming analytics allows network operators to immediately turn insights into action within seconds or minutes instead of hours or days. Big Data also supports the large-scale data sets needed to apply machine learning techniques for predictive analytics and AI-based network automation.

The ROI for real-time operational intelligence is compelling across a wide array of use cases, including: rapid root cause analysis and reduced mean time to repair (MTTR); immediate detection of security threats inside the network perimeter; real-time performance monitoring for on-the-fly traffic engineering; continuous KPI monitoring for service assurance; and the holy grail: closed-loop feedback for analytics-driven automation. The potential gains are huge and the industry is witnessing a new wave of innovation that will enable us to reinvent how networks are deployed and operated, and how services are delivered and managed.

Network operators are leveraging new real-time visibility and analytics technologies in three separate, but interconnected, domains:

  • Telecom network and communication services
  • Cloud-scale services delivered via the Internet
  • Hyperscale data center infrastructure

Therefore, my research in operational intelligence has separate tracks for covering developments in each domain, although there is overlap between tracks. For example, new telecom services are being delivered via the cloud, and SD-WANs are telecom services that use the Internet to connect users to applications in the cloud. The cloud-scale services track looks at visibility and analytics from the perspective of the network operator who is delivering or consuming services over networks that the operator doesn’t own or operate, whereas the hyperscale data center track looks at the role of visibility and analytics to manage that infrastructure, which is used for delivering cloud-scale services.

As a result, my research spans three separate, but interrelated, markets:

  • Telecom services
  • Cloud-scale services
  • Enterprise IT

While today these are three distinct markets, over the next decade I expect the lines will blur as the industry converges on delivering the majority of applications and services via public and hybrid clouds. Picture one vast market – cloud-scale services – segmented by application type: consumer, enterprise IT, communications, IoT, etc. At this point, the network simply provides access and transport for user devices, machines and sensors to connect with applications running in the cloud.

As an industry, we need to solve many technical problems in order to get there, with security being the most significant challenge, but today’s breakthroughs in real-time network visibility and Big Data analytics will play a key role in realizing this vision.