This week’s guest blog describes the massive scale visibility challenges confronting operators of hyperscale data centers:
This week’s Kentik guest blog is about digital transformation in the manufacturing sector and the implications for ensuring Industry 4.0 application performance in hyperscale data centers:
The first in a series of guest blogs for Kentik.
Last week I attended NXTWORK 2017, Juniper Networks’ customer event in San Francisco, where the agenda included keynote presentations by CEO Rami Rahim, new CTO Bikash Koley (most recently at Google) and other Juniper executives. One key takeaway is the focus on helping enterprise and service provider customers overcome the increased complexity of delivering applications and services in hybrid multicloud environments spanning the enterprise, service provider and hyperscaler domains.
Juniper’s vision for simplicity is a “self-driving network” that integrates multiple technologies in order to streamline network operations using intent-driven automation mechanisms that leverage real-time visibility and analytics across multiple layers and domains. The week before the event, Juniper announced its bot-based approach to automation, in which network operation tasks are performed by intelligent software-based bots that execute typical human workflows. The goal is to develop a library of bots which can also work cooperatively under the supervision of a master bot to carry out more complex workflows. Bot-based automation mimics the way things work in a NOC, with multiple people focused on different operational tasks but communicating with each other to share information when required. The approach is similar to NASA’s mission control center, where the flight director oversees a number of specialists who are each responsible for specific aspects of flight operations, working independently yet also cooperating when needed.
SVP Kevin Hutchins’ security keynote focused on the complex set of challenges enterprises face as applications move to the cloud and the traditional security perimeter shifts to encompass a greatly expanded attack surface. Many enterprises lack both the necessary tools and highly skilled personnel required to properly secure networks and application infrastructure in these new multicloud environments.
Juniper’s cybersecurity strategy is based on these principles:
- Pervasive & dynamic security policies across all domains and layers
- Automated & simplified security to streamline workflows
- Adaptive & open security driven by visibility, analytics and AI
In Juniper’s model for software-defined secure networks, multi-layer visibility and machine learning techniques are used to collect and analyze data to identify possible threats. Threat intelligence is then used to prevent threats by directing an intent-based, security policy director which in turn sets specific enforcement policies that will be implemented in network elements, multicloud infrastructure components and end points. Juniper’s security product portfolio is evolving to fully realize this vision and several new capabilities were announced at the event, most notably the recently acquired Cyphort product as Juniper’s on-premise advanced threat detection platform, complementing Juniper’s existing SkyATP cloud-based platform.
With Contrail Networking, Contrail Cloud Platform, Contrail Enterprise Multicloud and Contrail Security, Juniper is making an aggressive push to be a leading supplier of networking solutions to simplify the complex process of migrating from today’s legacy networks to hybrid multicloud environments. The company has the core technology and technical expertise to do this, and the recent acquisitions of AppFormix and Cyphort indicate that Juniper is willing to go outside the company when necessary. With a clear vision in place, success hinges on execution and possibly additional acquisitions, particularly in the area of multicloud security, which Juniper acknowledges is still a work in progress.
I attended last week’s MEF17 conference in Orlando, where the telecom industry association announced its MEF 3.0 Transformational Global Services Framework “for defining, delivering, and certifying agile, assured, and orchestrated communication services across a global ecosystem of automated networks”. MEF members have been busy defining a set of APIs for Lifecycle Services Orchestration(LSO), with a strong focus on inter-carrier orchestration for delivering end-to-end services across multiple domains and these were discussed in many conference sessions and demonstrated by vendors and service providers in the Proof of Concept Showcase.
It was no surprise to hear everyone talking about delivering SD-WAN and other virtualized services, but because these are largely software-driven constructs, it’s critical that the industry adopt open standards in order to support multi-vendor, multi-carrier deployments. This is where the MEF plans to take a leading role, using its base of standards for inter-carrier Ethernet services as a springboard. Here’s a link to the MEF 3.0 video that describes the association’s global services framework, which encompasses not only service definitions and APIs but also an automated, self-service certification platform and a broad community of vendors, service providers, open source projects and other standards bodies.
Automation was the other hot topic at the event. In his keynote, MEF CTO Pascal Menezes stressed the importance of telemetry-driven analytics and machine learning (“AI analytics”) for automating service orchestration at the connectivity layer and for virtualized, overlay services. He also talked about using visibility and analytics to make networks application aware for intent-based service orchestration.
I am keen to track the MEF’s progress in 2018 as it works to define APIs that facilitate automation at both the service and network layers and I’m hoping we’ll see tangible results in this area prior to next year’s MEF18 event.
The migration of content, applications and services to the cloud is driving network behaviors characterized by constantly shifting traffic flows, complex end-to-end paths and unpredictable bandwidth demand. Small, agile DevOps teams are engaged in cloud-native application and service deployments that are highly dynamic, distributed and diverse. These trends are creating serious operational challenges for both cloud-native webscalers and network service providers (NSPs), which are exacerbated by the dramatic expansion of the potential attack surface beyond the traditional security perimeter. In addition, a proliferation of weakly secured IoT devices has created a platform that hackers are exploiting to launch massive scale botnet-based DDoS attacks.
My colleague Tim Doiron and I just published an ACG Research white paper on “Powering Intelligent Network Services With Real-Time Visibility, Analytics and Automation” that describes how NSPs and webscalers can use real-time visibility, analytics and automation to overcome these challenges by taking advantage of the latest advances in network infrastructure hardware and software. The paper examines the four key building blocks that enable operators to realize the benefits of real-time, insight-driven network automation:
The data plane should be fully instrumented in both hardware and software, capable of extracting visibility data used for tracking, identifying and characterizing traffic flows. Data plane fabrics based on custom ASICs will continue to play a vital role by providing embedded support for packet flow classification mechanisms, flexible packet header and payload pattern matching for filtering functions, built-in flow replication and the ability to support millions of granular ACL rules.
The control plane should be fully programmable and evolve to incorporate a sophisticated orchestration layer implementing multiple applications for real-time network automation use cases in response to detected failures, anomalies, performance bottlenecks, sub-optimal utilization and security threats.
Multi-domain, multi-layer visibility is critical for dynamic traffic flows traversing complex end-to-end paths while the underlying network topology shifts in response to changing conditions. Another critical visibility requirement is identifying servers and end points not just by IP address, but by application, service or subscriber, which is a non-trivial problem in today’s vast and complex Internet. NSPs and webscalers also need a map of the “supply chain of the Internet”, which tracks the relationships and dependencies between cloud-native applications and services, CDNs and peering and transit networks.
Big Data analytics plays the critical role of extracting actionable insights in real-time by ingesting petabytes of data, including streaming telemetry for visibility into network paths, traffic flows and performance, but also data collected from a wide array of other sources that provides visibility into the identity of applications, services and subscribers.
In combination, these four key building blocks enable operators to deploy intelligent networks that use visibility and analytics to drive closed-loop feedback for insight-driven network automation:
To learn more, read the white paper and tune into the upcoming Light Reading webinar “Real-Time Visibility & Analytics to Enable Actionable Intelligence & Network Automation“, to be held Thursday, November 9, 2017, 12:00 p.m. New York / 6:00 p.m. London.
I just published an ACG Research market impact report on the Juniper Networks’ AppFormix monitoring and automation solution for intent-driven cloud-scale infrastructure. The report examines the ramifications for data center operators of the highly dynamic, cloud-scale application deployment environments I described in the “3 D’s” of hybrid and multi-cloud application deployment.
Data center operators have access to a wealth of tools for application, infrastructure and network monitoring, provided by numerous vendors and open source initiatives. Yet the current generation of tools fall short in helping operators overcome the challenges of managing cloud-scale application deployment, which is characterized by massive scale, software-driven complexity and highly dynamic run-time environments in which workloads and resources fluctuate constantly. These operators need real-time, full stack monitoring that spans the entire environment:
They also need tools that can remove time-consuming manual workflows from the remedial action feedback loop. Infrastructure monitoring and analytics should feed actionable insights directly to the orchestration layer to automate the process of taking action in response to anomalies or changing conditions by reallocating resources or redistributing workloads. In other words, infrastructure monitoring needs to move from operator-centric to automation-centric.
Collecting and analyzing full stack monitoring data in real time is a Big Data problem, but Juniper Networks’ AppFormix takes an innovative approach to solving this problem that utilizes the distributed computing resources inherent in cloud-scale infrastructure to perform local machine learning on the metrics extracted from each node, significantly reducing the flow of data streamed to the central analytics engine and database.
Providers of infrastructure monitoring solution are busy incorporating machine learning and Big Data analytics into their products. However, in addition to its unique approach to Big Data analytics, what differentiates the Juniper Networks’ AppFormix solution is the integration of analytics-driven, policy-based control that continuously monitors key metrics against pre-defined SLAs and automatically triggers the orchestration layer to make the adjustments necessary to assure the operator’s business objectives. The net result is automation-centric monitoring for intent-driven cloud-scale infrastructure.
For more information, watch the ACG Research Hot Seat video with Sumeet Singh, AppFormix founder and VP engineering, Juniper Networks.
Looking forward to participating in the ONUG Fall 2017 conference in New York, October 17 & 18.
I’ll be one of three judges for ONUG’s new Right Stuff Innovation Awards, which will be awarded to the companies presenting in the PoC Theater who are developing innovative products and solutions that are most closely aligned with the guidelines published by two ONUG working groups:
As I wrote in this blog, ONUG is now expanding its focus to helping enterprise IT managers address the challenges of hybrid multi-cloud application, infrastructure and network deployments, fostering cross-industry vendor collaboration in order to drive the development of more open, software-driven and cost-effective solutions.
I’m also moderating a Birds of a Feather session on Software-Defined Application, Infrastructure and Network Monitoring, with a strong focus on hybrid multi-cloud environments. This will be a forum for enterprise DevOps, ITOps, NetOps and SecOps users to share their experiences, challenges, concerns and recommendations with the vendor community. The session will immediately follow the Monitoring & Analytics Meetup where the M&A working group will discuss the progress ONUG is making in this operationally critical area.
Hope to see you in New York in just over a week, and if you are able to attend, be sure to check out some of the PoC Theater presentations and the vendors exhibiting in the Technology Showcase.
If you plan to attend the conference, register using discount code ACG30 to save 30%.
While describing the challenges of enterprise IT application development in his FutureStack keynote, New Relic CEO Lew Cirne addressed the key question: “How to go fast at scale?” He pointed out that it’s not uncommon for DevOps shops to perform HUNDREDS of application deploys per DAY while larger outfits even deploy 1000’s. Listening to Lew describe how New Relic’s customers are rapidly developing and deploying cloud-based applications, it really hit me again that “Toto, we’re not in Kansas anymore”.
This got me thinking about the “3 D’s” of cloud application deployment:
Let’s explore each of these and the challenges they are creating for DevOps, ITOps, SecOps and NetOps teams charged with deploying, securing, monitoring and managing hybrid and multi-cloud applications along with the underlying application and network infrastructure.
Dynamic. The basic premise of DevOps is that small, highly focused teams are working separately, but in parallel, continuously developing and deploying independent parts that make up a greater whole. This process itself is dynamic by its very nature, with some teams doing 100’s of deploys per day. More importantly, application run-time environments are becoming increasingly dynamic. In a Docker environment, new containers can be spun up and down in seconds, driven by the ebb and flow of application demands. In a microservices architecture, in which applications are composed of small, modular services, the various interactions between the microservices themselves will be inherently dynamic and unpredictable as new application capabilities are created by different combinations of the supporting microservices.
Distributed. Hybrid and multi-cloud environments are highly distributed, with applications and data possibly residing on-premise in legacy three tier data centers, on-premise in private clouds built using cloud-scale architectures, or in one or more public clouds utilizing SaaS, PaaS, IaaS capabilities and serverless computing. In addition, the underlying cloud compute and application infrastructures are highly distributed in order to ensure high availability and be able to easily scale compute and storage capacity on-demand. The interactions between application components distributed across these different environments can be very complex, both within in a given data center and over the network between data centers. We truly live in an age when “the network is the computer”.
Diverse. Application development is highly diverse, with enterprise IT developers using many different programming languages and run-time environments, including bare metal servers, virtual machines and containers. There are also multiple software frameworks that are used to implement these different environments, and developers may mix and match various components to create their own custom stacks. Each cloud service provider offers its own set of application services, supported by its own full stack and characterized by a comprehensive set of APIs. There are also many different ways data can be stored and queried, ranging from legacy RDBMS systems to the latest NoSQL Big Data repositories.
Combined, these “3 D’s” are creating serious challenges for enterprise operations teams and have put a premium on monitoring and analytics solutions for gaining real-time visibility into what is happening at the application, infrastructure and network layers, as well as how to correlate anomalies and events at one layer with observed behavior and conditions at another. I think it’s safe to say “we’re not in Kansas anymore”!
Returning to FutureStack, Lew closed his keynote by describing the challenge of “interconnectivity” in “3 D” environments and the use of instrumentation for “transaction tracing” in order to map out the flow of service execution to identify problematic services that may be negatively impacting overall performance. Lew noted that in this area, New Relic is leveraging open source software – OpenTracing – which is a Cloud Native Computing Foundation member project.
The interconnectivity problem is yet another reason why the solutions that New Relic and other APM vendors are developing are so critical. If DevOps and ITOps teams don’t have the tools they need to monitor and manage large-scale deployments of highly dynamic and distributed applications across heterogeneous environments, enterprise IT won’t be able to “go fast at scale”. The result will be higher operating expenses, lost business opportunities and a serious drag on digital transformation initiatives.
I recently attended New Relic’s FutureStack customer conference in New York City, which was a well organized event with great content delivered by subject matter experts, including many New Relic customers. It was my first engagement with the New Relic team and a good opportunity to take an in-depth look at the world of visibility and analytics top-down from the perspective of application performance monitoring (APM).
New Relic is a fast growing leader in the APM market, with revenue of $263.5 million in fiscal 2017, up 45% from fiscal 2016. More than 16,000 customers worldwide use New Relic’s SaaS-based product suite, including 40% of the Fortune 100. Company founder and CEO Lew Cirne was a pioneer in the modern APM market, founding Wily Technology almost 20 years ago. It was refreshing to hear that Lew is still a developer at heart and takes regular week long sabbaticals to work on ideas for new products.
New Relic offers a complementary set of products that serve as a “Digital Intelligence Platform” across three inter-related domains: digital user experience, application performance and infrastructure monitoring. The company’s core technology and expertise is embodied in its APM product line, which is used to instrument applications written in the leading programming languages and running across a wide range of execution environments. In his keynote, Lew emphasized that New Relic’s approach is to “instrument everything” so that DevOps teams always have full visibility into the behavior and performance of all applications. He noted that the old rule was nothing goes into production without a full QA cycle, but the new rule is no application should be deployed without complete instrumentation.
New Relic also provides several products for monitoring user experience by instrumenting mobile applications and browsers, including synthetic monitoring solutions that can proactively detect problems before users are impacted. Last year, the company moved into infrastructure monitoring that extends beyond basic server/OS monitoring to integrate a wide range of cloud-native application services provided by AWS and Microsoft Azure. Together, the full suite of New Relic products enables development and IT operations teams to see a complete picture of application behavior and performance from the end point to the execution environment and the underlying service infrastructure.
How does New Relic make sense of all the metrics and event data that are extracted using this ubiquitous instrumentation? “Applied intelligence” is the other side of the “instrument everything” coin, and this is where New Relic is doing impressive work with Big Data and real-time analytics. The company operates its own cloud infrastructure to deliver SaaS-based services to its customers. In order to be able to ingest, process and store the massive amount of metric and event data collected from customer applications, New Relic built its own high performance, multi-tenant, Big Data database from the ground up. The system currently processes on average 1.5 BILLION metrics and events per MINUTE. That’s a whole lot of data and speaks to why I believe SaaS-based analytics is the preferred approach for the vast majority of Big Data monitoring solutions, for several reasons.
First, SaaS solutions have significantly lower up front costs and can be deployed rapidly. Second, the elastic nature of the cloud allows the customer to rapidly scale monitoring, on-demand. Third, Big Data technology is a moving target and a SaaS solution shields the customer from having to deal with software updates and hardware upgrades, in addition to possible technology obsolescence. Last, and perhaps most importantly, since applications are migrating to the cloud, monitoring and analytics should follow. Given the option of a cloud-based Big Data monitoring solution, I can’t think of a good reason why mainstream enterprise IT organizations would choose to deploy on-premise.
Visibility into applied intelligence is provided by New Relic’s Insights product for visualizing application insights, including user-customizable dashboards that were showcased by customers in the main tent session that concluded the conference. Under the hood, New Relic has employed advanced statistical analysis and other techniques for correlating data extracted from user experience, application and infrastructure monitoring.
One example is RADAR, a new Insights feature that was introduced at FutureStack. RADAR “looks ahead and behind” to automatically glean useful intelligence for “situational awareness” that might not be readily apparent to customers looking at the usual dashboards. The analytics software acts like an intelligent assistant, constantly searching for anomalies and conditions that the customer might overlook or not discover until it’s too late. Not necessarily AI in the strictest use of the term, but certainly just as helpful.
FutureStack was also a great forum for learning how many leading enterprise IT organizations are embracing DevOps for application deployments spanning hybrid and multi-cloud environments, but I’ll wrap up my thoughts on FutureStack in my next post with a closer look at this far-reaching trend and its market impact.