Skip navigation

Multi-Dimensional Video Quality Management

Jim Welch,

Senior Consulting Engineer

IneoQuest Technologies

 

Tom Skinner,

Engineering Product Manager

Time Warner Cable

 

Introduction

 Download PDF

 

Cable service providers face an increasingly competitive landscape with the growth of video offerings from Telco’s and satellite service providers. Successful, reliable high quality delivery of thousands of video streams through dozens of network processing nodes in a regional delivery area to hundreds of thousands of subscribers requires an automated quality assurance system. The quality assurance system must correlate per-program impairment events throughout the network to provide an intelligent view of performance data to operations personnel in order to minimize maintenance expenses in detecting, locating, and repairing faults. Today’s modern dynamic and growing physical IP plant deployments along with a growing subscriber base demands nothing less to maintain a provider’s competitiveness.

 

An efficient, effective operational Quality Assurance system is not optional for a service provider’s modern high volume video over IP network. For most deployed video over IP system designs, even a single packet loss on a viewed stream results in a customer perceivable impairment. To keep costs in check, an operator must know immediately if transient issues occur, must know where they occur, and must have network visualization tools that allow correlating issues for fault isolation and for cost effective troubleshooting dispatch. Without such tools, the operations staff is reduced to best guesses, random interconnect and device replacement, switch/router configuration and tuning with no way to measure results, and the good will of customers to put up with unsatisfactory video results along the way – a formula for high costs and customer churn.

 

This paper outlines a major MSO’s regional network distribution architecture and shows how a high volume distributed continuous program monitoring and analysis system provides a new, logical approach to an end-to-end solution for fault detection and isolation while minimizing operational costs.

 

Highly Available Redundant Architecture Presents New Troubleshooting Challenges

 

Per Figure 1, a network with multiple program origination points or headends, has been architected to serve five regions. Two of these five headends have been equipped with state of the art broadcast equipment designed to operate as an autonomous redundant pair distributing programming to all five headends. Each location has its own switching mechanism in place to select the east or west path as the active source for use in its local serving area. In the event of a failure in the primary path, the backup will be switched online. This level of protection increases the complexity of monitoring as well as fault isolation to provide all five areas the highest programming availability.

 

 

 

Figure 1: Five headend locations, each with their own redundant interconnect for high availability

 

The five headends will serve as the local centers for distribution to multiple remote systems which then distribute programs to hub sites with even more attached nodes. While the headends could be designed with around-the-clock support, costs are prohibitive to extend full-time support at the hubs and the nodes are designed to operate unmanned. This is done, of course, to minimize costs to customers.

 

Consider a scenario where a customer calls into the Call Center reporting video impairments at his home or business location deep in the network. His report simply states occasional blocky pictures. The Call Center is able to reliably determine that the customer equipment is not at issue and informs the Regional Network Operations Center (RNOC) of the issue. Total traffic loss will typically be identified by alarm reporting agents via SNMP to the NOC or by polling the switching fabric. However, the impairment reported is “blocky video” and this level of alarm reporting will not be of much use to identify what could be causing the issue. The closest engineer is thirty minutes away from the first point in the path to that customer’s house. Several engineers could be dispatched to multiple locations to help reduce time but that’s costly. And, if the reported incidence is only periodic, even with several engineers at multiple locations, it may take an extended period to even see the problem much less resolve it.

 

Ironically, further exacerbating this situation is the very technology that makes possible today’s cost effective video over IP networks; namely, MPEG compression. By effectively and efficiently eliminating any bits unneeded to transmit a quality image, MPEG compression has correspondingly required that every packet in a compressed flow is now essential. Therefore, nearly any dropped packet, no matter where it is dropped in the transmission network, results in a customer perceivable artifact. Thus, even small magnitude transient loss events result in customer impacting events.

 

This situation is quite disturbing to the customer. While he may tolerate a brief interruption occasionally, an extended event or too frequent events could push him to the point he considers switching to a competitor. Because multiple personnel may have been dispatched, operating costs increase. The cost of the engineer’s time from their normal job function adds even more. Unchecked, such costs could spiral out of control. Such operational costs could lead to tying up funds that should be funding plant growth and adding subscribers further contributing to the problem by dragging down revenue growth.

 

 

 

Figure 2: Complex, redundant, modern IP networks can make transient fault locating time consuming and costly without distributed continuous program monitoring tools.

 

The need to identify where in the network the last known good program source is present is critical to dispatching the correct number of appropriate personnel as well as sending them to the right location. Triage of the network is needed to ascertain this information. Having the right test equipment installed and operating at various points on the network will enable a quick diagnosis keeping customers’ stream quality high while reducing time-to-repair and reducing costs.

 

The concept is straightforward and applied every day in detecting and isolating faults in simpler, localized systems. As illustrated in the figure below, program quality is first verified at the headend through deep packet inspection including Transport Stream parameter integrity and then key locations in the distribution network are examined for loss and jitter using the Media Delivery Index (MDI) metric which indicates where the transport fault begins occurring. This identifies the location for repairs.

 

Extending this time-honored divide and conquer debug strategy by continuously monitoring all streams with a range of distributed monitors, the time consuming and costly manual detection and locating procedure is automated with distributed monitoring probes whose results are continuously correlated in the centralized video management system. Automation makes possible the detection and logging of even the most infrequent, transient events. Correlation of such events pinpoints the problem location and nature of the fault. Appropriate technical resources can be directed specifically to the problem area and the performance results are continuously available via trending charts to know positively whether repairs were successful. Continuous network performance measurement feedback thus leads to an ever improving network with decreasing operational costs.

 

Figure 3: Distributed Continuous Program Monitoring yields the information needed to verify stream quality at the ingest and to expedite fault isolation and repair in the transport network

 

Centralized Correlation and Display of Monitoring and Analysis Data

 

A distributed continuous program monitoring solution gives the operator a view of the multidimensional data cube which consists of

 

• the status of all programs,

• at all locations,

• and available continuously over time.

 

These data arrays created from the continuous flow of reported data from network probes are the essential and critical needed information to achieving and maintaining a high availability video over IP network. They represent large amounts of data to which the operator needs ready and simple access. The information needs to be presented in ways that the operator can be alerted to faults and be able to immediately identify what programs are affected, how they are affected, where they are affected, and exactly when.

 

As shown in Figure 4, the video management system summarizes and presents the data in an easily usable form. This screen shot is a real time view. All programs are identified through an easy to identify, user-specified alias as well as IP addresses and port numbers. Both flow and program status, as identified with pre-set measurement thresholds, indicate green/red – good/bad status. Impairment counts over time intervals highlight the magnitudes of faults. A 24 hour flow history (lower right) is always available indicating exactly how many, exactly where, exactly when, and exactly which flows have been impaired. By tracking cumulative jitter excursions through MDI, excessive momentary transport device queue utilization events can be identified and used as early warning indicators of problems before customer-perceivable loss actually occurs.

 

Figure 4: Real Time View of a remotely monitored network location shows current status of all flows with a 24 hour history. Any status class may be further examined in detail through a simple mouse click. Correlation with status information from other locations and other flows rapidly points the operator to trouble spots.

 

Each such monitor point’s results are easily correlated in the view below showing the status of each flow at each progressive location thus pointing to a trouble spot. Green/Red probe, flow, and program status simplifies the use of powerful distributed tools for effective trouble locating.

 

Figure 5: Real-time status view of multiple simultaneously monitored locations points to trouble spots

 

The major MSO strives to maintain a quality customer experience while maintaining reasonable operating costs. Continuous distributed program monitoring identifies any distribution issues before the customer even knows that there may be an issue developing as customers expect. The same tools that continuously monitor all program flows at strategic locations are used to identify an issue’s origination location expediting the dispatch of the right resources to the right location.

 

Continuous distributed program monitoring provides the long term trending results making possible a structured, continuous improvement program creating a cost-reduction process over time rather than operational cost increases with growing plant complexity.

 

In a typical program flow from City 1 to City 4, an MPEG-2 program stream is received via satellite in City 1 and is multiplexed with other programs to form a Multiple Program Transport Stream, MPTS. This multicast MPTS is then combined in a switch that will aggregate other multicast MPTSs as well as Single Program Transport Streams (SPTSs) to send around the Regional Transport Network (RTN) for all areas to use. Flowing from east to west, the MPTSs travel through City 2 and City 3 to the City 4 headend. They continue through City 5 back to City 1 completing the ring. The received MPTSs are sent to all local hub locations where they are modulated and up-converted to RF for transporting on the HFC network. The HFC network is then split into smaller nodes through neighborhoods arriving at the customer’s home. Each of these interconnects presents an opportunity to introduce errors into the MPEG stream. Having the ability to identify what the error is and where it originates is vital in determining where to send an engineer. Without remotely operated continuous program monitoring test equipment, each location would require an engineer on site to troubleshoot the issue adding unnecessary time and costs.

 

 

Figure 6: Distributed monitoring and analysis probes saves maintenance costs by aiding engineers find issues quickly and cost effectively. This active view shows state of all flows at all probe locations all the time.

 

Summary

 

Distributed Continuous program monitoring provides the infrastructure to establish detailed, reliable feedback on quality and service assurance. This is a necessary component in establishing an organizational culture of quality delivery and continual cost reduction. The best intentions of the most capable staff cannot succeed in containing costs and improving quality without accurate product quality metrics in any business – especially in the dynamic and transport-sensitive video delivery industry. A continuous real time product like video must use continuous real time feedback metrics that reflect the continuous viewing aspect of customer use.

 

Continuous real time metrics obtained from thousands of streams generates large volumes of data. Visualizing key trends and locating specific faults through a centralized intelligent Video Management System display facilitates closes the feedback loop for quality improvement and provides the tools needed to enable the operations staff to produce results that show on the bottom line.

 

The major MSO’s State-of-the-Art regional video service provider systems described here shows the flexibility and power of modern video over IP delivery capability. Combined with distributed continuous monitoring and visualization tools for comprehensive fault detection and location capabilities, the delivery systems can grow to support virtually unlimited subscribers while assuring system quality and without ballooning operations costs.



Download (pdf, 1 MB)
IQPinPoint™

IQPinPoint™ Multi-Dimensional Video Quality Management™

<B>Switched Digital Video Solutions</B>

Switched Digital Video Solutions Solutions for monitoring, testing and validating SDV components and networks

IPTV Quality and Service Assurance

IPTV Quality and Service Assurance Solutions to predict, detect, isolate and solve network issues

Field Analysis & Troubleshooting

Field Analysis & Troubleshooting Portable solutions for IPTV Service Providers and Field Engineers

Lab - Design - Verification - Test

Lab - Design - Verification - Test Tools for component design to QA IP switches and network components.