
|
05 January 2012
In Part 1 of this two-part series, I laid out the demands that VDI places on contemporary and common WAN environments. I also outlined the reasons why VDI presents new challenges to the WAN and WAN engineers. In this second part, I will give you some techniques of ways to monitor your WAN to find VDI trouble "hotspots" and also give you practical solutions you can implement today without tearing out your existing WAN.
Learn from the *recent* past
If you have assessed your WAN as ready for VDI, then create utilization baselines a month or two before your full deployment of VDI. You can do this by taking snapshots of your WAN utilization for specific periods. Generate a report for an entire month, a weekly report for each week that same month and a daily report for a typical week. The reports will serve as your comparative baseline when you have VDI performance issues and others ask if the WAN was the cause. Overlay your current utilization report for a given period on top of your baseline and see if there are any anomalies.
If you have the budget and resources, look to implement an intelligent network utilization monitor such as Riverbed Cascade or SolarWinds Orion. Both will capture baselines and perform comparisons as part of its normal operation.
The devil is in the details
Utilization charts as mentioned above are useful to find potential hotspots but they won't bring you to a final conclusion. If you don't already have one, implement a network analyzer that captures traffic flow at the protocol and session level. This will allow you to find the culprits that are monopolizing your bandwidth and even tell you what they might be doing. If you do this, remember the following:
• Past-time is as good as real-time - Don't be satisfied with just capturing information for the last 5 minutes. Make sure you allow your analyzer to keep detailed information for at least a week.
• Who did what? - Knowing which device/user was causing the flooding is useful but also track the ports so that you can trace the usage to a particular application.
• Speed up your monitoring system - Don't be cheap when giving your monitoring system resources. You need it to be fast to expedite your diagnosis and find the offenders. There is a reason why police officers learn to drive fast.
Rethink your WAN threshold tolerance
WAN engineers often look for "tabletops" or "plateaus" which indicate over-subscribed WAN lines. Unfortunately, typical WAN utilization graphs show the average across a sampling period. Hence, if your sampling period is 1 minute and the line was maxed out for only 15 of those seconds, you won' t see that the line reached 100% utilization even though the VDI users may have had a frozen screen for the entire 15 seconds. Let's use the following example:
• Sampling period = 1 minute
• WAN line is 100% utilized (maybe because someone is downloading a very large file) for 15 seconds
• For the other 45 seconds, the WAN is consistently at 50%
• Total utilization in your charts will only show 62.5%
In the example above, VDI users across the same WAN connection likely experienced severely degraded response times and may have even experienced screens paused for 15 seconds. If a user complains about pauses for 5 seconds, imagine what colorful words they would scream out about IT for delays of 15 seconds.
You should either use shorter sampling periods (I recommend 15 seconds) or redefine your acceptable bandwidth peak.
Leverage WAN optimization
If you read the first article, you would think this is a contradiction after writing an entire section titled "WAN optimization won't come to the rescue". This is still true. WAN optimization won't be your savior but will still benefit the environment. Below are reasons to consider WAN optimization:
• Reduction of bandwidth reduces contention therefore opening more of the WAN throughput for VDI traffic
• Some WANOp devices (such as the Riverbed Steelhead) provides prioritization allowing you to reserve the bandwidth you need
• More WANOp devices now support compression and caching of VDI protocols such as ICA and RDP. In fact, Riverbed announced recently it would even support Teradici PC-over-IP in an upcoming release.
Implement bandwidth limits
A tactic rarely implemented in VDI environments is the use of bandwidth limits. One of the fears is how it might affect the quality of the desktop and the user's satisfaction. However, allowing uncapped sessions has great potential of affecting the performance of all users except for the one that maxed out the line.
The applications that can cause flooding are not limited to video playback and other graphics-intensive apps. Even something as simple an online game on Yahoo can eat up much of your bandwidth. Try playing some of those Flash games online and see what it does to your ICA or PC-over-IP protocol traffic. The reason you'll see the transmit rate reach well above 500Kbps is due to the fact that the compiled Flash code, though small, acts like video rendering in VDI.
To offset any fears of too low a cap, start with a cap of 1/3 your WAN line speed (i.e. 500Kbps on a T1) and work your way down from there. At the very least, you have now prevented a single user from becoming a bandwidth hog.
Revisit your QoS design
VDI is such a "game changer" that it will be time to review your network QoS design. You likely have Voice and Video-over-IP in the highest available class of service with critical business applications below it (i.e. ERP, email, etc.) then push down non-critical applications such as web traffic. VDI deserves a higher priority than even the critical business applications or at least should be classified as such. The reason for this is that VDI traffic directly impacts the user's perception of the performance of their applications.
Contemporary, non-virtualized applications are usually designed to assume distance between the app and the server and uses techniques such as local caching and session state management. One example of this is Outlook. Outlook once required a direct and fast connection to the Exchange server. By default, Outlook now utilizes the OST (offline, local database) then uses background synchronization which hides any network latencies from the user for most Outlook operations.
However, VDI requires an acknowledgement and response for almost every user operation which creates a higher dependency on network performance. In fact, latency has an even more dramatic effect on VDI than it does for VoIP because it uses TCP and not UDP which is designed to accept and account for packet loss, jitter and drops.
Review your QoS design to accommodate for this new demand.
WAN engineers have new challenges when the CIO decides to put VDI on the network. They have to be more responsive and exercise deeper inspection and control of their networks. If you think Voice-over-IP created new demands that required changes in the way we design and manage our networks, wait to see what VDI will do on the network. It is more demanding and directly impacts the user's perception of IT infrastructure performance. These techniques will hopefully prepare you for these new demands and challenges created by VDI.




