Essential Performance Tips For SAS Visual Analytics: Meera Venkataramani, SAS Institute Inc
Essential Performance Tips For SAS Visual Analytics: Meera Venkataramani, SAS Institute Inc
ABSTRACT
INTRODUCTION
Identifying performance issues distributed across SAS Viya is a challenging task and it will
need a step-by-step approach to identify the cause of the bottleneck. When it comes to
performance, there are a myriad of factors involved with each one contributing in its own
way to the overall problem. Sometimes, it’s like finding a needle in a haystack. Isolating the
various factors is the key.
This paper provides guidelines and basic validation tests to identify what and how to solve
these issues. Eliminating and identifying what is SAS and what is outside SAS early on gives
us an edge. We have seen over the years that everyone has a different approach to solving
performance issues and everyone has their own set of steps and tools they use to start this
evaluation. This paper is an effort to gather, review, test, and document all of these tools in
one place and to provide a step-by-step framework to solve performance issues.
1
performance or system downtime. Applications must be fully functional, efficient, and
readily available around-the-clock to deliver the most optimal performance and a high-
quality experience at all times to their customers. Would you like it if your Netflix movie
took far too much time buffering and occasionally it might freeze and drop out?
SAS is committed to delivering the best optimum performance with our application to our
customers. However, dealing with system performance is a joint responsibility between
SAS, the customer, the customer’s IT organization, and even other third-party service
providers. Customers expect SAS to understand how to get the best performance, even
though we don’t control all aspects of the issue. Solving such problems often involves
getting several teams in a room to work through the various layers of the application and
infrastructure.
2
SAS VIYA ARCHITECTURE
Not only does SAS Viya bring exciting advancements in high-performance analytics, it
introduces several modern practices to SAS software architecture. SAS Viya brings a more
resilient, elastic, unified, and open architecture, which leverages cloud-friendly
microservices and a next generation analytics run-time engine.
To administer SAS Viya, you should have a good understanding of each of the following
components, gain a better understanding of the components of the SAS Viya architecture,
and how they can be collectively managed to keep your environment available, secure, and
performant for the users and processes you support. Your system is only as good as the
weakest link in your chain. For SAS Viya to perform optimally, each of the components in
SAS Viya should also be performing optimally. Figure 1 shows a high-level architectural view
of SAS Viya.
IT INFRASTRUCTURE
Let’s see an example of what a customer’s IT infrastructure might look like.
Figure 2 illustrates how a modern organization might function in a global environment. It
provides a high-level map or plan of the information assets in an organization, which guides
the current operations. Notice the following in the figure:
• A front-end web server talks with an application server that talks with a middleware
server that queries one or more database servers.
• Then, all of those servers might talk with DNS servers to look up IP addresses or
map them back to server names.
When that happens, just one weak link slows the whole application down.
3
In order for SAS Viya to perform optimally – unless the underlying infrastructure that SAS Viya
depends upon, is responsive on all the different components of the application service, the entire
application service is impacted.
4
One of the biggest factors that impacts application performance is design. Performance
must be designed in. When applications are specified, performance goals need to be
delineated along with the details of the environment the applications will run in. Often
development is left out of this and applications are monitored, analyzed and “fixed” after
they are released into production. This never works as well as when performance is one of
the key goals of the application design before a line of code is written.
Hardware Sizing
Hardware sizing is never an exact science, particularly when you consider the potential
these days for database and user population growth. So, when you receive a sizing from
SAS consider going over it with your SAS representative and confirming whether the sizing
takes into account your best estimates for storage and usage growth in both the near term
and the foreseeable future. All customers should have plans in place to manage application
growth. Make sure you have adequately accounted for memory, CPU, and disk space.
Virtualization
In addition to latency and bandwidth being an issue, there is a good chance that customers
are running SAS software on virtualized hardware that is shared with other applications. It
is very challenging in this dynamic environment to understand what will impact your
application performance, as it requires intimate knowledge of your ever-changing
application structure at any given moment. For hardware or virtual machines, virtualization
usually adds an overhead.
5
The modern application is complex, and a single transaction trace can sprawl across many
layers in a virtualized, cloud world – a perfect storm impacting application performance.
This growing complexity impacts application performance from the end-user experience all
the way back through transactions, the application layer, application infrastructure, and IT
infrastructure. The IT team should be able to run tests to validate that the virtualized
hardware is not overcommitted and that resources are used optimally.
Browsers
Today’s web-based applications tend to push user-interaction work — often accompanied by
lots of data — to the client workstation. From there, JavaScript code processes hundreds or
thousands of rows of data, which can cause multi-second pauses before the client displays
the updates.
A web browser is the key blind spot for gaining true end-to-end visibility into application
performance. With new approaches to application design and the increased usage of web
services, the ability to monitor the processing that takes place within the browser has
become one of the key requirements for full visibility into application performance.
Applications perform differently in different browsers. Adequate testing and use cases
should be validated using different browsers on different operating systems to make sure
that your application performs optimally on all of the supported browsers.
INFRASTRUCTURE TESTS
Now that you have deployed SAS Viya, everything looks OK in your development and test
environment, system is rolled out to production but you are starting to see problems. Tests
that once seemed to operate smoothly in the lab are showing degraded performance and
performing sluggishly. End users are complaining. Your boss is upset. The pressure builds
for you to finally fix that slow application everyone depends upon. Where do you start?
Start by performing system-level baseline tests to rule out issues related to network and
connectivity. Performing system-level baseline tests (outside of SAS Viya) to assess the
heath of your system infrastructure is key in moving further with the diagnosis and will help
save a tremendous amount of time. These tests allow us to rule out basic obvious issues
associated with memory, network connectivity, latency, bandwidth, input/output, and disk
space requirements. Let us look at some of these tests in greater detail.
6
To run the test, go to your SAS Cloud Analytics Services (CAS) controller node and issue the
following command:
export GRIDHOST=<name of controller node>
export GRIDINSTALLLOC=/opt/sas/viya/home/SASFoundation/utilities
export GRIDRSHCOMMAND=/usr/bin/ssh
export TKPATH=/opt/sas/viya/home/SASFoundation/sasexe
cd /opt/sas/viya/home/SASFoundation/utilities/bin
./tkgridperf
Note: This example assumes that SAS is installed is under /opt/sas
The tkgridperf network test returns 3 results. Let us look at an example output from one of
my systems that has one controller and three worker nodes.
bash@system02>./tkgridperf
Grid initialized with 4 machines
Time for bcast(100M bytes, 20 times): 6013 ms
Time for reduce(max, 4 bytes, 10K times): 5220 ms
Time for allreduce(max, 4 bytes, 10K times): 5156 ms
The bcast test sends 100M of data to all nodes 20 times. This is a test of throughput
between nodes. This is more a measurement of bandwidth, because it's a big file, few
times. The reduce test gathers 4 bytes from each node to the controller 10,000 times. This
is a small amount of data, so this is mostly a latency test. The allreduce test gathers 4 bytes
to all nodes 10,000 times. Again, this is a latency test.
The reduce and allreduce tests use really tiny data (4bytes), but it's 10,000 times. It tests
latency, so if the latency is high, the value in milliseconds will be high. Ideally you want the
value in milliseconds to be as low as possible.
The tkgridperf test will not tell you which node is slow. They will only tell you if the CAS grid
as a whole is working well together. If you get a bad result and want to narrow down the
results to find out which node is slow, you could start by lowering the number of worker
nodes with the -procs option. You can also remove nodes from the machine list and try to
isolate the slow node (or nodes) that way.
This test can be used as a first shot to identify basic obvious latency, bandwidth, and other
networking issues. CAS is not used at all in these tests; however, the test uses the same
communication library that CAS uses.
qperf
The Linux command, qperf, is another way to measure network bandwidth and latency
performance. The qperf command works over TCP/IP, RDMA, and many other transports. It
connects two nodes, with one node designated as the server (with no arguments) and a
second node that runs with two arguments. Many options are available.
7
Dstat is handy system administration tool for viewing Linux system resources. It generates
useful system resource statistics in real time. (Dstat replaces multiple Linux tools, including
vmstat, netstat, ifstat, iostat, and mpstat.) Dstat lets you monitor systems while you are
troubleshooting, testing performance tuning, or benchmarking.
gridmon.sh is a console or terminal application that can be run from a Linux terminal or a
terminal emulator like PuTTY. gridmon.sh displays data that is streamed from all of the
machines on your CAS server. It shows information about jobs, individual machines on the
server, and the attached disks. The gridmon application shows the status closer to real
time, and it has some useful additional details, such as the specific percentage of CPU,
memory, and number of ranks (both active and pending).. gridmon.sh also lets you show
ranks, kill jobs, or run the gstack application to collect results
gridmon.sh is a popular tool for anyone wanting to know everything about a SAS Viya
system. You could use all the menu choices of gridman as a cheat sheet. It even has a
record/playback feature so that you can make a recording at a customer site and send it to
SAS Technical Support so that we can analyze it further.
The SAS documentation for gridmon.sh provides a complete list of the available commands.
The descriptions should give you enough information for cases where you are looking to see
for examples of which sessions/process are running which actions and using how much
memory, disk space, and CPU are being used. More difficult problems often require a
combination of detailed machine, operating system, and CAS knowledge. This is where the
record/playback feature comes in. Customers can send their recordings, CAS logs, and a
description of the symptoms to SAS Tech support, so that we at SAS can hopefully provide
a diagnosis, or at least next steps.
Microservices Analysis
SAS Viya is based on a microservices architecture that structures an application as a
collection of loosely coupled services. In a microservices architecture, the services are fine-
grained and the protocols are lightweight. The benefit of decomposing an application into
different smaller services is that it improves modularity. This makes the application easier to
understand, develop, test, and become more resilient to architecture erosion.
We could use many of the operating system-level commands to identify the microservice
that is consuming the largest amount of resources. Commands such as ps, top, and htop
8
can be used to find this information. Once you have identified the most CPU-hungry
process, you can find out which microservice it is.
Here is an example:
Use the ps -ef | grep sas command on the SAS Viya services machine. This gives you a
list of processes and the memory consumption of each of the microservice. If you look
closely, it also gives you the name of the microservice. The JAR name is in the full command
line of the Java process. However, it isn’t always visible when you use the top command.
Then you can use the JAR name to identify which microservices are consuming which
resources.
9
HTML5, JavaScript, and CSS improvements, more and more logic and behavior have been
pushed down to the client. This adds to the overall perceived performance of website or web
application.
There are several ways you could go deeper into the SAS Visual Analytics application layer
itself, such as analyzing the SAS Visual Analytics logs from a SAS Viya system inspecting
the web page using browsers development tools, or running a network sniffing tool such as
Wireshark or Fiddler.
Developer Tools
All modern browsers and most other environments support debugging tools, a special UI in
an application that makes debugging much easier. Developer tools enable you to trace the
code step-by-step to see what exactly is going on. These are all tools that are built into the
browser and do not require additional modules or configuration. If you are ambitious and
curious, these tools can give you a lot of insight into how your SAS Visual Analytics
application is behaving.
You can download Fiddler from here: www.telerik.com/fiddler. Installation is very easy and
straightforward. Once you’ve installed Fiddler, open it, and in the upper left corner select
File > Capture Traffic to start capturing HTTP traffic.
10
SAS TECHNICAL SUPPORT
SAS Technical Support’s mission is to help our customers make the best use of our software
products through effective and responsive support, active advocacy, and a broad and
flexible range of self-help resources. Get world class technical support via our SAS tracking
system and please visit our wealth of knowledge-based resources at SAS Technical Support.
Also please visit the SAS Support Communities for help if you are stuck on a problem.
While you're there, get a SAS tip and share what you know. This community of SAS experts
is there to help you to succeed.
CONCLUSION
While each problem is different and brings its own complexity with it, the general guidelines
I’ve laid out and tests I recommended should help you diagnose or get closer to identifying
the bottlenecks in your SAS Viya applications. Most often, precious time is lost from when
the problem starts occurring to when a final diagnosis is made. And, lot of this downtime
and frustration caused by this time loss can be avoided if the obvious basic issues with
infrastructure, network, connectivity, latency, and other such issues are ruled out early on
and we are focusing only on the complex and difficult hard to find issues.
Lot of times the problems are complex and come masked in as something so hairy and
weird that we spend endless wasted amount of time chasing the wrong issues, when a
simple test to check network or connectivity could have given us useful information to begin
with.
The forensic process is almost like watching a mystery thriller and requires you to be
curious and adventurous. It is not for everyone and certainly not for the faint at heart. For
those who do want to be adventurous and embark on this adventure, you get to be the
detective here. Either way, remember SAS Technical Support is always there to help you, so
engage them early in your troubleshooting journey.
REFERENCES
Brown, Tony. 2019. “Engineering CAS Performance Hardware Network, and Storage
Considerations for CAS Servers.” Proceedings of the SAS Global Forum 2019 Conference.
Cary, NC: SAS Institute Inc. https://www.sas.com/content/dam/SAS/support/en/sas-global-
forum-proceedings/2019/3351-2019.pdf.
Ellington, Bryan. 2020. “SAS® Viya® Monitoring Using Open-Source Tools.” Proceedings of
the SAS Global Forum 2020 Conference. Cary, NC: SAS Institute Inc.
https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-
proceedings/2020/4214-2020.pdf.
Kuell, Jim. 2020. “Diagnosing the Most Common SAS® Viya® Performance Problems”
Proceedings of the SAS Global Forum 2020 Conference. Cary, NC: SAS Institute Inc.
https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-
proceedings/2020/4296-2020.pdf.
11
SAS Institute Inc. SAS® Cloud Analytic Services. Cary, NC: SAS Institute Inc.
https://go.documentation.sas.com/?cdcId=calcdc&cdcVersion=3.5&docsetId=calserverscas
&docsetTarget=n08000viyaservers000000admin.htm&locale=en#n08193viyaservers000000
admin.
SAS Institute Inc. SAS Note 42197. “A list of papers useful for troubleshooting system
performance problems.” http://support.sas.com/kb/42/197.html.
CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author at:
Meera Venkataramani
SAS Campus Drive
Cary, NC 27513
SAS Institute Inc.
[email protected]
www.sas.com
SAS and all other SAS Institute Inc. product or service names are registered trademarks or
trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA
registration. Other brand and product names are trademarks of their respective companies.
12