disk
server
content
document
prizm
hardware
P a g e | 1 Prizm Content Connect v8 Server Sizing Guide Version 1.0 03/03/2014 Copyright © 2014 P a g e | 2 Table of Contents Prizm Content Connect v8 Server Sizing Guide ........................................................................... 1  Table of Contents ......................................................................................................................... 2  Summary ...................................................................................................................................... 3  Benchmark Configuration Details ................................................................................................. 3  Determining Targets ................................................................................................................. 3  Figure 1 ..................................................................................................................................... 4  Environments Examined ........................................................................................................... 4  Windows Testing Notes ......................................................................................................... 4  Linux Testing Notes ............................................................................................................... 5  Hardware Configurations .......................................................................................................... 5  Server 1 Configuration (Entry Level) ..................................................................................... 6  Server 2 Configuration (High End) ........................................................................................ 6  Document Data Set ................................................................................................................... 7  Benchmark Findings ..................................................................................................................... 7  Resource Constraints ............................................................................................................... 7  Disk IO ................................................................................................................................... 7  CPU ....................................................................................................................................... 8  RAM ...................................................................................................................................... 8  Disk Space ............................................................................................................................ 9  Benchmark Results ................................................................................................................. 10  Server 1 (Entry Level) ......................................................................................................... 10  Server 2 (High End) ............................................................................................................. 10  P a g e | 3 Summary Determining the amount of hardware required to run the Prizm Content Connect service (PCCIS) to support your user base is a complex problem. To help determine the server hardware requirements to support a given number of document views per minute, this document provides a summary of the findings from our analytical performance testing on our benchmark systems. In addition to the metrics defined here, this document also outlines the major factors that affect the performance of Prizm Content Connect. As your user base increases understanding what changes in your environment can have the greatest impact on the performance of Prizm Content Connect is critical to making any changes in your environment. The data provide in this document will be in terms of Viewing Sessions. When viewing content in Prizm Content Connect using the RESTful PCCIS API, you will create a unique Viewing Session for each piece of content being viewed. To simplify the concept you may think of a viewing session as a Document view. Benchmark Configuration Details Determining how to get the most benefit from a given set of hardware is a somewhat complex process. At some point system resources become burdened and begin to adversely affect the system’s performance and ability to server content to the end user. In this section we outline the process that we followed to determine where the “sweet spot†for a server resides. Determining Targets In order to determine when a particular server has reached its practical limit for handling PCCIS traffic we will use a couple of PCCIS metrics: ◠Seconds to Convert Office Document to PDF ◠Seconds to Generate Page 1 The goal for these metrics is to find the rate of viewing session creation when the time to generate page 1 and/or convert the entire document becomes higher than desired. Determining this point will identify the load level where the server will begin to become less efficient. While the system will run and process content beyond this point, performance may be adversely affected. The graph below illustrates the idea of this concept. P a g e | 4 Figure 1 Additionally, performance measurements gathered from utilities on the server itself were used to ensure a particular resource was not overburdened. The metrics used include: ◠Average CPU Usage (total across all cores/CPUs) ◠Current Disk Queue Length ◠Average Disk Queue Length ◠Disk Utilization % Environments Examined As the Prizm Content Connect product supports both Windows and Linux environments testing needed to be performed on both platforms. The details below outline the environments where we benchmarked the product and additional tools we used to gather data. These environments are the most common environments used by the current customer base. Windows Testing Notes Operating System: Windows 2008 R2 Data Gathering Tools: Performance Monitor Data: ◠Average Disk Queue Length (Physical Disk) - Derived value calculated from Disk Transfer/sec * Disk sec/Transfer. Essentially the “%Disk Time†counter multiplied by 100. ◠Current Disk Queue Length (Physical Disk) - The actual disk queue length at time of sampling. P a g e | 5 ◠% Processor Time - The percentage of time the processor was busy during the sampling interval Linux Testing Notes Operating System: CentOS x86_64 6.4 Data Gathering Tools: iostat • Command: “iostat -xkd 30 dm-0 > disk_activity.log†Data: ◠avgqu-sz - The average disk queue size over the measurement period. The queue contains IO requests waiting to be serviced. ◠await - The average total time an IO request takes to complete. This includes the time waiting in queue and service time. ◠svctm - The average time an IO request takes to be serviced. This time does NOT include queue wait time. Queue wait time = await - svctm. ◠%util - The percentage of time that the device spent servicing requests. Hardware Configurations As the hardware used to host Prizm Content Connect can vary, we examined the product on two environments in order to determine a sense of scale in performance as hardware improves on the host system. The two systems that we chose to examine represent two different ends of the server spectrum. The Server 1 was selected as it represents an “Entry Level†server, while Server 2 was selected to represent more advanced servers. The details of each server are outlined below. P a g e | 6 Server 1 Configuration (Entry Level) System Manufacturer: Dell Inc. System Model: PowerEdge R420 System Type: x64-based PC Processor: Intel(R) Xeon(R) CPU E5-2407 0 @ 2.20GHz, 2200 Mhz, 4 Core(s), 4 Logical Processor(s) Processor: Intel(R) Xeon(R) CPU E5-2407 0 @ 2.20GHz, 2200 Mhz, 4 Core(s), 4 Logical Processor(s) Hyper Threading: Disabled BIOS Version/Date: Dell Inc. 1.5.2, 3/11/2013 Installed Physical Memory (RAM): 16.0 GB Disk Model: SEAGATE ST300MM0006 SCSI Disk Device Disk Speed: 10000 RPM Disk Size: 300 GB Server 2 Configuration (High End) System Type: x64-based PC Processor: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz, 10 Core(s), 20 logical Processor(s) Processor: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz, 10 Core(s), 20 logical Processor(s) Hyper Threading: Enabled Installed Physical Memory (RAM): 198276396 kB Shared Memory: tmpfs, 95G Disk Model: TOSHIBA MK1002TSKB Disk Speed: 7200 RPM Disk Size: 1 TB P a g e | 7 Document Data Set Prizm Content Connect can process many different types of files. The Majority of our clients are displaying a common set of file types in the Prizm Content Connect HTML5 Viewer. The breakdown we used in our benchmark tests is based on the percentages used by some of our larger clients. Content Format Number Processed Percentage of Processed PDF 999 45% MS Word, DOC(X) 608 27% MS PowerPoint, PPT(X) 302 14% MS Excel, XLS(X) 299 14% RTF 4 < 1% HTML 2 < 1% Total 2214 Benchmark Findings Within this section of the document we will discuss how different hardware characteristics can affect the processing power of Prizm Content Connect. The characteristics outlined in this document have the greatest impact on a system’s ability to process content for display. Resource Constraints There is not one single piece of hardware that will address any performance concerns that you might have. In order to avoid underutilizing your hardware Accusoft recommends keeping these different characteristics in balance within your environment. Disk IO In its default configuration on common hardware of the day, the biggest limiting resource for PCCIS is most likely to be disk IO. PCCIS caches a considerable amount P a g e | 8 of data on a local drive consisting of document data, both original and converted, as well as state and other information. As the load increases on a server, the data being written and read from disk will continue to grow with each new request. We often see customers proposing hardware configurations that possess multiple core and thread CPUs but only a single, ~7200 RPM Hard Disk Drive. A single IO device in this situation can quickly become overburdened; causing the queue length for the IO device to grow to the point where much of the actual time used for document conversions is spent waiting for IO requests. There are options today that can greatly improve IO transfer rates: • SSD drives: A good option to minimize the IO wait time, allowing for the use of high-end multi-core processors to extract the most performance out of PCC. • Shared Memory Drive (Linux): In this case, a chunk of the available RAM is shared and appears as a mounted file system. This storage is not persistent, but this is perfectly acceptable for many parts of the PCCIS cache. Maximizing the number of Disk I/O operations your system can perform will be the most beneficial investment you can make, in order to allow your system hosting Prizm Content Connect to perform at its peak. CPU The CPU is another resource that should be considered when determining hardware requirements for PCC. Assuming PCC performance is not bottlenecked by Disk IO (either SSD, Shared Memory, or other Disk IO optimization is in use), the ideal range for CPU usage is about 60% total, across all available logical cores. The 60% target is a guideline to where system activity should be for the system to be at peak productivity, as referenced by the yellow line in Figure 1 above. RAM Through our experimentation and testing we recommend a server contain 2GB of Memory per Physical CPU core for use by Prizm Content Connect for processing. Any Memory considerations for a RAM Disk, Shared Memory Volume, should be made in addition to this recommendation. P a g e | 9 Disk Space As the content being processed by the system can vary greatly in size, the demands on Cache space required by Prizm Content Connect can vary accordingly. The recommendations within this section are based on averages we have determined from product usage reports. The minimum required disk space should be calculated with the following formula: 75MB x Viewing Session/Minute x (Viewing Session Timeout Minutes + Cache Expiration Minutes) Example (Benchmark server, Linux): 75 x 10 x (20 + 20) = 60,000 (60 GB) P a g e | 10 Benchmark Results Server 1 (Entry Level) Platform/OS Viewing Sessions/Minute Conversion Service Requests/Minute Seconds to Generate Page 1 Seconds to Convert Office Document to PDF Microsoft Windows Server 2008 R2 Datacenter 4 180 4 7 Linux CentOS 6.4 12 300 4 7 Server 2 (High End) Platform/OS Viewing Sessions/Minute Conversion Service Requests/Minute Seconds to Generate Page 1 Seconds to Convert Office Document to PDF‡ Microsoft Windows Server 2008 R2 Datacenter* 10 240 2 5 Linux CentOS 6.4 30 720 2 5 Linux CentOS 6.4** 36 1200 2 5 * Estimated based on predetermined ratio of performance for Windows to Linux. ** Using Shared Memory for all PCCIS and Conversion Service cache directories with Proxy Server still using HDD. ‡ Data  in  this  column  is  provided  for  conversion  services  sizing  estimates  and  is  not  relevant  to  viewing  performance.