How to handle Memory Pages Per Second is too High?

Some time when you configure SCOM (System Center Operation Manager – Monitoring Tool)  in your environment and configure a SQL Management Pack to monitor “Memory Pages Per Second” then you will get the below alert as mentioned ” Memory Pages Per Second is too High. – The threshold for the Memory\Pages\sec performance counter has been exceeded

Let’s first discuss how this alert is raised by per System Center Operation Manager

Here this counter Pages/sec means it’s the rate at which pages are read from or written to disk to resolve hard page faults. In more detail it identifies the rate of pages read from and written to disk for resolving memory requests for pages that are not in memory. In other words, Pages/Sec represents pages input/sec and pages output/sec. (Pages output represents the pages removed from memory heading to disk.).  If you talk about page fault it happens when the memory manager does not find the block of memory that it needs in the cache. That memory has been swapped out to disk, and this will degrade your Server \SQL server performance.

This rule will generate an alert when the pages/sec exceeds the defined threshold for a sustained period of time and Default is Memory\Pages/Sec >50).

When this performance threshold has been exceeded it often means that the system is low on physical memory because the operating system is reading from the hard disk to recover information from the virtual memory. When the system is in this state the overall system performance is significantly diminished and result in poor operating system and application performance.

Hence please make sure that this rule must be tuned accordingly to the server configuration keeping in mind about the performance capacity of server before it can be used. So before enabling this rule create a baseline Pages/Sec performance signature for each system. Once you create a baseline, set a default threshold that will work for the largest percentage of managed nodes and then use the override feature to define threshold for the exception cases.

From the SQL Server perspective when you receive this kind of alert it’s a primary indicator suggesting that there could be an issue of memory pressure\bottleneck, but that depends on several factors how frequent you are getting this alert from same server and how it is configured, allocated memory to SQL, no. of instances running on server etc.

Note: A high number in the Pages/Sec counter does not necessarily mean that you have a memory problem, it could be a temporary issue but if the same server is generating same kind of alert then please monitor the below mentioned counter over a period of time to confirm if there is a need to add more memory to your system to reduce the number of page faults and improve performance.

Memory – Available M Bytes

SQL Server: Buffer Manager – Page Life Expectancy (Higher is good)

SQL Server: Memory Manager – Memory Grants Pending (Lower is good)

SQL Server: Buffer Manager, Buffer Cache Hit Ratio (higher is better)

SQL Server: Memory Manager – Target Server Memory

SQL Server: Memory Manager – Total Server Memory

SQL Server: SQL Statistics – Batch Requests/sec

SQL Server: SQL Statistics – Compilations/sec

Hence from the Capacity Planning point of view from the performance perspective ,keep watch for an upward trend in the value of Memory\Pages/Sec and add the memory when paging operation absorb more than 20-50 % of your total disk I/O bandwidth

Once a while this kind of alert from SQL is ok you can monitor for some time and resolve if all goes fine, but if the alert is frequent then you have to investigate on the above counters because if could an issue of memory crunch\bottleneck and I would say the trend for these counter values is more important than their absolute value at any given time and If you notice that the average Page Life Expectancy is dropping over time, that is quite significant indication(and not in a good condition).

Conclusion: Don’t ignore this alert if it’s coming frequently from certain servers and during that time on focus on trend obtained from above counters (You can ignore this alert if you are aware that during certain time some heavy activity is going on server and because of that this alert appears and off Couse if customer is not complaining). Also If you are sure that for some server you can’t do anything because their hardware lifecycle is over and you don’t want to spend too much on them then define your threshold again or override this alert with the help of SCOM team in your environment, otherwise if you feel that it’s actually a serious issue on server on the basis of above counters then ask customer to add more memory on their dedicated server but make sure before asking him to add more memory although these days the memory chips are quite cheap but still there is a cost involve.

garmin

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s