Saturday, July 28, 2012

Maximus V Formula Overclocking Guide

This guide covers UEFI BIOS tweaking and overclocking tuning for the Maximus V Formula series. Most of the available UEFI options are similar to other motherboards from the ASUS Z77 family, with the exception of extra voltage controls and memory profiles – both of which provide extra overclocking margin and ease-of-use. We cover most of the available functions below, and provide a brief description of what each does and when to adjust (where applicable).
All of the overclocking related action takes place within the AI Tweaker Menu (UEFI Advanced Mode):

Load Gamer’ OC profile:

This will load a preconfigured overclock suitable for a 24/7 system with adequate cooling.

Load Ivy Bridge LN2 Extreme OC Profile 1 & 2:

These options are available when the onboard LN2 jumper is set to enabled. Both presets are for use with LN2 cooling only as the voltages and settings used require the processor be sub-zero cooled for both safety and stability purposes. Do NOT use these settings with air or water cooled processors.

Ai Overclock Tuner:

Options are Auto, Manual and X.M.P.
Auto: This is the default setting, and needs to be changed to Manual if you wish to change BCLK (BCLK is the base reference frequency from which processor and other system bus frequencies are derived).

X.M.P:

Extreme memory profile, use this option if you have Sandy Bridge/Ivy Bridge qualified XMP memory.
X.M.P. profiles contain pre-sets for system buses and in some cases voltages. If the specified speed of the DIMMs is greater than the supported memory frequency of the platform, a platform specific X.M.P. profile option becomes mandatory because processor core and memory controller voltage requirements vary from architecture to architecture.
High-speed enthusiast memory kits manufactured before the inception of the Sandybridge/Ivy Bridge platforms may not contain the necessary/adequate voltage offset settings for the system to be completely stable. In such instances, manual adjustments of memory controller voltage and memory timings may be necessary.
It is also wise to purchase a single memory kit rated at the density and timings you wish to run rather than combining multiple kits to make up that density. The XMP profile and memory module SPD is configured by the memory vendor for a single kit only and does not take into account timing and voltage offsets that may be required for two or more kits to operate in tandem. One of the reasons that high frequency high density kits are more expensive than their lower density counterparts (even when the operating frequency and IC used is the same) is because the binning process at higher densities is more stringent – only a few ICs make the grade. Making a wise investment here will save frustration later on.
A final note on memory purchasing; Sandybridge processors are binned to run DDR3-1333 speeds at stock voltages (CAS 9). Higher operating frequencies are defined as overclocked, so voltage requirements and overall stability will vary from CPU to CPU. The same rule applies to Ivy Bridge processors, although the margin has been extended to DDR3-1600 at stock voltages for these CPUs. As such, unconditional stability at higher operating frequencies cannot be guaranteed and will vary between processor samples.

BCLK Frequency:

This function becomes available if X.M.P or Ai Overclock Tuner “Manual” are selected. The base BCLK frequency is 100MHz. The CPU core frequency is derived via multiplication with the Turbo Ratio setting (final frequency is displayed at the top-left of the Ai Tweaker menu). BCLK also adjusts memory operating frequency in association with the applied memory ratio (Memory Frequency and CPU bus speed: DRAM speed ratio mode settings below).
Bear in mind that the adjustment margin for this setting is not large – most processors have a range of 7 MHz +/- the base frequency, although there are some processor samples that can exceed this.
To make things easier, we’ve included an auto calculator which displays the target CPU and memory bus frequency for you in the top left area of the AI Tweaker menu:

Turbo Ratio:

Options are “Auto”, “By All Cores” and “By Per Core”. A description of these settings is provided in the right-hand column of the UEFI BIOS and can be seen when the Turbo Ratio setting is selected.
By All Cores: This sets the CPU core frequency multiplier; multiplied by BCLK to give the target CPU frequency (under full load conditions if SpeedStep is active). “Auto” = stock CPU multiplier Ratio used. Manual numerical entry of the desired Turbo Ratio is accepted. *
Per Core: Allows setting the maximum Turbo multiplier of each physical processor core.*
*The available multiplier range is limited by both processor model and the ability of each CPU.

Internal PLL Overvoltage:

Increases internal phase locked loop rail voltage, allowing higher processor core frequency overclocking. A setting of Auto will enable this setting for you as you increase the CPU core multiplier over a certain threshold.
Most good processor samples will not need this setting enabled until overclocking past a core multiplier of 45X (4.5GHz CPU speed).
The stability of S3 sleep resume may be affected with this setting “Enabled”. If you find that your CPU won’t overclock past 4.5GHz without this setting Enabled, then the only choice may be to run at a lower speed if you find the system is unable to resume from S3 successfully.

CPU bus speed to DRAM speed ratio mode:

This setting is for Ivy Bridge processors only. This can be left at Auto to apply changes in accordance with the Memory Frequency setting.

Memory Frequency:

Selects the desired memory operating frequency (memory ratio). This setting is a derivative of BCLK and CPU bus speed:DRAM speed ratio mode. The target operating frequency is displayed within the drop-down list of this setting as well as the topleft corner of the Ai Tweaker menu.
Ivy Bridge CPUs have a wider range of memory ratio settings at their disposal than the previous generation Sandybridge processors. When used in addition with BCLK*, this allows more granular bus frequency control which should help us to tune a system to its full potential.
For daily use, we recommend opting for memory kits specified at a maximum of DDR3-1600.
Higher speed memory shows miniscule performance gains in most desktop software, so it is wiser to spend money elsewhere on the system. Further, the stability of the system at higher operating frequencies cannot be guaranteed, often interfering with resume from sleep states and also when the system is stressed by software. Hence our advice to opt for memory kits that are within processor specifications if you do not wish to spend time tuning the system for stability.
At the other end of the spectrum, benchmarking fanatics will find that 2GB PSC based kits offer the best overall performance in memory sensitive benchmarks. Target operating frequencies between DDR3-2400 to DDR3-2600 seem to be the optimal point for best scores and times in sensitive benchmarks utilizing CAS 6 or 7 timing sets, in tandem with sub-zero cooling of both the processor and memory modules. Higher speeds are possible at the expense of secondary and third memory timing parameters, often at the expense of efficiency. If PSC based kits are not available, then Samsung based kits offer a modern alternative albeit requiring looser operating latency at equivalent frequencies.
*Within functional limits of the BCLK setting.

DRAM Timing Control:

Takes us to the DRAM Timing sub-section:
Most of these settings can safely be left at Auto unless you wish to tune the system for optimal scoring in benchmarks. The primary timings will be set in accordance with the memory module SPD at a given frequency or fall back on ASUS defaults as memory bus frequency is increased.
If you do wish to overclock memory then we suggest starting off by entering the Memory Preset subsection and selecting a memory profile that is based upon the ICs that are used on your memory modules.

DRAM CAS Latency:

Column Address Strobe, defines the time it takes for data to be ready for burst after a read command is issued. As CAS factors in almost every read transaction, it is considered to be the most important timing in relation to memory read performance. To calculate the actual time period denoted by the number of clock cycles set for CAS we can use the following formula:
tCAS in Nano seconds=(CAS*2000)/Memory Frequency
This same formula can be applied to all memory timings that are set in DRAM clock cycles.

DRAM RAS TO CAS Latency:

Also known as tRCD. Defines the time it takes to complete a row access after an activate command is issued to a rank of memory. This timing is of secondary importance behind CAS as memory is divided into rows and columns (each row contains 1024 column addresses). Once a row has been accessed, multiple CAS requests can be sent to the row the read or write data. While a row is “open” it is referred to as an open page. Up to eight pages can be open at any one time on a rank (a rank is one side of a memory module) of memory.

DRAM RAS# PRE Time:

Also known as tRP. Defines the number of DRAM clock cycles it takes to Precharge a row after a page close command is issued in preparation for the next row access to the same physical bank. As multiple pages can be open on a rank before a page close command is issued the impact of tRP towards memory performance is not as prevalent as CAS or tRCD – although the impact does increase if multiple page open and close requests are sent to the same memory IC and to a lesser extent rank (there are 8 physical ICs per rank and only one page can be open per IC at a time, making up the total of 8 open pages per rank simultaneously).

DRAM RAS Active Time:

Also known as tRAS. This setting defines the number of DRAM cycles that elapse before a Precharge command can be issued. The minimum clock cycles tRAS should be set to is the sum of CAS+tRCD+tRTP.

DRAM Command Mode:

Also known as Command Rate. Specifies the number of DRAM clock cycles that elapse between issuing commands to the DIMMs after a chip select. The impact of Command Rate on performance can vary. For example, if most of the data requested by the CPU is in the same row, the impact of Command Rate becomes negligible. If however the banks in a rank have no open pages, and multiple banks need to be opened on that rank or across ranks, the impact of Command Rate increases.
Most DRAM module densities will operate fine with a 1N Command Rate. Memory modules containing older DRAM IC types may however need a 2N Command Rate.

Latency Boundary:

This setting contains presets for the third timing section below. A higher number is less aggressive. We recommend you start with a setting of 14 and then decrease by one digit after running a stress test to check if the system is stable.


Secondary Memory Timings

DRAM RAS to RAS Delay:

Also known as tRRD (activate to activate delay). Specifies the number of DRAM clock cycles between consecutive Activate (ACT) commands to different banks of memory on the same physical rank. The minimum spacing allowed at the chipset level is 4 DRAM clocks.

DRAM Ref Cycle Time:

Also known as tRFC. Specifies the number of DRAM clocks that must elapse before a command can be issued to the DIMMs after a DRAM cell refresh.

DRAM Write Recovery Time:

Defines the number of clock cycles that must elapse between a memory write operation and a Precharge command. Most DRAM configurations will operate with a setting of 9 clocks up to DDR3-2500. Change to 12~16 clocks if experiencing instability.

DRAM Read to Precharge Time:

Also known as tRTP. Specifies the spacing between the issuing of a read command and tRP (Precharge) when a read is followed by a page close request. The minimum possible spacing is limited by DDR3 burst length which is 4 DRAM clocks. Most 2GB memory modules will operate fine with a setting of 4~6 clocks up to speeds of DDR3-2000 (depending upon the number of DIMMs used in tandem). High performance 4GB DIMMs (DDR3-2000+) can handle a setting of 4 clocks provided you are running 8GB of memory in total and that the processor memory controller is capable. If running more than 8GB expect to relax tRTP as memory frequency is increased.

DRAM Four Activate Window:

Also known as tFAW. This timing specifies the number of DRAM clocks that must elapse before more than four Activate commands can be sent to the same rank. The minimum spacing is tRRD*4, and since we know that the minimum value of tRRD is 4 clocks, we know that the minimum value for tFAW at the chipset level is 16 DRAM clocks.
As the effects of tFAW spacing are only realised after four activates to the same DIMM, the overall performance impact of tFAW is not large, however, benchmarks like Super Pi 32m can benefit by setting tFAW to the minimum possible value. As with tRRD, setting tFAW below its lowest possible value will result in the memory controller reverting to the lowest possible value (16 DRAM clocks or tRRD * 4).

DRAM Write to Read Delay:

Also known as tWTR. Sets the number of DRAM clocks to wait before issuing a read command after a write command. The minimum spacing is 4 clocks.
As with tRTP this value may need to be increased according to memory density and memory frequency.

DRAM CKE Minimum Pulse width:

This setting can be left on Auto for all overclocking.
CKE defines the minimum number of clocks that must elapse before the system can transition from normal operating to low power state and vice versa.

DRAM RTL & IOL:

Unlike other timings, DRAM RTL and IOL are measured in memory controller clock cycles rather than DRAM bus cycles. These settings can safely be left on Auto for all normal use. The RTL and IOL define the number of memory controller cycles that elapse before data is returned to the memory controller after a read CAS command is issued. The IOL setting works in conjunction with RTL to fine tune DRAM buffer output latency. Both settings are auto-sensed by the memory controller during the POST process (memory training). Manual adjustment should not be necessary unless the system is being used in order to obtain maximum DRAM frequency screenshots (limited stability).

Tertiary Memory Timings


Most of these timings can be left on AUTO unless tweaking for SuperPi 32M. The best way to tune these settings if benchmarking is to set them to their maximum value and then decrease one step at time while monitoring stability at every change. We have color-coded text within this section to highlight more important timings over lesser ones.
Red = more important
Black = less important
On some settings , Intel have already enforced a 2 clock preset which the UEFI set value is added, and on others the memory controller calculates a minimum delay to which the UEFI value is added.

tRWDR (DD):

Sets the delay period between a read command that is followed by a write command; where the write command requires the access of data from a different rank or DIMM. A setting of 1 clocks works with some high performance DIMM configurations (dependent upon CAS). Relax to 2~7 clocks only if you are experiencing stability issues when running in excess of 4GB of memory over DDR3-2300.

tRWSR:

Sets the delay between a read command followed by a write command to the same rank. A setting of 2 is possible with high performance 2GB DIMMs. If experiencing instability or non-POST with CAS 8 or 9 then try a setting of 4+. To use a setting of 3 with CAS 9, set Stretch_ODT to 8 clocks using MemTweakit and monitor for performaqnce impact or change.

tRR (DD):

Sets the read to read delay where the subsequent read requires the access of a different DIMM. For high performance DIMMs start with a setting of 2 and increase to 3+ if you experience no POST.

tRR (DR):

Sets the delay between read commands when the subsequent read requires the access of a different rank on the same DIMM. This setting is an additive to an internally calculated value.

tRRSR:

Sets the delay between read commands to the same rank. From a performance perspective a setting of 4 clocks is optimal.

tWW(DD):

Sets the write to write delay where the subsequent write requires the access of a different DIMM. 4 clocks will work with most configurations; increase if using 4GB or 8GB DIMMs with all slots populated.

tWW(DR):

Sets the write to write delay where the subsequent write command requires the access of a different rank on the same DIMM; increase if using 4GB or 8GB DIMMs with all slots populated. tWWSR: Sets the delay between write commands to the same rank. From a performance perspective a setting of 4 clocks is optimal.

tWWSR:

Sets the delay between write commands to the same rank. From a performance perspective a setting of 4 clocks is optimal.

Misc Memory Settings


MRC Fast Boot:

Bypasses longer memory training routines during system re-BOOT. Can help speed up BOOT times. If using higher memory frequency divider ratios (DDR3-2133 and over), then disabling this setting while trying to achieve stability can be beneficial. Once the desired system stability has obtained, Enable this setting to prevent the auto sensed parameters from drifting on subsequent system re-BOOTs.

DRAM CLK Period:

Defines memory controller latencies in conjunction with the applied memory frequency. A setting of 5 gives best overall performance though may impact stability.

Transmitter Slew & Receiver Slew:

A setting of around ‘3’ on Transmitter Slew may yield the best results or a good starting point with most DIMM. Tweaking these settings will require some time, but can extend overclocking headroom for DRAM frequency. It’s best to adjust one step at a time and then run a memory intensive benchmark or stress test to monitor for changes in failure rates to find the optimal settings.
After changing Transmitter slew, one should go through the same steps tuning Receiver Slew. Both settings should be tuned before relying on an increase of voltage.

MCH Duty Sense CHA & CHB:

These settings can be left on Auto most of the time. If experimenting, start at middle value of 15, check for impact on stability then move up by +2 and re-check. Tuning will be system and DIMM specific and depend on operating frequency.

CHA & CHB DIMM Control:

Allows a user to disable a channel without physically removing the DIMM. Leave on Auto unless experimenting or testing individual channels for stability.

DRAM Read and Write Additional Swizzle:

Leave these settings on Auto unless experiencing instability at high DRAM frequency. Toggling from enabled to disabled or vice-versa may help pass a benchmark where the DIMMs were otherwise unstable.

GPU.DIMM Post:

This takes us to a sub-menu where we can check that DIMMs and GPUs have been detected at POST.

CPU Power Management:

Takes us to a sub-menu that allows configuration of non-Turbo ratio CPU multipliers as well as set power thresholds for Turbo multipliers. Information is provided within UEFI with regards to the usage of each option.

DIGI+ Power Control:


Each of the settings within the DIGI+ VRM section has an explanation listed in the right hand column of UEFI. All settings have been configured to scale on Auto in accordance with overclocking. We recommend you leave the thermal control parameters as is for all operating conditions. We’ll highlight some of the other settings below for clarification purposes.

Load-Line Calibration:

AKA LLC, sets the margin between applied and load voltage. For 24/7 use a setting of 50% is considered optimal, providing the best balance between set and load voltage in a manner that compliments the VRM for all loading conditions. Some users prefer using higher values, although this will impact overshoot to a small degree.

VRM Spread Spectrum:

ssigns enhanced modulation of the VRM output in order the peak magnitude of radiated noise into nearby circuitry. This setting should only be used at stock operating frequency, as the modulation routines can impact transient response.

All “Current Capability”settings:


A setting of 100% on all of these settings should be ample to overclock processors using conventional cooling methods. If pushing processors using LN2 or other sub-zero forms of cooling then increase the current threshold to each voltage rail respectively. A setting of 140% should ensure OCP does not trip during benchmarks.

CPU Voltage:

There are two ways to control CPU core voltage; Offset Mode and Manual Mode. Manual mode assigns a static level of voltage for the processor. Offset Mode allows the processor to request voltage according to loading conditions and operating frequency. Offset mode is preferred for 24/7 systems as it allows the processor to lower its voltage during idle conditions, thus saving a small amount of power and reducing unnecessary heat.
The caveat of Offset Mode is that the full load voltage the processor will request under load is impossible to predict without loading the processor fully. The base level of voltage used will increase in accordance with the CPU multiplier ratio. It is therefore best to start with a low multiplier ratio and work upwards in 1X steps while checking for stability at each increase. Enter the OS, load the CPU and check CPU-Z to check the voltage the CPU requests from the buck controller. If the level of voltage requested is very high, then you can reduce the full load voltage by applying a negative offset in UEFI. For example, if our full load voltage at 45X CPU multiplier ratios happened to be 1.40V, we could reduce that to 1.35V by applying a 0.05V negative offset in UEFI.
Most of the information pertaining to overclocking Sandy Bridge CPUs has already been well documented on the internet. For those of you purchasing retail Ivy Bridge CPUs, we expect most samples to achieve 4.3-4.5GHz with air and water cooling. Higher overclocks are possible although full-loading of the CPUs will result in very high temperatures even though the current consumed by these processors is not excessive. We suspect this is a facet of the 22nm process.

iGPU Voltage:

Sets the rail voltage of the integrated GPU. Same function as CPU Vcore with regards to Manual and Offset Mode. Should this option become available when you are not using the iGPU, then you may force the iGPU to disabled via System agent config > graphics config/ option > internal graphics. Doing so ensures better memory stability and overclocking headroom.

DRAM Voltage:

Sets voltage for the memory modules. 1.50V DIMMs qualified on Sandybridge and Ivy Bridge processors are recommended for use on this platform.

IMC-DRAM Offset Sign:

Selects whether or not to add or subtract voltage from the IMC-DRAM Offset function below.

IMC-DRAM Offset:

This setting offsets the DRAM voltage seen by a portion of the memory controller. The based voltage at AUTO is DRAM voltage. By using the positive or negative setting above, we can offset processor side DRAM voltage in 0.00661V steps. The reason this setting has been added is we found that some DIMMs exhibit more stability when a specific processor side DRAM pin has its voltage set above or below the voltage supplied to the memory modules. Usually a setting 5 steps above or below DRAM voltage (circa 0.02V) is sufficient to help in memory intensive benchmarks. In my testing to date the memory ICs that seem to respond best to an offset are Elpida BBSE based modules. Don’t stay too far from auto.

VCCSA Voltage:

Sets the voltage for the System Agent. It can be left on Auto for most overclocking.

VCCIO:

May need adjustment on Sandybridge processors if using 16GB of memory or memory modules that contain ICs that represent a tough load to the memory controller. 1.05V is base, if adjusting increase in 0.025V steps and check stability at each increment. Maintaining a DC delta between this setting and DRAM voltage may be beneficial if using very high DRAM voltages (on Ivy Bridge, too).

CPU PLL Voltage:

For most overclocking, the minimum voltage requirements will be centered around 1.80V. If using higher processor multiplier ratios or DRAM frequencies over DDR3 2200, then a small over-voltage here can aid stability. Don note that the processor will become increasingly sensitive to PLL voltage changes at sub-zero temperatures and when nearing the maximum frequency the CPU is capable of.

Skew Driving Voltage:

Base is 1.05V. Adjustment is only required when running sub-zero processor temperatures or very high BCLKs. We have taken the time to enter offsets for you to work with within the LN2 profiles.

2nd VCCIO Voltage:

Is split from the VCCIO power rail to allow you to adjust both separately. As a starting point keep this close to VCCIO and then try setting this at a different value if chasing maximum processor overclocks (benhmarking use).

PCH Voltage:

It can be left at default values for all overclocking. We have not observed any relationship between this voltage rail and any other in our testing to date.

VTTDDR:

Supplies power to the VTT input pin on DRAM memory modules. In most cases this setting can be left on Auto. At high DRAM clocks (in excess of DDR3-2400) increasing this voltage may help improve stability. Start with 0.85V and work up. Traditionally this setting should be at 50% of VDIMM, however in our testing we have found 0.85V a good starting point for improving stability in Super Pi 32M. A one or two step change above or below that can help 32M pass where it would otherwise fail.

DRAM DATA and CTRL References for all channels:

Allow adjustment of the DRAM read/write reference voltages for the DATA and CTRL signal lines. A setting of Auto defaults to 50% of VDIMM which should be adequate for almost all overclocking. Adjustment can sometimes be required when benchmarking memory at very high operating frequencies. In such instances a small reduction or increase (one step) above or below 50% can help aid stability in memory intensive benchmarks. Also if processors are sub-zero cooled, there may come a point where the memory controller becomes unstable regardless of operating frequency. This is where fiddling with these voltages can sometimes help pass benchmarks that would be otherwise unstable.

CPU Spread Spectrum:

Modulates processor core frequency in order to reduce the peak magnitude of radiated noise emissions. We recommend setting this to disabled if overclocking the as the modulation can interfere with system stability.

BCLK Recovery:

When enabled, this setting will return BCLK to a setting of 100 MHz (default) if the system fails to POST. Disabling it will NOT return BCLK to 100MHz when OC Failure is detected.




powered by
Socialbar
Configuration After installing, you might want to change these default settings: