Print

Technical Articles

System Performance Optimization - An Overview
[ Name: 제니퍼소프트, Date: 07-05-18 23:07:12 ] ( ko en ja )

A problem can be described as the disparity between the current state and the target state. In another words, the resolving a problems can be defined by quantifying how things are now and determining how things should be. The gap between current and target state is the problem. System performance problem can also be observed from the same point of view. Performance problem can be derived by measuring the gap between the system’s current state and the target state after determining the current state via system tests and monitoring, then setting the performance target.

In a social phenomenon, solving a problem is easy once the current state is quantified and the target is set appropriately. However, in a system, there exist factors which cannot be resolved merely with the current and a target state. In a sudden system down for example, the current status here is the abnormal system down and the target is stable operation. Analyzing the cause of the problem under this situation is a difficult and time-consuming task. Additionally, the problem cannot be quantified simply with numbers only; a downed system cannot be improved by X%. Eitherthe system is working or it is not; it is an all-or-nothing situation. To a consultant, this type of situation can be like gambling.

System Optimization

System performance optimization is to make the system perform at an optimum state with the limited assigned resources. For maximum performance with given resource, every performance bottleneck has to be eliminated except for the system resources (CPU, MEM, NET, etc.). The system resources (except for improper use of resources) are no longer an issue as it can be resolved by simply adding more of it. Performance obstacles refer to the condition when the service is not possible for any reason including system failure due to bugs. System optimization generally includes “tuning” for maximizing the amount of process as well as removing performance obstacle and others.

Removing Performance Bottleneck.

Performance bottleneck describes condition when a service cannot be performed because service is stopped or there is an excessive delay in the service response time. Thus removing the performance bottleneck refers to the process of identifying and solving the cause of the hindered service. Performance tests are not mandatory to remove performance obstacle. System-down, out of memory error, etc. are the typical examples of performance bottleneck that can happen in a system. Even though the performance bottleneck can be recognized intuitively without any tests or analysis, artificial recreation of the phenomenon is still a matter of importance. If it is possible to reconstruct the phenomenon at a time in need, 70~80 percent of the problem can be considered solved. Particularly, if it is possible to reconstruct the phenomenon in a development server, logical thinking is the only tool necessary to solve the bottleneck for anyone. To illiterate, recreation of the problem phenomenon is significant in resolving performance bottleneck.

Performance Tuning

System performance can be expressed in throughput and the service response time. Effort to reduce the service response time and maximize the throughput is called “tuning”. Quantifying the current performance statue and setting a target is vital in tuning; this is why there is a close relationship between system testing and tuning. Generally, performance tuning is carried out by eliminating performance bottlenecks, and occasionally by expanding H/W resources or adjusting S/W parameters.


Difficulty in System Optimization

Improving response time, increasing throughput, and finding the root cause of System-down is an arduous task. Often, difficulty in optimizing a system isn’t because of lack technical skills but because the performance problem is caused by unforeseen circumstance or the method used to approach the problem has fundamental flaws.

  • Inability to Define the Problem (During Performance Analysis) This is a case when administer cannot even determine if there really is a problem. Setting system performance goal on a hunch without legacy testing, or performing simple login test only to benchmark the system performance, indicate that the performance problem has not been defined. Trying to resolve undefined performance problem is like running a marathon without a finish line.
  • Finding the Root Cause is Too Difficult This is a case when a certain phenomenon can be seen consistently during server operation. Administrator can tell that there is a problem but has no clue how to find the root cause. Typically, this type of case is due to S/W, H/W, or package bug.
  • Follow-Up or Verification is Difficult This is a typical scenario found in production system when the root cause of the performance problem has been identified and resolution applied but verifying whether the problem has been solved is not possible. Using incorrect coding standard or having too many program to change fall under this category.
Following are some examples of difficulty faced during system optimization.
  • I feel insecure about the system. Please take a look =>[Check whether there is a problem in the first place.]
  • System suddenly went down with out any log information =>[Wait indefinitely after setting up a monitoring tool]
  • Check and correct all the relevant code until problem is resolved. => [Too much resource, both time and manpower, spent to resolve the problem]
  • Problem only happens during class registration =>[Must wait until the next registration to recreate the problem]

Advise on Performing System Optimization

Difficulty in system optimization can be due of lack technical knowledge of the administrator but often it is because the approach method is improper due to lack of experience. Following are examples of situation to keep in mind when performing system optimization.

Division of viewpoint is the foundation of problem solving. Start problem solving from macroscopic viewpoint then narrow the scope by eliminating each potential root cause. The goal is to narrow the problem down to where it originates from, whether it is the system, application, DB or the network, etc… Focusing on the displayed phenomenon only and working with a narrow perspective in the beginning often leads into improper assumptions and ultimately failure to resolve the problem. Having the right expertise at the right time can be very helpful as well though it does not happen frequently. Focusing on phenomenon or working with narrow perspective can be dangerous because it plants often incorrect assumption in administrator and customer’s mind, and when the problem solving fails, the reputation and the credibility of the administrator crumbles in the customer’s mind. When I start tuning a system, I often state that “90% of the root cause initially hypothesized is wrong”. The lesson here is that the initial observation of the problem is usually not relevant and it is important to look at the big picture first then narrow down the problem. Consulting experts at the right time is also important.

- Use Top – Down Approach

As I have mentioned earlier, do not focus too much on the error or phenomenon but look at the problem from a macroscopic perspective and continuously analyze the information from the system perspective to divide and conquer the problem one section at a time until the root cause if found.

- Two is better than one

Cooperation and Teamwork drastically improves the chance of resolving performance problem. Two people, each with different approaches working together as a team to solve a problem works well especially. There are some cases when experts from different companies get together to resolve difficult problems successfully as a team. Even within the same company, getting people from different departments to collaborate in a team to resolve difficult problem has resulted in success.

- Publicized Hypothesis Protects Everyone.

Assurance is not common in resolving performance problem. Resourcing system problem under firm assumption or narrowly focus method is the enemy of administrator’s credibility and even if the specific problem is resovled, it may not have positive effect on the overall system operation/development. When a firm assurance and assumption is made as to the cause and resolution of the system problem and the assumption turns out to be wrong, the adminitor’s credibility deteriorates and the the work done thus far becomes useless. Separating the relevant info from the irrelevant ones during system test/observation, logically analyze the data then hypothesizing the possible root cause, and publicizing the theories to all members involved to draw teamwork is important.

- If the eye (Monitoring Tool) is closed the road (Solution) cannot be seen

Even the leading experts in system optimization cannot resolve the problem without relevant and adequate data. In order to collect and organize data, quality logging tools and monitoring tool are essential. Demanding consultant to resolve the issue simply by observing the phenomenon is irresponsible. Some customers expect consultants to know the root cause of the issue simply by observing an error message but this is a wrong assumption.

- Time Spent Troubleshooting Cannot be estimated

“Resolved the issue in 3 days” or “within next couple hour”… from the business perspective, such goal can be set but often it is an unrealistic goal. In performance tuning and problem solving, you can put in the effort to work as fast as possible but you cannot assure result in a set time.

- Do not rely entirely on intuition

Some consultants/adminitrators rely too much on their intuition. Though their intuition may be the result of numerous years of relevant experience, relying on intuition only may be damaging if it is wrong. Using intuition or past experience to assume the root-cause and assure results to customer can be damaging to the consultant’s reputation and credibility.

Conclusion

System optimization can be done in many different ways. Resolving the problem expediently id important but using proper procedure and having counter measures to unforeseen circumstance is equally important. Maintaining the credibility of the consultants are as credical as the resolving problem itself.

Final Thoughts
  • Resolution is provided by a “Person”
  • Making difficult decision at a dilemma is the “Know-How”
  • Good approach is based on mixture of experience and precedence
  • Collection relevant data through quality monitoring