Q&A from Confio’s Exadata Deployment Case Study Webcast with Cardinal Health

post date Tweet This
post comments No Comments

On January 26th, 2012, Confio presented a live webcast with Bronwyn Altizer, Lead Oracle DBA at Cardinal Health showcasing the recent Exadata deployment at Cardinal Health.  Watch recorded webcast. At the end of the presentation there were a number of questions from our audience. Here are the responses.

Q. Is Oracle doing Exadata patching or your team?
A. Yes, Oracle is currently handing everything related to Exadata at Cardinal Health.

Q. What is cost saving after the Exadata migration?
A. The exact cost savings was not specified, but Cardinal has realized cost savings and other benefits by:
- eliminating 20+ Unix servers and the floor space/power/management related to them
- also, Cardinal.com can do many more product searches now which has led to more revenue from the site

Q. Oracle recommends hiding indexes rather then dropping them when doing performance tuning in Exadata. Did Cardinal Health hide indexes or drop them?
A. Bronwyn talked about removing some indexes to help performance once they move to Exadata. The first step was to hide these indexes while performance testing was done. Once performance testing completed the indexes were removed eventually for space savings.

Q. If a query accesses data via an index, would it use a Smart Scan?
A. Smart scans only work for specific operations like table and index scans and here are the three basic requirements:
- there must be a full scan of an object (TABLE ACCESS FULL or INDEX FAST FULL SCAN, etc in plan)
- the scan must use direct path reads
- the object must be stored in the Storage Servers

Q. What version of Oracle is Cardinal on?
A. 11.2.0.2

Q. How is DBRM working in Exadata? We had issues with DBRM as we were running into resmgr:cpu quantum waits
A. After some trial and error DBRM is not working well at Cardinal. If you are seeing “resmgr: cpu quantum waits” this means that DBRM is doing what you told it to do and is throttling your CPU resources. Increase the amount of CPU available to the user experiencing the issue.

Q. Do you spend a lot of time chasing down bugs via Service Requests?
A. Cardinal has not had any more issues with bugs in Exadata than other Oracle platforms.

Q. Is there any feature that can be utilized to speed up the RMAN backup ?
A. Exadata will inherently offload RMAN backup processing to the storage servers rather than the database server in a non-Exadata environment. Based on several Oracle Exdata whitepapers, full image copies achieved over 7TB / hour and 10-48 TB for incremental backups.

Q. What is the primary work load, data warehousing/reporting or OLTP?
A. A little of everything. The primary performance boost was achieved on Cardinal.com which is an OLTP order entry system. However, reporting and data warehousing applications have also benefited greatly.

One Reason Why IgniteVM is better than vSphere

post date Tweet This
post comments No Comments

I hate wasting time. And if you are like me, you hate wasting time as well. And I *know* your end-users hate watching you waste time when trying to find the root cause for performance problems.

As an example let’s look at one possible resource bottleneck: CPU. When you have a database server that is reporting a high amount of CPU usage one of the first questions you will want to ask is this: is the pressure something internal to the database engine, or external?

If your database instance is running on a guest inside of VMWare then you have little chance of discovering if the pressure is internal or external. And to make matters worse, the native tools available inside of vSphere require you to spend a lot of time bouncing from one screen to another in order to have any chance at discovering the root cause. You don’t get any threshold information so you have no idea if the counters are good or bad. And the icing on the cake is the fact that vSphere does not show you any information down to the SQL statement level.

Enter IgniteVM. With IgniteVM you get to quickly see each layer (host, guest, database instance) along with thresholds. You can drill into specific SQL statements. You can spend far less time coming up with a list of actions to take in order to alleviate the performance issue.

But don’t just take a few words on a blog as proof, check out the video I put together to help show you what I am trying to say.

Get the Flash Player to see the wordTube Media Player.

If you haven’t yet had the chance to download a trial of Ignite8 or IgniteVM yet let me make it easy for you.

Do You Make These 5 Database Design Mistakes?

post date Tweet This
post comments No Comments

Look, everyone makes mistakes. It’s true. But not all of us have the chance to make mistakes that end up costing millions of dollars in hardware and production support costs.

Any one of the following five mistakes listed below will add additional costs to your company. It’s guaranteed. The costs could be hardware related (extra disk space, network bandwidth), which tend to add up quickly. The costs are also support related (bad performance, database re-design, report creation, etc.), and they add up very quickly.

Want to save your company some money, some headaches for your end-users, and ensure data quality all at the same time? Then avoid these five database design mistakes.

1. Going BIG, just in case

I often see data types being chosen as if they don’t matter. But the truth is they do matter. If you know that the only possible values for a certain column are between 0 and 100,000 then you don’t need to slap a BIGINT data type for that column when a INT would do just fine. Why does this matter? The BIGINT data type requires 8 bytes of storage, and the INT requires only 4 bytes of storage. That means for each row of data you could be wasting 4 bytes. Doesn’t sound like much, right?

OK then, let’s consider that your table has two million rows. Multiply those rows by 4 bytes and you have 8 million bytes, or roughly 7.8MB of wasted space. I know it doesn’t sound like a lot, does it? Well, it adds up, and quickly. I’ve only shown you an example for just one column, but what about those NCHAR(2000) columns that are storing mostly first and last names? How much extra overhead for those? How about your date columns? Do you have a need for calendar dates prior to the year 1900, or after the year  2079? Then SMALLDATETIME is likely to work just fine for you.Oh, and let’s not forget that these columns can be indexed, and those indexes will also be unnecessarily wider as well.

Choosing the right data type matters, for all sorts of reasons. Take the time and make an effort to get it right at the start.

2. Expecting a DBMS feature to mean the same across all systems

Bigger companies tend to have a variety of systems, all of which need to exchange data. It is quite common to see familiar terms in use by those systems. For example, you are likely to see the use of the term TIMESTAMP. You simply cannot assume that the format of TIMESTAMP will be the same between systems. If you do you are likely to have issues at some point. If you are lucky the issues will manifest themselves early. If you are unlucky the issue will lay dormant until the worst possible time for your system to come to a grinding halt.

What's the worst that can happen?

I chose TIMESTAMP as my example here because it is one of my favorite misnomers inside of SQL Server. Why? Because TIMESTAMP has nothing to do with date or time. And I love this hidden gem inside the BOL:

The Transact-SQL timestamp data type is different from the timestamp data type defined in the ISO standard.

So that means other systems that do adhere to the ISO standard of what it means to be a TIMESTAMP are going to return something vastly different. Take Oracle, for example, and how they define a TIMESTAMP value. Even DB2 has some subtle differences as well. The lesson here is that when you are working with disparate systems, do not assume that familiar terminologies will work in exactly the same way.

3. Expecting database-generated identifiers to be sequential

Being able to insert data into a database in sequential order has many benefits, and most database designers will look to implement some type of mechanism to ensure that such sequences are used. However, many of those mechanisms will leave “gaps” from time to time, and these gaps lead to confusion as people except to see {1, 2, 3, 4, 5…} and not something like {1, 2, 4, 5, 6…}. As soon as a gap is found people send out a search party, looking for the missing data. I know I’ve lost hours trying to track down “bugs” in systems where people thought rows of data were missing, only to realize that the integrity of the data was fine but the method for generating the sequence was not.

Now, for reasons I can’t explain, people just love GUIDs in database designs. One of the (many) issues with GUIDs is that they are not sequential by default. This will lead to decreased performance, often in a very short amount of time. For a great explanation of this in detail, check out this post by Kimberly Tripp at SQL Skills. The part you want to pay attention to is here:

“But, a GUID that is not sequential…can be a horribly bad choice - primarily because of the fragmentation that it creates in the base table but also because of its size. It’s unnecessarily wide…”

Yeah, that *is* what she said.

4. Not examining your foreign keys (FKs) as part of your indexing strategy

I am assuming, of course, that you even have foreign keys defined. I’ve seen *many* databases that have little to no primary keys, foreign keys, or even any indexes defined. No, I don’t know who would do such a thing either. But they’re out there, and sooner or later you will find them, too.

Assuming you have FKs defined then you should be evaluating to see if it would make sense to add indexes to match those FK definitions. In some case, it will. In other cases, it won’t. But you should make certain that this type of review is part of your overall design process.

5. Applying surrogate key without having the original business key as an alternate

Back to those folks who love the GUIDs for a moment. If you are using a surrogate key (and, ideally, one that is sequential) in your database design you should not forget that the data in your table should also have an alternate key based upon the business logic that was used to necessitate the creation of the table in the first place.

Many database designers tend to focus on the surrogate key only, and forget all about the other columns. And this is bad for your data quality, as it would allow for duplicate values to creep into your business layer if you do not put a unique constraint/index on that alternate key. Nobody wants that, trust me.

And there you have it. Avoid these five database design mistakes and save your company money, headaches, and ensure data quality all at the same time.

Ignite: What To Do When OEM Breaks

post date Tweet This
post comments No Comments

There are some days when a stroll down memory lane is most welcoming, and today is one of those days as I read this blog post written by Kyle Hailey: http://dboptimizer.com/2011/12/01/confio-ignite/

When I read that post it reminds me of myself about two years ago. I was a production DBA and used a myriad of tools for performance monitoring and troubleshooting. I had an in-house system that had been built over the years as well. And with everything we had in place, it always seemed to fall short of what was needed.

In time we narrowed down the key things we needed in a solution. They were:

  • It had to monitor more than one database platform (In my case, MSSQL and Sybase), and it had to do so seamlessly
  • It had to operate without the need of installing binaries on the monitored server, to reduce administrative overhead
  • It had to help me find the root cause of issues in the shortest time possible

When I found Ignite8 from Confio it was like having my eyes opened for the first time. I loved it so much that in the span of three months I went from potential customer to employee.

I like how Kyle mentions “what if OEM breaks?” It reminds me of how we felt whenever one of our tools would fail to collect the data needed, or if the data was there but it took hours to get the results you needed to make an informed decision on any action items.

Read the post from Kyle to find out more reasons why Ignite is, in a word, amazing.

Ignite or OEM? The DBA says the answer is both!

post date Tweet This
post comments No Comments

Did you know 90% of Confio’s Oracle customers use both Oracle Enterprise Manager (OEM) and Ignite to solve performance issues?  Based on my personal experience as a DBA in a large financial services transaction environment, this is not surprising.  The combination of the OEM and Confio Ignite 8 provides the kind of help I needed to better understand database performance issues and save time, money, and frustration.

Here are the three key reasons why you should considering using Ignite alongside OEM, and why I think it is necessary if you are serious about improving your performance as a DBA:

  • Quickly gets you directly to the source of the problem
  • Simple enough for everyone to use, without impacting my production instances
  • Provides charts and graphs that managers and developers understand, so I can make my case about the source of the problem

There are some very distinct differences between OEM and Ignite. OEM (sometimes referred to as Grid Control) was created to help manage the Oracle database environment.  Citing Oracle documentation, “Enterprise manager is a system management software that delivers centralized monitoring, administration, and life cycle management functionality for the complete Oracle IT infrastructure, including systems running Oracle and non-Oracle technologies.”

In contrast, Ignite 8 is a tool that is focused on database performance monitoring using wait time analysis. “Confio’s Ignite tool is a comprehensive database performance analysis and monitoring solution for DBAs, IT managers, and application developers.  Ignite 8 identifies performance bottlenecks, improves application service, and reduces overall cost of Oracle database operations.”

Because Ignite is dedicated to solving performance issues, Ignite fills a niche that OEM does not.  If a DBA is looking for the root cause of a performance issue, they can use Ignite to identify the root cause in 3 to 4 clicks.  Seriously. If you don’t believe me, install, and see for yourself.

Ignite is a great collaboration tool.  Ignite is completely agentless and puts a negligible load on the databases being monitored and their hosts.  Since there is a very light load put on the monitored server from Ignite, there is no worry about granting read only access (or more where appropriate) to developers and even management. This enables DBAs and developers to fight problems, not each other.  The tool is easy to read and easy to use, so everyone can see the same the same information from the database thus helping to break down those not-so-invisible barriers — otherwise known as  communication “silos” — between departments, management, and worker bees.  This is especially useful when I needed to articulate technical issues to management.  With Ignite’s graphing capabilities, even the less technical people can better understand the impact of performance related issues.

Another place where Ignite fills a void is when a company has a combination of Oracle Enterprise Edition (EE) and Standard Edition (SE).  You get OEM Grid with both versions; however the diagnostic and performance tuning packs are not available for the SE version.  Using Ignite with SE can be a lifesaver because Ignite doesn’t need the tuning pack tables so the monitoring will be the same for EE as it is for SE.  Ignite allows you to follow alarms to see problem queries, server resources, trends and sessions.

Below are three scenarios that help illustrate why Ignite compliments OEM and vice versa.

Scenario 1:  User receives an alert from OEM.  The alert stats that the tablespace is 97% full for a particular database from OEM Grid.  You log into OEM and add a datafile to the tablespace to get the percent full to go under 75%.

Scenario 2 :    DBA receives an alert from OEM.  The alert states that the I/O is running very high.  The DBA logs into OEM and confirms the I/O is high, the performance tab shows there are several SQL statements running but you can’t pin point exactly where the problem is.  The DBA opens Ignite and sees there is one query that is causing 90% of all the wait.  After drilling into the SQL statement, it is found that this is new code that was put in the previous night, and it is returning 10 times the amount of data due to a mistake in the code migration.  You notify the developer and the code is reverted, fixing the problem.

 Scenerio 3:    Every Monday morning, a user has a report sent from Ignite that lists the top 15 SQL statements across their databases.  This morning there is a spike on the report for a particular database from Saturday evening.  The DBA goes into Ignite to find out more about what caused the spike and drills into the Saturday incident and sees that, because it’s of end of month, the Accounting department was working over the weekend and ran a customized report that caused the spike.  You can then email the graph that points this spike out with the SQL associated with it to management for your daily OPS meeting to explain what caused the performance incident.  Since Ignite is easy to read, it is easy to share this information among all the groups.

In conclusion, some Oracle OEM users might assume they would use either tool but not both.  I don’t think this is correct.  The vast majority of Oracle DBAs using Ignite use OEM for database management (resource usage monitoring, user management, storage management, etc.) and Ignite for end-user performance analysis, monitoring, and problem resolution.  This is reduces time consuming tasks a DBA has to perform in their day to day job making them faster and more efficient; saving money for the company.  All DBAs are being asked to take on more tasks and management of more databases in their organization.   Ignite is a critical tool in a DBA’s toolbox.