五种提高SQL性能的方法(一)

时间:2024-07-02 02:15:06 计算机毕业论文 我要投稿
  • 相关推荐

五种提高SQL性能的方法(一)

译文
Five Ways to Rev up Your SQL Performance
 Sometimes all it takes is a little tweak here or there to make your application run much faster. Ah, but the key is figuring out how to tweak it! Sooner or later you'll face a situation where a SQL query in your application isn't responding the way you intended. Either it doesn't return the data you want or it takes entirely too long to be reasonable. If it slows down a report or your enterprise application, users won't be pleased if they have to wait inordinate amounts of time. And just like your parents didn't want to hear why you were coming in past curfew, users don't want to hear why your query is taking so long. ("Sorry, Mom, I used too many LEFT JOINs.") Users want applications to respond quickly and their reports to return analytical data in a flash. I myself get impatient when I surf the Web and a page takes more than ten seconds to load (OK, more like five seconds).
 To resolve these issues, it is important to get to the root of the problem. So where do you start? The root cause is usually in the database design and the queries that access it. In this month's column I'll demonstrate four techniques that can be used to either improve your SQL Server™-based application's performance or improve its scalability. I'll examine the use of LEFT JOINs, CROSS JOINs, and retrieving an IDENTITY value. Keep in mind that there is no magic solution. Tuning your database and its queries takes time, analysis, and a lot of testing. While the techniques here are proven, some may work better than others in your application.
 Returning an IDENTITY From an INSERT
 I figured I would start with something I get a lot of questions about: how to retrieve an IDENTITY value after performing a SQL INSERT. Often, the problem is not how to write the query to retrieve the value, but rather where and when to do it. In SQL Server, the statement to retrieve the IDENTITY value created by the most recent SQL statement run on the active database connection is as follows:
 SELECT @@IDENTITY
 While this SQL is far from daunting, it is important to keep in mind that if the most recent SQL statement was not an INSERT or you run this SQL against a different connection than the INSERT SQL, you will not get back the value you expect. You must run this code to retrieve the IDENTITY immediately following the INSERT SQL and on the same connection, like this:
 INSERT INTO Products (ProductName) VALUES ('Chalk')
 SELECT @@IDENTITY
 Running these queries on a single connection against the Northwind database will return to you the IDENTITY value for the new product called Chalk. So in your Visual Basic® application using ADO, you could run the following statement:
 Set oRs = oCn.Execute("SET NOCOUNT ON;INSERT INTO Products _
 (ProductName) VALUES ('Chalk');SELECT @@IDENTITY")
 lProductID = oRs(0)
 This code tells SQL Server not to return a row count for the query, then executes the INSERT statement and returns the IDENTITY value just created for the new row. The SET NOCOUNT ON statement means the Recordset that is returned has one row and one column that contains the new IDENTITY value. Without this statement, an empty Recordset is returned (because the INSERT statement returns no data) and then a second Recordset is returned, which contains the IDENTITY value. This can be confusing, especially since you never intended the INSERT to return a Recordset. This situation occurs because SQL Server sees the row count (that is, one row affected) and interprets it as representing a Recordset. So the true data is pushed back into a second Recordset. While you can get to this second Recordset using the NextRecordset method in ADO, it is much easier (and more efficient) if you can always count on the Recordset being the first and only one returned.
 While this technique gets the job done, it does require extra code in the SQL statement. Another way of getting the same result is to use the SET NOCOUNT ON statement preceding the INSERT and to put the SELECT @@IDENTITY statement in a FOR INSERT trigger on the table, as shown in the following code snippet. This way, any INSERT statement into that table will automatically return the IDENTITY value.
 CREATE TRIGGER trProducts_Insert ON Products FOR INSERT AS
     SELECT @@IDENTITY
 GO
 The trigger only fires when an INSERT occurs on the Products table, so it always will return an IDENTITY after a successful INSERT. Using this technique, you can consistently retrieve IDENTITY values in the same manner across your application.
 Inline Views Versus Temp Tables
 Queries sometimes need to join data to other data that may only be gathered by performing a GROUP BY and then a standard query. For example, if you want to return the information about the five most recently placed orders, you would first need to know which orders they are. This can be retrieved by using a SQL query that returns the orders' IDs. This data could be stored in a temporary table, a common technique, and then joined to the Product table to return the quantity of products sold on those orders:
 CREATE TABLE #Temp1 (OrderID INT NOT NULL, _
                      OrderDate DATETIME NOT NULL)
 INSERT INTO #Temp1 (OrderID, OrderDate)
 SELECT     TOP 5 o.OrderID, o.OrderDate
 FROM Orders o ORDER BY o.OrderDate DESC
 SELECT     p.ProductName, SUM(od.Quantity) AS ProductQuantity
 FROM     #Temp1 t
     INNER JOIN [Order Details] od ON t.OrderID = od.OrderID
     INNER JOIN Products p ON od.ProductID = p.ProductID
 GROUP BY p.ProductName
 ORDER BY p.ProductName
 DROP TABLE #Temp1
 This batch of SQL creates a temporary table, inserts the data into it, joins other data to it, and drops the temporary table. This is a lot of I/O for this query, which could be rewritten to use an inline view instead of a temporary table. An inline view is simply a query that can be joined to in the FROM clause. So instead of spending a lot of I/O and disk access in tempdb on a temporary table, you could instead use an inline view to get the same result:
 SELECT p.ProductName,
     SUM(od.Quantity) AS ProductQuantity
 FROM     (
     SELECT TOP 5 o.OrderID, o.OrderDate
     FROM     Orders o
     ORDER BY o.OrderDate DESC
     ) t
     INNER JOIN [Order Details] od ON t.OrderID = od.OrderID
     INNER JOIN Products p ON od.ProductID = p.ProductID
 GROUP BY
     p.ProductName
 ORDER BY
     p.ProductName
 This query is not only more efficient than the previous one, it's shorter. Temporary tables consume a lot of resources. If you only need the data to join to other queries, you might want to try using an inline view to conserve resources.
 Avoid LEFT JOINs and NULLs
 There are, of course, times when you need to perform a LEFT JOIN and use NULL values. But they are not a solution for all occasions. Changing the way you structure your SQL queries can mean the difference between a report that takes minutes to run and one that takes only seconds. Sometimes you have to morph the data in a query to look the way your application wants it to look. While the TABLE datatype reduces resource gluttony, there are still plenty of areas in a query that can be optimized. One valuable, commonly used feature of SQL is the LEFT JOIN. It can be used to retrieve all of the rows from a first table and all matching rows from a second table, plus all rows from the second table that do not match the first one. For example, if you wanted to return every Customer and their orders, a LEFT JOIN would show the Customers who did and did not have orders.
 This tool can be overused. LEFT JOINs are costly since they involve matching data against NULL (nonexistent) data. In some cases this is unavoidable, but the cost can be high. A LEFT JOIN is more costly than an INNER JOIN, so if you could rewrite a query so it doesn't use a LEFT JOIN, it could pay huge dividends (see the diagram in Figure 1).
 
 Figure 1 Query
 One technique to speed up a query that uses a LEFT JOIN involves creating a TABLE datatype and inserting all of the rows from the first table (the one on the left-hand side of the LEFT JOIN), then updating the TABLE datatype with the values from the second table. This technique is a two-step process, but could save a lot of time compared to a standard LEFT JOIN. A good rule is to try out different techniques and time each of them until you get the best performing query for your application.
 When you are testing your query's speed, it's important to run it several times and take an average. Your query (or stored procedure) could be stored in the procedure cache in SQL Server's memory and thus would appear to take longer the first time and shorter on all subsequent tries. In addition, other queries could be running against the same tables while your query runs. This could cause your query to stand in line while other queries lock and unlock tables. For example, if you are querying while someone is updating data in that table, your query may take longer to execute while the update commits.
 One of the easiest ways to avoid slowdowns with LEFT JOINs is to design the database around them as much as possible. For example, let's assume that a product may or may not have a category. If the product table stores the ID of its category and there was no category for a particular product, you could store a NULL value in the field. Then you would have to perform a LEFT JOIN to get all of the products and their categories. You could create a category with the value of "No Category" and thus specify the foreign key relationship to disallow NULL values. By doing this, you can now use an INNER JOIN to retrieve all products and their categories. While this may seem like a workaround with extra data, this can be a valuable technique as it can eliminate costly LEFT JOINs in SQL batches. Using this concept across the board in a database can save you lots of processing time. Remember, even a few seconds means a lot to your users, and those seconds really add up when you have many users accessing an online database application.
 Use Cartesian Products Wisely
 For this tip, I will go against the grain and advocate the use of Cartesian products in certain situations. For some reason, Cartesian products (CROSS JOINS) got a bad rap and developers are often cautioned not to use them at all. In many cases, they are too costly to use effectively. But like any tool in SQL, they can be valuable if used properly. For example, if you want to run a query that will return data for every month, even on customers that had no orders that particular month, you could use a Cartesian product quite handily. The SQL in Figure 2 does just that.
 While this may not seem like magic, consider that if you did a standard INNER JOIN from Customers to Orders, grouped by the month and summed the sales, you would only get the months where the customer had an order. Thus, you would not get back a 0 value for the months in which the customer didn't order any products. If you wanted to plot a graph per customer showing every month and its sales, you would want the graph to include 0 month sales to identify those months visually. If you use the SQL in Figure 2, the data skips over the months that had $0 in sales because there are no rows in the Orders table for nonsales (it is assumed that you do not store what did not occur).
 The code in Figure 3 is longer, but can achieve the same goal of getting all the sales data, even for months without sales. First, it grabs a list of all of the months in the past year and puts them in the first TABLE datatype table (@tblMonths). Next, the code gets a list of all customers' company names who had sales during that time period and puts them in another TABLE datatype table (@tblCus-tomers). These two tables store all of the basic data required to create the resultset except the actual sales numbers.
 All of the months are listed in the first table (12 rows) and all of the customers who had sales in that time frame are listed in the second table (81 for me). Not every customer purchased a product in each of the past 12 months, so performing an INNER or LEFT JOIN won't return every customer for every month. These operations will only return the customers and the months when they did purchase something.
 A Cartesian product can return all customers for all months. A Cartesian product basically multiplies the first table by the second table and results in a rowset that contains the number of rows in the first table times the number of rows in the second table. Thus, the Cartesian product returns 972 rows into the table @tblFinal. The last steps are to update the table @tblFinal with the monthly sales totals for each customer during the date range and to select the final rowset.
 Use CROSS JOINs with caution if you do not need a true Cartesian product because they can be very resource intensive. For example, if you do a CROSS JOIN on products and categories and then use a WHERE clause, DISTINCT or GROUP BY to filter out most of the rows, you could have gotten to the same result in a much more efficient manner by using an INNER JOIN. Cartesian products can be very useful when you need the data returned for all possibilities, as in the case when you want to load a graph with monthly sales dates. But you should not use them for other purposes as INNER JOINs are much more efficient in most scenarios.
 Odds and Ends
 Here are a few other common techniques that can help improve the efficiency of your SQL querying. Let's assume you are going to group all of your salespeople by region and sum their sales, but you only want salespeople who were marked active in your database. You could group the salespeople by region and use a HAVING clause to eliminate the salespersons who are not active, or you could do this in the WHERE clause. Doing this in the WHERE clause reduces the number of rows that need to be grouped, so it is more efficient than doing it in the HAVING clause. Filtering row-based criteria in the HAVING clause forces the query to group data that could have been eliminated in the WHERE clause.
 Another efficiency trick is to use the DISTINCT keyword to find a distinct list of data rows instead of using the GROUP BY clause. In this case, the SQL using the DISTINCT keyword will be more efficient. Reserve use of the GROUP BY for occasions when you need to calculate an aggregate function (SUM, COUNT, MAX, and so on). Also, avoid using the DISTINCT keyword if your query will always return a unique row on its own. In that case, the DISTINCT keyword will only add overhead.
 You've seen that numerous techniques can be employed to optimize queries and implement specific business rules; the trick is to try a few and compare their performance. Most important is to test, test, and test again. In future installments of this column, I'll continue to explore SQL Server concepts including database design, good indexing practices, and SQL Server security paradigms.

原文
五种提高SQL性能的方法
 有时, 为了让应用程序运行得更快,所做的全部工作就是在这里或那里做一些很小调整。啊,但关键在于确定如何进行调整!迟早您会遇到这种情况:应用程序中的 SQL 查询不能按照您想要的方式进行响应。它要么不返回数据,要么耗费的时间长得出奇。如果它降低了报告或您的企业应用程序的速度,用户必须等待的时间过长,他们就会很不满意。就像您的父母不想听您解释为什么在深更半夜才回来一样,用户也不会听你解释为什么查询耗费这么长时间。(“对不起,妈妈,我使用了太多的 LEFT JOIN。”)用户希望应用程序响应迅速,他们的报告能够在瞬间之内返回分析数据。就我自己而言,如果在 Web 上冲浪时某个页面要耗费十多秒才能加载(好吧,五秒更实际一些),我也会很不耐烦。
 为了解决这些问题,重要的是找到问题的根源。那么,从哪里开始呢?根本原因通常在于数据库设计和访问它的查询。在本月的专栏中,我将讲述四项技术,这些技术可用于提高基于 SQL Server? 的应用程序的性能或改善其可伸缩性。我将仔细说明 LEFT JOIN、CROSS JOIN 的使用以及 IDENTITY 值的检索。请记住,根本没有神奇的解决方案。调整您的数据库及其查询需要占用时间、进行分析,还需要大量的测试。这些技术都已被证明行之有效,但对您的应用程序而言,可能其中一些技术比另一些技术更适用。
 从 INSERT 返回 IDENTITY
 我决定从遇到许多问题的内容入手:如何在执行 SQL INSERT 后检索 IDENTITY 值。通常,问题不在于如何编写检索值的查询,而在于在哪里以及何时进行检索。在 SQL Server 中,下面的语句可用于检索由最新在活动数据库连接上运行的 SQL 语句所创建的 IDENTITY 值:
 SELECT @@IDENTITY
 这个 SQL 语句并不复杂,但需要记住的一点是:如果这个最新的 SQL 语句不是 INSERT,或者您针对非 INSERT SQL 的其他连接运行了此 SQL,则不会获得期望的值。您必须运行下列代码才能检索紧跟在 INSERT SQL 之后且位于同一连接上的 IDENTITY,如下所示:
 INSERT INTO Products (ProductName) VALUES ('Chalk')
 SELECT @@IDENTITY
 在一个连接上针对 Northwind 数据库运行这些查询将返回一个名称为 Chalk 的新产品的 IDENTITY 值。所以,在使用 ADO 的 Visual Basic? 应用程序中,可以运行以下语句:
 Set oRs = oCn.Execute("SET NOCOUNT ON;INSERT INTO Products _
 (ProductName) VALUES ('Chalk');SELECT @@IDENTITY")
 lProductID = oRs(0)
 此代码告诉 SQL Server 不要返回查询的行计数,然后执行 INSERT 语句,并返回刚刚为这个新行创建的 IDENTITY 值。SET NOCOUNT ON 语句表示返回的记录集有一行和一列,其中包含了这个新的 IDENTITY 值。如果没有此语句,则会首先返回一个空的记录集(因为 INSERT 语句不返回任何数据),然后会返回第二个记录集,第二个记录集中包含 IDENTITY 值。这可能有些令人困惑,尤其是因为您从来就没有希望过 INSERT 会返回记录集。之所以会发生此情况,是因为 SQL Server 看到了这个行计数(即一行受到影响)并将其解释为表示一个记录集。因此,真正的数据被推回到了第二个记录集。当然您可以使用 ADO 中的 NextRecordset 方法获取此第二个记录集,但如果总能够首先返回该记录集且只返回该记录集,则会更方便,也更有效率。
 此方法虽然有效,但需要在 SQL 语句中额外添加一些代码。获得相同结果的另一方法是在 INSERT 之前使用 SET NOCOUNT ON 语句,并将 SELECT @@IDENTITY 语句放在表中的 FOR INSERT 触发器中,如下面的代码片段所示。这样,任何进入该表的 INSERT 语句都将自动返回 IDENTITY 值。
 CREATE TRIGGER trProducts_Insert ON Products FOR INSERT AS
     SELECT @@IDENTITY
 GO
 触发器只在 Products 表上发生 INSERT 时启动,所以它总是会在成功 INSERT 之后返回一个 IDENTITY。使用此技术,您可以始终以相同的方式在应用程序中检索 IDENTITY 值。
 内嵌视图与临时表
 某些时候,查询需要将数据与其他一些可能只能通过执行 GROUP BY 然后执行标准查询才能收集的数据进行联接。例如,如果要查询最新五个定单的有关信息,您首先需要知道是哪些定单。这可以使用返回定单 ID 的 SQL 查询来检索。此数据就会存储在临时表(这是一个常用技术)中,然后与 Products 表进行联接,以返回这些定单售出的产品数量:
 CREATE TABLE #Temp1 (OrderID INT NOT NULL, _
                      OrderDate DATETIME NOT NULL)
 INSERT INTO #Temp1 (OrderID, OrderDate)
 SELECT     TOP 5 o.OrderID, o.OrderDate
 FROM Orders o ORDER BY o.OrderDate DESC
 SELECT     p.ProductName, SUM(od.Quantity) AS ProductQuantity
 FROM     #Temp1 t
     INNER JOIN [Order Details] od ON t.OrderID = od.OrderID
     INNER JOIN Products p ON od.ProductID = p.ProductID
 GROUP BY p.ProductName
 ORDER BY p.ProductName
 DROP TABLE #Temp1
 这些 SQL 语句会创建一个临时表,将数据插入该表中,将其他数据与该表进行联接,然后除去该临时表。这会导致此查询进行大量 I/O 操作,因此,可以重新编写查询,使用内嵌视图取代临时表。内嵌视图只是一个可以联接到 FROM 子句中的查询。所以,您不用在 tempdb 中的临时表上耗费大量 I/O 和磁盘访问,而可以使用内嵌视图得到同样的结果:
 SELECT p.ProductName,
     SUM(od.Quantity) AS ProductQuantity
 FROM     (
     SELECT TOP 5 o.OrderID, o.OrderDate
     FROM     Orders o
     ORDER BY o.OrderDate DESC
     ) t
     INNER JOIN [Order Details] od ON t.OrderID = od.OrderID
     INNER JOIN Products p ON od.ProductID = p.ProductID
 GROUP BY
     p.ProductName
 ORDER BY
     p.ProductName
 此查询不仅比前面的查询效率更高,而且长度更短。临时表会消耗大量资源。如果只需要将数据联接到其他查询,则可以试试使用内嵌视图,以节省资源。
 避免 LEFT JOIN 和 NULL
 当然,有很多时候您需要执行 LEFT JOIN 和使用 NULL 值。但是,它们并不适用于所有情况。改变 SQL 查询的构建方式可能会产生将一个花几分钟运行的报告缩短到只花几秒钟这样的天壤之别的效果。有时,必须在查询中调整数据的形态,使之适应应用程序所要求的显示方式。虽然 TABLE 数据类型会减少大量占用资源的情况,但在查询中还有许多区域可以进行优化。SQL 的一个有价值的常用功能是 LEFT JOIN。它可以用于检索第一个表中的所有行、第二个表中所有匹配的行、以及第二个表中与第一个表不匹配的所有行。例如,如果希望返回每个客户及其定单,使用 LEFT JOIN 则可以显示有定单和没有定单的客户。
 此工具可能会被过度使用。LEFT JOIN 消耗的资源非常之多,因为它们包含与 NULL(不存在)数据匹配的数据。在某些情况下,这是不可避免的,但是代价可能非常高。LEFT JOIN 比 INNER JOIN 消耗资源更多,所以如果您可以重新编写查询以使得该查询不使用任何 LEFT JOIN,则会得到非常可观的回报(请参阅图 1 中的图)。
 
 图 1:查询
 加快使用 LEFT JOIN 的查询速度的一项技术涉及创建一个 TABLE 数据类型,插入第一个表(LEFT JOIN 左侧的表)中的所有行,然后使用第二个表中的值更新 TABLE 数据类型。此技术是一个两步的过程,但与标准的 LEFT JOIN 相比,可以节省大量时间。一个很好的规则是尝试各种不同的技术并记录每种技术所需的时间,直到获得用于您的应用程序的执行性能最佳的查询。
 测试查询的速度时,有必要多次运行此查询,然后取一个平均值。因为查询(或存储过程)可能会存储在 SQL Server 内存中的过程缓存中,因此第一次尝试耗费的时间好像稍长一些,而所有后续尝试耗费的时间都较短。另外,运行您的查询时,可能正在针对相同的表运行其他查询。当其他查询锁定和解锁这些表时,可能会导致您的查询要排队等待。例如,如果您进行查询时某人正在更新此表中的数据,则在更新提交时您的查询可能需要耗费更长时间来执行。
 避免使用 LEFT JOIN 时速度降低的最简单方法是尽可能多地围绕它们设计数据库。例如,假设某一产品可能具有类别也可能没有类别。如果 Products 表存储了其类别的 ID,而没有用于某个特定产品的类别,则您可以在字段中存储 NULL 值。然后您必须执行 LEFT JOIN 来获取所有产品及其类别。您可以创建一个值为“No Category”的类别,从而指定外键关系不允许 NULL 值。通过执行上述操作,现在您就可以使用 INNER JOIN 检索所有产品及其类别了。虽然这看起来好像是一个带有多余数据的变通方法,但可能是一个很有价值的技术,因为它可以消除 SQL 批处理语句中消耗资源较多的 LEFT JOIN。在数据库中全部使用此概念可以为您节省大量的处理时间。请记住,对于您的用户而言,即使几秒钟的时间也非常重要,因为当您有许多用户正在访问同一个联机数据库应用程序时,这几秒钟实际上的意义会非常重大。
 灵活使用笛卡尔乘积
 对于此技巧,我将进行非常详细的介绍,并提倡在某些情况下使用笛卡尔乘积。出于某些原因,笛卡尔乘积 (CROSS JOIN) 遭到了很多谴责,开发人员通常会被警告根本就不要使用它们。在许多情况下,它们消耗的资源太多,从而无法高效使用。但是像 SQL 中的任何工具一样,如果正确使用,它们也会很有价值。例如,如果您想运行一个返回每月数据(即使某一特定月份客户没有定单也要返回)的查询,您就可以很方便地使用笛卡尔乘积。 图 2 中的 SQL 就执行了上述操作。
 虽然这看起来好像没什么神奇的,但是请考虑一下,如果您从客户到定单(这些定单按月份进行分组并对销售额进行小计)进行了标准的 INNER JOIN,则只会获得客户有定单的月份。因此,对于客户未订购任何产品的月份,您不会获得 0 值。如果您想为每个客户都绘制一个图,以显示每个月和该月销售额,则可能希望此图包括月销售额为 0 的月份,以便直观标识出这些月份。如果使用 图 2 中的 SQL,数据则会跳过销售额为 0 美元的月份,因为在定单表中对于零销售额不会包含任何行(假设您只存储发生的事件)。
 图 3 中的代码虽然较长,但是可以达到获取所有销售数据(甚至包括没有销售额的月份)的目标。首先,它会提取去年所有月份的列表,然后将它们放入第一个 TABLE 数据类型表 (@tblMonths) 中。下一步,此代码会获取在该时间段内有销售额的所有客户公司的名称列表,然后将它们放入另一个 TABLE 数据类型表 (@tblCus-tomers) 中。这两个表存储了创建结果集所必需的所有基本数据,但实际销售数量除外。 第一个表中列出了所有月份(12 行),第二个表中列出了这个时间段内有销售额的所有客户(对于我是 81 个)。并非每个客户在过去 12 个月中的每个月都购买了产品,所以,执行 INNER JOIN 或 LEFT JOIN 不会返回每个月的每个客户。这些操作只会返回购买产品的客户和月份。
 笛卡尔乘积则可以返回所有月份的所有客户。笛卡尔乘积基本上是将第一个表与第二个表相乘,生成一个行集合,其中包含第一个表中的行数与第二个表中的行数相乘的结果。因此,笛卡尔乘积会向表 @tblFinal 返回 972 行。最后的步骤是使用此日期范围内每个客户的月销售额总计更新 @tblFinal 表,以及选择最终的行集。
 如果由于笛卡尔乘积占用的资源可能会很多,而不需要真正的笛卡尔乘积,则可以谨慎地使用 CROSS JOIN。例如,如果对产品和类别执行了 CROSS JOIN,然后使用 WHERE 子句、DISTINCT 或 GROUP BY 来筛选出大多数行,那么使用 INNER JOIN 会获得同样的结果,而且效率高得多。如果需要为所有的可能性都返回数据(例如在您希望使用每月销售日期填充一个图表时),则笛卡尔乘积可能会非常有帮助。但是,您不应该将它们用于其他用途,因为在大多数方案中 INNER JOIN 的效率要高得多。
 拾遗补零
 这里介绍其他一些可帮助提高 SQL 查询效率的常用技术。假设您将按区域对所有销售人员进行分组并将他们的销售额进行小计,但是您只想要那些数据库中标记为处于活动状态的销售人员。您可以按区域对销售人员分组,并使用 HAVING 子句消除那些未处于活动状态的销售人员,也可以在 WHERE 子句中执行此操作。在 WHERE 子句中执行此操作会减少需要分组的行数,所以比在 HAVING 子句中执行此操作效率更高。HAVING 子句中基于行的条件的筛选会强制查询对那些在 WHERE 子句中会被去除的数据进行分组。
 另一个提高效率的技巧是使用 DISTINCT 关键字查找数据行的单独报表,来代替使用 GROUP BY 子句。在这种情况下,使用 DISTINCT 关键字的 SQL 效率更高。请在需要计算聚合函数(SUM、COUNT、MAX 等)的情况下再使用 GROUP BY。另外,如果您的查询总是自己返回一个唯一的行,则不要使用 DISTINCT 关键字。在这种情况下,DISTINCT 关键字只会增加系统开销。
 您已经看到了,有大量技术都可用于优化查询和实现特定的业务规则,技巧就是进行一些尝试,然后比较它们的性能。最重要的是要测试、测试、再测试。在此专栏的将来各期内容中,我将继续深入讲述 SQL Server 概念,包括数据库设计、好的索引实践以及 SQL Server 安全范例。

 

【五种提高SQL性能的方法(一)】相关文章:

谈英语听力提高的方法03-05

提高AutoCAD绘图速度的若干方法03-18

单片机通信性能分析和评价方法03-18

提高矿井主提升能力方法的探讨03-20

提高历史课堂的教学效率的方法03-02

谈提高钢琴练习效率的有效方法03-29

怎样提高资金控制力的方法03-24

关于提高企业员工韧性的途径和方法03-18

提高浅静脉穿刺成功率的方法03-04