SQL语法提示工具SQL Prompt教程：为什么SELECT *（BP005）在生产代码中不好？（下）

SQL Prompt是一款实用的SQL语法提示工具。如果“提示”警告您在SELECT语句中使用星号或“star”（*），请考虑将其替换为显式列列表。它将防止不必要的网络负载和查询性能问题，并避免在插入表时如果列顺序更改而造成问题。本文是该教程的下半部分内容！

SQL Prompt根据数据库的对象名称、语法和代码片段自动进行检索，为用户提供合适的代码选择。自动脚本设置使代码简单易读–当开发者不大熟悉脚本时尤其有用。SQL Prompt安装即可使用，能大幅提高编码效率。此外，用户还可根据需要进行自定义，使之以预想的方式工作。

SQL Prompt试用版

如果“提示”警告您在SELECT语句中使用星号或“star”（*），请考虑将其替换为显式列列表。它将防止不必要的网络负载和查询性能问题，并避免在插入表时如果列顺序更改而造成问题。这篇文章主要描述该教程的后半部分内容，“为什么SELECT *在生产代码中不好/strong>”的一些内容（紧接上文），还有“在应用程序中选择*”的内容。

误解

使用SELECT *，您不能确保代码始终以相同的顺序返回相同的列，这意味着它对数据库重构没有弹性。对表源的上游修改可以更改列的顺序或数量。如果使用来传输数据，INSERT INTO…SELECT *，那么最佳结果将是一个错误，因为分配数据的后果是错误的目标列可能会令人恐惧

我将演示如果在生产代码中使用它，然后需要进行一些数据库重构，那么这将是多么危险。在这里，我们在复制敏感信息时会犯一个错误。这是非常容易做到的，并且可能导致财务违规，而不会触发任何错误。如果您情绪紧张，请立即移开视线。

/* we create a table just for our testing */  CREATE TABLE dbo.ExchangeRates --lets pretend we have this data    (    CurrencyRateDate DATETIME NOT NULL,    AverageRate MONEY NOT NULL,    EndOfDayRate MONEY NOT NULL,    FromCurrency NVARCHAR(50) NOT NULL,    FromRegion NVARCHAR(50) NOT NULL,    ToCurrency NVARCHAR(50) NOT NULL,    ToRegion NVARCHAR(50) NOT NULL    );  /* we now steal data for it from AdventureWorks next-door */  INSERT INTO dbo.ExchangeRates  SELECT CurrencyRate.CurrencyRateDate, CurrencyRate.AverageRate,      CurrencyRate.EndOfDayRate, Currency.Name AS FromCurrency,      CountryRegion.Name AS FromRegion, CurrencyTo.Name AS ToCurrency,      CountryRegionTo.Name AS ToRegion      FROM Adventureworks2016.Sales.CurrencyRate        INNER JOIN Adventureworks2016.Sales.Currency          ON CurrencyRate.FromCurrencyCode = Currency.CurrencyCode        INNER JOIN Adventureworks2016.Sales.CountryRegionCurrency          ON Currency.CurrencyCode = CountryRegionCurrency.CurrencyCode        INNER JOIN Adventureworks2016.Person.CountryRegion          ON CountryRegionCurrency.CountryRegionCode = CountryRegion.CountryRegionCode        INNER JOIN Adventureworks2016.Sales.Currency AS CurrencyTo          ON CurrencyRate.ToCurrencyCode = CurrencyTo.CurrencyCode        INNER JOIN Adventureworks2016.Sales.CountryRegionCurrency AS CountryRegionCurrencyTo          ON CurrencyTo.CurrencyCode = CountryRegionCurrencyTo.CurrencyCode        INNER JOIN Adventureworks2016.Person.CountryRegion AS CountryRegionTo          ON CountryRegionCurrencyTo.CountryRegionCode = CountryRegionTo.CountryRegionCode;  GO  /* so we start our test by creating a view to show exchange rates from equador  */  CREATE VIEW dbo.EquadorExhangeRates  AS  SELECT ExchangeRates.CurrencyRateDate, ExchangeRates.AverageRate,         ExchangeRates.EndOfDayRate, ExchangeRates.FromCurrency,         ExchangeRates.FromRegion, ExchangeRates.ToCurrency, ExchangeRates.ToRegion    FROM dbo.ExchangeRates    WHERE ExchangeRates.FromRegion = 'Ecuador';  go  /* now we just fill a table variable with the first ten rows from the view and display them */  DECLARE  @MyUsefulExchangeRates TABLE    (    CurrencyRateDate DATETIME NOT NULL,    AverageRate MONEY NOT NULL,    EndOfDayRate MONEY NOT NULL,    FromCurrency NVARCHAR(50) NOT NULL,    FromRegion NVARCHAR(50) NOT NULL,    ToCurrency NVARCHAR(50) NOT NULL,    ToRegion NVARCHAR(50) NOT NULL    );  INSERT INTO @MyUsefulExchangeRates (    CurrencyRateDate, AverageRate, EndOfDayRate,    FromCurrency, FromRegion,ToCurrency, ToRegion)    SELECT * --this isn't good at all      FROM dbo.EquadorExhangeRates;  --disply the first ten rows from the table to see what we have  SELECT TOP 10 UER.CurrencyRateDate, UER.AverageRate, UER.EndOfDayRate,    UER.ToCurrency, UER.ToRegion, UER.FromCurrency, UER.FromRegion    FROM @MyUsefulExchangeRates AS UER    ORDER BY UER.CurrencyRateDate DESC;  GO  /* end of first part. Now someone decides to alter the view */  alter VIEW dbo.EquadorExhangeRates  AS  SELECT ExchangeRates.CurrencyRateDate, ExchangeRates.AverageRate,    ExchangeRates.EndOfDayRate, ExchangeRates.ToCurrency, ExchangeRates.ToRegion, ExchangeRates.FromCurrency,    ExchangeRates.FromRegion    FROM dbo.ExchangeRates    WHERE ExchangeRates.FromRegion = 'Ecuador';  GO  /* we repeat the routine to extract the first ten rows exactly as before */  DECLARE  @MyUsefulExchangeRates TABLE    (    CurrencyRateDate DATETIME NOT NULL,    AverageRate MONEY NOT NULL,    EndOfDayRate MONEY NOT NULL,    FromCurrency NVARCHAR(50) NOT NULL,    FromRegion NVARCHAR(50) NOT NULL,    ToCurrency NVARCHAR(50) NOT NULL,    ToRegion NVARCHAR(50) NOT NULL    );  INSERT INTO @MyUsefulExchangeRates(    CurrencyRateDate, AverageRate, EndOfDayRate,    FromCurrency, FromRegion,ToCurrency, ToRegion)    SELECT * --bad, bad, bad      FROM dbo.EquadorExhangeRates;  --check that the data is the same. It isn't is itnbsp;No sir!  SELECT TOP 10 UER.CurrencyRateDate, UER.AverageRate, UER.EndOfDayRate,    UER.ToCurrency, UER.ToRegion, UER.FromCurrency, UER.FromRegion    FROM @MyUsefulExchangeRates AS UER    ORDER BY UER.CurrencyRateDate DESC;  GO  /* now just tidy up and tear down */  DROP VIEW dbo.EquadorExhangeRates  DROP TABLE  dbo.ExchangeRates

这是“之前”和“之后”结果…。

如您所见，通过切换“to”和“from”列，我们“无意”破坏了数据。引用列列表在您的代码中是多余的。但是，它的执行速度甚至比仅用星号指定所有列（假设它们按特定顺序排列）时的速度甚至更快。

约束问题

当我们使用SELECT *与大量的联接表时，我们可以并且可能会有重复的列名。这是来自AdventureWorks的简单查询：

SELECT *    FROM HumanResources.Employee AS e      INNER JOIN Person.Person AS p        ON p.BusinessEntityID = e.BusinessEntityID      INNER JOIN HumanResources.EmployeeDepartmentHistory AS edh        ON e.BusinessEntityID = edh.BusinessEntityID      INNER JOIN HumanResources.Department AS d        ON edh.DepartmentID = d.DepartmentID    WHERE (edh.EndDate IS NULL);

此代码将显示重复的列名称：

DECLARE @SourceCode NVARCHAR(4000)='   SELECT *    FROM HumanResources.Employee AS e      INNER JOIN Person.Person AS p        ON p.BusinessEntityID = e.BusinessEntityID      INNER JOIN HumanResources.EmployeeDepartmentHistory AS edh        ON e.BusinessEntityID = edh.BusinessEntityID      INNER JOIN HumanResources.Department AS d        ON edh.DepartmentID = d.DepartmentID    WHERE (edh.EndDate IS NULL);  --'  SELECT Count(*) AS Duplicates, name    FROM sys.dm_exec_describe_first_result_set(@SourceCode, NULL, 1)    GROUP BY name    HAVING Count(*) > 1    ORDER BY Count(*) DESC;

这将给试图在选择命名列时理解这种结果的应用程序带来问题。如果您尝试根据结果创建一个临时表，使用SELECT…INTO会失败。

SELECT * INTO MyTempTable    FROM HumanResources.Employee AS e      INNER JOIN Person.Person AS p        ON p.BusinessEntityID = e.BusinessEntityID      INNER JOIN HumanResources.EmployeeDepartmentHistory AS edh        ON e.BusinessEntityID = edh.BusinessEntityID      INNER JOIN HumanResources.Department AS d        ON edh.DepartmentID = d.DepartmentID    WHERE (edh.EndDate IS NULL);  Msg 2705, Level 16, State 3, Line 19  Column names in each table must be unique. Column name 'BusinessEntityID' in table 'MyTempTable' is specified more than once.

同样，这意味着您的SELECT *代码很脆弱。如果有人在一个表中更改了名称，则可能会在SELECT * INTO其他位置的上创建重复的列，而您只能挠头，想知道为什么正常工作的例程突然崩溃了

有一个地方SELECT *具有特殊的意义，不能被替代。这是在将结果转换为JSON时，并且您需要将联接表作为对象嵌入的结果时发生的情况。

SELECT *     FROM HumanResources.Employee AS employee      INNER JOIN Person.Person AS person        ON person.BusinessEntityID = employee.BusinessEntityID      INNER JOIN HumanResources.EmployeeDepartmentHistory AS history        ON employee.BusinessEntityID = history.BusinessEntityID      INNER JOIN HumanResources.Department AS d        ON  history.DepartmentID = d.DepartmentID    WHERE ( history.EndDate IS NULL) FOR JSON AUTO

这将为您提供…（我仅显示数组中的第一个文档）

[{"BusinessEntityID": 1,"NationalIDNumber": "295847284","LoginID": "adventure-works\ken0","JobTitle": "Chief Executive Officer","BirthDate": "1969-01-29","MaritalStatus": "S","Gender": "M","HireDate": "2009-01-14","SalariedFlag": true, "VacationHours": 99, "SickLeaveHours": 69, "CurrentFlag": true, "rowguid": "F01251E5-96A3-448D-981E-0F99D789110D","ModifiedDate": "2014-06-30T00:00:00",        "person": [{            "BusinessEntityID": 1, "PersonType": "EM","NameStyle": false, "FirstName": "Ken","MiddleName": "J","LastName": "Sánchez","EmailPromotion": 0, "Demographics": "0</TotalPurchaseYTD></IndividualSurvey>","rowguid": "92C4279F-1207-48A3-8448-4636514EB7E2","ModifiedDate": "2009-01-07T00:00:00",            "history": [{                "BusinessEntityID": 1, "DepartmentID": 16, "ShiftID": 1, "StartDate": "2009-01-14","ModifiedDate": "2009-01-13T00:00:00",                "d": [{                    "DepartmentID": 16, "Name": "Executive","GroupName": "Executive General and Administration","ModifiedDate": "2008-04-30T00:00:00"                }]            }]        }]    }}

这里没有冲突，因为ModifiedDate列被封装在表示源表的对象中

对应的XML给出如下：

<employee BusinessEntityID="1" NationalIDNumber="295847284" LoginID="adventure-worksken0"           JobTitle="Chief Executive Officer" BirthDate="1969-01-29" MaritalStatus="S" Gender="M" HireDate="2009-01-14" SalariedFlag="1" VacationHours="99"          SickLeaveHours="69" CurrentFlag="1" rowguid="F01251E5-96A3-448D-981E-0F99D789110D" ModifiedDate="2014-06-30T00:00:00"> <person BusinessEntityID="1" PersonType="EM" NameStyle="0" FirstName="Ken" MiddleName="J" LastName="Sánchez" EmailPromotion="0"          rowguid="92C4279F-1207-48A3-8448-4636514EB7E2" ModifiedDate="2009-01-07T00:00:00">     <Demographics>         <IndividualSurvey             >             <TotalPurchaseYTD>0</TotalPurchaseYTD>         </IndividualSurvey>     </Demographics>     <history BusinessEntityID="1" DepartmentID="16" ShiftID="1" StartDate="2009-01-14" ModifiedDate="2009-01-13T00:00:00">         <d DepartmentID="16" Name="Executive" GroupName="Executive General and Administration" ModifiedDate="2008-04-30T00:00:00"/>     </history>  </person></employee>

可维护性

在布置代码时，您指定的列不仅避免在将值分配给正确的列或变量时出错，而且还使代码更具可读性。尽您所能，仅出于将来的目的，或者有一天要负责维护代码的可怜的灵魂，就应详细说明所涉及的列的名称。当然，代码看起来有些笨拙，但是如果您的肩膀上出现了一位仙女，并说如果您两次键入代码，您的代码将更加清晰和可靠，您会这样做吗/p>

在应用程序中选择*

有时，您会看到长时间运行的查询，这些查询请求所有列并且源于一个应用程序，通常是使用LINQ的应用程序。通常，这不是故意的，但是开发人员犯了一个错误，没有指定列的说明，看起来无辜的LINQ查询会转换为SELECT *或包含每个列的列列表。如果该WHERE条款过于笼统，或者甚至被完全遗漏，那么后果就更加复杂了，因为网络始终是最慢的组件，所有不必要的数据都在网络上堆积。

例如，使用Adventureworks和LinqPad，可以在LINQ中执行此操作：

Persons.OrderBy (p => p.BusinessEntityID).Take (100)

…LINQ将其转换为实际执行的查询。您会看到它选择了所有列…

SELECT TOP (100) [t0].[BusinessEntityID], [t0].[PersonType], [t0].[NameStyle], [t0].[Title], [t0].[FirstName], [t0].[MiddleName], [t0].[LastName], [t0].[Suffix], [t0].[EmailPromotion], [t0].[AdditionalContactInfo], [t0].[Demographics], [t0].[rowguid] AS [Rowguid], [t0].[ModifiedDate]  FROM [Person].[Person] AS [t0]  ORDER BY [t0].[BusinessEntityID]

同样，这个表达式

from row in Persons select row

…将提供整个表格中每一行的每一列。

SELECT [t0].[BusinessEntityID], [t0].[PersonType], [t0].[NameStyle], [t0].[Title], [t0].[FirstName], [t0].[MiddleName], [t0].[LastName], [t0].[Suffix], [t0].[EmailPromotion], [t0].[AdditionalContactInfo], [t0].[Demographics], [t0].[rowguid] AS [Rowguid], [t0].[ModifiedDate]  FROM [Person].[Person] AS [t0]

相比之下，这…

from row in Persons.Where(i => i.LastName == "Bradley") select row.FirstName+" "+row.LastName

…翻译成更明智的：

-- Region Parameters  DECLARE @p0 NVarChar(1000) = 'Bradley'  DECLARE @p1 NVarChar(1000) = ' '  -- EndRegion  SELECT ([t0].[FirstName] + @p1) + [t0].[LastName] AS [value]  FROM [Person].[Person] AS [t0]  WHERE [t0].[LastName] = @p0

结论

一般的代码味道是请求提供比您需要的更多的数据。允许数据源为您进行过滤几乎总是更好、更快的方法。使用SELECT *，在某些情况下是完全合法的，通常是这个更普遍问题的标志。对于那些精通C＃或VB但不精通SQL的开发人员来说，诱使他们下载整行甚至整个表，并在更熟悉的领域进行过滤是很诱人的。额外的网络负载和延迟本身应该足以阻止这种做法，但这通常被误认为是“数据库慢”。长列列表（通常列出所有列）几乎与SELECT *一样有害，尽管SELECT *在进行任何重构时会带来额外的风险。

本教程内容到这里就结束了，感兴趣的朋友可以继续关注我们，后面会不管更新新的文章内容！您也可以下载SQL Prompt免费版评估一下~

相关内容推荐：

SQL语法提示工具SQL Prompt教程：为什么SELECT *（BP005）在生产代码中不好上）

SQL Prompt系列教程>>>

想要购买SQL Prompt正版授权，或了解更多产品信息请点击“咨询在线客服”

标签：

来源：慧都

声明：本站部分文章及图片转载于互联网，内容版权归原作者所有，如本站任何资料有侵权请您尽早请联系jinwei@zod.com.cn进行处理,非常感谢！

SQL语法提示工具SQL Prompt教程：为什么SELECT *（BP005）在生产代码中不好？（下）

相关推荐

发表回复