me talking out loud » Blog Archive » “Getting” Joins NOTE: I highly recommend people read the Coding Horror extension of my post. I wrote this up as a quick and dirty cheat for a friend. Coding Horror actually takes his time with the subject and gives it a better explanation. I was asked to post this after explaining it to someone on IRC. If you have tried to understand how joins work and constantly get confused about what join to use, you just need to keep a simple picture in mind ( I like pictures). I will be explaining joins by referencing a Venn diagram. We will start with just an empty diagram: The T1 circle represents all the records in table 1. I will use red to signify the records that will be returned by a particular join. INNER JOIN An inner join only returns those records that have “matches” in both tables. OUTER JOIN An outer join is the inverse of the inner join. LEFT JOIN A left join returns all the records in the “left” table (T1) whether they have a match in the right table or not.
Data Science Toolkit Join Hints (Transact-SQL) Join hints specify that the query optimizer enforce a join strategy between two tables in SQL Server 2012. For general information about joins and join syntax, see FROM (Transact-SQL). Applies to: Transact-SQL Syntax Conventions <join_hint> ::= { LOOP | HASH | MERGE | REMOTE } Specifies that the join in the query should use looping, hashing, or merging. Specifies that the join operation is performed on the site of the right table. If the right table is local, the join is performed locally. REMOTE cannot be used when one of the values being compared in the join predicate is cast to a different collation using the COLLATE clause. REMOTE can be used only for INNER JOIN operations. Join hints are specified in the FROM clause of a query. A. The following example specifies that the JOIN operation in the query is performed by a HASH join. B. The following example specifies that the JOIN operation in the query is performed by a LOOP join. C.
Data Science Toolkit Usage Command Line on OS X and Linux Download python_tools.zip, extract into a new folder, cd into it and run . This will create a set of scripts you can run directly from the command line, like this: html2text | text2people The command above fetches the New York Times front page, extracts a plain text version, and then pulls out likely names. file2text -h ~/scanned_documents/*.jpg > scanned_text.txt This will run OCR on all the JPEG images in that folder (the same command also works on PDF, DOC and XLS files). Commands that take in inputs that aren't natural-language text or html treat their arguments as the strings to process, rather than file names. ip2coordinates "67.169.73.113" If you do want to run one of these commands on a large number of inputs, you can pipe them in from a file on stdin, and each line in the file will be treated as an input: ip2coordinates < someips.txt text2places -h > nytimes_places.csv export DSTK_API_BASE= Ruby
CROSS APPLY Explained My first introduction to the APPLY operator was using the DMVs. For quite a while after first being introduced, I didn’t understand it or see a use for it. While it is undeniable that it is has some required uses when dealing with table valued functions, it’s other uses evaded me for a while. Luckily, I started seeing some code that used it outside of table valued functions. It finally struck me that it could be used as a replacement for correlated sub queries and derived tables. I never liked correlated subqueries because it always seemed like adding full blown queries in the select list was confusing and improper. SELECT SalesOrderID = soh.SalesOrderID ,OrderDate = soh.OrderDate ,MaxUnitPrice = (SELECT MAX(sod.UnitPrice) FROM Sales.SalesOrderDetail sod WHERE soh.SalesOrderID = sod.SalesOrderID)FROM AdventureWorks.Sales.SalesOrderHeader AS soh It always seemed to me that these operations should go below the FROM clause. Luckily, this is where the CROSS APPLY steps in so nicely.
Using Cross Joins A cross join that does not have a WHERE clause produces the Cartesian product of the tables involved in the join. The size of a Cartesian product result set is the number of rows in the first table multiplied by the number of rows in the second table. The following example shows a Transact-SQL cross join. USE AdventureWorks2008R2; GO SELECT p.BusinessEntityID, t.Name AS Territory FROM Sales.SalesPerson p CROSS JOIN Sales.SalesTerritory t ORDER BY p.BusinessEntityID; The result set contains 170 rows (SalesPerson has 17 rows and SalesTerritory has 10; 17 multiplied by 10 equals 170). However, if a WHERE clause is added, the cross join behaves as an inner join.
Using Self-Joins A table can be joined to itself in a self-join. Use a self-join when you want to create a result set that joins records in a table with other records in the same table. To list a table two times in the same query, you must provide a table alias for at least one of instance of the table name. A. The following example uses a self-join to find the products that are supplied by more than one vendor. Because this query involves a join of the ProductVendor table with itself, the ProductVendor table appears in two roles. USE AdventureWorks2008R2; GO SELECT DISTINCT pv1.ProductID, pv1.VendorID FROM Purchasing.ProductVendor pv1 INNER JOIN Purchasing.ProductVendor pv2 ON pv1.ProductID = pv2.ProductID AND pv1.VendorID <> pv2.VendorID ORDER BY pv1.ProductID B. The following example performs a self-join of the Sales.SalesPerson table to produce a list of all the territories and the sales people working in them.
Using Inner Joins An inner join is a join in which the values in the columns being joined are compared using a comparison operator. In the ISO standard, inner joins can be specified in either the FROM or WHERE clause. This is the only type of join that ISO supports in the WHERE clause. Inner joins specified in the WHERE clause are known as old-style inner joins. The following Transact-SQL query is an example of an inner join: USE AdventureWorks2008R2; GO SELECT * FROM HumanResources.Employee AS e INNER JOIN Person.Person AS p ON e.BusinessEntityID = p.BusinessEntityID ORDER BY p.LastName This inner join is known as an equi-join. The following example uses a less-than (<) join to find sales prices of product 718 that are less than the list price recommended for that product. Here is the result set. ProductID Name ListPrice Selling Price 718 HL Road Frame - Red, 44 1431.5000 758.0759 718 HL Road Frame - Red, 44 1431.5000 780.8182 718 HL Road Frame - Red, 44 1431.5000 858.90 (3 row(s) affected)
Using Outer Joins Inner joins return rows only when there is at least one row from both tables that matches the join condition. Inner joins eliminate the rows that do not match with a row from the other table. Outer joins, however, return all rows from at least one of the tables or views mentioned in the FROM clause, as long as those rows meet any WHERE or HAVING search conditions. All rows are retrieved from the left table referenced with a left outer join, and all rows from the right table referenced in a right outer join. SQL Server uses the following ISO keywords for outer joins specified in a FROM clause: LEFT OUTER JOIN or LEFT JOINRIGHT OUTER JOIN or RIGHT JOINFULL OUTER JOIN or FULL JOIN Consider a join of the Product table and the ProductReview table on their ProductID columns. To include all products, regardless of whether a review has been written for one, use an ISO left outer join. Consider a join SalesTerritory table and the SalesPerson table on their TerritoryID columns. Northeast 275