what can you use to combine data from two or more tables into a single result set?
Acquire data science while social distancing
How to bring together tables using SQL to combine datasets
Find out where magpies are the about dangerous in Australia using the SQL keywords UNION, UNION ALL, INNER Join, LEFT OUTER JOIN and RIGHT OUTER JOIN
Introduction
For the final 4 weeks my friends and I accept been helping each other larn new skills while social distancing. We are learning data science equally an online study group. We are nearly finished the SQL portion. I am and so proud of everyone's efforts and how willing they have been to take a become at something new.
In the last couple weeks my parents have fifty-fifty decided to give my lessons a go. They merely accept a few weeks to take hold of upwardly on, and then it will exist interesting to hear what they recall.
Previous lesson
Last calendar week we used data on the three albums released past the Spice Girls to determine which of the albums was the best. We did this using the Group BY keyword so that we could aggregate the statistics for each album.
This lesson
Now that we know how to grouping and amass data in the table, this week we will acquire how to bring together tables together.
It is helpful to exist able to combine data sets. Specially if there are different details in each table. By joining them together you are able to exercise some calculations. Alternatively, you can create a new table that contains all the different details together in one dataset.
To combine tables we volition apply the Matrimony, UNION ALL, INNER JOIN, LEFT OUTER Bring together and RIGHT OUTER Join keywords.
Primal learnings
- use the keyword UNION to stack datasets without duplicate values
- use the keyword UNION ALL to stack datasets with duplicate values
- use the keyword INNER JOIN to join two tables together and only get the overlapping values
- apply the keyword LEFT OUTER JOIN to join two tables together and not loose whatsoever data from the left table, even those records that do not have a friction match in the right table
- utilize the keyword RIGHT OUTER Bring together to join two tables together and not loose whatever data from the right table, even those records that do not have a lucifer in the left table
- sympathize the deviation between the Matrimony and Spousal relationship ALL keywords
- sympathize the difference between an INNER JOIN, LEFT OUTER JOIN and RIGHT OUTER JOIN.
The problem
Everybody knows that Australia is full of unsafe animals. However, one of the lesser known predators is the magpie. This bird swoops down on unsuspecting victims, with precipitous beaks well equiped to have out an eye. We want to find out how common magpie attacks are in each of the Australian states. Where do nosotros take to clothing protective headgear? 😜
The data
To solve our trouble nosotros accept several tables of data.
The start is a tabular array contains the number of magpie attacks reported from each Australian state co-ordinate to the website magpie alert. It is important to notation that the information on this site is crowdsourced. People cocky-report when they have been swooped past a magpie. Therefore in that location may be some biases based on where the website is almost pop.
Our second ready of data is a table of dropbear attacks. Dropbears are some other ferocious Australian brute. We can employ this table equally a comparison to the magpie attacks information.
Disclaimer: dropbears are not real so the data is fabricated up by me 😃
In addition to the two tables on animate being attacks, we also have data on the Australian states that I got from Wikipedia. We can utilize this table to normalise our animal attack tables for population or area differences betwixt the states.
Syntax to combine tables
The simplest manner to combine two tables together is using the keywords UNION or UNION ALL. These two methods pile one lot of selected information on top of the other.
SELECT
name_column_one,
name_column_three
FROM
name_of_table_one
Marriage
SELECT
name_column_one,
name_column_three
FROM
name_of_table_two
;
The difference betwixt the two keywords is that UNION only takes distinct values, but UNION ALL keeps all of the values selected. Both are used with the same syntax.
SELECT
name_column_one,
name_column_three
FROM
name_of_table_one
Union ALL
SELECT
name_column_one,
name_column_three
FROM
name_of_table_two
;
What if yous want to combine tables based on a matching value?
There are several dissimilar means we can combine tables based on value matching. They include the INNER JOIN, Full OUTER JOIN, LEFT OUTER Join and Correct OUTER Join.
Available joins are slightly different in different versions of SQL linguistic communication. We have been learning MySQL. Therefore we volition focus only on the joins available in MySQL. INNER JOIN, LEFT OUTER JOIN and Right OUTER JOIN, merely not FULL OUTER Bring together are the ones that may be used in MySQL.
If you would like to learn how to do a Full OUTER Join that is covered in one of my other articles on the difference between inner and outer joins in SQL.
What if you simply want the data where both tables contain a matching value?
If you lot want to perform a bring together where you only include data where both tables contain matching values in a specified cavalcade, and so you would use an INNER JOIN.
Inner joins return only the parts of ii datasets that overlap. So that records volition be returned only where in that location is a matching value in the cavalcade you are joining on in both tables. The syntax for an INNER JOIN is shown below:
SELECT *
FROM
name_of_table_one
INNER JOIN
name_of_table_two
ON
name_of_table_one.name_column_one = name_of_table_two.name_column_one
In the example higher up, the records from table one and table ii would both be returned, but only if the values in column i of tabular array one match the values in column one of table 2. Whatsoever records that do non have matching values would not be returned by an INNER JOIN.
One function of the bring together syntax that nosotros accept not come beyond in our lessons before, is referring to a column past both table and column name. This is important when joining tables because both tables could have a column with the same proper name. If you don't include the table name when selecting columns with the same name, the programme will not know which one you are referring to.
To avoid confusion, nosotros utilize the table name and the cavalcade name separated by a full stop to create a unique identifier for each cavalcade.
What if you want to join on a matching value, but just want to proceed all the data from the left table and merely those records that match from the right table?
To join two tables based on a column lucifer without loosing any of the data from the left table, you would utilise a LEFT OUTER JOIN.
Left outer joins are used when you want to get all the values from 1 table but only the records that match the left table from the right table.
SELECT *
FROM
name_of_table_one
LEFT OUTER Join
name_of_table_two
ON
name_of_table_one.name_column_one = name_of_table_two.name_column_one
In our LEFT OUTER JOIN instance above, all rows from tabular array one will exist returned plus the rows that tabular array two had in mutual with table one based on column i in each table.
What is the deviation between a LEFT OUTER JOIN and a Correct OUTER Bring together?
The syntax for a RIGHT OUTER Join is the same every bit for a LEFT OUTER Join. The merely deviation between the two is that the right table, in our example tabular array two, will retain all of its records. Whilst the left table, table one will only continue its records where in that location is a friction match betwixt its column one and table two'southward column i.
SELECT *
FROM
name_of_table_one
Correct OUTER JOIN
name_of_table_two
ON
name_of_table_one.name_column_one = name_of_table_two.name_column_one
At present let'south accept a get
- Go to https://www.db-dabble.com/
- In the left paw cavalcade put the CREATE Tabular array and INSERT INTO queries below
CREATE TABLE magpie_attacks(
state_code varchar(255),
state_name varchar(255),
animal varchar(255),
number_of_attacks int(255)
); INSERT INTO magpie_attacks(
state_code,
state_name,
fauna,
number_of_attacks
)
VALUES
('SA', 'South Australia', 'magpie', 154),
('VIC', 'Victoria', 'magpie', 972),
('TAS', 'Tasmania', 'magpie', 0),
('NSW', 'New South Whales', 'magpie', 823),
('QLD', 'Queensland', 'magpie', 1141),
('WA', 'Western Commonwealth of australia', 'magpie', 287),
('Human action', 'Australian Capital Territory', 'magpie', 668) ; CREATE TABLE dropbear_attacks(
state_code varchar(255),
state_name varchar(255),
animal varchar(255),
number_of_attacks int(255)
); INSERT INTO dropbear_attacks(
state_code,
state_name,
animal,
number_of_attacks
)
VALUES
('SA', 'South Australia', 'dropbear', 21),
('VIC', 'Victoria', 'dropbear', 67),
('TAS', 'Tasmania', 'dropbear', 30),
('NSW', 'New Southward Whales', 'dropbear', 19),
('QLD', 'Queensland', 'dropbear', xl),
('WA', 'Western Australia', 'dropbear', 37)
; CREATE TABLE australian_states(
state_code varchar(255),
state_name varchar(255),
population int(255),
area_km2 int(255)
); INSERT INTO australian_states(
state_code,
state_name,
population,
area_km2
)
VALUES
('SA', 'South Commonwealth of australia', 1751693, 1044353),
('VIC', 'Victoria', 6594804, 237657),
('TAS', 'Tasmania', 534281, 90758),
('NSW', 'New Southward Whales', 8089526, 809952),
('QLD', 'Queensland', 5095100, 1851736),
('WA', 'Western Commonwealth of australia', 2621680, 2642753),
('Act', 'Australian Capital Territory', 426709, 2358),
('NT', 'Northern Territory', 245869, 1419630)
;
iii. In the right hand column put your queries and run them using the 'Run' button in the superlative left
iv. Run the query below and see if it returns what you would expect it to:
SELECT
*
FROM
magpie_attacks
UNION
SELECT
*
FROM
dropbear_attacks
;
5. Run the query below and meet if it returns what you would expect it to:
SELECT
state_code,
state_name
FROM
magpie_attacks
UNION
SELECT
state_code,
state_name
FROM
dropbear_attacks
;
half dozen. Run the query beneath and compare the results to what you got from the previous query:
SELECT
state_code,
state_name
FROM
magpie_attacks
UNION ALL
SELECT
state_code,
state_name
FROM
dropbear_attacks;
7. Run the query below and encounter if information technology returns what yous would await it to:
SELECT
magpie_attacks.state_code,
magpie_attacks.number_of_attacks AS magpie_attacks,
dropbear_attacks.number_of_attacks Every bit dropbear_attacks
FROM
magpie_attacks
INNER Join
dropbear_attacks
ON
magpie_attacks.state_code = dropbear_attacks.state_code;
8. Run the query below and compare the results to the previous query:
SELECT
magpie_attacks.state_code,
magpie_attacks.number_of_attacks AS magpie_attacks,
dropbear_attacks.number_of_attacks As dropbear_attacks
FROM
magpie_attacks
LEFT OUTER Bring together
dropbear_attacks
ON
magpie_attacks.state_code = dropbear_attacks.state_code;
ix. Run the query below and compare the results to the previous two queries:
SELECT
magpie_attacks.state_code,
magpie_attacks.number_of_attacks Equally magpie_attacks,
dropbear_attacks.number_of_attacks AS dropbear_attacks
FROM
magpie_attacks
Correct OUTER JOIN
dropbear_attacks
ON
magpie_attacks.state_code = dropbear_attacks.state_code;
10. Run the query below and compare the results to the previous two queries:
SELECT
magpie_attacks.state_code,
magpie_attacks.number_of_attacks Equally magpie_attacks,
dropbear_attacks.number_of_attacks Equally dropbear_attacks,
dropbear_attacks.number_of_attacks / magpie_attacks.number_of_attacks * 100 AS 'dropbear_attacks_as_percentage_of_magpie_attacks'
FROM
magpie_attacks
INNER JOIN
dropbear_attacks
ON
magpie_attacks.state_code = dropbear_attacks.state_code
;
Practise 1: Combine the magpie_attacks table and the australian_states table using each of the different spousal relationship and join methods that we have learned in this lesson. Feel costless to select every bit many or as few columns equally you need to in order to make your queries run.
Exercise 2: Write a query to notice out which Australian state has the greatest number of magpie attacks equally a percentage of the population in that state. Hint: you can use the query in step 10 as a reference if needed.
Learning review
After completing this lesson you should know:
- how to use the keyword UNION to stack datasets without duplicate values
- how to utilize the keyword Wedlock ALL to stack datasets with duplicate values
- how to utilize the keyword INNER Bring together to join two tables together and only get the overlapping values
- how to use the keyword LEFT OUTER JOIN to join 2 tables together and not loose any data from the left tabular array that does not have a match in the right tabular array
- how to use the keyword Correct OUTER JOIN to join 2 tables together and not loose any data from the right table that does not have a match in the left table
- understand the difference betwixt the UNION and UNION ALL KEYWORDS
- understand the difference between an INNER JOIN, LEFT OUTER JOIN and RIGHT OUTER JOIN.
Next Lesson
Side by side lesson will be a review of all we have learned in the terminal 5 lessons on SQL. I ever recall it is a expert idea to practice new skills to consolidate the lessons. Hopefully the review lesson will also let us use all the skills learned over the lessons in a more than independent style. Upward till now the exercises take been fairly like to the examples that preceded them. All the same, I am hoping that part of what we will learn next lesson is how to cull what methods to use in order to solve the problems.
In addition to data, my other passion is painting. You tin can notice my wildlife art at www.katemarielewis.com
All Lessons in the learning data scientific discipline while socially distancing (LDSWSD) series
Source: https://towardsdatascience.com/combining-tables-using-sql-f6ddfd5f8b6f
0 Response to "what can you use to combine data from two or more tables into a single result set?"
Postar um comentário