Week 1: Getting Started with SQL for Data Analytics

The 3 pillars which make up the backbone of SQL (Structured Query Language) are:

SELECT | FROM | WHERE

These 3 commands translate the request you are looking for into a language the database can understand to retrieve the data you need.

How does it work?

Imagine a big bucket, inside are a mix of different types of balls – your boss has asked you to retrieve certain balls from the bucket based on a set of rules. 

Inside the bucket are big balls, small balls, green balls, red balls, some of them are striped, some of them are spotted.

SQL is like the set of instructions your boss gives you to retrieve the balls they need – this is how the instruction may be written:

SELECT  - go and get balls

  BALL

FROM  - from the bucket

 BUCKET

WHERE  - only green balls

  COLOUR =  ‘GREEN’

Just by reading that little query, you can decipher that your boss wants you to pick up all the green balls from the bucket.

Things to note

Now, this may seem quite simple, however, there are a number of assumptions that have been made when your boss passed on these set of instructions:

Bucket-Balls ExampleTechnical Meaning
You can read English, therefore, know what these words mean: SELECT, FROM, WHERE, Balls, Bucket and GREEN.You are using the correct SQL Language (SQL Server / PSQL / MySQL / GoogleSQL etc.)
You know what a bucket looks like and the types of balls inside.You know the Schema (columns of data and their data-types e.g. text/numbers/dates etc.) of the table.
There is only 1 bucket to choose from.You know the table architecture, which tables to query and have sufficient permission to access those tables.
There are, in fact, green balls in the bucket.You know what data is in which tables so you are not looking for pink cubes in the balls bucket.

So, even though SQL can be an easy language to read and understand, just as important as syntax is domain knowledge, business case and data accuracy.

Your first SQL Query

SELECT 

  * 

FROM table

WHERE

  condition

LIMIT 100

What this query does is ask the database to return every single column in a specific table but only the first 100 results.

Almost like tipping the bucket of balls upside down and getting out the first 100 balls regardless of its features.

Note: the reason I have added “LIMIT | WHERE ROWNUM | TOP” is that different databases use different SQL languages so have different ways to restrict a certain number of rows from a query – please check the syntax used for your database in your scenario to make sure you’re using the right clause.

In this case we will be using MySQL syntax (LIMIT)

Why only query a certain number of rows?

Often tables in databases span hundreds, thousands, if not, millions of rows, so as not to overload the machine you are using to run SQL queries or rack up unnecessary costs for cloud-based SQL platforms, always restrict the data.

Just don’t forget to remove the restriction once the query is finalised!

Scroll to Top