This article is a tutorial aimed at providing a basic introduction to the use of SQL (Structured Query Language) and is meant to complement the articles related to Managed and virtual tables. SQL is vital to the usage of these tables, whether it is to manipulate, extract or store data.
The outcomes of this tutorial are as follows:
1. Appropriately apply the following SQL statements/clauses/operators:
- AND, OR, NOT
- ORDER BY
- INSERT INTO
2. Construct and apply a SQL statement to view/extract data from a managed table
3. Construct and apply a SQL statement to modify an existing managed table in terms of:
- Inserting new records
- Updating existing records
- Deleting existing records
For users looking for a full and in-depth tutorial of SQL, you can click here to be re-directed.
1.You will be re-directed to an external website of which SolveXia has no association with.
2.The SolveXia platform requires the SQL code to be written in the same format and convention as Microsoft SQL Server.
1. Extracting data
To extract data from a table into an excel file or text file, you will need to use the Select to Excel/Text instructions (available in the File manipulation action step). This section will provide an introduction to the SQL queries/statements that can be used to extract data.
1.1 The "SELECT" query
A simple SQL query to return data from a managed table called Table_1.
Note: When retrieving data from a managed table, it is important to wrap the table's name in square brackets. In this case, Table_1 should appear as [Table_1] in the query.
1.2 The "WHERE" clause
There may be times where you may only want to extract a subset of the full data stored in the table. A typical example is if you stored your cash flow data in a table called, [Cashflow] and you wish to only extract the data only for a specific data. In such a situation, a "WHERE" clause allows you to filter the data you to a specific condition.
Given the scenario above to extract a specific month's cash flow, the below query demonstrates how to use a "WHERE" clause in tandem with a "SELECT" query
In 1.2, we went through an example of using the "WHERE" clause based on a singular condition. In practice, you may have multiple conditions that you wish to filter on.
The three operators you can use with the "WHERE" cause are:
The "AND" operator is used to retrieve data when all conditions you specified have been met (i.e. all conditions have been evaluated to True).
With our cash flow scenario, the below example demonstrates how to extract data on which you wish to see the cash flow due to a specific item for a specific month.
The 'OR' operator is used to retrieve data when at least one of the conditions you have specified is met.
With our example, the below extract demonstrates how to extract data for two months.
The 'NOT' operator is used to retrieve data when the specified conditions are not met (i.e. conditions evaluated to False).
The below extract shows how to use the 'NOT' operator to extract all cashflow data as long as it isn't associated with a specific label.
Users are kindly reminded to keep in mind of De Morgan's laws when using the 'NOT' operator with 'AND'/'OR' operators.
1.4 Ordered sets
Now that you are able to write a query to display the information you require, you may also need to sort the results in either ascending or descending order. Using our previous example, the syntax for sorting your data is shown below.
By default data is sorted in ascending order. To sort a specific column in descending order, place the keyword "DESC" after the column name.
2. Inserting and deleting data
You can insert data into an existing managed table through the "INSERT INTO" statement. You can insert new records in multiple ways.
The first method allows you to insert records based on values:
The second method allows you to insert records from another table:
Note: If you are adding a record that has values for all the columns in the table AND you are inserting values in the same as order corresponding to the table structure, you do not need to specify the column names after the table name in the "INSERT INTO" query.
Now, if you wanted to delete the data that we just insert into [Table_1]. This can be done through the "DELETE" statement.
Note: The "WHERE" clause is very important to delete data. If you leave out the "WHERE "clause, all data will be deleted (and won't be recoverable). It is also vital to be aware of the data you are deleting with the "WHERE" clause. It is always good practice to do a "SELECT * FROM" with the same conditions before executing the "DELETE FROM". This will allow you to examine the data you are deleting first.
3. Manipulating data
To manipulate data that exist in a Managed Table, you can achieve this by using the "UPDATE" statement. The UPDATE statement allows you to manipulate the data of existing columns by changing the values directly or using formulas.
The below extract shows how you can manipulate a column by changing the values directly.
Alternatively, you can update a table's column based on the values of other columns.
The "UPDATE" Statement allows you to manipulate the data of multiple columns; to do so you will need to separate each column you are manipulating with "," in the SET component. See the below extract for an example of the syntax.
Of course, it goes without saying, the UPDATE statement does not allow you to add/remove columns, only manipulate the data for existing columns. i.e. if Column N does not exist, the statement will fail.
So far we have shown you to update the data for an entire column by changing its values directly or through formulas. However, there may be cases where you only need to change a subset of the data in the column(s). You can achieve this by applying a "WHERE" clause at the end of your UPDATE statement.
To test your understanding of the content of this tutorial. There is a data set provided in this article, air_mov.xlsx, to be used for the following exercise.
Please start off by completing the following steps.
- Import the air_mov.xlsx into a managed table. Name the managed table, air_mov_raw.
- Create two managed tables of the same structure as air_mov_raw. Name these tables air_mov_syd and air_mov_mel.
If you are having difficulties doing the first two exercises, please refer to this article for some help.
Complete the following task.
- Remove records before (and including) 2011.
- Fill in the missing data for all records; missing data will appear as blank fields for the row. You may assume Domestic + International movement = Total Movement.
- Create a step to clear air_mov_syd and air_mov_mel managed tables.
- Using a SQL command move data from air_mov_raw to air_mov_syd where all records have Airport = 'SYDNEY'. Similarly, import data to air_mov_mel where all records have Airport = 'MELBOURNE'.