SQL Basics: When To Use SELECT * INTO

in STEMGeeks4 years ago

Recently, a development team reached out because they faced issues when testing their data. The team had created a quick copy of the data with the SQL Server code of SELECT * INTO, but their testing failed when they used the newly created table in A-B testing. As they described their issue, I highlighted to the team that they didn't only need a copy of the data, but they needed a copy of the functionality as well. In other words, the data were only a part of their needs in their testing. While the execution of the code created a copy of the data, it did not create the same functionality (demo of this).

Some points worth considering before we proceed to using the SELECT * INTO syntax:

  • Do we only need to test the validity of specific data?
  • Are we creating a quick backup copy of the data without needing the defaults?
  • Are we A-B testing specific data only or are we A-B testing data with functionality?
  • What CRUD operations will we be performing on the copied table?
There are other considerations as well, but these make good starting points before we run out with a design that may not be appropriate in our situation. Like the development team I assisted, you may discover after the fact that the testing was insufficient because the wrong solution was used.


In any SQL language, before using any quick copy tool, we want to consider the reason for the copy. If we're only copying data because we need a quick backup, with SQL Server (in this case), the SELECT * INTO syntax will function as intended. But this means that we get an exact copy of data with nothing else from the table as far as defaults or possible schema needed in testing. For instance, if we need a copy of configuration data and we ONLY need the configuration data for possible reversion, then this technique will suffice as a backup. However, if we wanted to copy data for an A-B test, but our A-B test included data creation that had (1) primary keys, (2) default constraints, and (3) other restrictions, these would not be copied in the new table outside of the specific data values. The data values would be copied, but the specific schema would not (without the defaults and constraints applied, we only get a partial schema, not the full schema).

In the case of A-B testing where we want the functionality AND the data, a full script and copy will be required. This means that the full defaults and constraints are copied, as well as any dependencies, such as foreign keys. We cannot thoroughly test our code if we do not have everything created that is required in our testing. Keep in mind that testing data and functionality significantly differ from testing data only, or using data for a backup. Some of the testing discussion assumes we're testing in the same environment. In some contexts, we may have an A and B testing environment (ie: Sandbox A and Sandbox B). The advantage of these types of environments is that we can easily do A-B testing and avoid situations like the above discussion where we're trying to test side-by-side in the same environment on the same server within the same database. There are other techniques that are similar, which may save us time if testing is involved.


Source: SQL Basics: SELECT * INTO (Table Copy and Testing)

Sort:  

I didn't know it's part of the SQL-standard, thought it was something postgres-specific.

MySQL doesn't support it ... https://dev.mysql.com/doc/refman/8.0/en/ansi-diff-select-into-table.html ... but according to the docs, mariadb supports it by now. Haven't tested it though.

Yes, some SQL languages support it, some don't. Most SQL languages have something similar and they work in a similar way where some of the schema is copied, while some is not (like defaults). In this example, I use a specific SQL platform, SQL Server, to demonstrate this, but it would vary by SQL tool.