In previous post we dealt with the SELECT statement and WHERE clause optimization. In this post we will see optimization of INSERT statement. SELECT statement is very general for dynamic websites while INSERT is used very often where WRITE option is given. The tips for optimization of INSERT statement are:
- REPLACE can be used instead of INSERT to overwrite old rows. It can be done using WHERE clause on some UNIQUE INDEX. REPLACE can do job for both INSERT and UPDATE.
- Insert multiple rows using INSERT is very efficient method and very important point for optimization. Inserting single row at a time wastes the time and resource of the server and also increases the disk overheads. Multiple inserts at a time is considerably faster.
INSERT INTO table VALUES (1,'a'),(2,'b'),(3,'c');
- DELAYED keyword with INSERT will push the insert value into buffer to run later in case of WRITE/READ is being done on table. This is very useful for MyISAM where table locks on WRITE/READ. Make sure you don’t require insert values immediately. This can be implement where “Login using Request” is used. It works with MyISAM only. Do not work with INSERT…SELECT.
INSERT DELAYED table VALUES (1,'a'),(2,'b'),(3,'c');
- Priority can also be set for the INSERT statements, LOW_PRIORITY and HIGH_PRIORITY, which is also quite useful for some tasks. Works with MyISAM only.
- Using IGNORE keyword with INSERT is like a treat. IGNORE converts error into warnings and completes the query. Eg: If your query has insert value which has repeated unique value IGNORE will keep running the query ignoring the error while normal query would have aborted.
- The fastest method to load data in a database is LOAD DATA INFILE. It is usually 20 times faster than INSERT statement.
LOAD DATA INFILE datafile.txt INTO table
If you want more clear picture, read MySQL manual.
Writing SQL query is easy but it can use lots of server resources which is utmost important for huge databases, where minimization of execution delay is important, like Facebook – if they don’t update your status very frequently it will cost them users. Let’s start:
- Most important for query optimization is INDEX. Make sure you define index in your table for the columns which you use more often than others or create for every possible column.
- Use EXPLAIN statement to determine which indexes are used for a SELECT statement.
These were the general tips. For speeding up queries on MyISAM tables:
- Use ANALYZE TABLE after loading data on it. This updates a value for each index part that indicates the average number of rows that have the same value. For unique obviously it is always 1. MySQL uses this to decide which index to choose when JOIN is used.
- Sorting a table by index is good when reading all rows of the table according to the index.
- Removal of unnecessary parentheses. eg:
(a AND b AND c) OR (a AND b AND c AND d)
((a AND b AND c) OR (a AND b AND c AND d))
- Constant folding eg:
b>5 AND b=c AND a=5
(a<b AND b=c) AND a=5
- Constant expression used by indexes are only used once so try to use them as more as you can.
- COUNT(*) without WHERE clause on a single table fetch the result directly from MyISAM table information which is pretty fast.
- Constant tables are read first before any other tables. Constant table are:
- Empty tables or table with one row.
- A table with which WHERE clause is used on PRIMARY KEY or UNIQUE index.
- JOIN statement is slower that any other. It checks all the possibilities for join to give best result. If ORDER BY or GROUP BY is used and all the columns come from same table then this table is preferred first in joining.
- In some cases, MySQL can read rows from the index without even consulting the data file. If all columns used from the index are numeric, only the index tree is used to resolve the query.
- LIMIT also executes faster than query without it.
I hope you found this helpful. We’ll continue with INSERT statements optimization in next post.
MySQL is a well known name in RDBMS world and many of you must have used it. It implements one of the oldest way to create a database. I have never used it with any other application but web though. To run MySQL I use phpmyadmin, the browser application made for ease. We will be discussing some settings for MySQL other than default.
Required: phpmyadmin ( if you can use it from command line good for you! 🙂 )
- InnoDB vs MyISAM: These two are storage engines which are widely used. Facebook use InnoDB storage engine but they use their version of MySQL which is like more that 10,000 times more efficient than the default. They have optimized it for write operations (writing in database). We’ll see the difference between two (not all but major) and you choose which is best for you.
- MyISAM – Data in MyISAM tables is split between three different files on the disk. One for the table format, another for the data, and lastly a third for the indexes.The maximum number of rows supported amounts to somewhere around ~4.295E+09 and can have up to 64 indexed fields per table. Both of these limits can be greatly increased by compiling a special version of MySQL. Use the option –with-big-tables while creating table.Text/Blob fields are able to be fully-indexed which is of great importance to search functions.Number of indexes can be 64 in a table by default. You can increase it anything up to 128 using –with-max-indexes=N, where N < 128 while compiling MySQL or simply when you run it.Concurrent inserts are supported. Only condition there are no free blocks i.e., a result of deleting rows or an update of a dynamic length row with more data than its current contents.You can put the data file and index file in different directories on different physical devices to get more speed with the
DATA DIRECTORY and
INDEX DIRECTORY table options. This is sort of prerequisite for huge databases.Table locking is present so concurrent writes are not possible.
- InnoDB – It is a transaction-safe (ACID compliant) storage engine for MySQL that has commit, rollback, and crash-recovery capabilities to protect user data.Consistent nonlocking reads increase multi-user concurrency and performance.Stores user data in clustered indexes to reduce I/O for common queries based on primary keys.It supports Foreign Key which is of great use if you don’t want to check the values being inserted in table which derives from different table.Designed for maximum performance when processing large data volumes. Its CPU efficiency is probably not matched by any other disk-based relational database engine.Maintains its own buffer pool for caching data and indexes in main memory.
Stores its tables and indexes in a tablespace, which may consist of several files (or raw disk partitions).
Row lock is present so simultaneous writes on a table is possible.
- To set auto increment value for primary key but with default starting value other than 1 go to “OPERATIONS” tab for the table change the auto increment value to whatever you like say 1000000
- Go to the home page of phpmyadmin and click “PRIVILEGES”. You can assign different privilege to users that are using your database here. When you’ll add the user you can set privileges for him at the same time.
- “ENGINES” tab on your home page will tell which engines are installed on your machine.
- “VARIABLES” tab is the configuration which is currently set. You can change these variables for example: big table is “off” by default, you can switch it on and can increase the number of rows in a table to twice. Though not useful for InnoDB.
I personally use InnoDB, though it doesn’t matter for the amount of data I have. So, these were the few things which I believe can surely help you in creating a good database structure. Engines plays the most important role that’s why it was discussed in detail. I highly recommend you to read manual for MySQL.