What are the unwritten conventions to be followed when designing a query and designing databases?
Here are some ideas on the same. (for sql server).
The same principles could be extended to other relational databases as well.
Database design :
1) Use primary keys. A table without a primary key is a poorly designed table. (Transactional replication in sql server requires a primary key, please note.)
2) When choosing primary key, use some combination of integers and characters that is compact, unique and not intuitive. There is a historical perspective to this, please note.
Conventional wisdom says that primary key should be monotonically increasing integers (which consume less bytes), and hence theoretically, should be optimum.
But, in the light of the OWASP recommendations, this poses a security loophole which can be exploited by hackers. Hence use an alphanumeric key, where the numeric component is based on some logical entity, and is not intuitive.
As per OWASP recommendations, violation of this principle can lead to unwarranted data accesses through url modifications and compromise sensitive data in web applications.
3) Normalize your database.....to the Third Normal form. (What follows from this is that the database is already normalized to the First and Second Normal forms.)
This leads to the elimination of data redundancy and more effecient query performance.
4) Preferably maintain foreign key relationships through the application itself. This point has been arrived at through personal experience and is my personal stand on this issue. When there are too many foreign key relationships in the database it leads to too many CPU reads and can hamper query execution and increase data processing times.
5) Choose your column widths as per the User Interface design. If your phone no. has 10 digits then do not provide 100 characters to store it. This is sloppy design. Match database design with UI layer and business logic considerations.
6) Do not waste space. If your data has variable length upto 5 characters use varchar(5). not char(5) which is padded with spaces. This again is sloppy design.
7) Add a CreatedBy, DateCreated and DateUpdated to all tables. These fields serve as audit fields and constitute robust design.
8) Use stored procedures as far as possible. (if required use output parameters). This way you avoid using inline queries and also let the database take care of management of the stored procedure.
9) Do not create table objects as cache tables in the database. Use temporary tables whose domain is the body of the stored procedure. Use of table objects can create locking issues.
10) Encrypt all passwords in the database. Do not store passwords as clear text. The reason for this is too obvious to highlight. It could land you in some serious difficulties if you don't do this !
Writing a bad performing query is worse than not writing a query at all.
1) A "WHERE" clause is a primary row filter. When using it remember that "=" operator gives the best results.
Compare and contrast this with "<>" which is much slower.
Other inequality operators like ">=" also do not perform as well.
2) Avoid using inline queries in applications. This can lead to SQL Injection attacks, another OWASP recommendation.
3) Don't use dynamic sql in stored procedures. This is fodder for hackers and others with maliciious intent for the same reason listed in (2) above.
4)When using a WHERE clause on multiple columns place the greatest filter first in the list. eg) Admin / Staff as opposed to Male / Female which is less restrictive. (and hence filters out less rows).
5) Don't use "SELECT * " where it is not required. This enforces a table scan and is very slow. The more the columns the more the overhead.
These are some simple guidelines to follow. There are certainly more, but these are by far the most important, and simple.