Checking for a Null value in SQL

Yesterday I fell into a very simple but dangerous SQL trap.  Can you identify it?

SELECT first_column, another_column
FROM my_table
WHERE first_column = 1
    AND (another_column >= 100 OR 
         another_column = null)

Found it?  It’s the comparison to null.  My intention was to get rows where another_column was null, however the query did not do so.  It didn’t give me any results where another_column was null, but didn’t give me any error either.  Why?

Null in SQL is not a real value.  Null means “lack of a value” or “unknown”. Anything compared to null leads to an “unknown” result. Therefore, in SQL there are actually three values conditions can evaluate to: true, false, and unknown/null. This is called three-valued logic. SQL WHERE clauses only return rows that evaluate to true, not rows that evaluate to false or “unknown”.

The proper way of checking for null is by using the IS NULL or IS NOT NULL predicates. The correct SQL query would have been as follows:

SELECT first_column, another_column
FROM my_table
WHERE first_column = 1
    AND (another_column >= 100 OR 
         another_column IS null)

So if you’re used to languages where you check for null with equal signs, remember that in SQL, you must use the special IS NULL or IS NOT NULL predicates instead of the equals sign or other comparison operators. Don’t get tripped up like I did.

If you want to read more about, Wikipedia has a detailed article on null in SQL.

Database query optimization is a useful skill

Recently I read an article titled “Quick tip: Improve developer habits by showing time cost of DB queries” by Dan Birken.  He claims that one common reason why websites load slowly are because of unoptimized database queries:

Websites can be slow for many reasons, but an overwhelmingly common one is time spent performing database queries. Now database queries are likely a very important part of rendering the webpage – the information to populate a page has to come from somewhere. But sometimes rendering a page is slowed down by either unnecessary queries or improperly optimized queries.

I completely agree with his view.  For most web developers, querying information from a database is quite a common action.  Therefore, I believe learning how to write efficient and optimized database queries is a good skill to have.

I was fortunate to learn about this topic when I was developing MyBB way back years ago.  At the bottom of each MyBB page, administrators are shown debugging info with a summary of loading times and other statistics.

Debug info
Debugging information presented to the administrator on each MyBB page

Clicking on the “advanced details” link yields a much more in-depth analysis of the page generation, including an explanation of all the database queries used to create the page.

Performance debugging details
Advanced performance debugging details presented to MyBB administrators.

This is very useful to determine where the bottlenecks occur when generating the page.  Some common mistakes include repeating similar queries instead of batching them together, selecting more data than is needed, and doing full table scans rather than indexed queries.

However, not all software and web frameworks has built-in performance debugging pages like what MyBB has.  If you’re programming something from scratch, it might take some extra effort to create a useful debugging interface.  What other tools can we use then?

Using the EXPLAIN query (works at least for MySQL, PostgreSQL, SQLite) will give you the details about how a specific query will be executed by the database.  All you need to do is to tack “EXPLAIN” on at the beginning of your SELECT query, and run it through your favourite database management program.  SitePoint has a thorough article on how to understand the output of an EXPLAIN query.

Of course, it’s also beneficial to have a concrete understanding of database indexes and how they work.  One of my favourite sites on this topic is Use the Index, Luke!  They also have a fun 3-minute quiz to test your knowledge, if you think you already know everything 😉  (I got 4/5 on my first try, how about you? 🙂 )

Do you know of any other resources to help with optimizing database queries?  Let me know in the comments.

Side Project 1: Real Time Bus Map in Vancouver

Since TransLink released their new mobile Next Bus site with real-time GPS updates of bus locations, I’ve been trying to find ways to get the data and rehash it into something that  Metro Vancouver transit enthusiasts (more specifically, enthusiasts who chase buses and monitor the transit system’s operation) will find useful.

There were two main shortcomings of TransLink’s site from the viewpoint of a transit enthusiast:

  1. Can’t search for a specific bus.  Often times transit enthusiasts “chase” a particular bus, usually a new bus, a fresh bus after a midlife refurbishment, or a bus with a new advertisement wrap.
  2. Can’t see the entire system as a whole.  This one’s pretty self explanatory.  It’s just fun to be able to see where all the buses are.

So I created a system which gleaned information from the TransLink site and aggregated it into a useful interface which I called “T-Comm”.

This is named after CMBC’s Transit Communications centre which has an interface similar to what I created.  Using the information I was also able to add additional functionality like grabbing the bus’s schedule for the current trip and even the entire day.

Since most transit enthusiasts would be using this on the go, I knew I had to make this site mobile-friendly. The enthusiasts I knew used a myriad of mobile devices including Blackberries, iPhones, and Androids, so it would not have made sense for me to create a native app for each of the platforms; it would have killed me in terms of time and energy.  I chose to use jQuery Mobile and Google Maps API as the basis of the frontend.  The backend is powered by PHP/MySQL. I was amazed at the ease I was able to make something mobile-friendly using jQuery Mobile.  It was actually fun too.  More importantly testers reported positively on their mobile devices.

I’m hoping that TransLink will release the GPS data officially for developers.  They’ve said after April 2012 on Twitter, so I’m crossing my fingers.

Contact me if you would like access to the site.  Since the current site hammers TransLink’s servers I’m trying to tread lightly.  Once official GPS data is used I will open it to the public, but I don’t really see this being used by the public as it is quite enthusiast-oriented.

Update: The site has been updated to use TransLink’s official data feed for real time bus data, so here’s the link: http://tcomm.bustrainferry.com.  More info on this site can be found on my T-Comm project info page.

MyBB Security

These are just my thoughts about MyBB security updates.  I’m not a security expert of any sort, but I just offer my opinion based on the knowledge I have.
Over the last few weeks there have been two releases to MyBB to patch potential security vulnerabilities that have been discovered by various parties. I have seen some people who have found these seemingly miniscule updates too trivial to apply to their own boards, despite the fact that I and various other members of the MyBB staff have recommended these updates.

These people seem to believe that just because no harm has been done by people attempting to exploit the vulnerability, or just because no harm has been done when they try the exploit script by themselves, that the upgrade is not required. Personally I find this absurd.

First of all, I’d like to point out that not all proof-of-concept scripts are harmful; as their name suggests, these scripts prove the concept, but may not actually compromise the system. Wikipedia says: “In both computer security and encryption, proof of concept refers to a demonstration that in principle shows how a system may be protected or compromized, without the necessity of building a complete working vehicle for that purpose.”

Just because a board administrator cannot find a way to exploit the vulnerability, doesn’t mean that another malacious user won’t find a way. Just because nothing has been “done” to the board when an attempt has been made, doesn’t mean that eventually someone else won’t find a way to compromise the board. For example, the 1.1.3 release patched a serious security vulnerability where a malacious user could execute arbitrary PHP code at their own heart’s content (with a malaciously-formed username). As an administrator, you may not even detect any problems on the surface if you tried the proof-of-concept script, or seen usernames that have registered on your board, but nothing harmful has happened. In fact, much more serious and critical information may have been available to the hands of malacious users, if they indeed have compromised the board in this manner, and the patch released was not applied.

As well, once the security vulnerability has been patched, anyone with a malacious intent would be able to figure out how to exploit it, and may be able to compromise boards which have not patched the vulnerability.

Okay, so I may not be a security expert, however, I do use my common sense (and I do hope that you use yours). When a security vulnerability has been found, and has been identified to affect the particular version of MyBB (or any other software), we do not just release these patches to annoy our users with little upgrades every few weeks. No, we actually do want to improve our software by patching these holes and keeping our users safe. If a vulnerability has been reported, it is most likely that something harmful can be done to your board, and if a board administrator wishes to take that risk and not upgrade, it is his or her decision, and I cannot force anyone to apply the patch.
Obviously it is possible that sometimes the malacious users will compromise boards before we can find the vulnerability and release the patch, but I assure you that security is at the highest priority with the MyBB Group, and we strive to keep our customers safe from these exploits in as a timely manner as possible.

However, once we have released a patch, it is up to each and every individual board administrator to update their board to keep them and their board safe from the exploit. Each security patch, no matter how small, should be considered as significant. I hope that you all take this into mind the next time you ponder whether or not to update your board.

After writing all this about security, I hope I won’t get hit on my behind by something that I have just fervently preached. 🙂

MyBB Development Update: Attachments

In anticipation of the release of first beta/preview of MyBB 1.2, I’ve been committing a lot of small features here and there. Some of them are from past suggestions on the Suggestions forum on the Community Forums (which proves that yes we do add features that people suggest to the MyBB releases). However, there are others that I have added on my own.

One of the sections I concentrated on during the last few days was the Attachment Manager, in both the Admin CP and User CP. I thought that it lacked some functionality, including finding broken attachments. There are three main ways where an attachment can be “lost.” First of all, the attachment can exist in the database, but the actual file has gone missing (directly deleted, or otherwise). Secondly, the attachment does exist in the uploads folder, but has lost its identification in the database. Thirdly, the attachment may not be associated with a post (eg. uploaded to a post that never was posted). I’ve started to implement an “Orphan Search” page in the Admin CP which currently searches for attachments that are in the uploads folder but aren’t associated in the database, and also gives an option to delete them. Regarding the first problem, I’ve added colour coding in the main attachment search page in the Admin CP that indicates that the file associated with the attachment does not exist. Now there are also a bit more detailed attachment statistics in the User CP (I thought that one line in the thead wasn’t sufficient ;)) I’ll probably be working on this more after the release of 1.2 so there will hopefully be more to come, for the attachment managers.

Anyway, as we’re nearing the release of the 1.2, I can see that there have been some significant modifications to certain parts of MyBB. Most of the changes were within the code. As you probably know, we are always looking for ways to make MyBB more OOP, so you’ll see a few new classes in the code. MySQL full-text searching capability should make its way into this release. This will hopefully make searching a lot better for both the user and the server. These are not all the changes, but just the few major ones. We like to give people surprises, so we’ll let you see them when it’s released 🙂