Sorting MySQL Records Based on Latitude and Longitude

The objective here is to achieve proximity-based searches for individuals or locations using latitude and longitude data, within the context of Location-Based Services (LBS). This pertains to building a backend for location-based applications without relying on MongoDB. While there are various solutions available online, it often boils down to a few fundamental approaches. Therefore, it is imperative to amalgamate the information available online with your unique requirements and gradually optimize your solution according to different stages and use cases.

In this context, ‘x’ represents latitude, and ‘y’ represents longitude.

P.S.: The SQL code provided is not complete; it’s intended for reading and reference purposes, helping you understand the concepts involved.

I. MySQL: A Simple Approach without Spatial Functions

  1. Approximation: Based on your specific scenario, determine a range and calculate latitude and longitude to form a rectangular area (A). This approach may lack precision but serves as a foundation. It is one of the most straightforward methods to implement.
WHERE latitude > y - range
AND latitude < y + range
AND longitude > x - range
AND longitude < x + range
ORDER BY ABS(longitude - x) + ABS(latitude - y)
LIMIT 10;
  1. Calculate Distance with PHP Functions and Sort Accordingly:
/**
 * @param $lat1 Latitude
 * @param $lng1 Longitude
 * @param $lat2 Latitude
 * @param $lng2 Longitude
 * @return float|int Distance (in kilometers)
 */
function GetDistance($lat1, $lng1, $lat2, $lng2)
{
    $EARTH_RADIUS = 6378.137;
    $radLat1 = rad($lat1);
    $radLat2 = rad($lat2);
    $a = $radLat1 - $radLat2;
    $b = rad($lng1) - rad($lng2);
    $s = 2 * asin(sqrt(pow(sin($a / 2), 2) +
            cos($radLat1) * cos($radLat2) * pow(sin($b / 2), 2)));
    $s = $s * $EARTH_RADIUS;
    $s = round($s * 10000) / 10000;
    return $s;
}

/**
 * @param $d
 * @return float
 * Convert to Radians
 */
function rad($d)
{
    return $d * pi() / 180.0;
}

For sorting, you can implement a suitable algorithm or utilize built-in sorting functions available in many programming languages.

II. MySQL: An Optimized Approach without Spatial Functions

In this optimization phase, we refine the range calculation used in the first approach. Calculate a more accurate range (B) based on the radius, which is circular (as the radius represents the distance between two points). However, during filtering (C), the range remains rectangular. Precisely speaking, circle B is the inscribed circle within rectangle C, and points outside circle B but inside rectangle C will also appear in the SQL results. Nevertheless, this approach is more accurate than the first.

$radius = 1; // Radius range in kilometers
$rangeLat = 180 / pi() * $radius / 6372.797; // Latitude range
$rangeLng = $rangeLat / cos($x * pi() / 180.0); // Longitude range
$maxLat = $x + $range; // x1
$minLat = $x - $range; // x0
$maxLng = $y + $lngR; // y1
$minLng = $y - $lngR; // y0

I have seen attempts to incorporate this calculation directly into SQL, resulting in complex SQL statements. However, such calculations are not best suited for SQL, and I do not recommend this approach.

III. MySQL: Using Spatial Functions

In the previous phase, we obtained four points, forming a rectangular area. All we need to do now is determine whether a point falls within this area. MySQL’s spatial functions can support any polygon, and they even allow for index optimization. While MyISAM can create spatial indexes, as of MySQL 5.7, InnoDB also supports spatial indexing.

The core idea here is to determine whether a point is within a specified range. If your database has a column of type geometry, you can execute a query like this:

SELECT id, AsText(g)
FROM geom
WHERE MBRContains(GeomFromText('Polygon((x0 y0, x1 y0, x1 y1, x0 y1, x0 x0))'), g);

This query accurately filters points within the specified range. Even when using the ORDER BY clause to limit results by distance, it does not significantly impact performance.

SELECT id, AsText(g), SQRT(POW(ABS(X(g) - X(x)), 2) + POW(ABS(Y(g) - Y(y)), 2)) AS distance
FROM geom[*1]
WHERE MBRIntersects(g, GeomFromText('Polygon((x0 y0, x1 y0, x1 y1, x0 y1, x0 x0))'))[*2]
AND SQRT(POW(ABS(X(g) - X(x)), 2) + POW(ABS(Y(g) - Y(y)), 2)) < radius[*3]
ORDER BY distance;[*4]
  • *1: Calculate results based on the distance formula.
  • *2: Condition 1 checks if the points in column ‘g’ intersect with the calculated range. Note the use of MBRIntersects() instead of MBRContains().
  • *3: Condition 2 checks if the ‘distance’ is less than the specified radius.
  • *4: Finally, the results are ordered by distance.

The above code is based on the official documentation and serves as a crucial reference. If you find it challenging to understand, don’t worry; I’ll provide an example:

Let’s use MBRContains() as an example. You can experiment with MBRWithin() as well; just pay attention to the parameter order. The results are essentially the same.

Function Usage: MBRContains(g1, g2)

Function Description: Returns 1 or 0 to indicate whether the minimum bounding rectangle (MBR) of ‘g1’ contains the MBR of ‘g2’. The function explicitly states whether ‘g1’ contains ‘g2’, so be cautious with the parameter order. This function supports any polygon.

Target Point: D(1,1), which can also be a range (refer to *1).

Range: E(0 0, 0 3, 3 3, 3 0, 0 0), a closed rectangle. It supports any closed polygon.

SELECT id, name
FROM geom
WHERE MBRContains(GeomFromText('Polygon((0 0, 0 3, 3 3, 3 0, 0 0))'),
                  GeomFromText('Point(1 1)'));[*1]
  • *1: In this context, the target point D can also be any closed polygon. The parameters

differ; instead of Point(), you’d use Polygon().

This SQL query can be interpreted as determining whether point D is within range E.

SELECT id, name, AsText(g)[*1]
FROM geom
WHERE MBRContains(GeomFromText('Polygon((0 0, 0 3, 3 3, 3 0, 0 0))'),
                  g);
  • *1: Since the ‘g’ column is of geometry type, we use AsText to convert it before displaying.

This SQL query can be interpreted as listing all points within database ‘geom’ that are contained within range E.

For more information on intersecting, containing, touching, and other spatial methods, refer to the development documentation in section “19.5.5. Relationships Between Geometric Minimum Bounding Rectangles (MBRs)”.

IV. GeoHash

GeoHash is a method to convert two-dimensional latitude and longitude coordinates into strings. The length of the string determines the precision. The more matching characters two GeoHashes share, the closer the corresponding coordinates are in most cases (with some exceptions).

Similar to: (G->)|J…me…|K…(<-H)

GeoHash divides the Earth’s surface into rectangular regions. Therefore, within the same rectangular region, GeoHashes are identical. However, there is a boundary issue. Suppose G and H are two adjacent rectangular regions. I am at the right boundary of G (right next to H), restaurant J is at the left boundary of G, and restaurant K is at the left boundary of H. If you calculate the GeoHash, it might suggest that restaurant J is closer to me, which doesn’t make sense.

This problem can be addressed by increasing the region’s precision and expanding the similar area.

You can implement GeoHash conversion from latitude and longitude using existing code available online. I won’t provide the code here, but there are PHP extensions available for faster calculations.

To give you an idea of precision:

  • With a 6-character GeoHash, you’re roughly within 1 kilometer of the nearby area.

V. Redis GeoHash

Redis offers high-efficiency geolocation capabilities through GeoHash (a high-performance and high-precision version). It encapsulates many useful methods and allows operations based on latitude and longitude without the need to convert to GeoHash strings manually. Redis can return points based on distance, support JSON output, and facilitate sorting.

However, being a key-value store, Redis has considerations related to persistence and capacity.

Redis doesn’t operate in isolation. It can be paired with MySQL, serving as a front-end cache for high-frequency location data. With the right architecture, you can create an efficient solution for location-based queries using both Redis and MySQL.

These are MySQL-based solutions for handling latitude and longitude data. If you have a better approach or want to discuss further, please feel free to contribute to the comments section.

Happy Coding!

http://howto-use-mysql-spatial-ext.blogspot.com

http://www.cnblogs.com/dengxinglin/archive/2012/12/14/2817761.html#a